Natural Language LfD & RL

So, I’m working with Ermo on applying reinforcement learning to text based games. So, I was wondering if eventually if our method works if we could do text based learning from demonstration with reinforcement learning? Basically instead of the user pressing buttons they would describe what they wanted the system to do using english sentences. The user could then be able to say yes or no to what they are doing. Using natural language to train a multiagent system seems like it would be better. Especially since once it works for text, it could naturally be extended to speech! Telling the robots what to do and what to pay attention to would be even better.

Bounty Hunting on a forum

Interesting, I found someone talking about bounty hunting as a non-exclusive task allocation mechanism.

http://forums.ltheory.com/viewtopic.php?t=2477&p=38248

Coke Robot

So, many of the people at my robotics lab buy soda from the vending machine. They just increased the price for a drink from 1.50 to 1.75! That is paying 10.50 for a 6 pack! So, we were saying we should just buy a bunch of soda when it goes on sale. So, we of course could make our own dispensary.

Would be terribly fun to implement the bounty hunting task allocation for a Coke delivery robot on GMU’s campus. So, instead of going to a vending machine you would order on your phone and it would bring you your coke product. So you pay for the soda through the app and you get a qrcode that you then present to the robot and it will dispense the soda. Then you could buy a coke for someone and they would just show the code.

Of course this could be done with a ton of things. But since my campus only has coke product it would be really cool if we had this. It would be a great research problem too.

Inhaler

Wow, I took my inhaler a bit ago and I feel amazing now! I think for the past few weeks I must not have been getting enough oxygen. I’ve been tired and a bit slow. For the past few weeks I have been going to the gym regularly and running/ellipticalling/lifting. So, I’ve not been kind to my lungs. I should probably start taking my inhaler before going to the gym… That will probably help. What do you know, doctors are right haha. 🙂

Bounty Hunting and Cloud Robotics

Cloud robotics needs very stringent QoS guarantees and in certain cases is highly reliant on location to satisfy some of the requirements.

So, I was thinking a while back that maybe a bounty hunting based cloud robotics system could work like:

The robot registers with the bounty hunting service the bondsman (highly distributed might have multiple bondsmen, the robot could be the bondsman, this could be explored). The service then posts bounties out describing the tasks requested by the robot, its location/ip, QoS reqs. Then as the bounty rises the different cloud services will tell the bounty hunting service that they will go after the particular bounty. The cloud service will then contact the robot for the information required to complete the task (there could be a few bounty hunters and the bondsman could limit them etc.). If the robot replies with the needed info the bounty hunter will then proceed to complete the task. If they are able to complete the task before other bounty hunters then they will get the reward. If they do not then they learn not to go after the task (exactly how the current bounty hunters learn). These tasks are repeated and there are particular task classes due to the attributes of the types of tasks the robot needs processed (from control to high level planning).

The other neat thing is that many of the tasks are repeating. So, the tasks could be to provide a plan to get to a particular location along with a standard performance metric. Quality of the solution should also matter. That is something that bounty hunting did not consider at first. However, this is something that could be integrated. What if there was a metric that was included in the solution that the bounty hunter provides that is standard across the bounty hunters and is quickly verifiable. The winner would be the one that is able to produce a solution in the time requirements and has the highest quality.

So, the robot sort of acts as an arbiter. So, if the robot put a bounty out on control level task (like give me low level actuator commands for doing this particular thing for the next 5 seconds) then there are a two options:

1. whoever starts giving the commands first is the winner

2. there are multiple winners as they are able to produce parts of the task. Basically this is the case where it is good that you are getting commands from multiple sources and if the current winner for some reason looses connection then you have the other bounty hunter who is providing an equivalent solution but is faster or exists or whatnot.

The bounty seems like a good fit due to the variety of price structures and what not of different cloud services. The different cloud services can decide if it is worth their time to go after the particular bounty or not. The bondsman would also be able to learn how to adjust the base bounty and the rate of bounty increase based on the type of problem and its interaction with the different cloud providers. Another reason that the bounty model is good is due to the fact that most likely the different cloud providers will complete the tasks using temporary resources where the prices are highly elastic. So, having the bounty would work due to the nature of the pricing structures on the bounty hunters end.

I don’t know who to compare against. Just show that the QoS guarantees were met/exceeded on the tasks even in a dynamic environment, the cost was kept within an acceptable range, the cloud providers and bondsmen could adapt to scale to large numbers of robots etc..

This could be used for autonomous vehicles (millions of cars) for example by putting out a bounty for the fastest/scenic/etc route, parking spot, charging location (for EVs), down to the most low level control of the car itself. And of course for other robots. Would be interesting if co-located robots This seems very interesting and exciting.

Some other ideas related to cloud robotics. One is the ability for modular robotics to really stand-out. You have the case here where the robot itself could modify its physical structure and abilities and instantly be able to adjust its behavior due to all the modules being in the “cloud”.

AI and Creativity

So I just read an article stating that AI is nowhere near supplanting artists due to computers inability to “decide what is relevant”. I think that might be giving us AI researchers too much credit or going too soft on us. We have yet to develop non-noisy inputs in order to simulate the emotional and non-functional aspects of the brain. The closest we could get is to teach a computer based off of an FMRI of the brain while experiencing art/music etc. Somewhat simpler is being able to recognize emotions and correlate what is happening with that emotion. That is even more difficult. That is when we are at the point that the machine can put itself in “another’s shoes,” as it were. That is at an entirely different level than where we are at now. So, I don’t disagree with the author, I just think that she is just scratching the surface of what AI is unable to do currently, especially in a general, non-lab setting. However, I believe given better inputs (and of course better algorithms) that machines may develop human like emotions and ability to simulate others situations and thus develop a connection and be inspired to create art. But, I’m pretty sure that won’t happen in my time :(.

http://www.technologyreview.com/view/542281/artificial-creativity/

Lentils Recipe

Lentils recipe I came up with:

cooked lentils
herb rice
orange chicken
cooked broccoli
seared pear (cooked with oil and honey alongside cashews and pecans)
Seasoned with cinnamon, very little nutmeg, and curry.

Haven’t tried it yet but will next week hopefully. High in fiber, protein and vitamin C.

Politics and MAS

Politics seems like a good real world example of the multi-agent inverse problem and trying to get agents to coordinate at a massive (country) scale. Basically the multi-agent inverse problem is determining rules and behaviors at the low level that achieve a higher level objective. This problem is made more difficult because the low level behaviors of agents interact with each other causing possibly unexpected emergent behavior.

Another thing that politics has is hierarchies…

Mainly this was prompted by the article on laws that pertain to the constitution and how they are interpreted.

Puzzle Space

I wonder what the space of problems that we consider puzzles is like. I mean how big is it? What characteristics in general do they have? Does computational complexity correspond to how difficult the puzzle is? Puzzles usually require some degree of logic. So, I’d imagine that puzzles that are most difficult correspond to those that have you solve NP-hard problems without you realizing it. Can we use a language that describes puzzles in general, like maybe some form of logic, and then we can maybe automatically generate puzzles. Interesting especially when you consider multi-agent puzzles.

task resources

I think that I might talk a bit about:

Adaptive task resources allocation in multi-agent systems

Drew’s Borg

Prepare to be assimilated

Author: Drew Wicke