tv channel

The animation for the channel number on our tv (you know how when you change the channel it tells you what channel you are now on) changed today.  Made me wonder how that happens.  I thought that was defined by the tv… Strange.

Third Year in Phd

http://chronicle.com/article/Your-Third-Year-in-a-PhD/143853/ advice for third year phd.  GMU emailed it to all cs phds.

This semester I’m taking stochastic processes, pattern recognition and CS colloquium.

Bounty Brokers

Bounty Brokers exist in real life :).  They are the websites that profit from

Like fiverr, the bounty is the service that the users on the site provide.  Of course the bounty hunter is looking to get the most out of its expense of doing the bounty (the analogy to robots is that robots do tasks to get bounties, but the cost to them to achieve the bounty is less than the bounty and in the case of fiverr the cost of me drawing an good illustration is more than paying someone five dollars to do it for me).  So, the broker in the case of fiverr is the website, they take a cut of the bounty of the worker.

Everyone gets paid.  So, Bounty Brokers exist solely to facilitate the interactions of other bondsmen and bounty hunters.

I’m pretty sure in economics they call these systems producers and consumers. 🙂

 

Bounties for Autonomous Traffic and Smart Grid

So, David reminded me of the problem of Coopetition and how that would be an issue with using bounties with the cars.  Also, I love this problem (I’ve looked at it off and on since undergrad). Coopetition is basically the problem is how can competing teams learn to cooperate.  So, here are my thoughts on how I think that a solution may emerge due to how bounties work and who the bondsmen are.

So, on to bounties and the idea of conflicting goals. This is why people shouldn’t be driving :). People don’t compromise because we have no way to really communicate our goals with other drivers other than through “signaling” by essentially going slower or faster and annoying some people. It would be interesting to show in a simulator what the throughput through intersections, time to destination, fuel economy, etc. is when the agents can’t communicate, are greedy and have different objectives. I think that bad things happen. However, it would be good to get a base line of when things start going bad for the scenario.

The main problem I think is that bounties are more locally greedy and that the “world” might want to put a bounty out that helps to cause cooperation among conflicting goals. These could adapt as the “world” observes the agents. The world could be an intersection or a road or something. And the road puts a bounty out for cars to get into some lane or turning left etc.

This could be an entirely different way to work the problem. The cars place their constraints like fuel efficient or fast and their destination and then the world directs them by alerting them to bounties that they may be interested in doing in order to satisfy their constraints in a more cooperative manner. Then this becomes a multiagent planning problem. So, one of the roads could put a bounty out for x number of cars with these properties and then some intersections might want to put out bounties to those cars to change lanes and turn onto that road (that have the properties that the road desired). So, then the car would have to learn whether to do one of the bounties or not and continue on its original route.

I think that this sort of framework would be applicable to smart grid charging and discharging of batteries and consumer devices (like electric cars, air conditions, etc) in order for the grid to be optimal. The cars are the electrons 🙂 wires are the roads, batteries are intersections and houses/appliances are destinations.  Or some variation….

So, essentially I am separating those that are doing the competition (the humans/cars) and those that are cooperating (the road system).  By doing so, the components of the road system are able to use the competitiveness of the agents to manipulate the system.

 

Another idea is that in large networks we may need a broker of bounties.  This would incorporate the work I did with Nil.  The packages in this case would be Bounties….

The Flight to Joao Pessoa

It was a very very long journey, about 27 hours, to get from Dulles to Joao Pessoa.  I made the trip with David Freelan and Stephen Arnold.  We went from Dulles to Miami in 3hr and then had a 5hr layover there.  Then we had about an 8hr red eye flight to Rio De Janeiro.  We then had an 8hr layover there!  Finally we had a short 3hr flight to Joao Pessoa.

I carried the majority of the batteries for the robots in one of my carry-ons and every time we had to go through security the security guard had to take my bag to the side and check it out.  So, I got to tell a lot of people about how we were going to play soccer with robots and that those were the batteries for them.

We had some fun at all of the layovers.  During the layover in Miami, a friend of Stephen’s from highschool stopped at the airport and we at pizza with her.  We found that the Miami airport is strange and that you have to come out of security to go from domestic to international flights.  We didn’t go outside because it was rainy.  While at the airport we waited in an empty area around the corner from our gate.  Our trip was in an hour and we decided we were going to walk around and check out our gate.  We get there and there is no one there!  We walk over to the counter and find out that international flights board an hour early and that we are just on time.  We were glad we went to check it out.

The flight was reallllly long.  I had an isle seat and sat next to David who had the window.  We got a mini pillow and a fleece blanket.  Each person had their own screen with a pretty big selection of free movies, games and tv shows.  This was cool.  So, for the first 2 hours I watched “The Book Thief”.  It was a book I had always wanted to read but never got to.  So, it was fun to watch the movie.  The worst part of this part of the trip was when the person in front of me laid back their seat.  I had practically no room for my knees.  Next time I fly for so long I have to get the seats where there is nothing in front of me.  We were served food twice.  My first time eating plane food.  Not the best but nice to eat something.

In Rio we had a super long layover.  But first we had to go through customs.  We didn’t run into any trouble.  The asked David to open his one pellican case with the tools for the robots, but they only looked at the pictures that Stephen printed out of what was inside and didn’t make him disassemble the foam packing.  They had stopped him because it was curious that he was bringing in so many tools.  But he explained they were for the robots we had just brought in.  So, all went well.  After customs and re-checking our baggage it was a lot of waiting.  The airport was small and security was nice.  We didn’t have to take off our shoes like we do in the US.

The top three experiences at the Rio airport I think were the view of the mountains, I tried an espresso (awful), and our first experience trying to order food when the people behind the counter only understand Portuguese.  This was the start of our trip long use of obrigado (thank you in Portuguese).  Also of note, toilet paper was disposed of in cans beside the toilet and there was a person sitting in the elevator and pressed the floor button for you.  We also may or may not have been able to see the Jesus statue from the airport.  We couldn’t tell what the big white statue looking thing in the distance was to be sure.

Since security was simple and we were there for so long I went out side and walked around the front of the airport.  I saw some construction, palm trees and jagged mountains.  I also saw a car called LOGAN.

So, that pretty much summarizes my first international flight.  Overall I enjoyed it.

GMU’s Multicampus Smartgrid

This would be awesome!  GMU has multiple campuses and it is a state school.  So, it would be awesome if GMU could be funded to create a smart grid for their campuses.  So, I would be able to do live experiments rather than just simulating.  Well probably I would have to do a lot of simulation.  Like simulate the various campuses.  Get the electricity data and weather patterns.  I would also need to do assessments on the viability of various alternative energy like solar for installation at different places on the campuses.

This would be an awesome thesis.  I would be able to apply MAL and MAS to a live system after first simulating it.  This would cost a lot of money.  However, I believe the dividends the school and state would make would be worth it.

more ideas

So the idea is that we have these beacons.  They have a broadcast radius.  In that radius they state what they want to have done to them, ie moved to location (x,y).  Then as robots explore the environment they discover tasks.  they can decide to take on the task or remember it and then continue doing what they were doing.  As the robots encounter other robots they can exchange information about the tasks they have encountered and when.  Therefore, updating their belief about the state of the tasks.  Maps are not shared between the robots.  Only the coordinates of the task relative to the current location and possibly any way points that may help the robot find it (like if you have to go in the opposite direction to get to the task.  This way the robots are assuming intelligence between the robots and acting more like humans.  This is more how humans give directions rather than sharing maps.

The interesting part is when we have more tasks than robots.  The robots must cooperatively decide what tasks to take.  I think that when the robots encounter each other that is when they have to decide how they will coordinate their actions and decide when to do cooperative tasks.

The other idea I had was about essentially behavioral bootstrapping only in the reinforcement area.  I know there was a paper on different levels of agents.  And I know that at GMU they have looked at behavior bootstrapping in the case of multi-robot learning from demonstration.  The main thing it is similar to would be cooperative cooevolution

Another idea is that we really don’t want to have to publish all the time the current bounty price.  We initially do a broadcast of the starting price and then they stop.  Any new agents on the field must ask other robots.  Whenever a new task becomes available the bondsman decides when to announce it.

The other idea is inspired from this book I am reading.  They make a good point that we as humans answer simpler questions when solving a problem we don’t know how to solve.  For example, when picking stock if you don’t know how you might be like “I like Porches” so I am going to buy stock in Porche.  Deciding on what stock to buy based on what things you like is answering a simpler question because you know you want stock, you know you like Porche, but you don’t know anything about picking stock.  So, the nice thing with humans is that we can learn to adjust what questions we ask our selves as we learn new things.  We not only learn the rewards for states and actions, but we also learn better question, how to ask better questions and learn how to answer them.

 

Adaptive Mechanism Design

I found a Peter Stone paper that address part of my idea on adaptive auctions and mechanism design.  However, they only took it part of the way.  They only looked at adapting the parameters of single type of auction.  Rather my idea was that not only would that happen but also the type of auction would change as well to adapt to the bidders.  The nice thing is that they have discussed the application areas.

So, my original ideas are basically what would happen if we combined the two following papers.

http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/INFORMS10-pardoe.pdf

http://edoc.sub.uni-hamburg.de/hsu/volltexte/2014/3041/pdf/2014_02_Lang.pdf

I would want to apply it to not only auctions but also negotiation as well.