Reply with BCCs

I think you should be able to have BCC’s that when the person the email was TO: replies will also unknowingly be replying to the rest. This could be accomplished if the person replies to the original sender. The original sender will receive the mail and that person’s account knows that they BCC’ed others as well so if the original sender had checked that they wanted this feature the server would also cause the reply email to be sent to the BCCs.

Price of Anarchy

I am using Bounties as a distributed approach to assigning tasks and developing roles for robots in a multi-robot system. I am trying to figure out a good learning algorithm that will get good results for this system that will scale and be general. Therefore, since I’m not that smart I have to learn about some of the theory to do with bounties and how they operate so that I can program the bounty hunters to learn how to do well. One game theory tool I think I should apply is the concept of the price of anarchy.

The price of anarchy is the cost of allowing the agents to act selfishly rather than being controlled by a central authority. It is usually used in traffic routing or networks. One interesting result is Braess’ Paradox which describes the self evident fact that low latency routes when freed for everyone to use will result in a lower social welfare because when everyone uses that route it becomes the Nash equilibrium.

I believe that in the robot case we have that auctions are a centralized approach to robot task allocation (essentially solving the graph coloring problem) whereas bounties are the selfish distributed approach. Therefore, what is the social welfare cost of using this selfish approach. (Social welfare is what me as the designer is interested in since I don’t want my robots congesting the route to performing a task such that the other tasks don’t get done.)

So, can I learn the “congestedness” of particular tasks sort of like the idea that I will longer route if it is faster. Meaning I will take a bounty that will cause me to explore the environment even though there is a bounty that is close by but I know that the last time I went that way the route was congested with other bounty hunters trying to get to it. The bounty is essentially then the motivation for me to try to get the bounty even if I have encountered congestion in the past. The bounty is acting as a flag to be like hey, I’m not congested come get me. However, the bounty hunters have to know that all of the hunters also know that info. So, how do they react? Nash equilibrium probably becomes that they all take that bounty?.. The robots have an estimate of the cost to travel to the task location and deliver that box. They also integrate in the cost of congestion expected to task.

On a side note I can make the system heterogeneous by using the flockbots to grab the cans with the barcodes since the darwin’s can’t pick up cans, but flockbots can. Darwins can pickup blocks and flockbots can’t pick them up.

Some links:

http://theory.stanford.edu/~tim/papers/robust.pdf why is it useful how robust?

http://math.mit.edu/~olver/msc_thesis.pdf good explanation of the concept

https://www.mpi-sb.mpg.de/~gchristo/papers/ck05a.pdf — The price of anarchy in finite congestion games seems like this would be relevant to my problem since we have finite number of robots?

http://www.cs.cmu.edu/~avrim/Papers/ics_bbm.pdf — might be inspiration for my learning algorithm.

Bounties are Prisoners Dillema

For my CS880 project I’m making a bounty system that will hopefully enable robots to learn what roles they should do. I’m currently going to try providing robots a tit-for-tat (or some variation of it) to learn what tasks to do. I think this method might be good since in a real life scenario we don’t have time to do reinforcement learning. Anyway it seems like in the simplest case robots going after bounties are essentially in a “Prisoners Dilemma” scenario.

So, the idea is that the robots if they both cooperate (neither go after the same bounty) they end up both going to jail just not for too long since they can always go after a different bounty (this is modeling the expected loss they take by not going after that bounty). If both defect (they both go for the same bounty) then one of them is going to get it and the other won’t (since they will both be fighting over the bounty we are going to assume that it will cost them more even though one of them is going to get the reward, we are assuming that the reward was not great enough). If one defects and the other cooperates then they don’t compete and have thus decided their roles in the system.

Does the bails bondsman tell the winner about who else was trying to get the bounty? We would have to in order for the the defect-defect case to actually be relevant. Otherwise from the point of view of the winner he was participating in a Defect-Cooperate outcome game and from the losers they were participating in a Defect-Defect game.

I also need to make it a weighted tit-for-tat based off of the changing bounties. I might decided to go for the bounty if the bounty is large enough.

Adaptive Checkout Lights

[written in October 2013 but decided to publish now]

I think it would be awesome if checkout lights could be more useful. For example, if the light would change color depending on how full the line was. So, it could be red if it was really full, yellow if a little full, and green if empty or practically empty. The wouldn’t even need to be any software installed to control the light. The cashier could just flip a switch to change the light color. Also, there could be lines on the floor to let the cashier know when to change it. Of course I would prefer the software solution with learning etc. The software solution would be to put up sensors like those that open the automatic doors at each of the lines on the floor. But this way would be cheap and a good way to signal customers.

Also, (in my world 😉 ) in order to make checking people out fun there would be a friendly competition between the cashiers. The computers could keep track of stats of each of the cashiers, like throughput, number of errors, most saved or most number of items etc. Then you could

EMF in pipes and Mixing in Space

I was thinking that it would be cool experiment to try to see how much current would be produced by flushing magnetic particles through a system of pipes that were wrapped with wire and connected to batteries. Wonder how powerful the magnets would need to be and how the fact that they are traveling in water effects the field. Would be interesting to compare to tap water to see how much an emf field is present already.

Just not sure if it would work because:

the magnets would all clump together maybe not due to attraction and repulsion maybe end of
field not strong enough to induce a noticeable current

[update] Well I I talked with someone that knows a bit more and he reminded me that even though it might be possible the magnet would be slowing down the water and therefore not doing any good. He suggested that a mechanical method would be better rather than using magnets as well.

Was wondering how do you mix liquid things in zero gravity? I would imagine it is either really hard or not possible. Maybe we could use something like http://phys.org/news175873414.html. I guess we could just shake them together…

Well as to be expected they have done some experiments and they tried mixing oil in water by shaking them together. The cool thing is that they stayed together for 10 hours! On earth they would separate in 10 seconds. They also tried mixing water and water that had tea in it by pouring them into the same container. However, since no gravity they barely mixed at all. However, they did not shake the two. However, I’m assuming that they would mix and would not separate. (source: http://history.nasa.gov/SP-401/ch12.htm)

Trash Water Sewer oh my

Well, I was thinking that a fun multirobotics project would be to have robots that were trash cans and their goals would be:

Minimize expected time to find a trashcan
Stay out of the way
When full empty into “dumpster” or something
Stay charged
Minimal movement (don’t move when someone is using it)
(Some sort of tipping defense mechanism?)

They might be outfitted with the ability to identify and suck up trash (like a vacuum). They would have to coordinate so that they did not cluster. They would learn roles and adapt to changes in the number of robots.

This multirobot trash can could be used at high traffic events or locations. I guess the big thing is that people might actually steal the trash can! We would need body guards for the trash can haha ;). I do realize that this is sort of like the room clearing problem for multiple robots, so I’m thinking that some techniques that have been developed could be applied to solve this problem.

I call them CoWs, Cooperative Waste-cans :).

So, water and sewer plants when there are multiples that must cooperate to serve a community. Then the plants would be the agents and they could each communicate with the others to coordinate how/who/when they would do things. This would reduce risk by distributing the system. It is also agent based because the plants are selfish. It is not a distributed problem because the facilities may be different companies.

They mush cooperate to take better care of the environment (over taxing the river where the water comes) and customers. I wonder who owns the sewer pipes then? Would this be a traveling repairman problem? So, it would have to be distributed due to the joint ownership of the pipes by different companies, but separate management of the repairmen?

In the creation of the city that would also be an interesting problem of determining all the sewer layout and design… This seems like a Ford-Fulkerson solution with a Traveling Repairman problem and TSP. So, essentially we could evolve different pipe networks so that when the roads get built on top we will have created the best possible network with the given simulated info at the time.

Repairmen and Trees

I think I have finally found a problem that has not been studied and is important. I’m calling it the K-D-Traveling Repairman Problem. K, the number of repairmen, D, the number of types of repairmen. The repairman problem is a NP-hard problem and therefore heurstics are used to solve it. The Predictics group at CMU used Machine learning to approximate solutions to the TRP. I’ve found one other author that has looked at the K-TRP problem. I can’t find anyone that has looked at the K-D-TRP. This is a heterogeneous extension. Basically I’m saying that there are different types of repairmen that may need to overlap at nodes such in order to perform the repair.

Another interesting extension is a dynamic graph. Also, I’m interested in looking at saturation levels of repairmen. Non-deterministic repair times would also be another aspect to look at. The nice thing with this problem is that it is based on a real world problem that really needs to be solved.

Also, want to look Game theory, optimization theory and expander graphs seem like they will be useful in the theoretic aspect of the problem. I want to then develop a multi-agent learning algorithm where the agents learn how to solve the problem.

Also, I have been reading about some cool trees: KD-Trees and KD-B-Trees. They are pretty cool.

Bribes

Another idea is to use bribes to manipulate the system http://mpref2012.lip6.fr/proceedings/MaranMPREF2012.pdf show how to resist bribery. Can we use what they learned and instead use bribes for good as a means of inducing cooperation? The problem trying to solve is how to bring about cooperation of self-interested agents when the only/main reason they would help is because you are bribing them to do it. This is a weak form of cooperation, but could still produce cooperation. Bribery is usually used to have someone do something illegal, so one example would be to bribe members of a team to give you information that your team could use against them.

I noticed the other day that this is a topic in the MAS book page 323.

The Magic of the Unknown

I like this quote at the beginning of the chapter on regression models in my stats book:

So startling would his results appear to the uninitiated that until they learned the process by which he had arrived at them they might well consider him as a necromancer.

Dr. Watson speaking about Sherlock Holmes — A Study in Scarlet

Quote — Buckminster Fuller

When I’m working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong.

Buckminster Fuller

Came across this quote as I was looking into whether there exists N-dimensional hierarchies.

Found this paper http://www1.ceses.cuni.cz/benacek/3D-Apr05-full_version.pdf not the most useful, but interesting (haven’t read the whole thing).

Drew’s Borg

Prepare to be assimilated

thoughts