Wow so it has been a long time. I’ve recently been looking at stocks again and just two days ago I found a stock and I was like I should buy that. Then I didn’t. But I really really should have because it then proceeded to go up by 20% in 2 days. So, this made me look again into algorithmic trading. I found a couple really good resources:
Quantopian will let you design your own algorithms for trading on old data and will also let you run it through robinhood.io or interactivebrokers. I think I like IB better but should start with robinhood.io as there are no fees. But this is awesome!
I’ve had this idea for doing clustering and data mining of history texts. Would be interesting to creating learning algorithms that can learn timelines and context. Essentially history textbooks are structured possibly geographically and as a time series of events. Then doing graphical analysis and social analysis on these structures. Could compare history texts and see what is left out and maybe what each text places more emphasis on what events. Also, doing this across time to see how history texts have changed in what the historians themselves find interesting. Possibly finding patterns in history itself that were not evident or obvious without such algorithms that can crunch large large volumes quickly. Doing the normal sentiment analysis as well.
This could then lead to producing a better picture of different countries and people groups and how they were formed. Possibly doing anomaly analysis or creating other types of filters to uncover gaps in the history texts themselves.
This of course seems like it should have been done. Main issue is getting digital copies of the history books for the algorithm to work with. So, it may not have been studied. Creating learning algorithms that can understand human history seems like an important area of research. Especially as we are writing history now it is important to maintain a grasp of the entire picture and how everything fits together.
So, I’ve been thinking rather small lately. Especially with that autonomous mixer idea, I mean pathetic, am I right :p. I really want to go back to the reason I wanted to get into AI and multiagent systems which is making autonomous vehicles! So, I’m sure everyone knows that car manufacturers and even Google and Baidu are attempting to make cars specifically that are autonomous. This is great! I’m arguing that before full consumer acceptance of this happens and to make it affordable and economical we need to make it possible for consumers to modify their existing car to make it autonomous! This seemed to be the direction that the DARPA grand challenge was heading in. So, I found that some graduate from MIT also had this idea a year or so ago and have already made a company with a product (their website, wired article, machine learning job at their company). Obviously I’m excited about their product because of its simplicity and the fact that they are doing this now! Seems like its is meant currently for highways though. So, it still needs a lot of work.
EMG for robot gait creation.
Would be interesting to compare writing styles, format etc of users on forums. Would be interesting if we could identify users with multiple accounts. Also, would be interesting to look at the change in style over time per person to see if it was hacked or something out of the ordinary. Plot emotions over time. Would be interesting to see who has the emotional sway (can cause others to become emotional etc.) over the forum or post etc. I wonder how much there is in this area. I would imagine it would be a lot since we have had forums for a long time and I think this is something that Facebook has (or should be) been doing.
My brother got an internship as an accountant. One of his duties is a pricing things for the marketing department. Basically figure out how much all of the components cost and add in all the margins, expected demand etc. This takes about 3 days to put together a single price. This is crazy. It is prone to human error, slow and not robust. This problem certainly could be automated because all of the pricing info is contained in a database and the materials and components are in the database. So, all the info is there it just takes a while for a human to search for the things and copy the pricing data and add the margins. It is a long process. It would be ideal to also include info about competitors prices. Then the marketers would instantly be able to know the prices. Which would allow them to sell products so much faster because the competition wouldn’t even have had time to get a price.
I think that this might be something our company might be able make. This might be something to talk to Mike about.
So, I’m doing a pattern recognition project and I’m comparing different clustering techniques in R. I found http://www.r-bloggers.com/anova-and-tukeys-test-on-r/ has a nice intro on using ANOVA and Tukey. The method they show also includes a method for plotting the results (the confidence intervals)! So, thats nice.
I’m taking pattern recognition. We had a discussion on biometrics and we came up with the idea of using the gyro info from a smart phone as a password. This would allow users to shake and move their phone in some pattern that would allow them to log into their system.
I found this paper that details how mallicious apps can monitor when users are typing and gather gyro data to predict the keystrokes and thus possible passwords. http://www.cse.psu.edu/~szhu/papers/taplogger.pdf. By using motion based passwords users would not have to fear that their passwords were being stolen. Maybe since websites can detect if you are browsing with a phone they could ask you to input a motion based password if you have previously associated one with your account.
They are doing something similar using Leap Motion http://www.forbes.com/sites/michaelwolf/2013/02/06/could-that-shake-in-your-hand-replace-your-password-leap-motion-thinks-so/. This would be great for medical fraud detection. Can’t use dead people, and you would need the person to be their in order for the system to approve the claim.
I talked to Professor Dominicani and about the fractal clustering idea (thinking of making it or pattern recognition class) and she had great suggestions. She recommended http://snap.stanford.edu/data/ that has a bunch of graph datasets (like social networks and the like). Since such things tend to be self similar it would seem like a perfect use-case for fractal clustering. She also suggested looking at sub-space clustering because it seemed similar to fractal clustering. http://www.cs.cmu.edu/~sguennem/ is the guy to look at for subspace clustering. Also, she said she has some students doing research in this area, so I can always ask them questions.
I also found http://pajek.imfm.si/doku.php?id=data:urls:index which has a ton of resources for network data sources.
http://vlado.fmf.uni-lj.si/pub/networks/data/ is another resource for network data.