Riot Heatmap

I’d like to make a world heat map of the riots going on. Doing a quick google search didn’t get me any results of such an app. I was thinking about riots because of somewhat of a long story. There is a podcast that I listen to called The White Vault, that has a voice actor that lives in Chile. Now recently the podcast had to postpone an episode because there was a riot going on in Chile and it was unsafe for the voice actor to make it into the studio to record. This got me curios about riots.

There are many questions I have about riots. How do they form, what is the root cause, what are the initial conditions, how do governments handle them? How useful are they for bringing about real change? What sort of metrics and data can we collect to answer these questions? Can we then use this collected data in order to predict when riots will form? Riots are interesting and seemingly most of the world experiences them.

But, for the heatmap, I think initially the heat should be based on the number of news articles.  Make another map that is based on NLP extraction of the number of deaths/money/etc that shows how actually bad the thing is.

Could look into doing it using javascript sort of like:
https://hicsuntdra.co/blog/earth-global-wind-map/
https://earth.nullschool.net/#current/wind/surface/level/orthographic=-76.30,32.01,1308

But, really i’d like to also link to the list of the news articles/outlets etc.

I know d3 is pretty complex but maybe something like:

http://d3.artzub.com/wbca/

where maybe the connectors would be the country that was reporting on the riot.

Started new job

I’m now working at Janssen Pharmaceuticals (a Johnson and Johnson company) as a machine learning scientist for drug discovery. So, what that means is that I’m using state of the art techniques in machine learning and trying to use them to help discover new drugs with the right properties. I just finished my second week and am planning on working on a reinforcement learning project already!

I’m still learning what the bigger goals are and what the overarching plan is directing our research. And well I’m not really a chemistry or biology expert in any way so I’m getting up to speed on a lot of that. Thankfully my boss has been able to explain everything I need to know so far. So, I’ll try and keep the blog updated.

As far as Janssen itself, I’m liking that I get to work one day a week at home and that I get free access to a bunch of museums like MOMA in NYC. One downside is that the 16 hr of volunteer time they give you off you can only take it off in 8 hour increments and it doesn’t accrue. So, it is going to be a bit harder to take advantage of this benefit than I thought. Also, the health insurance options don’t seem to be as good as I was expecting, especially for a pharma company.

A quick edit. I noticed I haven’t posted here in more than a year! Well my last year was a crazy busy period due to my previous job at Raytheon BBN. While working there I got to do a variety of research on about five different proposals and worked on two different programs. So, hopefully this new job will be a less stressful and I’ll have more time to post here.

Data with built-in functions

I think it might be helpful in the future to not just have json data be just the dataset but also contain pickled functions that can be used by the end user to easily access the data in a way that works for their application.  Dill can be used to serialize a python function or class (though no security is assumed).  Then stick that serialized function into the json and use that to read the dataset.  Would be much nicer to just say that everyone needs to provide functions to their dataset that are easily obtained.  Since these are arbitrary functions this is very very dangerous so I’d only recommend using for data that you wrote yourself…  So, this sort of defeats the purpose…

Macro Scale Agent Based Modeling

I was reading about this book called Factfulness: Ten Reasons We’re Wrong About the World—and Why Things Are Better Than You Think by Hans Rosling that Bill Gates recommend reading. From this I found the Gapminder (which is a spin off from Han’s work) and their tool:

https://www.gapminder.org/tools/#$chart-type=bubbles

which lets you explore a dizzying number of statistics in order to get a better idea of the world from a macro perspective.

Open Numbers is a cool organization that has a lot of data and is where gapminder pulls its data to put into their tool.  Particularly this dataset:

https://github.com/open-numbers/ddf–gapminder–systema_globalis

As I am into multiagent systems and agent based modeling this seems like an amazing resource for providing real world data to back up simulations.  There are so many interesting things to try and model this data and then with those models be able to code “what if” scenarios.  Like say what if we taxed all the millionaires and billionaires 1% every year and redistributed it somehow to the poorest 6 billion?  With this data we could see how nations could change and populations grow.  We might even find that the people we tax grow even richer due to the increase in the number of people that would be buying things.  So many other things we could study with sort of simulation.  We could consider what would happen if we had trade tariffs, or natural disasters, or famines… We would see what would happen globally not just locally and not just to a particular sector but to a variety of variables.  Clearly this would require a massive amount of research and more data than is currently available.  It would be awesome just modeling the behavior of these datasets would be beneficial to understanding how the world works and possibly aiding decision making to possibly reveal outcomes previously not thought of.

Nix

I have been thinking about being able to reproduce results easily and quickly.  As you can read in my previous post about jupyter notebooks.  They will, at least in python let you do so.  However, when attempting to reproduce entire dependencies for your software so that you can easily install on another machine there is Nix:

https://nixos.org/nix/about.html

There are obviously other ways of managing packages, but NIX install packages within build environments so that you can isolate packages to particular projects.  Then you can easily know that you have the list of packages that are needed for reproducing your project build on another computer.  It is pretty neat, but as you may have guessed it works with linux and max os, not windows.

Jupyter notebooks

Some interesting projects:

Google has there own modified jupyter notebook that integrates into google drive:

https://colab.research.google.com/

And there is Binder (beta) that will create an executable jupyter environment from a github repo with jupyter notebooks.  Then anyone can easily run your code.

https://mybinder.org/