Python and Me
I’ve only got a month until I graduate from college! This semester has been a very interesting one and I have the distinct honor to be working on a capstone project that aims to determine if we can track user behavior on Twitter. Why is this relevant? Without going into too much detail, knowing the “risk factors” for a spread of behavior (or knowing when a cascade is about to take place) could be a very important piece of information in determining how to prevent such behavior. That’s as much as I can really say without breaking it down all the way. I am also involved with studying the Osotua Giving model among the Maasai of Africa for my last modeling class. Initially I thought it was boring, but I was proven wrong quite fast and I’ve grown fascinated that such a system can exist.
The entire experience in making this happen and learning how to understand the data we have has opened my eyes to the wonders of programming – specifically, Python. Not only is it easy to pick up and very well documented, but there are tons of users that are always willing to share their skills. I’m using a combination of both Python 2.7 and the extraordinary IPython Notebook (plenty more on that in future posts).
The versatility of Python
Let me tell you what I’ve created this semester (3 months at this point) using both Python 2.7 and IPython Notebook:
– An ‘Osotua Giving’ simulation, modeling behaviors among Maasai herders. We then used this to see how node connectivity can influence survival rates. This also taught me a great deal about data storage, dictionaries, and pickles.
– A module of my very own named bartle.py, used to log specified lines and variables of any script. I (sort of) named it after Herman Melville’s famous scrivener and I’ve used it on most of my code for debugging (including the ‘Osotua’ simulation which you can see more about at Osotua Fortress).
– Several brute force methods of making enormous data sets accessible. There is a network edge chart for the Twitter study that I initially attempted to load and manipulate as one variable (36 gigs), with expectedly poor results. I then went back and approached it more mindfully and now it has been put to work. The results have been some fantastic and useful network graphs.
– Beautiful plots. I am an expert level Excel user and I don’t use it for anything but work and budgets anymore. Amazingly, iterating over an array is more useful than auto-filling formulas down a column. The only real usefulness of Excel now is pivoting on data, although I’m sure I’ll find a way to do such a thing in Python eventually.
– An egg timer. At one point I was getting so lost in my projects that I’d be spending too much time on one thing and lose a whole afternoon. So I wrote 8 lines of code to remind me every 30 minutes or hour depending what I specified to get up and do something else.
Python has become a huge part of my academic work, and I’m attempting to extend that further into other areas. I’ve begun to develop a list of projects to do once I graduate in order to grow my abilities with the language. Are you a Python user? If so, please get in touch/follow/leave a comment and let me know what you’re working on. It could be personal or professional, expert level or just tinkering. I’d love to see what Python can do!
I leave you with the coolest thing I’ve been able to do so far with the networkx module, finding the shortest distance between two points on a connected grid. Cheers!