Programming Logic for the Impatient, Part 2

Thanks for reading through to part 2 of this hasty primer to programming (you can get part 1 here). It’s a poorly planned and half baked idea, the way most legendary code begins.

Here’s something totally counter-intuitive to think about. We are bombarded by ideas about complexity in technology, marketing in the 80s and 90s fostered the idea that technology is complicated and sophisticated and it’s okay not to understand it because of this. We think to ourselves that machines are complex and some kind of nebulous threat. I even fell for that for a long time!

But it’s totally wrong. It doesn’t actually get more simplistic.

89102310539f8b2128ea27329d9b3315
Number Munchers. The bar was set very high.

In programming, a piece of code takes one small bit of information and makes some decision about it. That’s it. That’s literally it. The difference between a decision made by the human mind and a piece of code processed by a machine is that the machine can do it as fast as it can order the processor to perform. It’s light years beyond comprehension (for our purposes). Why is this important? Because it’s fundamental in seeing how machines understand the world – bit by bit.

Big decisions, but not really

Routinely you will see statements such as the following if you read your way through code.

  • If x and y are true, and z is less than some number, do this thing.
  • While x is less than y, do this thing. Otherwise, do these other 2 things, and maybe this other thing too.
  • If x is true, check the value of y. If y is less than 1, do z until y is greater than 1.
  • For everything in a list z, while y is greater than x, do this thing. Then after these things are done, do this other thing.

If this does not immediately make sense to you, that’s okay. Consider these alternate statements:

  • If you’re hungry, go to the fridge. Don’t go to the fridge more than 3 times a day, and wait at least 3 hours between visits.
  • If your bladder reaches max capacity, go empty it. If you empty your bladder, flush the waste from the toilet you used (and maybe lower the seat at some random interval because you’re courteous).
  • When the phone rings, answer it. If it’s your mom, stop everything else that you’re doing, sit on the couch, and talk to her.
  • Go to 6 meetings today and do these 10 important tasks, and then go home, but not before you’ve completed everything in some capacity.

These are referred to as conditional statements. They exist in absolutely everything, if you took math in 2nd grade you’ve seen things like “greater than” and “less than”. You can’t decide anything without them. You used one when you started reading this:

  • I have ten minutes. I don’t have anything else to do at this moment. This article is something I want to read.
  • If I estimate that reading this will take me no longer than 10 minutes, I’m reading it.
  • If it gets boring, I’ll stop. Otherwise, I’ll finish it and share it with a friend.

This process happened so fast you barely even noticed it.

Now consider the following things you are probably familiar with, written as simple decisions and instructions:

  • If it is 7:00 am, wake me up.
  • If there is no coffee, don’t drink anything.
  • If there is a Dunkin Donuts, go get coffee.
  • If you are awake, go brush your teeth.
  • When you see the highway, get on it.
  • If you are going to work, travel west.
  • When you get to work, go inside.

These are just simple conditional statements. As building blocks, we can clearly see that they are just small bits of information to base decisions on. True or false, yes or no. In many cases these bits are actually so small that they may be without context – but that’s why we have descriptive variable names to perfectly understand what it is that we’re doing. Remember our last article? If I know what that bucket is supposed to be holding, I can refer to it by name.

We can add these statements together, and since we’re thinking like a machine we are going to be able to do all these things very quickly. I’m not going to add too much additional information but we need to make sure that every outcome is covered so that we don’t do the wrong thing correctly like we talked about in part 1. There are tricks and transformations that we should be using to make this even easier to understand (if you can believe that) but I choose to avoid them to make my point. I also committed myself to no code, but I don’t think formatting something as code is that big of a deal. So,

  1. if you are going to work and if it is 7 am,
    1. wake up
    2. drink coffee and then brush your teeth
      1. if there is no coffee
        1. brush your teeth
    3. if the time is later than 7:45
      1. leave the house
    4. if there was no coffee and you see a Dunkin Donuts
      1. get coffee
      2. if you weigh less than 200 pounds
        1. maybe a donut too
    5. when you see the highway,
      1. travel west
    6. when you arrive at work,
      1. go inside
  2. roll over and go back to sleep

When we nest all this together we can see that it’s not terribly sophisticated at all. We did leave several outcomes out, but for this train of thought that’s okay. Just know that one of the fundamental building blocks in coding is making complex decisions by combining simpler ones. You are already doing this constantly!

One last thing about conditional statements. I have one more observation to make about machines. Doing the same thing over and over and over and over and over and over and over and over and over and over and over and over and over and over and over and over and over and over and over again (my wrist hurts from all that Ctrl-V-ing) would destroy a human being’s brain eventually. We’re lucky that machines are willing to do this type of thing thanklessly for us. In programming we call such monotonous repetitions loops. They’re actually just another type of conditional statement in which you order a machine to do something repeatedly until something else changes.

I’m a really big fan of outdated technology books. I don’t know why. They’ve got this weird anachronistic quality about them that I can’t get enough of. I have a Novell Netware book from 1994 that’s 6 inches thick propping up my monitor at my desk. I sometimes find gems floating around at thrift stores and I can’t resist.

I can’t (and wouldn’t) visit a thrift store every day, though. My wife would go crazy. Let’s talk about this in terms of a loop:

  1. I don’t have a 700 page book on WordPerfect published in 1990 yet.
  2. Until I find one, I’ll go to every thrift store every day.
  3. I’ll search through every book until I have one.

That’s a loop – do this thing constantly until you meet the reason to stop that I’ve given you. In this case, searching through every book at every thrift store is the thing to do, and the reason to stop is when you find that sweet sweet WordPerfect book that I can use to bookend my Visual Basic volumes and my TAG book about a self programming robot named Rodney.

Next time we’ll cover objects and functions. I hope you’re keeping up with this, let me know in the comments below!

 

Programming Logic for the Impatient, Part 1

I am not a programmer. I very much enjoy programming and trying to solve problems, but I’m not a programmer. I really like manipulating code to do cool things, but I’m not a programmer!

The reason I’ve decided to write a (series of) blog posts about programming logic is because I know many others who are damn smart but don’t have a firm grasp on how machines think. I think everyone should at least have some real-world grounding in this idea. There’s an entire hidden world right at our fingertips.

Please note, this is not going to be a technical explanation by any stretch. I am not going to use many technical terms at all – in fact, I am challenging myself to write no code whatsoever as I go forward. My definitions are going to come right off the top of my head. I am going to make every possible attempt to explain how machines think without forcing you to go click to other sites in order to follow along. I am prepared to get my eyebrows burned off by the searing rage of professional programmers that think I’m a fool because I didn’t tell you about error handling, threading, or inheritance. That’s all okay by me. My goal is to get you to think like a machine thinks. After we’re done you’ll hopefully be able to go off and start learning the language of your choice with a good grip on the conversation you’re actually having with the machine following your orders.

Let’s start off with my first homemade definition.

What programming is

Programming is just a series of instructions given to a machine with the express purpose of performing some task. If I want a computer to open a web browser, I need to carefully explain to the computer that I want it to do this on demand when I push a button. It’s pretty basic stuff, and I can’t assume that the computer understands what I’m talking about unless I set down ground rules.

“Follow these orders, exactly as I’ve written them down.”

Make sure you read that correctly, because this is quite literally as tough as it gets.

Follow these orders, exactly as I’ve given them to you.

That’s a statement loaded with meaning. The biggest problems and confusion in coding come from the fact that we do a poor job of explaining ourselves in general. Can you think of a time in your life when you were unclear about something that you wanted and the end result got screwed up? That’s what happens with code all the time. Machines never make mistakes (well . . . ), but they do get confused if we don’t explain every single detail to them. They always follow orders exactly as we’ve given them.

How machines think

Machines are like children with no bias about anything. If I tell you do go play fetch with a king cobra, you will not do it. Why? Because there are many pieces of information that flow through your head immediately after I issue that order.

  • Snakes are predators
  • Cobras are poisonous
  • If you mess with a poisonous snake it’s going to bite you and you’re going to die
  • If you get too close it’s going to chase you
  • A snake won’t chase a stick, and it probably wouldn’t bring it back if it did
  • Why is there a snake in our yard
  • Isn’t there a dog park near here
  • I should probably do something about this snake

Now, if you were to tell a very young child to do the exact same thing, they may actually do it. Nothing is defined for them, they have no experience, and they don’t know about the dangers. As far as a kid knows, a poisonous cobra would make an amazing companion.

That’s how a program functions. It only knows what you tell it. If you write code that tells the machine to do something, it’s going to do it. Unconditionally. Programming logic isn’t that complex compared to what we do – it’s just a lot clearer. Psychology spends decades trying to unravel very complex things into very simple ones. I guess they’re really just trying to debug mental illness!

Variables are just buckets of stuff

Let me give easy way to understand the notion of a variable, because you won’t get far if you don’t generally understand what they are. Let’s build on what we’ve already discussed.

Consider this: If there’s a fire, and you have a bucket of water, and you want to put the fire out, you pour the bucket over the flames. Easy.

As humans we speak in a very symbolic language. This is a pretty fantastic shortcut to understanding variables that I’m about to totally exploit.

Let’s reread that thing about the fire again for just a second. You’re pouring the bucket on the fire. Shouldn’t that mean you’re pouring the water in the bucket on the fire? Stay with me in this moment, right before the light bulb goes off in your brain. Our symbolic way of thinking about things is the key. Anything could be in that bucket right now, including:

  • water
  • sand
  • nothing
  • gasoline
  • dynamite
  • a puppy
  • a cobra
  • and so forth.

We’ve got a fire, it’s raging, and I am issuing you a direct order to pour that bucket over the flames! Why? Because earlier, when we were preparing ourselves for this crisis, I filled that bucket up with water. Now all I have to do is say “take whatever is in that thing and do this other thing with it”. The variable is defined somewhere, we can change the value of that variable around any way we want, and then we use that value however we need to. I grab a bucket to put things in, I tell you what’s in the bucket, and now we can use the contents of that bucket to do other things. All I’ve got to do is help you keep track of what’s in which bucket. Make sense?

I’ll see you next time when I offer a basic, non-programmer’s explanation of conditional statements and loops!

What Excel Skills are Considered “Advanced”?

I’ve been away for over a year, holy cow. Time to dive right back in with a writing sample I wrote about – woohoo – Microsoft Excel!

Spreadsheets are a pretty amazing tool, and you probably have some experience using them or you wouldn’t be here! Although I can point you to several engineers who wouldn’t dream of booting up Excel for their tasks, basic “what-you-see-is-what-you-get” tools possess a fundamental beauty in bridging the gap between programmers and non-programmers. For example, I can demonstrate complex algebra and statistics ideas on a spreadsheet that makes them perfectly clear to people who have not had much experience with them.  This is critical in an age where we are all expected to have some competence with these tools no matter what we do.

In reality, not everyone is going to do complicated math in a spreadsheet. Most of what I see on a daily basis is just simple data in rows, some multiplication and division, and infrequent VLOOKUPS and SUMIFS. And charts. Oh, those charts. People just love charts.

From my perspective, going from basic to advanced is stepping over the line from using Excel as a spreadsheet/word processor/MS paint for the sole purpose of presenting data to a pseudo-database that can actually tell you something about that data. That “Advanced” moment is a sudden realization that keeping source information clean and exclusively categorized can actually provide you with so much more useful insight.

So, are you advanced?  Here are a few questions you can ask next time you’re busy building spreadsheets.

Are You Allowing the Spreadsheet to Think For You?

Many tasks involve looking at large lists of information and making decisions about how to analyze, categorize, or summarize that information in a way that gives you concise answers.  At the advanced level this requires indexing, if/then formulas, custom conditional formatting, and custom filters.

Are You Letting Your Variables Be Themselves?

=A1 + B1 works great for single sheet files with basic information.  What happens when you’re dealing with 17 sheets, plus 5 reporting sheets and a data validation variable (I speak from experience)? Advanced Excel demands that you name ranges and cells for clearer formulas and references, and eventually use INDIRECT when you start getting really complex.

Are You Leaving the Editing to The Professionals?

Many basic level users are perfectly happy learning how to use CONCATENATE and never looking back.  Eventually, they’ll realize that using a formula with parameters to join strings together just doesn’t make for an easily scalable sheet.  Advanced users know that it’s important to use formulas like FIND, ISNUMBER, and LEFT/MID/RIGHT to edit and produce the exact result they want dynamically.  I once made a sheet to help me write 301 redirects for 300 webpages.  Advanced text manipulation cut that down to 10 minutes.

Are You Giving Excel Your Dirty Work?

I haven’t been entirely honest with you up to this point – I actually do more at work than create spreadsheets. In fact, I use numerous different proprietary tools and sites day to day, but they do not assemble the information together in the way that I need them to, and they certainly never give me immediate insights.  Advanced users know that getting to know the Data Model in Excel and leveraging the raw reporting power of Pivot Tables can free up the time you need to do whatever it is you do at the highest possible level. And yes, for crying out loud – you can still do charts!

So get learning about these new ideas!

 

Life on the Long Tail

You’re going to hear a lot more from me now that I’m graduated from college and I’ve found my home in digital analytics and marketing.

The semester wrapped up really well. I won’t go too in depth about the projects but I can tell you that the Twitter study didn’t turn up anything I considered valuable:

Twitter word distribution graphThis is a distribution of 2 different word categories across a month on Twitter.  One category contained substance-use/abuse-related words, while the other contained words with similar monthly volume drawn at random from an English dictionary.  Individual users do not use either type of word in any significantly different way – notice that p-value.  During my presentation I pointed this out and one of the attendees (a very well-regarded scientist and mathematician at ASU) gave me something to think about.  He explained that these types of words may be used in a very different way across different social media platforms.  The “stream of consciousness” manner in which Twitter is generally used makes it very different from the “thoughtfulness” of a Facebook or Google Plus post, or the “curated” feel of a LinkedIn post.  It was a commonsense observation that was easy to miss while I was digging away at the code. And of course this becomes blatantly obvious once I am entrusted with creating some social media content for a few clients at the agency.

The part that is stuck firmly in my craw is that the type of research we were doing was cutting edge stuff, and there wasn’t much available yet in the way of academic comparisons between platforms. Perhaps someday that would be available as more than blog posts – and not in any way to diminish specialist bloggers in “social forensics”, it’s just difficult to justify that sort of thing in academic literature. I can’t cite Moz or KISSmetrics.

Nonetheless, I’ve moved on. Now I turn my attention to statistics in a much more applied way. Although my position at the agency does not directly require me to perform research on traffic, I do get to spend a lot of time with keyword research. Specifically, I’ve fallen in love with the long tail of the distribution, something that turns statistical analysis on its head and instead takes us a murky step forward into the sociological side of the Internet.

Here’s the fun question that can’t really be answered in one word: What are users really searching for? Try typing in something like “cars” and whatever comes back on that results page clearly indicates that a lot of businesses are trying to get in front of a word that obvious. So the obvious question should be – why is someone typing in “cars”? Probably because they want to buy a car, right? Maybe. Or maybe because they are:

  • Interested in classic cars and want to look at some pictures but don’t know where to start,
  • Middle-schoolers who have to do a report on the history of cars and want some fun facts,
  • Car dealerships that are just entering the world of digital marketing and are taking the misguided step of seeing who ranks for a keyword with an estimated search volume of over 3 million,
  • Looking for Cars.com but can’t actually remember the web address for it (it happens),
  • Huge fans of the Pixar animated film, or perhaps watching it at that moment and trying to figure out the name of the washed-up comedian that did the voice of the tow truck,
  • Looking to sell, get repairs for, talk about, find parts for, read maintenance tips about, read funny stories about, read creepy fan-fic about, learn the history or business behind, or find good places to wash their cars.

Oh, and also those users that are looking for a dealership to buy cars from. “cars” is one of those words that exist in what is termed the “fat head” of the distribution. And frankly, it’s a way to get whacked by Google. That’s Web 1.0 hogwash. Altavista-Era foolishness. Pre-Panda-poppycock!  In fact, I’d say that “fat head” is the best name for it.

Let’s take a quick look at this using Google AdWords:

AdWords - The Long TailIt should be pretty clear at this point, and hopefully very intuitive – the more specific the search term, the lower the results.  That makes sense, right?  Interestingly enough, many, many businesses do not understand the implications of this!  They immediately attempt to get in front of those 3 million searches without realizing that this number may include everything below it in this list.  They don’t know that they are rushing toward irrelevance!  Couple that with interest in pay-per-click and the result is that a lot of companies that can’t afford to spend as much as they do on digital marketing are probably wasting the majority of it.

Now consider the impact of different posting types from social media.  Depending on where your interested users are coming in from, who knows if you’re even targeting the right long tail keywords?

I never anticipated finding so many interesting challenges in this field, but here we are!

A Semester of Python

Python and Me

I’ve only got a month until I graduate from college! This semester has been a very interesting one and I have the distinct honor to be working on a capstone project that aims to determine if we can track user behavior on Twitter. Why is this relevant? Without going into too much detail, knowing the “risk factors” for a spread of behavior (or knowing when a cascade is about to take place) could be a very important piece of information in determining how to prevent such behavior. That’s as much as I can really say without breaking it down all the way.  I am also involved with studying the Osotua Giving model among the Maasai of Africa for my last modeling class.  Initially I thought it was boring, but I was proven wrong quite fast and I’ve grown fascinated that such a system can exist.

The entire experience in making this happen and learning how to understand the data we have has opened my eyes to the wonders of programming – specifically, Python.  Not only is it easy to pick up and very well documented, but there are tons of users that are always willing to share their skills.  I’m using a combination of both Python 2.7 and the extraordinary IPython Notebook (plenty more on that in future posts).

The versatility of Python

Let me tell you what I’ve created this semester (3 months at this point) using both Python 2.7 and IPython Notebook:

– An ‘Osotua Giving’ simulation, modeling behaviors among Maasai herders. We then used this to see how node connectivity can influence survival rates. This also taught me a great deal about data storage, dictionaries, and pickles.
– A module of my very own named bartle.py, used to log specified lines and variables of any script. I (sort of) named it after Herman Melville’s famous scrivener and I’ve used it on most of my code for debugging (including the ‘Osotua’ simulation which you can see more about at  Osotua Fortress).
– Several brute force methods of making enormous data sets accessible. There is a network edge chart for the Twitter study that I initially attempted to load and manipulate as one variable (36 gigs), with expectedly poor results.  I then went back and approached it more mindfully and now it has been put to work. The results have been some fantastic and useful network graphs.
– Beautiful plots. I am an expert level Excel user and I don’t use it for anything but work and budgets anymore.  Amazingly, iterating over an array is more useful than auto-filling formulas down a column.  The only real usefulness of Excel now is pivoting on data, although I’m sure I’ll find a way to do such a thing in Python eventually.
– An egg timer.  At one point I was getting so lost in my projects that I’d be spending too much time on one thing and lose a whole afternoon. So I wrote 8 lines of code to remind me every 30 minutes or hour depending what I specified to get up and do something else.

Python has become a huge part of my academic work, and I’m attempting to extend that further into other areas.  I’ve begun to develop a list of projects to do once I graduate in order to grow my abilities with the language.  Are you a Python user?  If so, please get in touch/follow/leave a comment and let me know what you’re working on.  It could be personal or professional, expert level or just tinkering.  I’d love to see what Python can do!

I leave you with the coolest thing I’ve been able to do so far with the networkx module, finding the shortest distance between two points on a connected grid.  Cheers!

This uses explicit positioning in networkx to find the shorts distance between 2 points on a grid.