All Aboard

Posted by Liz A. on May 16, 2019

I’ve been thinking about data science for a few years. I don’t remember exactly when it started because it was such a gradual thing. I like math and I like asking questions so, I often search out answers to all kinds of questions like:

  • how many mice could fit in my car? (My daughter suggested buying mice to fill up the car and then we could just count them. I said that might be expensive and she said we only needed to buy two: one boy and one girl. She’s a problem solver.)
  • do the different octanes of gasoline have a linear relationship compared to cost? compared to..something else? “Wait, I don’t even know anything about what ‘octane’ really means”…which was followed by a long journey on Wikipedia.
  • how many sunflower seeds weigh the same as a bluejay? (This got pretty complicated. Turns out, not every sunflower seed weighs the same thing. Same with bluejays.)
  • how old would I be if I was 1 trillion seconds old? (I would be dead. By a lot. Interestingly, 1 million seconds is only about 11 and a half days. 1 billion is about 31 years.)

And so on.

Anyway, during one of my google searches I ended up at this article: “Why ‘Random’ Shuffle Feels Far From Random”. The article explains that people don’t actually like random playlists because our brains are really good at picking up on anything that looks remotely like a pattern. So, “random” doesn’t actually feel random to us because we notice things like “Hey, the last song was by this same artist!” We expect random to mean that, from one song to the next, there aren’t any connections - not in the artist, not in the genre, nothing. That means that the song choosing algorithm can’t actually randomly select a song - it has to figure out which songs don’t have major things in common and choose those. The article doesn’t say this, but it’s easy to imagine that the algorithm also learns things as the user skips songs or favorites them. It was a really interesting article to read and it was my first real glimpse of what sort of things data science can do.

The next big milestone in my data science journey was taking an online course about big data sets and SQL. Before that, I had only ever worked with Excel and very small datasets (with 100 data points or fewer) before. During the course, I got the chance to analyze a gigantic dataset (300,000 data points!) about Martian craters. Here’s one of my write-ups about it: Visualized Data. I very much enjoyed exploring the data and drawing conclusions about it (even if they were really basic conclusions. Every step counts in a journey though, right?)

Then my personal life got really busy (moving, getting married, changing jobs, etc) and I put data science on a back burner for a while. I kept reading and thinking about it and over the past couple of years, my interest has just continued to grow until I was finally ready for the next milestone: signing up for a data science bootcamp.

So here I am, ready to learn and ready to change.