Bootstrapping your big data training - how I grade The Great Courses' big data analytics course

Profile picture for user jreed By Jon Reed January 27, 2017
Summary:
To polish my analytics chops, I took a 24 class analytics course online. Here's my review of The Great Courses' Big Data Analytics offering - as well as some big data misconceptions highlighted by the instructor. Short version: this course works well for certain skill levels, but not others.

Career confessional: for a little while there, I considered "data science." I thought it might make a good backup plan if readers hated diginomica. Fortunately readers don't hate us because it turned out becoming a kickass data scientist is really freakin' hard.

Between mastery of programming languages like R and fluency in advanced math/stats, I saw too many hurdles outside my core. So I decided to tip my cap in respect instead. My new plan: add a stronger analytics component to my enterprise commentary. Learn how to better apply numbers to decision making. And that's where education like Big Data: How Data Analytics Is Transforming the World from The Great Courses comes in.

I'm a big continuous learning guy. Formal degrees have their place, but give me a course I can take on the fly, especially with downloadable audio I can consume on the tarmac. I'm willing to pay to get a course with multi-media formats. You can also get plenty of big data education for free online, but if you're like me, you don't mind paying a reasonable price for multi-media access. So here's my review, along with a few analytics takeaways.

The course itself has mixed reviews, so you need to know what you're getting into. The Great Courses bills the course as:

  • "A course for data users at all levels"
  • "You need no expertise in mathematics to follow this exciting story."

The second point on math expertise is definitely true. The first point is quetionable.

The course is taught by Timothy Chartier, who is an applied mathematician with a focus in computer science and a frequent author. His specialty is sports analytics, so the course is chock full of entertaining examples drawing on Chartier's work with the NBA, ESPN's Sport Science, and fantasy sports.

There are twenty-four lectures in all (30 minutes apiece), delivered in an engaging manner by Chartier. Chartier has a passion for big data that rivals your cousin's passion for Tool or your niece's obsession with Taylor Swift. I always say if an instructor isn't bored, you probably won't be either.

You can view the chapter overview here, but the course gets into all the expected topics, from algorithms to sentiment analysis, from decision trees to clustering.

This is not the type of course where I can offer a simple thumbs up or down. It's all about your agenda and skill level.

Course strengths:

  • Accessible language and plenty of field examples make it easy to follow along. This is a much better course than it would be if Chartier were just a professor and not a practitioner who regularly builds his own data sets.
  • A solid overview of the essential topics of modern analytics, without obvious omissions.
  • Sense of humor without a forced comedy routine.
  • Frequent acknowledgement of the challenge of deriving value from data and where analytics can fall short.
  • Plenty of recommendations for open source data sets, and tools you can use to crunch your own.

Course weaknesses:

  • Lack of deep drill down on any one data science technique. Definitely an overview course.
  • The field examples are often tied to sports. Sports is entertaining, but as we've learned the hard way, from a predictive standpoint sporting outcomes are perpetually elusive. Chartier does cite examples from other industries, but you're going to hear a lot about March Madness brackets. He probably should have spent more time on Moneyball, which laid the groundwork for the advanced statistics that impact pro leagues today on just about every level, from talent scouting to in-game strategy.
  • The lack of advanced math minutia is great for the layperson, but not for mathematicians or seasoned data professionals.

My recommended audience:

Looking at the negative reviews (the course averages 3.5 out of five stars), I think The Great Courses got themselves into trouble by billing this course as ideal for data users at all levels. If you dig into the negative reviews, most of them hammer at the basic level of the info. As in this review:

Unless you've been living under a rock for the past ten years, I must agree with other reviewers that you will not learn anything you don't already know about data management and analysis in this 24 lecture offering.

This five star review summed up the other side:

This is an excellent concept for a layperson what wants to understand what the "Big Data" fuss is about.

Therefore, I recommend this course for:

  • Executives who are under pressure to acquire a stronger analytics bent
  • Enterprise workers who find themselves interacting with analysts and want to ask them better questions
  • Those who are new to big data/analytics topics and want a solid overview
  • Anyone who likes to review fundamentals before moving on to advanced concepts (that's where I fall in)

I don't recommend this course for: anyone who is already active in data analytics and data science. If you have any doubt, review the chapter headings. If you have some idea of most of the topics listed, you'll want to move on. I would also caution anyone who really dislikes sports to steer clear (that's the second reviewer gripe - narrow industry examples)

Before I wrap, I'd like to give you a conceptual sense of how Chartier approaches big data and its impact. He does a good job of explaining why this explosion of data happened, and why this trend has business and cultural significance. If anything, Chartier is a bit over-the-top with his excitement over petabytes. We get loads of data fetish stats like: "the amount of photos uploaded to Facebook in fifteen minutes is greater than the number of photographs stored in the New York public photo archives," as if that fact improves our quality of life in some way.

The best parts are when Chartier talks about his projects with his students that DIDN'T work and the faulty assumptions they made and uncovered. You learn that with analyzing data sets, you have to question assumptions and try new tactics to test your premise. In chapter one, Chartier does a good job laying out six big data misconceptions:

  • Data analysis gives you an answer, not the answer. Chartier: "Unlike math, data analytics
    does not get rid of all the messiness. So, you create an answer anyway and try to glean what truths and insights it offers. But it’s not the only answer."
  • Data analysis requires your intuition as a data analyst. You are not simply crunching numbers.
  • There is no single best tool or method. Many times, figuring out which tool to use is part of the art of data science.
  • You do not always have the data you need in the way you need it. Just having the data is not enough. It may have errors, or be incomplete, or need to be merged. Cleansing data or getting it into the right format can be a big deal.
  • Not all data is equally available. The Internet has loads of freely available data sets, but sometimes the data you need most is in closed or restricted systems.
  • While an insight or approach may add value, it may not add enough value. Not every new and interesting insight is worth the time or effort needed to integrate it into existing work. And no insight is totally new: If everything is new, then something is probably wrong.

Chartier tells of a classic glitch his data group at Davidson ran into:

We built a model that initially made complete sense to us for sports ranking. We were proud of the result and excited to see what results it would produce. But then we ran it; we didn’t recognize any of the top teams. What happened? We couldn’t just trust the results and leave it there. Instead, we trusted our intuition and doubted the work.

Scouring their weighted assumptions uncovered the problem:

Our method was giving high ranks to very, very weak teams that lost a lot... But the data analysis wouldn’t know; it’s like an opinion, and our method was creating a very sketchy opinion at best.

Final thoughts

I liked Chartier's attention to data privacy. He didn't let his excitability about big data blind him to the consequences of data breaches and privacy encroachment. Chapter 23, on data privacy, is one of the course highlights.

This course is not available in audio-only format. I do like The Great Courses phone apps though. You can download or stream the video.

For those who finish this course or who are already beyond it, there's good news. Another Great Courses offering, Mathematical Decision Making: Predictive Models and Optimization is geared for more advanced students, and has a slew of five-star reviews. The price is right - to me anyway. That's where I'm headed next.