The Sloan Sports Analytics Conference is in town, and we're going to spend most of the weekend talking about sports analytics. Let's take a moment to talk about how we talk about sports analytics.
It's an exciting time for sports analytics. With next-gen data collection platforms operational in all of the major sports, there's no longer a question about whether analytics will have a role to play in sports, but how -- but with this legitimacy comes a need to move the conversation about analytics. Going forward, it will be less important to fascinate audiences with the possibilities of analytics (that sale's been made), and more important to ground the conversation in intuition and practicalities; ultimately, we'll need a conversation that's less TED talk and more textbook. It's time to focus on how analytical methods can contribute meaningfully to sports decision-making and story-telling.
This weekend's conference will be a great place to push the discourse in the right direction. As we navigate the slew of analytical methods and products that are being shopped around, here are some of the questions we'll be asking.
What's your argument and why should I believe you? It's true that numbers don't lie, but that's because numbers don't make arguments. Like all science, sports analytics is numbers plus rhetoric -- by highlighting a particular metric or measurement, you are implying that this summary carries some insight about the sport that generalizes beyond the games we've seen. What sorts of predictions could be made on the basis of your metric and how do you justify that extrapolation? How could we test those predictions? The objectivity in analytics doesn't come from doing math; it comes from testing predictions against ground truth.
What's the decision you want to support? Every metric or analysis is trying to support a decision, whether it's for team operations (Should I trade for this player?) or entertainment (Should I call this player the greatest of all time?). Modeling that decision is just as important as modeling the data. Every decision involves an implicit comparison between counterfactuals -- for example, if you're trading for a player, you need a sense of how your team's performance would differ between having player A versus player B. How does your metric help a decision-maker fill out both sides of that comparison?
Where is the variation coming from and which sources of variaton matter? Analytics is often sold as an exercise in separating signal from noise, but signal and noise aren't fixed concepts. Depending on the story being told or the decision being made, different types of variation can be classified as either. If you're looking to help a GM pick up a player on the free agent market, you'll be most interested in the "portable" skills that the player can bring to your team, while you'll want to filter out variation attributable either to the player's former team or to luck. If you're writing a story about the best performances of a given season, you may not care so much whether a player was lucky or good -- in the end, all that mattered was that the contested shot at the buzzer went in. Telling a coherent story with analytics requires a clear definition of what numerical variation matters to you and what variation doesn't. See more of our thoughts on this in our Meta-Analytics work.
Sports analytics is coming of age, but to really reach maturity, conversations about analytics have to become more organized and rigorous. Making this transition can be painful -- introducing accountability always is -- but that's the price of moving from the armchair to the front office.