Just like with business data today, there also seems to be an endless supply of sports data out there. How far, how fast, how accurately, at what times and temperature, from which part of the field or court or pitch, for what salary and at what age?
If the Big Game holds to usual American football form, a cornerback or wide receiver will run the highest total distance of any player during the course of the game—around 1.25 miles per game is typical.
That’s a lot compared to an NBA basketball player—Toronto Raptors guard Fred VanVleet leads the league with 2.77 miles total run over the first 52 games of the 2022/23 season. Of course, this pales in comparison to the 10+ miles Croatia’s Marcelo Brozovic ran in one soccer match against Japan in the 2022 World Cup (breaking his own World Cup record).
Once the NFL season comes to an end with the Super Bowl, of course, there won’t be current games to write about for several months. So how do analysts in the sports industry continue to find, extract, and tell stories from overwhelming volumes of data?
To find out, we spoke with journalists Aaron Schatz, founder of Football Outsiders, and Steve Doig, longtime professor of data journalism at Arizona State University, who provided some insights into how to tell compelling stories with data.
The metrics firehose
In his reporting days, Doig didn’t focus on sports. “As an investigative reporter, to get data I had to use FOIA [the Freedom of Information Act], or sometimes go to court. On the sports side, today, you get the firehose,” he says.
Even in sports, of course, it wasn’t always this way. Schatz says that before the turn of the century, journalists as well as NFL teams and coaches mostly used very blunt measurements to analyze the game.
“Now, there are a ton of people in football analytics,” he says. By creating DVOA, which stands for “Defense-Adjusted Value Over Average,” Schatz and associates contributed to moving the discipline forward. DVOA compares the results of anything, from a single play to a full team over a whole season, against the league average in comparable situations: down-and-distance, score, time left in the game, and more.
DVOA is now one of many advanced metrics applied across different sports. Soccer analysis and coverage often features variations on xGa, for “eXpected Goals Against,” which reflects the quality of shots created, taken, scored, or stopped.
The rise of player and movement tracking systems such as early-2000s pioneer SportVu, which applied missile-tracking technology originally developed by the Israeli military, underlies many of these developments.
The net result of all this data sprawl is that a sports journalist sitting down to work on any given morning has access to these and other advanced metrics and raw data points, published by professional leagues, media sites, and subscription services. Increasingly, they can get even more from their own data analysts. “Today every media organization needs people to do this,” Doig says.
Finding and telling stories with data
Just having data to throw at the audience doesn’t guarantee that they’ll listen. Here are six journalism strategies for diving into that data and turning it into something that grabs the audience’s attention:
1. Start with a question
“The best way to find interesting data is to start with a ‘gee, I wonder’ question,” says Doig. “Imagine the interesting tidbit on which you could hang a story,” then dive into the data to see what’s there.
For example, he says, Axios publishes a sports data newsletter that often leads with questions such as “Who’s the only non-quarterback this century who won the Most Valuable Player award?” Even if the question has a simple answer, it can lead the audience to engaging stories and comparisons.
Of course, if the question isn’t interesting, then the resulting story is also likely to fall flat.
2. Contest common wisdom and perceptions
In 2020, Schatz says, Kansas City Chiefs quarterback Patrick Mahomes was much more effective against man defense than against zone. In the following season, his overall stats declined, and many in the media made an obvious connection: “The most popular story of 2021 was that cover-2 (zone) defenses have slowed down Patrick Mahomes.”
It was a story that passed the sniff test, Schatz says. But it was simply wrong.
Teams were indeed playing more cover-2 defenses against Mahomes—that much was correct. However, he actually averaged more than one full yard per pass attempt against zone defending than he did against man-man defenses, according to Schatz.
“What our eyes told us was completely wrong,” he says.
Schatz agrees that many of the best stories come out when either the data completely backs up what people think they see, or is completely different.
“It looks like the Chargers offense has been neutered because they aren’t throwing downfield any more. Let me find some data that demonstrates tha —that’s a good story,” he says. But so is “The Vikings’ record is nine wins, two losses, but here are some stats that say theyre actually a below-average team. So the story is why, and what does that mean going forward?”
3. Look for surprises
According to Schatz, “There’s no consistency year to year about which quarterbacks are good against zone or man-to-man.” Each year, different QBs rise to the top of performance rankings playing against either style of defense.
That’s something Schatz cites as a surprising bit of data, and surprises are among the best places to find stories. “Then you try to figure out, why is that the case?” he says. And the why, or sometimes even the process of examining it, is where a valuable and compelling story may be hidden.
An abrupt change, a performance that stands out, a data point that bucks the trend—those are potential hooks for a story. But data analysts also know that outlying data points do require some care. “Outliers do catch people’s attention, absolutely.” You have to remember statistical significance. “Outliers based on 50 pass attempts are interesting, maybe; 150 attempts are more meaningful. You don’t write about something based on six attempts,” Schatz says.
4. Invite readers in for more
Sometimes a good data visualization stands on its own. In those cases, Doig notes, “the reader should be able to look at the graph and immediately see the point.”
If the difference between numbers or lines is subtle, then color, size, or placement of visual elements may be needed to provide that at-a-glance clarity.
For more complex data sets and interactive presentations, the storytelling job may be about summarizing data in a way that invites more investigation. Doig’s example is from basketball. “When Kobe Bryant retired, the LA Times did a full-page heatmap of every shot he has taken from the floor, each represented by a dot.” That’s 30,699 dots. “It was a great graphic; people would really dive in,” he says. (The paper also published a glimpse at its process of creating that visualization, which dataheads may find a good story in and of itself.)
From the soccer world, visualizations can invite viewers to examine the style of play for Premier League teams, and the parts of the pitch where they are most effective.
5. Imitate examples from all over
Doig says there are lessons and examples in data storytelling everywhere.
Both for a data expert who wants to do a better job telling engaging stories, and for the natural storyteller looking for an engaging way to learn analytical methods. Doig mentions sources inside and outside the sporting world, including:
- Thomas Severini’s book Analytic Methods in Sports, applying lessons from “the dismal science” of economics
- Freakonomics, the best-seller subtitled “A Rogue Economist Explores the Hidden Side of Everything”
- John Allen Paulos’s A Mathematician Reads the Newspaper as well as Innumeracy, both of which “are filled with bad examples of using math—usually by journalists,” Doig says.
6. Always put your audience first
Schatz’s background includes a stint in public radio. What lessons from that experience apply in data storytelling? He mentions two.
The first thing is to serve your audience, who will be tasked with using the data to make decisions. “Don’t do what you like, do what they want,” he advises.
The second thing Schatz learned working in radio was “how to be succinct, how to summarize complex statements with shorter statements.”
Even when it’s tempting to expound in great detail, it’s better to be concise and let the audience’s questions determine where to delve deeper into the story. “Understand, I can go on forever if I want to,” Schatz says with a laugh, “but I try not to. You have to be succinct.”