Even True Statistics Lie

(published July 21, 2011)

“Past performance is not a guarantee of future results.”

That disclaimer should appear, early and often, alongside every piece of sports commentary.

Athletes are not lab rats, and human performance is not dictated by some discernible natural law.

Today’s exhibit:  the 2011 Minnesota Twins.

At the end of May, Tom Verducci wrote on SI.com that it takes just two months for the contenders and the pretenders to sort themselves out in baseball, and the Twins were one of eight teams already doomed to miss the playoffs.

He offered three pieces of statistical analysis for the 15 full wild-card seasons from 1996 to 2010, examining where every team stood on May 31:

(1) Of the 143 teams that were at least five games under .500 at the end of May, 136 did not make the playoffs, a 95.1 percent failure rate.

(2) Of the 184 teams at least five games out of a playoff spot – any playoff spot, division or wild-card – on May 31, 174 missed the playoffs, a 94.6 percent rate.

(3) Of the 126 teams that were both five games under and five games out, 121 didn’t make the playoffs, a 96.0 percent rate.

“Now you understand why the Twins are done,” Verducci wrote immediately after citing these facts.  “At 17-35, they are 18 games under .500 and 12 games out of a playoff spot.”

He wrote that on May 31.  The Twins lost that night, falling to 17-36, deader still.

Since then, the Twins are 29-15 – an outstanding but not outlandish record – raising them to 46-51, five and a half games out of first.  They’re in fourth place in the AL Central; one of the teams ahead of them, at 47-50, is the Chicago White Sox, who also met Verducci’s “five-and-five” criteria on May 31.

Would it be shocking if either of these teams won the division?  Of course not.

I don’t mean to single out Verducci, who is an excellent writer and reporter.  There are any number of people I could have cited who have been just as guilty of using statistical analysis to achieve an impossible certainty.

If he had written that the odds were against Minnesota coming back from its dreadful start, that would have been true, undeniable, and pretty dull.  If he’d ignored the “five-and-five” thing and looked merely at their record of 17-36, finding how many teams had reached the playoffs after falling 19 games under .500 by June 1st, he could have come to a similar conclusion.

So what?

It’s only happened five percent of the time over a fifteen-year period?  Then it’s happened one time in twenty.  It’s not impossible.  It’s not likely, but they’re not “done.”

And even if it has never happened, that doesn’t mean it can’t.  Ask the 2004 Yankees.

A season takes place in the present.  We can study, analyze, debate, and theorize to our hearts’ content, but two sets of people are going to play the games, and nothing is necessarily going to happen.  Anomalies abound.  Mistakes are made; great performances can come out of nowhere.  The percentages are likelihoods, but they aren’t facts.

Don’t get me wrong: I love sabermetrics, and my understanding of the game has been greatly enhanced by the writings of Bill James and Elias and Baseball Prospectus.

I can tell you it’s odd that the Twins have outscored their opposition by just 26 runs during their 29-15 stretch, so they’ll probably fade; that the Pirates are unlikely to maintain their current pace with a pitching staff that ranks last in the league in Ks per nine innings, and an offense with just one everyday players whose OPS is over .800; that the Giants’ .684 record in one-run games (26-12) will probably not continue – but those are predictions, not facts.

Unlikely things happen in baseball all the time.

The ultimate Revenge of the Nerds has been the extent to which their analytical methods have taken hold in the game.  “Moneyball” has become a shorthand term for this kind of breakdown of performance into its discrete component parts – surely an oversimplification of Billy Beane’s thinking as depicted in Michael Lewis’s book.  But ask yourself: Did you ever imagine a baseball movie starring Brad Pitt in which his lead role would be the general manager?

In the long run, the generalizations are by and large true.  It’s hard to pitch effectively while striking out very few batters.  Stolen bases are a small marginal gain for an offense, but only if the success rate outweighs the damage done by a caught stealing.  Catchers and second basemen don’t age well.

Still, Randy Jones won a Cy Young Award while striking out fewer than three batters per nine innings; the most important play of the 2004 ALCS was a stolen base; and Carlton Fisk caught 465 games after turning 40 and had a .780 OPS in those six seasons.

One hundred and sixty-two games make for a long season, but they’re not “the long run.”

Statistics are indicative, not determinative.  Stuff happens, which is exactly why we watch.

People are not statistics, no matter how well we’re able to define them.




Leave a Reply

  • (will not be published)