Stephen Baker

The Numerati
Home - Viewing one post

Baseball to see new data explosion

April 4, 2011Hop Skip Go

In the beginning, there was the hit. Then the strike out, the RBI, the batting average, the run scored, the win, the loss. This was the first original generation of baseball data. It was the universe occupied by dead-ball era players, like Ty Cobb and Honus Wagner. Sometime early on, the first Numerati of the sport started to crunch some numbers. If each pitcher were to go a full nine-inning game, how many runs would he let in? That led to the earned-run average, or ERA. When I was a kid, the ERA was one of the more sophisticated numbers that I memorized from the back of baseball cards (such as the one of Johnny Callison, above, one of my early favorites).

The second generation, as Michael Lewis described in Moneyball, came about in the '90s. Number-crunchers started to develop new enhanced statistics, which took on a life of their own at Baseball Prospectus. They brought in loads of new variables, and analyzed correlations. They could calculate, for example, the AEqR, "the number of equivalent runs scored by a team, adjusted for their opponents' pitching and defense."

But now comes the sensor revolution, which will bring to baseball (and the rest of our lives) mountains of new statistics. These ones, as Ira Boudway writes at Bloomberg, will measure players not by the traditional route--results--but instead by monitoring and measuring their behavior. The new monitoring, already in place at San Francisco's AT&T Park, is called Fieldf/x. Boudway writes:

Fieldf/x is a motion-capture system created by Chicago- based Sportvision. It uses four cameras perched high above the field to track players and the ball and log their movements, gathering more than 2.5 million records per game. That means you could find out whether Ichiro Suzuki truly gets the best jump on fly balls hit into the right-field gap, or if Derek Jeter really deserved that Gold Glove last year.

It's with systems like this that the Numerati establish their hegemony over businesses, including baseball. The reason is that the statistics are so rich and varied that only experts with advanced computer skills can analyze them. Of course, eventually, they build and sell the software to widen the markets to the rest of us. But their systems come to define the game.

In the end, though, they're still stats. While the corralations they find may define the past, there's no guarantee they'll predict the future. Does this mean that the old-fashioned gut will prevail? I'd say no. But the arguments about baseball, about whether Jeter is worth is contract or whether the Phillies should have traded J. Happ for Roy Oswalt, will now rage in the realm of enhanced stats. And, of course, they'll never replicate the complexity of the real game, where day dreams, hangovers, and even the appetite of a single mosquito can change the course of a pitch, a catch, a game, and a season. There will always be a push to collect more information, to come up with yet another generation of stats.

add comment share:

©2022 Stephen Baker Media, All rights reserved.     Site by Infinet Design

Kirkus Reviews -

LibraryJournal - Library Journal

Booklist Reviews - David Pitt

Locus - Paul di Filippo

read more reviews

Prequel to The Boost: Dark Site
- December 3, 2014

The Boost: an excerpt
- April 15, 2014

My horrible Superbowl weekend, in perspective
- February 3, 2014

My coming novel: Boosting human cognition
- May 30, 2013

Why Nate Silver is never wrong
- November 8, 2012

The psychology behind bankers' hatred for Obama
- September 10, 2012

"Corporations are People": an op-ed
- August 16, 2011

Wall Street Journal excerpt: Final Jeopardy
- February 4, 2011

Why IBM's Watson is Smarter than Google
- January 9, 2011

Rethinking books
- October 3, 2010

The coming privacy boom
- August 17, 2010

The appeal of virtual
- May 18, 2010