Sunday, July 19, 2009

Fun With (Retrosheet's) Numbers

Great piece in Friday's WSJ on how statisticians have gone nuts with and and their voluminous amounts of data. In "Baseball Veers Into Left Field," Austin Kelley has pulled some of the more curious highlights:

When baseball dubbed shortstop Harold Reese "Pee Wee" and first basemen Fred Merkle "Bonehead," they probably weren't trying to lengthen the players' lives. But according to researchers at Wayne State University, major-league players who have nicknames live 2 1/2 years longer, on average, than those without them.

The nickname findings are part of the wide-ranging and often arcane academic research that deals with the national pastime. In another study, we learn that players whose first or last name begins with "K" strike out more than those without "K" initials. And in case you were wondering, research finds Democrats support the designated-hitter rule more than Republicans.

As any numbers geek knows, baseball has always been the wonkiest of sports, rife with statistics and theories. Whether baseball purists like it or not, scores of analysts and number crunchers have knocked down the gates of the hallowed game and are now climbing over the furniture with their protractors and measuring tapes.

Scores of armchair statisticians and Retrosheet, an organization dedicated to digitizing baseball history, have compiled hundreds of thousands of bytes of errors and sacrifices -- all easily downloadable to the public. The ballclubs themselves have taken notice and most, if not all, have statisticians on the payroll.

I also loved how an upcoming SABRmetricians' conference will feature David Smith, president of Retrosheet, who "will answer the question: If a pitcher reaches base and has to do a little running, does it affect his performance on the mound?" If any of you SoSG fans are going to this conference, please make sure to report back, stat.