Once again, please bear with me on this one, as sometimes it's necessary to endure a tedious first half to fully understand the more interesting second half. Those of you who've seen 'Cloverfield' know what I'm talking about.
I begin by introducing the Gini Index. Developed in the early 1900's by Italian mathematician Corrado Gini, the Gini Index is a statistic oft cited by the UN and other such organizations to measure the inequality of income distribution in countries around the world. It's calculated by taking everybody in a given society, lining them up from left to right in order of increasing income, cumulating their income as you go down the line, and measuring how much this running total differs from what the running total would be if the society's collective income were evenly distributed among all members. The end result is a value between 0 and 100, where 0 is perfect equality (everybody has exactly the same income) and 100 is perfect inequality (the collective income of the entire society belongs to only one member - everybody else makes zero). If you want more details on calculating the Gini Index, click here.
For those readers who don't use the Gini Index on a day-to-day basis (what rock do you live under?), below is a sampling of the Gini values for income from selected countries:
Country: | Gini Index: |
Denmark | 24.7 (lowest in world) |
India | 32.5 |
United States | 40.8 |
China | 44.7 |
Brazil | 58.0 |
Namibia | 74.3 (highest in world) |
So Denmark is the country with the most evenly distributed earning power among its citizens. Namibia on the other hand has the largest imbalance, with huge amounts of wealth concentrated among very few citizens. The United States lies somewhere in the middle.
Now that your Gini value processor is somewhat calibrated, let's employ the Gini Index to things it was never intended to be used for, such as the home run distribution of baseball teams. If we treated baseball teams as countries and home runs as income, we could quantify how much of a team's power is concentrated among the few or how evenly it is spread across the lineup.
Here are the 30 MLB teams, ranked in order from most evenly distributed to least evenly distributed (most concentrated) home run hitting. I've also added a column "HR Total Rank" to indicate how the teams ranked in terms of total home runs hit:
HR Distribution Rank: | Team | HR Total Rank: | HR Distribution (Gini Index): | Similar to Income Distribution of: |
1 (most distributed) | Texas | 8 | 30.8 | Netherlands |
2 | Atlanta | 12 | 36.0 | Italy |
3 | Baltimore | 23 | 37.2 | Vietnam |
4 | Seattle | 20 | 37.8 | Latvia |
5 | Oakland | 13 | 37.9 | Jamaica |
6 | Detroit | 13 | 38.2 | Portugal |
7 | Boston | 18 | 39.1 | Israel |
8 | Kansas City | 30 (fewest HRs) | 39.4 | Burkina Faso |
9 | Pittsburgh | 22 | 39.5 | Morocco |
10 | LA Dodgers | 26 | 40.3 | Trinidad and Tobago |
11 | Cleveland | 9 | 41.1 | United States |
12 | San Diego | 14 | 41.6 | Senegal |
13 | Arizona | 15 | 41.9 | Thailand |
14 | Tampa Bay | 7 | 42.8 | Iran |
15 | NY Yankees | 4 | 43.4 | Hong Kong |
16 | Milwaukee | 1 (most HRs) | 44.2 | Venezuela |
17 | Cincinnati | 3 | 44.3 | Camaroon |
18 | Washington | 27 | 44.4 | Ivory Coast |
19 | Toronto | 19 | 44.6 | China |
20 | NY Mets | 11 | 45.6 | Rwanda |
21 | St Louis | 24 | 46.4 | Philippines |
22 | Florida | 5 | 47.6 | Mexico |
23 | Colorado | 16 | 48.0 | Madagascar |
24 | San Francisco | 25 | 48.5 | Malaysia |
25 | LA Angels | 28 | 50.3 | Gambia |
26 | Houston | 17 | 50.4 | Malawi |
27 | Philadelphia | 2 | 50.7 | Niger |
28 | Chicago Cubs | 21 | 52.5 | Argentina |
29 | Chicago White Sox | 21 | 53.4 | Chile |
30 (least distributed) | Minnesota | 29 | 62.5 | Sierra Leone |
So the Texas Rangers are the Denmark of Major League baseball (although statistically they are closer to the Netherlands), topping the list of most evenly distributed HR production. Their top 6 players account for just over half of the team's home runs. Contrast that with Minnesota, where it takes only their top 2 guys (Torii Hunter and Justin Morneau) to account for half. The Dodgers rank 10th on the list, with nobody demonstrating great power but with 7 guys producing moderate power. They boast the distinction of being the Trinidad and Tobago of baseball.
Another thing to note is that there's no obvious correlation between HR frequency and HR distribution. While Minnesota ranked at or near the bottom in both categories, many of both the best and worst power teams (Milwaukee, Cincinnati, NY Yankees and Washington, LA Dodgers, Kansas City) congregated around the middle of the HR distribution rankings, as Orel pointed out.
Well that's all the insight I have for now...if you have any, please share. Thanks for reading through.
7 comments:
This is cool. Is there a version of this stat that factors in total homers?
Not to my knowledge. I suspect it'd be tough to meaningfully fit the two on a common scale. All else being equal, I think most would agree that for total homers, the higher the better. But for homer distribution, all else being equal, it's not as clear which is better, being more dispersed or more concentrated. It'd be like trying to combine mean and standard deviation into one stat. There's probably something related out there in the world of math.
Maybe we could see how Gini correlates with winning percentage—do teams that share the HR load perform better?
well winning % is probably dominated by too many other factors, as even HR total doesn't present clear correlation.
here's study I'd like to see but don't have the diligence to conduct - throughout the past 10 years or so, for teams at a given total HR level, is there a correlation btwn total runs scored and HR dispersion. however one of the downsides of being too concentrated wouldn't be accounted for here - that being if one of your top HR hitter goes down - cuz their stats wouldn't be factored in anyways.
Anyways I am heading to airport will try to check in from time to time over next 2 wks -
Are the Dodgers the only team which had a former player name that matched their country analogy (Trinidad Hubbard)?
Did Ruben Sierra ever play for the Twins?
I don't think Israel Valdez ever played for Boston nor Fernando Venezuela for Milwaukee, so I think so
Post a Comment