Tuesday, May 27, 2008

Debunking the Myth: Winning Streaks

I generally don't believe in the concept of a team "getting hot" or "going cold". That doesn't mean a team isn't capable of, say, losing 5 consecutive games after winning 10 of 11; I just think attributing such streaks to the team "being in a groove" is largely misguided (though psychologically convenient).

Let's take winning streaks. How often do you see an otherwise mediocre team rip off 5 consecutive wins and hear people say things like "they're in a zone." I think "being in a zone" or "hitting a groove" almost never* has anything to do with it - I instead believe winning streaks are due almost exclusively to two things:

    1) How good the team is (including injuries, suspensions, etc)
    2) Randomness

So I begin by trying to debunk Hypothesis 1: a team improves its chance of winning a game if it's coming off a win in its previous game.

To evaluate this hypothesis, I looked at the games for each of the 30 mlb teams since the beginning of last season - roughly 210 games per team - to see if their winning % was any different in games when coming off a win as they were overall. Here's the result:

(click graph to enlarge)

The bubbles to the upper left of the diagonal represent teams that do indeed seem to show a higher winning percentage coming off a win. Well, exactly half of the 30 MLB teams are above the line, half below. The white dot indicates the overall average, which resides barely above the line (overall, MLB teams won 50.6% of games after a win vs 50% of all games overall).

Well, not exactly conclusive, but the fact that the white bubble lies barely above the line doesn't debunk the hypothesis. Maybe a team really does have a slightly better chance of winning a game when coming off a win. Let's continue.

The next step was to look at Hypothesis 2: a team improves its chance of winning a game if it's coming off a win streak (i.e., is "hot"). To do this, I looked at teams' winning percentage after 2-game winning streaks. Turns out such teams won their next game 50.4% of the time, again barely over 50% but slightly less than the 50.6% above. So thusfar, for small 1-game and 2-game streaks, results are still inconclusive and no concrete myth-debunking has occurred.

When I got to 3-game winning streaks, it started to get interesting. The overall win % in the next game after a 3-game winning streak dropped to 47.7%. And this is based on 776 cases, so it seems like a decent sample size (and I mean 'seems' in the strict statistical sense of the word).

So it seems like the hypothesis is taking some hits. Let's keep going. Win % after a 4-game streaks? 45.8%. And after a 5-game streak, it dropped further to 45.5%. I stopped there, as sample sizes beyond 5-game streaks were getting pretty thin.

Add to that the fact that teams with 3-, 4-, and 5-game winning streaks are more often than not among the better teams, and thus would tend to have an overall expected winning percentage greater 50%. This makes their sub-50% win % after streaks even more telling and further weakens the myth. Factoring this in, let's check it out graphical-style:

(click graph to enlarge)

I won't go so far as to say this proves anything, as a proper analysis probably involves Bernoulli Trials and R squared values and other stuff I'll leave to you mathematicians out there. But the bottom line is if an otherwise mediocre team is coming to LA on a 5-game winning streak, my response will continue to be: "Bring 'em On!"

*The '02 A's and '07 Rockies may have been in a zone.
For a related article regarding the '07 Rockies, click here.
For an article debunking the 'basketball-players-with-the-"hot hand"-myth', click here.

12 comments:

Dusto_Magnifico said...

Wow! Great stuff!

I guess my thought on "streaks" is that eventually the team will play to their average. If they win 5 in a row, they like will lose 5 in a row if they are a 500 team.

Naturally I think the Dodgers will will 162 in a row. Is that asking for too much?

cigarcow said...

The Rockies of '07 are seem to me like an indicator that hot streaks do exist. Because clearly they are not a good team.

Math is hard.

cigarcow said...

Apparently English is hard, too.

QuadSevens said...

I have way too much free time at work...

I thought the win/loss relationship seemed very similar to the heads/tails relationship of flipping a coin. Fielding errors, bad pitching, and weak hitting accounts for the randomness in baseball, while coin rollage, hitting my foot, chair, or desk accounts for the randomness in coin flips. I simulated a full season with heads being a win, and tails being a loss. I ended up with a record of 85-77 and a slightly better winning percentage after a win than overall. Then I looked at the records from last year and found that my record was the same as the Cubs.

Guess I'll use this quarter to buy me some DoubleMint Gum since it won't last long in the playoffs.

Eric Karros said...

Dusto: thanks.

Cigarcow: Yeah, I agree the '02 A's and particularly the '07 Rockies may have hit a hot streak. I guess I'm saying 99.9% of the time when we collectively say a team is on a hot streak or "in the zone" it's really just being good and/or lucky. The other 0.1% of the time it's things like the '07 Rockies.

Quad: I see business at Red 5 Research is still slow. Hang in there. And you're right on with the coin flip analogy, particularly for 0.500 teams. Lookup 'binomial distribution' on wikipedia (or wookieepedia if you prefer).

Steve Sax said...

EK, I have found yet another flaw to your logic: NBA Jam. After hitting three baskets in a row, one's player would hear "He's on fire!" and proceed to drain every outside shot from there on. In fact, the only thing the defense could do would be to foul the crap out of the player to stop the streak.

I haven't run your model on this dataset, but I don't think it holds.

Eric Karros said...

Dammit, dammit, dammit. You're right. Back to the drawing board...

Michael Lugo said...

The fact that teams tend to lose after they've won 3 or 4 in a row might have to do with pitching rotations. For a staff where one pitcher is notably inferior, it wouldn't surprise me to learn that a lot of 4-win streaks come in games where the #1, #2, #3, #4 starters pitch -- and then the team is more likely to lose when the fifth starter comes up.

Orel said...

But wouldn't that mean they're facing a fifth starter as well?

Unknown said...

Jim Albert has written some academic papers on streaks in baseball -- his findings agree with yours, that there is no more streakiness than would happen randomly. He has looked at hitting streaks for individual players and winning streaks for teams, I believe.

Within a game, I was a co-author of an article in Statistical Thinking in Sports which looked at momentum within a game. We didn't find evidence of momentum once we included controls for ability and other variables.

Momentum and streakiness are either hard to find or very rare.

Steve Sax said...

Rebecca,

That is an awesome comment (which I believe is your first on SoSG; welcome and thank you!).

I have a feeling your next comment will be equally awesome, and the one after that will maintain the awesome trend. Keep them coming!

Eric Karros said...

...and I stand ready to graph said trend in Rebecca's comment awesomeness.

Seriously Rebecca thanks for stopping by. I have actually come across your book before though have not read it. It's good to hear that real mathematicians have reached the same conclusion as I did (at least in this instance).