Thursday 9 May 2013

The "Hot Hand" - Reality or Fantasy


I don't follow the NBA really at all, but I am interested in other sports (baseball in particular) and mathematics.  Thus, I found a recent NY Times article on the phenomenon of "hot hands" in sports quite interesting.

The author, following the opening game of the Chicago Bulls - Miami Heat playoff series, commented on a recent paper analysing the question of whether such streaks exist or are just an imputed pattern amidst random outcomes.  In that game, Chicago's Nate Robinson, basically a 29-year old journeyman of no particular note, poured in 27 points (more than twice his career and seasonal averages) to lead the Bulls to a surprise win over Miami.  The Heat are heavy favourites to win this series and the title.

Underlying the article is a recent analysis done by Yaari Gur and Shmuel Eisenmann, two computational biologists at Yale University.  The two analysed data from both the NBA (free throws) and the Pro Bowlers' Association in an effort to determine whether there are any data to support the belief.

The research followed up to a 1985 paper done by Amos Tversky, a professor of statistics at Stanford, that looked at the same question, based on field goal and free throw sequences.  (NB: I knew Professor Tversky casually during my time as a graduate student at Stanford).  I read the paper 20 years ago as part of my own research into defining randomness via what is called Kolmogoroff-Chaitin Complexity.  Put as simply as I can, K-C theory looks at the level of difficulty, measured by length, needed to define a long string of data.  The longer the description, the more complex the string, and hence the more "random" it is.

In an example, consider a 10000-character string of "X".  In computer code, this can easily be described as the ASCII code for X (88) followed by the hex code for 10000 (2710).  Thus, a 10000 character string compresses to 882710, a very, very short and thus non-random string.

Tversky looked at sequences of makes and misses, and in particular, runs of each, and the likelihood, presuming that each shot is an independent event.  According to Professor Tversky's paper, mathematically, these runs are indistinguishable from random variation.

The newer research of Professor Gur approaches the problem from a slightly different angle.  Rather than looking at strings, he looked at free throw pairs/triples (in the NBA, players can be awarded three free throws if fouled on a three-point shot), and conditional probabilities of making the second/third free throw following the outcome of the first/second shot.

An external file that holds a picture, illustration, etc.
Object name is pone.0024532.g002.jpg
Conditional Probability Plot of Free Throws from Gur, 2009
The analysis compares P(1|0) (probability that second shot is made, given that the first is a miss) vs. P(1|1) (probability that second shot is made, given that the first is made).  The results are based upon a variation of what is called a Polya Urn, named for George Polya.  Balls are drawn from an urn containing B black balls and W white balls.

The research of the second paper concludes that there is a slight (7-12%) increase of the likelihood following a make.  Thus, a 70% free-throw shooter improves to about a 72% shooter if he has made the first. It's a very small difference.

Sports is a physical and mental endeavour, as Yogi Berra attested ("90% of this game is half mental"), so this result has face validity.  Any golfer also knows that a key to a good round is to find a good swing and then just repeat it.  As a golfer of middling ability, my own anecdotal evidence is that I can have a round where my swing just "feels" right, and I can duplicate the same pattern on most of the holes.  The opposite also is true.

The same phenomenon also is perhaps demonstrated in that a free throw shooter 'feels it,' when a good stroke yields a make, and a poor shot results in a rim-shaking brick.  A missed free throw is likely, psychologically, to result in the shooter attempting to make slight adjustments.  Thus, it seems that what is really being seen is not so much a hot streak, but the avoidance of cold ones - poor shots.

Does this apply to teams?  That seems less likely to me.  Neither Tversky nor Gur looked at winning streaks for teams or "runs" (periods of time in a game where one team seems to make all of its shots and scores a large number of un-answered points).

A back-of-the-envelope analysis was done by the baseball writer Bill James back in 1986, where he looked a the probabilities that a team who lost the first game of a playoff series came back to win the second.  Adjusting for winning percentages, home/road advantage, and other variables, James concluded that the team who LOST the first game had a slightly elevated probability of winning the second, which stands astride the common wisdom about "hot streaks."  James's conjecture was that the team who lost the game was more likely to examine the events of the game and make adjustments - changing the lineup, for example - than the team who won the first game.

This lesson was lost on the NY Times author, who took the results of Gur and optimistically predicted that the Bulls, the hot hand, would walk over the Heat.  Granted that Ms. Reynolds cops to being a Bulls' fan and thus not objective, she appears to be reading into the data something not there.

Empirically, her optimism was not borne out, as Miami came back to blow out Chicago in Game 2, 115-78.

No comments: