13 min read 0 comments

Last weekend I had the privilege of playing one of my bucket-list courses: Whistling Straits. I was really excited to play the Pete Dye masterpiece, and of course I also wanted to play well. Up to this point, my tee shots had been particularly stellar all weekend, where I was consistently hitting fairways at distances up to 330 yards off the tee (downhill, down wind, but I’ll take it!).

On the first hole at Whistling Straits I hit a great drive down the center of the fairway. On the second hole, though, I pushed my drive well right. Same thing on the fourth hole, same thing on the eighth. I had missed the fairway well right on three of the four times I hit my driver on the front nine.

I was really frustrated. Normally my driver is a strength, and it had been particularly good this weekend. I wasn’t sure what to do. Should I abandon the driver on the back nine, or trust that I would return to my typical good play soon?

This is a common situation many golfers face: when should I ignore new information and when should I make an adjustment? In an earlier post I discussed the problems with small sample size; that a limited amount of data is generally not enough to make conclusive decisions. However, the reality is that in golf decisions still need to be made, often with only a small sample of data to use. If the wind blows your approach shot 10 yards short on the first hole, that is useful information that should be utilized in choosing your club on the second hole, even though it is a small sample. On the other hand, one bad tee shot should not cause you to completely abandon your driver. Recognizing a small sample size is important, but it does not help explain when to make an adjustment based on new data.

How to correct your estimates

Fortunately, there is a rigorous way we can determine how predictions should change with new data, using a statistical technique called Bayesian Inference. In a nutshell, this technique is a systematic way of using new information to update prior beliefs.

The key requirement for Bayesian Inference is using what is called a prior probability distribution. These are the odds you would assign to different outcomes a priori – before any observations occur.

This is best shown with an example. Suppose you are going out to play 9 holes golf. How many greens do you think you will hit in regulation today? 2? 5? 8? You probably already know that it is pretty unlikely that you will hit all 9 greens in regulation, and that it’s also unlikely that you won’t hit any. As an exercise, try writing out what you think the percent chance is of each possible outcome, remembering that the probabilities must add up to 100%. Here is my example:

Total Greens Hit Percent Chance
0 1%
1 2%
2 3%
3 10%
4 20%
5 30%
6 20%
7 10%
8 3%
9 1%

This is a prior probability distribution. It tells you how likely you think the different outcomes are before obtaining any data. Now suppose you miss the first green. How do things change? Well you know that the probability of hitting all 9 greens is zero. Additionally, the probability of hitting 8 greens has probably gone down, since you now need to hit 8 in a row. Also the probability of hitting 0 greens has probably gone up slightly.

Suppose you miss the second green as well. Now the probability of 8 greens hit is zero too. It’s beginning to look like you may be having a bad day, and perhaps the chance of hitting even 5 out of 9 was estimated as being a bit too high. We should probably keep updating the probability estimates as your round continues, getting closer and closer to a true picture of how you perform.

Bayesian inference provides a rigorous method for doing this updating. The formal definition of Bayes’s theorem is relatively easy to state:

\[ P(A\mid B) = \frac{P(B\mid A)P(A)}{P(B)} \]

In plain English, is the probability of event happening given the new information , is the probability of happening assuming event is true, is the prior probability of , and is the probability of occurring overall.

In the example above, represents the total number of greens hit out of 9, and represents a the new information that a green was missed. Writing Bayes’s theorem in this specific case gives:

\[ P(\text{A greens total} \mid \text{missed green}) = \frac{P(\text{missed green} \mid \text{A greens total})P(\text{A greens total})}{P(\text{missed green})} \]

Let’s use Bayes’s theorem to update the probability of hitting 5 greens total. First we need the probability of missing a green assuming that you will hit 5. If 5 greens are hit, 4 are missed, so the probability of missing one given that you hit 5 is 4/9 = 44%. Next we need the prior probability of hitting 5 greens. This was the estimated value from the table above, 30% (if you are following along with your own table, you can use your own value here).

Finally, we need to compute the overall probability of missing one green given all green total possibilities. This is a more complex calculation, but it is not hard. We must look at the probability of missing a green given each possible total, which I have put in the table below:

Total Greens Hit Chance of missing any one green
0 100%
1 89%
2 78%
3 67%
4 56%
5 44%
6 33%
7 22%
8 11%
9 0%

Then to find the total probability of missing a green, we take these probabilities, multiply them by the prior probability of hitting X greens, and adding them all up (a weighted sum of all the possible probabilities). This looks like the following:

\[ P(\text{missed green}) = (100\%\times1\%)+(89\%\times 2\%)+(78\%\times 3\%)+(67\%\times10\%)+(56\%\times 20\%)+(44\%\times 30\%)+(33\%\times20\%)+(22\%\times10\%)+(11\%\times3\%)+(0\%\times1\%) \]

This calculation gives the total probability of missing the green as 45%.

Now we have all the pieces to use Bayes’s theorem. The probability of hitting 5 greens given that we missed one is now:

\[ P(\text{5 greens}\mid\text{missed green}) = \frac{44\%\times 30\%}{45\%} = 29\% \]

In other words, missing the first green lowers the chances of hitting 5, since hitting 5 out of 9 is easier than hitting 5 out of 8, but it doesn’t lower the chances by much. This value is called the posterior probability, since it is the probability computed after we include the new information.

We can compute the posterior probabilities for each outcome in the same way as we did for 5 greens. The results are shown below along with the assigned prior probabilities:

Total Greens Hit Prior Probability Probability after a missed green
0 1% 2%
1 2% 4%
2 3% 5%
3 10% 15%
4 20% 24%
5 30% 29%
6 20% 15%
7 10% 5%
8 3% 1%
9 1% 0%

As you can see, the chances of hitting only a few greens have gone up, while the chances of hitting 7 or 8 greens have dropped significantly. Remember that one data point is a small sample size and shouldn’t affect our probabilities very much, and indeed the overall picture is mostly the same. It is still very likely that I will hit between 4 and 6 greens and unlikely that I will hit no greens.

What if I miss the second green as well? The beauty of Bayes’s theorem is that we can apply it again, using our previous posterior probabilities as our new prior probabilities. Below I add to the previous table the posterior probabilities after missing the first two and three greens.

Total Greens Hit Prior Missed First Hole Missed First Two Holes Missed First Three Holes
0 1% 2% 5% 10%
1 2% 4% 7% 13%
2 3% 5% 8% 12%
3 10% 15% 20% 24%
4 20% 24% 26% 24%
5 30% 29% 24% 24%
6 20% 15% 8% 2%
7 10% 5% 1% 0%
8 3% 1% 0% 0%
9 1% 0% 0% 0%

As you can see, the most likely scenario we predicted before starting was hitting 5 greens, and this is still the most likely case, but the situation has significantly changed after 3 misses. The chance of missing every green doubles after each miss as the chance of hitting 6 greens rapidly declines.

You may recall that in one of my earlier posts I mentioned my instructor told me not to make any changes until I saw a pattern of 3. This heuristic appears to be pretty well supported by the example above. After only one miss, the percentages haven’t changed a lot. After two in a row, they lean a little more toward a low total. After three missed greens in a row, there is still a reasonable chance of hitting just 5, but the chances of hitting less than 3 have gone from 6% up to 35%.

We could have chosen a different prior distribution. Here are two examples. The first assumes a uniform prior probability (all outcomes are equally likely):

Total Greens Hit Prior Missed First Hole Missed First Two Holes Missed First Three Holes
0 10% 20% 30% 40%
1 10% 18% 23% 27%
2 10% 16% 18% 17%
3 10% 13% 13% 10%
4 10% 11% 8% 5%
5 10% 9% 5% 2%
6 10% 7% 3% 0%
7 10% 4% 1% 0%
8 10% 2% 0% 0%
9 10% 0% 0% 0%

The second is a “boom or bust” prior, assuming a high or low number of greens hit is more likely than something in the middle:

Total Greens Hit Prior Missed First Hole Missed First Two Holes Missed First Three Holes
0 30% 60% 70% 76%
1 10% 18% 18% 17%
2 4% 6% 5% 4%
3 3% 4% 3% 2%
4 3% 3% 2% 1%
5 3% 3% 1% 0%
6 3% 2% 1% 0%
7 4% 2% 0% 0%
8 10% 2% 0% 0%
9 30% 0% 0% 0%

Both of these posterior distributions weigh the probability of hitting no greens much more heavily than the first one due to the difference in prior assumptions. The initial prior weighed the middle values as most probable and the extremes as unlikely, so even after three missed greens in a row, hitting 3,4, or 5 total is still more probable. On the other hand, the “boom or bust” prior considered the chance of missing every green as being relatively high, and missing 3 in a row only increases the likelihood. The uniform prior gave no preference to any probability, so its posterior chances of missing every green is high, but not nearly as high as with the “boom or bust” prior.

A significant criticism of Bayesian statistics is its reliance on the prior probability distribution, which must be constructed as a non-rigorous hypothesis. As the examples above demonstrate, different priors do produce different posterior predictions. Defenders of Bayesian statistics argue that prior probabilities align more with reality, where we usually have a sense of what to expect from a given situation. If I give you a coin out of my pocket to flip, you will inherently assume a high probability that it is a fair coin, given your experience from other coin flips. If the coin comes up heads on all of its first 4 flips, you might start to be a little suspicious, but you will probably still think it is most likely bad luck, since your prior is so strong. Additionally, with enough data Bayesian inference will converge to the same posterior result regardless of the prior. If after 10 million flips my coin shows heads 9 million times, you will probably be convinced that it isn’t a fair coin, and that the probability of heads is 9/10, regardless of how trustworthy you thought my coin was at the beginning.

Applying it

Bayesian inference is the rigorous way to update your beliefs about the likelihood of different events, but you don’t need to compute all the math in your head on the course to utilize its way of thinking. The core principle is that situations where you have a high confidence shouldn’t change much with only a little data, while situations where you do not have much information should be more strongly influenced.

Here is an example that contrasts the two extremes and uses the thought process behind Bayesian inference. Suppose Paula and Stacy are going to play a round of golf at a course they have never seen before. Paula gets to the course early to practice her putting. She hits a lot of putts and gets a good sense of the speed of the greens, to the point where she can consistently lag putts to within a few inches of the hole. Stacy is running late and arrives just before their tee time with no ability to practice her putting on these new greens.

Despite all of her practice, on the first green Paula’s lag travels 10 ft past the hole! Should Paula try to make an adjustment on her second putt, intentionally hitting it softly, or should she mostly ignore what just happened and treat her next putt as ordinary? Since Paula had a lot of success on the practice green before her round, her prior probability of accurately being able to judge the speed should be high. Therefore, one bad putt should not drastically change her calculation, and she shouldn’t try to intentionally hit the ball softly. (How many times have you seen a first putt travel well past the hole only to see the second putt come up short? This is likely an overreaction to new information)

Having had no practice before the round, Stacy’s lag putt on the first green is left 10 ft short of the hole. Should Stacy mostly ignore this information like Paula did or should she try to hit her next putt a little harder than she expects? Unlike Paula, Stacy didn’t get a chance to warm up on the greens, so her prior probability of knowing the correct speed should be a lot lower than Paula’s. Therefore her putt that came up short should have a much stronger influence on her decision for the next putt. Compared to Paula it is much more likely that Stacy does not know the proper speed of the greens, and so she should hit the next putt a little harder than she might normally expect.

As the two players gather more information, their posterior probabilities are continually updated, and their strategies should change accordingly. Although Paula shouldn’t change her putting strategy initially, if she finds herself consistently hitting putts too hard after several holes, she should start slowly trying to hit the ball more softly (sometimes the practice green is in fact a little slower than on the course because it often gets more water to maintain aesthetic appeal near the clubhouse). As Stacy plays more holes, she may begin to find her feel adjusting to the greens, requiring her to feel as if she isn’t hitting her putts as hard. Both players started with different priors, but with appropriate updating they can eventually converge to the right feel.

The next time you play, consider the beliefs about your game for which you feel you have a strong prior (perhaps how far you hit your irons, how often you get up and down from the bunker, how likely you are to hit the fairway, etc.). These beliefs shouldn’t be adjusted very much unless a lot of significant information appears (your irons have to come up short on several holes before you start thinking of changing clubs). Also consider the situations where your priors are weaker (perhaps the amount of wind today, the amount of roll on the fairways, the wetness of the bunkers, etc.). These beliefs should be updated more readily as you gather information during your round (if the wind blows your tee shot significantly off line, you should definitely take that into account on your second shot).

If you want to be more rigorous, you could use math of Bayesian inference to prepare for a big round or tournament, charting out how much you will adjust to different anticipated events. However, even without any math, thinking like a Bayesian can help you adapt to changes on the course.