Probability calculation

Posted by: wfaulk

Probability calculation - 29/08/2011 18:45

I was having a discussion with TonyC this morning about birthdays and he pointed out that his office had a statistical anomaly wherein nine people in his office of 32 had birthdays in August. I started to try and figure out what the odds of that happening were, but I got stuck.

How would you go about figuring those odds?

The first thing I looked at was the odds of one person having a birthday in August. To simplify, let's ignore the length of months and any non-even birth distribution, and assume that it's 1 in 12. And then it seems that if you have two people, the odds raise to 2 in 12, and so on. But this can't be the case, as if you have 12 people, that would imply that the odds were 12 in 12, and it's perfectly reasonable for a group of 12 people to have no one born in August. I'm guessing that this mathematical analysis is just flat-out wrong.

So maybe we have to look at the odds of not being born in August. That would be 11 in 12 for an individual. And then square that for two people, which yields something very close to 10 in 12, and then the next power and the next power until you get to the 12th power, which comes out to about 35.2%, which would mean that in a group of 12, there's about a 65% chance that someone was born in August. That sounds more reasonable. If we take that to 32 people, that's about a 94% chance that (at least) one person was born in August, which, again, sounds reasonable.

But now how do we calculate the odds of multiple people being born in August? If the above algorithm, O(n)=(1-(11/12)^n)), is correct, does it make sense that the odds of two people in a group of 32 would be the odds of one person in that group times the odds of one person in the remaining group? That sounds reasonable, but using that formula, aren't we just finding the odds of someone being born in a random particular month, and not specifically August? How do we tie the results of the first set of odds to the next set?

It's at this point that my brain shuts down.

Help?
Posted by: hybrid8

Re: Probability calculation - 29/08/2011 19:03

I thought I remembered this from a finite math class in High School...

This specific example is from The Wizard of Odds: http://wizardofodds.com/askthewizard/probability.html
Quote:


Five persons are in a room. What is the probability that at least 2 of them were born in the same birth month?

To keep things simple let's assume that each person has a 1/12 probability of being born in each month. The probability that all five people are born in different months is (11/12)*(10/12)*(9/12)*(8/12) = 0.381944. So the probability of a common month is 1 - 0.381944 = 0.618056.



And here's a Yahoo Answers where someone breaks it all down: http://answers.yahoo.com/question/index?qid=20080101071810AAyM5LB
Posted by: Glen_L

Re: Probability calculation - 29/08/2011 19:10

I'm not sure that this will do much more than confirm that the math can get complicated quickly for a seemingly simple question:

Birthday Problem

Happy birthday, by the way!
Posted by: peter

Re: Probability calculation - 30/08/2011 16:04

The probability that exactly two people of N are born in a given month (if N<13, see end) is:

(1/12)^2 * (11/12)^(N-2) * N(N-1)/2

where the first factor is the two people being born in that given month, the second is all the rest not being born in that month, and the third is the number of ways that can happen: N people who could be one person born in that month, N-1 remaining people who could be the second person born in that month, and divide by 2 because we've counted each pair of people twice (there are 496 different pairs among 32 people, not 992).

For three people, it would be:

(1/12)^3 * (11/12)^(N-3) * N(N-1)(N-2)/6

where the final factor is N who could be the first, N-1 remaining who could be the second, N-2 remaining who could be the third, and divide by six because we've counted each trio six times.

In general, the odds that precisely X people of N were all born in a given month are:

(1/12)^X * (11/12)^(N-X) * N! / (X! * (N-X)!)

where the final factor has become the number of ways of choosing X items from N equivalent ones (a binomial coefficient).

For N=32, X=9, this evaluates to 0.000734, which multiplied by 12 (because there's nothing really special about August -- you'd be equally surprised if the nine shared any other birth month) is 0.0088, meaning that there's about a 1 in 113 chance that, of 32 randomly-selected people, exactly 9 will share a birth month.

Except that type of calculation is not exactly accurate -- for example, if N=13 or more then the chance that two or more people will share a birth month is 100%, by the pigeonhole principle. What I've missed out, is that the other people -- the 30 others in the N=32, X=2 case, or the 23 others in the N=32, X=9 case -- might also include X further people who share a birth month. In the X=2, N>=13 case, that's a certainty, so the result is dead wrong. But in the N=32, X=9 case, that's rather unlikely (maybe one chance in 2,000), so the answer I've given is, I think, close to correct.

Peter