Understanding Bayes Theorem With Ratios

My first intuition about Bayes Theorem was "take evidence and account for false positives". Does a lab result mean you're sick? Well, how rare is the disease, and how often do healthy people test positive? Misleading signals must be considered.

This is a companion discussion topic for the original entry at http://betterexplained.com/articles/understanding-bayes-theorem-with-ratios/

““Aha!” you may say, “But they each have an equal chance of winning BEFORE the winner is drawn”. But how can they each have an equal chance of winning if only one wins and the others lose – whether before or after the draw?”

Because that’s what the definition of chance is? Chance is the word used to describe the expected result of unknown outcomes. “One is the winner and the others have lost” describes a statement about a past event with a known outcome: you can identify the winners and losers, and the random draw that selected them.

Before the draw takes place, there are no winners, and no losers. To speak of “the winner” before the winner has been decided is to speak nonsense; one can only talk about the candidates and the potential outcomes. Furthermore, to speak of “the winner” when you do not know who won is to speak nonsense; one can again only speculate on what the outcome was. Probability comes from assigning quantities to potential outcomes and then taking the average over repeated trials. The outcome is decidedly random, and this randomness is described and quantified using reference to ‘chance’.

Only if you presume that the draw, and ultimately life itself, is entirely and fully deterministic can you come to the conclusion that there is a winner and a loser decided beforehand… and even if you did that, there would be no practical value in stating such, since a fully deterministic system can’t be identified as such by actors in the system, nor can the outcomes be determined a priori by said actors. One must still rely on probabilities to make any predictive statements.

“But I will point out that the recent Global Financial Crisis resulted from reliance on on “probabilities”!”

And I can point out that the capital of France is Timbuktu, that doesn’t make it true.

It is absurd to maintain, simply because we do not know which one(s) will win, that each has an equal chance.

The point is that, before a lottery, it is known that there will be winners and losers - it is therefore nonsensical to say that EACH has an EQUAL chance, since it is known before the lottery that they don’t.

The reason that saying Timbuktu is the capital of France does not make it true is simply that it is not true.

[…] Blogs that have covered this topic: Better Explained Bayes’ […]

Hi David, no worries – I really appreciate the comment. I’d love to do some more about Diff Eqs, it’s a topic that’s bothered me for a long time (never formally studied it), but I hope to start soon. Also hoping to work on a few more books :).

Sorry, my comment was rather curt.

It’s the most sensible thing that you’ve said, because it’s the most useful thing that you’ve said. It is (I think) the only thing that you’ve said that allows for one to actually USE probabilities other than 0 or 1, in other words to actually use probability theory in a nontrivial way.

Actually, I should have quoted more: ‘We know in advance that in our lottery only 1 ticket will win, while 999,999 will not. So it is overwhelming unlikely that any given ticket will win. That’s for sure, and it certainly indicates probability accurately, and can inform one’s choice to take a ticket or not.’. You’re not only talking about a probability that is slightly greater than 0 (in the second sentence); you’re also linking it to knowledge (in the first sentence), which is exactly how probability becomes useful (as in the third sentence). Our knowledge in advance leads us to assign, to ANY given ticket (even the one that eventually turns out to win!), the overwhelmingly unlikely probability of 0.000001 that it will win.

This directly contradicts what you wrote next: ‘But to say in advance that each ticket has an equal chance, given we know in advance that only 1 will win, is nonsensical.’. In fact, giving each ticket an equal chance is exactly what you did in the second sentence, where ANY given ticket has an overwhelmingly unlikely chance to win. And that’s exactly the sensible thing to do in advance, that is BEFORE we have any information that distinguishes the tickets. It’s only AFTER we learn which ticket won that we change this probability from 0.000001 to 0 for 999999 of the tickets, and change it from 0.000001 to 1 for 1 of the tickets.

If you’re going to use probability to make decisions, then this is how you do it, with probability depending not only on the event in question but also on the knowledge that you have at the moment.

>it is overwhelming unlikely that any given ticket will win

This is the most sensible thing you’ve said!

@ Toby
Perhaps you could say something sensible yourself instead of catcalling from the sidelines and coming up with gibberish such as your:

“Well, I would never say that. “Aha!” I would say instead, “But they each have an equal chance of winning BEFORE I learn the identity of the winner”. This makes it clear that lottery probabilities are facts about my knowledge about the lottery, not facts directly about the lottery itself”.

Since you have made no intelligent or intelligible contributions, we must conclude you have none to make.

I find it reprehensible that doctors, for example, can virtually frighten a patient to death with self-fulfilling prophesies, such as falsely telling a smoker that they have a 1 in 5 chance of dying from smoking based on the observation that 1 in 5 smokers seems to do so. They should instead correctly say “1 in 5 smokers dies from smoking - you may be one of them”. Again, each smoker has either a 0% or 100% chance.

That’s OK, Toby - I would rather have been less intemperate in my previous response, but I also wish you’d posted your latest well-reasoned and thoughtfully-written view instead of your “curt” comment, which raised my ire and made me think “why do I bother?”.

You are right, and I have never stated otherwise - BEFORE the lottery draw it SEEMS that each ticket has an equal chance and, even though we know that only one will win, we don’t know which. If the lottery were “fixed”, those who fixed it would know in advance which ticket would win, while we wouldn’t. So to the same ticket they assign 100% chance, we give only 0.0001%. So it is a question of advance knowledge - or not.

As I have prevously written, this is how we constantly make decisions, and to do so is generally “sensible”, as you say, based on what we know. But we also KNOW BEFORE the draw that 1 ticket has 100% chance and the rest have 0% - we just don’t know which.

This leads to my main point - about falsely particularising the general. While it may be true that 1 in 5 smokers will develop lung cancer, it is wrong - and certainly not “sensible” - to tell each smoker they each have an equal 1 in 5 probability, simply because we don’t know the outcome. It IS correct and sensible to tell them that 1 in 5 smokers develops lung cancer - but no more. It IS correct to say that 1 ticket in a million will win, but not - based only on our lack of foreknowledge - that each has the same chance. Again, the decisions made by bankers on the basis of “probability” led to the recent Global Financial Crisis - while they may say they acted “sensibly” on what they knew or believed (or chose to believe), others say with hindsight that they acted recklessly. Perhaps that is why these bankers have not been criminally charged - their defence is they merely acted on what they knew, or “sensibly” believed.

You bring up a good point. How can multiple outcomes have equal chances when only a select few will ever occur? I’m no mathematician, but there are a couple things I’d like to bring up that may prove relevant. Firstly, statistics is rather misleading in that it does not really deal with probabilities, but rather averages, and the ability to make an educated guess. Say I obtain a raffle ticket from a pool of ten tickets. What is my chance of winning the raffle? In a sense, my chance is obviously either 100% or 0% - I can’t “maybe” win the raffle. I either win or I don’t. So, in this sense, there is not much probability involved. Nine tickets have a 0% chance of winning, and one has a 100% chance of winning. Before the ticket is drawn, we can estimate the “chance” of a certain ticket winning, but the end result is the same, and so, in a roundabout sort of way, the tickets already “knew” whether they were going to win or not. But that’s not necessarily what probability is about. I know that one of the ten tickets is going to win. What probability allows me to do is decide whether getting one of these tickets is worth it. Let’s say that I repeat this process over and over; ie. I buy one raffle ticket out of ten total tickets, every time the drawing takes place. In the first drawing, one of the ten tickets wins. Fine. That may have been mine - we don’t know, as this is purely hypothetical. In the second drawing, one of the ten tickets wins (surprise!). If there are ten drawings, then I will have bought ten tickets, and, according to probability, I will have won about one raffle. Will I have actually won exactly one raffle? Probably not. But if there are one hundred drawings, then I will have bought one hundred tickets, and I will have won about ten raffles. The amount of raffles I have won is rarely ever coincident with the projected amount, but as the amount of drawings increases, the projection grows more accurate. This is what probability does: it gives us averages. Now, let’s say that the prize for the raffle is a ten dollar chocolate cake. Each ticket costs three dollars. We can use probability to figure out whether it is statistically advantageous for me to enter the drawing. If the drawing takes place one hundred times, I will have won about ten of them, giving me about $100 worth of chocolate cakes (mmm, delicious!). However, that will have meant that I spent $300 on raffle tickets. Obviously, it was not in my favor to enter the raffle - unless, of course, I won more than 30 chocolate cakes. It’s certainly possible that I did win a ridiculous amount of cakes; however, basic probability tells me that I would be better off spending my money on something more reliable. So, wouldn’t the chance of me getting back my money’s worth be 100% (if I won at least 30 cakes) or 0% (if I didn’t)? Yes, in the long run, it would. But since I didn’t have any way of seeing how many cakes I would win, use of statistics was my best bet. It’s not perfect, but it’s necessary.

Crystals and subjectivity are a problem for Greene’s naïve ‘entropy = disorder’ idea of entropy, not for Boltzmann’s theory. To be fair to Greene, it’s not his idea originally, but he should know better. Boltzmann’s ideas were also naïve, but only because he was coming up with new concepts and couldn’t possibly know better yet; even so, he was closer to the truth! He said something quite close to ‘the entropy of a system is proportional to the lack of information about that system’ (to quote Ralph above): he said that it was proportional to the logarithm of the number of ways the system could be on a microscopic scale, given our information about it on a macroscopic scale. This is naïve, because it essentially assumes that each microscopic state is equally probable (given information about the macroscopic state), although this turns out to be a rather good approximation for most realistically large systems.

Great article, thanks Khalid! Bayes can be really useful, and isn’t easy to think about.

I’d like to weigh in on the lottery thing:

Wile I certainly can’t disagree that probabilities are misused to justify bad choices, I think Ralph’s resentment of probabilities themselves is a little off-base. We’re talking about predictors here - given what we know, what are the outcomes that we can predict? Sometimes, as with Bayes and patient survival, we’re just going on prior observation, and sometimes we can calculate the exact array of possibilites, as with a lottery number. These predictors here are then used in assessment of risk, and as noted in the article, the cost of failure has to be a factor when making a decision.

I think that in society the problem usually comes not from a reliance on the odds, but from trivializing the cost of failure. Risks are taken because the odds are good, but if failure will cause an unacceptable loss (like the loss of a business, for example) then playing it safe is usually a sounder strategy.

If we’re talking about the semantics, then I think “probability” is actually a fantastic name. The possibilities in the case of a fatal disease are “I will survive this disease” and “I will die from this disease”. You can’t really ask “how possible is it that I will die?”, but you can ask “how probable (likely) is it that I will die?” And that information is valuable - if you’re likely to die, then it’s a good idea to get your affairs in order. But the really valuable information is “can I affect my chance of living?”, not “will I die?”

And a nitpick: in most lotteries, every ticket does have an equal chance to win, so long as everyone picks the same numbers. It’s perfectly possible to have more than one winning ticket - it’s the combination of numbers that can win or lose.

A nice little application of Bayes’ rule in odds form can be downloaded from


We know in advance that in our lottery only 1 ticket will win, while 999,999 will not. So it is overwhelming unlikely that any given ticket will win. That’s for sure, and it certainly indicates probability accurately, and can inform one’s choice to take a ticket or not. But to say in advance that each ticket has an equal chance, given we know in advance that only 1 will win, is nonsensical.

Ah, I can breathe again! Great summary!
Thanks kalid for breaking that down in such a clear way, and thanks to Ralph for sparking the discussion!

This is a great website! I recently bought Kalid’s book, which was also fantastic, and then I found his website, which contains a wealth of math topics explained at a very intuitive level for free. Thank you, Kalid, for providing us with such a great resource. I’m looking forward to your next book. :slight_smile:

I apologize for going off topic here, but I couldn’t find an email address or more appropriate place to post this, nevertheless, I’d like to request Kalid provide us with an intuitive explanation of Differential Equations. His calculus explanations are just so good that I really would love to read his take on Differential Equations. (Once again, sorry for being off topic.)

Ian, regarding your ‘nitpick’: you have cited a specialised lottery where you pick your numbers, and say that IF several tickets have the WINNING numbers, they have an equal chance of winning (which is obviously 100%). Although I was not referring to such lotteries which may have several winners, my argument is unchanged, and I have read nothing so far which refutes it, so will not use time and space to re-state it.

But I will point out that the recent Global Financial Crisis resulted from reliance on on “probabilities”!

@Ralph, @Rich: Thanks for the discussion! You bring up an interesting point about the meaning of a probability – there are several interpretations, with different philosophical implications.

One interpretation is “If we repeat the experiment many times, we expect to get this outcome some known fraction of the time (i.e., coin flips)” and another is “Probability reflects our level out certainty about our knowledge” (as Rich notes).

So in the coin-flip case, we can say “I’m 50% certain that it will be heads”, which, in practical terms, means I’d be indifferent to winning 2x my money when betting on heads (better odds, like paying 3x, and I’ll play all day. Worse odds, like paying 1.5x, means I’ll never play).

There’s even more philosophical implications around Bayesian vs. Frequentist probabilities that I’m not well-versed in, but want to understand further.

@Jeff: Glad you enjoyed it – awesome insights. Yes, I think ratios/odds keep is in “probability mode” more than raw percentages (which puts my brain in “calculating” mode).

And cool note about the programming. That’s exactly it: computers have limited precision in the numbers they can represent, so drastically-shrinking percentages are a no go [in the spam case, especially, where we’ll have a multiplication by a small fraction for each word in the message!]. We can use tricks like taking the logarithm of large multiplications (i.e., adding logs) to keep numbers within reasonable bounds.