Bayes Theorem

Blue neon sign denoting a simple expression of Bayes formula

Bayes theorem (or Bayes formula ) is one of the basic theorems of elementary probability theory , which allows you to determine the probability of an event, provided that another statistically interdependent event has occurred. In other words, according to the Bayes formula, one can more accurately recalculate the probability, taking into account both previously known information and the data of new observations. Bayes formula can be derived from the main axioms of probability theory, in particular from conditional probability. The peculiarity of the Bayesian theorem is that for its practical application a large number of calculations is required, therefore Bayesian estimates began to be actively used only after the revolution in computer and network technologies.

When Bayes' theorem arose, the probabilities used in the theorem were subjected to a number of probabilistic interpretations. One such interpretation said that the derivation of the formula is directly related to the application of a special approach to statistical analysis. If we use the Bayesian interpretation of probability, then the theorem shows how the personal level of trust can radically change due to the number of events that have occurred. This is the conclusion of Bayes that became fundamental to Bayesian statistics. However, the theorem is used not only in Bayesian analysis, but is also actively used for a large number of other calculations.

Psychological experiments ^[1] showed that people often incorrectly assess the probability of an event based on their experience (a posteriori probability ), because they ignore the very probability of an assumption (a priori probability ). Therefore, the correct result according to the Bayes formula can be very different from the intuitively expected one.

Bayes' Theorem is named after its author, Thomas Bayes (1702-1761), an English mathematician and priest who was the first to use the theorem to correct beliefs based on updated data. His work " An Essay towards solving a Problem in the Doctrine of Chances " was first published in 1763 ^[2] , 2 years after the death of the author. Before Bayes' posthumous work was accepted and read by the Royal Society, it was significantly edited and updated by Richard Price . However, these ideas were not made public until they were rediscovered and developed by Laplace , who first published the modern statement of the theorem in his 1812 book, Analytical Probability Theory.

Sir Harold Jeffries wrote that Bayes' theorem "for probability theory, the same as the Pythagorean theorem for geometry" ^[3] .

Wording

Bayes formula :
$P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B)}}$ ${\ displaystyle P (A \ mid B) = {\ frac {P (B \ mid A) \, P (A)} {P (B)}}}$ ,
Where
$P(A)$ ${\ displaystyle P (A)}$ - the a priori probability of hypothesis A (see the meaning of such terminology below);
$P(A\mid B)$ ${\ displaystyle P (A \ mid B)}$ - probability of hypothesis A when event B occurs (posterior probability);
$P(B\mid A)$ ${\ displaystyle P (B \ mid A)}$ - the probability of occurrence of event B with the truth of hypothesis A ;
$P(B)$ ${\ displaystyle P (B)}$ - the full probability of the occurrence of event B.

Proof

Bayes formula follows from the definition of conditional probability . Probability of a joint event $A B$ ${\ displaystyle AB}$ expressed in two ways through conditional probabilities

P(AB)=P(A\mid B)P(B)=P(B\mid A)P(A)

{\ displaystyle P (AB) = P (A \ mid B) P (B) = P (B \ mid A) P (A)}

Consequently $P(A\mid B)={\frac {P(AB)}{P(B)}}={\frac {P(B\mid A)\,P(A)}{P(B)}}$ ${\ displaystyle P (A \ mid B) = {\ frac {P (AB)} {P (B)}} = {\ frac {P (B \ mid A) \, P (A)} {P (B )}}}$

Calculation $P(B)$ ${\ displaystyle P (B)}$

In tasks and statistical applications $P(B)$ ${\ displaystyle P (B)}$ usually calculated by the formula for the total probability of an event, which depends on several incompatible hypotheses with a total probability of 1.

P(B)=\sum _{i=1}^{N}P(A_{i})P(B\mid A_{i})

{\ displaystyle P (B) = \ sum _ {i = 1} ^ {N} P (A_ {i}) P (B \ mid A_ {i})}

,

where the probabilities under the sum sign are known or allow experimental evaluation.

In this case, the Bayes formula is written as follows:

P(A_{j}\mid B)={\frac {P(A_{j})P(B\mid A_{j})}{\sum _{i=1}^{N}P(A_{i})P(B\mid A_{i})}}

${\ displaystyle P (A_ {j} \ mid B) = {\ frac {P (A_ {j}) P (B \ mid A_ {j})} {\ sum _ {i = 1} ^ {N} P (A_ {i}) P (B \ mid A_ {i})}}}$

Physical Meaning and Terminology

Bayes formula allows you to "rearrange the cause and effect": on the well-known fact of the event to calculate the probability that it was caused by this cause.

Events reflecting the action of “causes” in this case are called hypotheses , since they are the alleged events that entailed this. The unconditional probability of validity of the hypothesis is called a priori (how likely the cause is in general ), and conditional - taking into account the fact of the event - a posteriori (how likely the reason turned out to be taking into account the data on the event ).

Examples

Example 1

Let the event $B$ ${\ displaystyle B}$ - the car will not start, but the hypothesis $A$ ${\ displaystyle A}$ - there is no fuel in the tank. Obviously the probability $P(B\mid A)$ ${\ displaystyle P (B \ mid A)}$ the fact that the car will not start if there is no fuel in the tank is equal to one. As a result, the posterior probability that there is no fuel in the tank if the car does not start, that is $P(A\mid B)$ ${\ displaystyle P (A \ mid B)}$ is equal to ${\frac {P(A)}{P(B)}}$ ${\ displaystyle {\ frac {P (A)} {P (B)}}}$ , that is, the ratio of the a priori probability that there is no fuel in the tank to the probability that the car will not start. For example, if the a priori probability that there is no fuel in the tank is 1%, and the probability that the car will not start is 2%, and a randomly selected car does not start, then the probability that there is no fuel in its tank is 50%.

Example 2

Let the first worker have the probability of marriage $p_{1}=0{,}9$ ${\ displaystyle p_ {1} = 0 {,} 9}$ , the second worker - $p_{2}=0{,}5$ ${\ displaystyle p_ {2} = 0 {,} 5}$ and the third - $p_{3}=0{,}2$ ${\ displaystyle p_ {3} = 0 {,} 2}$ . First made $n_{1}=800$ ${\ displaystyle n_ {1} = 800}$ details, the second - $n_{2}=600$ ${\ displaystyle n_ {2} = 600}$ parts, and the third - $n_{3}=900$ ${\ displaystyle n_ {3} = 900}$ details. The shop manager takes a random part, and it turns out to be defective. The question is, with what probability was this part made by a third worker?

Event $B$ ${\ displaystyle B}$ - marriage details, event $A_{i}$ ${\ displaystyle A_ {i}}$ - the detail was produced by the worker $i$ ${\ displaystyle i}$ . Then $P(A_{i})=n_{i}/N$ ${\ displaystyle P (A_ {i}) = n_ {i} / N}$ where $N=n_{1}+n_{2}+n_{3}$ ${\ displaystyle N = n_ {1} + n_ {2} + n_ {3}}$ , but $P(B\mid A_{i})=p_{i}$ ${\ displaystyle P (B \ mid A_ {i}) = p_ {i}}$ .

By the formula of total probability

P(B)=\sum _{i=1}^{3}P(B\mid A_{i})P(A_{i}).

{\ displaystyle P (B) = \ sum _ {i = 1} ^ {3} P (B \ mid A_ {i}) P (A_ {i}).}

By the Bayes formula, we get:

P(A_{3}\mid B)={\frac {P(B\mid A_{3})P(A_{3})}{P(B)}}={\frac {P(B\mid A_{3})P(A_{3})}{P(B\mid A_{1})P(A_{1})+P(B\mid A_{2})P(A_{2})+P(B\mid A_{3})P(A_{3})}}=

{\ displaystyle P (A_ {3} \ mid B) = {\ frac {P (B \ mid A_ {3}) P (A_ {3})} {P (B)}} = {\ frac {P ( B \ mid A_ {3}) P (A_ {3})} {P (B \ mid A_ {1}) P (A_ {1}) + P (B \ mid A_ {2}) P (A_ {2 }) + P (B \ mid A_ {3}) P (A_ {3})}} =}

={\frac {p_{3}n_{3}/N}{p_{1}n_{1}/N+p_{2}n_{2}/N+p_{3}n_{3}/N}}={\frac {0{,}2\cdot 900/2300}{0{,}9\cdot 800/2300+0{,}5\cdot 600/2300+0{,}2\cdot 900/2300}}=0{,}15.

{\ displaystyle = {\ frac {p_ {3} n_ {3} / N} {p_ {1} n_ {1} / N + p_ {2} n_ {2} / N + p_ {3} n_ {3} / N}} = {\ frac {0 {,} 2 \ cdot 900/2300} {0 {,} 9 \ cdot 800/2300 + 0 {,} 5 \ cdot 600/2300 + 0 {,} 2 \ cdot 900/2300}} = 0 {,} 15.}

Example 3

The tree diagram shows a frequency example. R , C , P, and P with a dash are events that indicate that the beetle is rare, ordinary, with and without a pattern. Interest in parentheses is calculated. Note that the values of three independent events are given, therefore it is possible to calculate the inverse tree (see the graph above).

An entomologist suggests that a beetle may be a rare subset of beetles , since it has a pattern on its body. In a rare subspecies, 98% of beetles have a pattern or P (Pattern | Rare) = 0.98 ( P (Pattern | Rare) = 0.98). Among ordinary beetles, only 5% have a pattern. A rare species of beetles accounts for only 0.1% of the entire population. What is the likelihood that a beetle having a pattern belongs to a rare subspecies or P (Rare | Pattern) ( P (Rare | Pattern) )?

From the extended Bayesian theorem we get (any bug can be either rare or common): ${\begin{aligned}P({\text{Rare}}\mid {\text{Pattern}})&={\frac {P({\text{Pattern}}\mid {\text{Rare}})P({\text{Rare}})}{P({\text{Pattern}}\mid {\text{Rare}})P({\text{Rare}})\,+\,P({\text{Pattern}}\mid {\text{Common}})P({\text{Common}})}}\\[8pt]&={\frac {0{,}98\times 0{,}001}{0{,}98\times 0{,}001+0{,}05\times 0{,}999}}\\[8pt]&\approx 1{,}9\,\%.\end{aligned}}$ ${\ displaystyle {\ begin {aligned} P ({\ text {Rare}} \ mid {\ text {Pattern}}) & = {\ frac {P ({\ text {Pattern}} \ mid {\ text {Rare }}) P ({\ text {Rare}})} {P ({\ text {Pattern}} \ mid {\ text {Rare}}) P ({\ text {Rare}}), + \, P ({\ text {Pattern}} \ mid {\ text {Common}}) P ({\ text {Common}})}} \\ [8pt] & = {\ frac {0 {,} 98 \ times 0 { ,} 001} {0 {,} 98 \ times 0 {,} 001 + 0 {,} 05 \ times 0 {,} 999}} \\ [8pt] & \ approx 1 {,} 9 \, \%. \ end {aligned}}}$

Example 4 - The Bayesian Theorem Paradox

Let there be a disease with a frequency of distribution among the population of 0.001 and a diagnostic examination method that identifies the patient with a probability of 0.9, but at the same time has a probability of 0.01 to incorrectly determine the disease in a healthy person ( more ... ) Find the likelihood that a person is healthy if he was found to be ill during the examination.

Let us designate through B - the event that the person is sick, "B" - the event that the examination showed that the person is sick, and through H - the event that the person is healthy. Then the specified conditions are rewritten as follows:

P ("B" | B) = 0.9;

P ("B" | H) = 0.01;

P (B) = 0.001, then P (S) = 0.999.

The probability that a person is healthy, if he was recognized as sick is equal to the conditional probability:

P (Z | "B").

To find it, we first calculate the total probability of recognition as a patient:

P ("B") = 0.999 × 0.01 + 0.001 × 0.9 = 1.089%.

The probability that a person is healthy with the result of "sick":

P (3 | "B") = 0.999 × 0.01 / (0.999 × 0.01 + 0.001 × 0.9) ≈ 91.7%.

Thus, 91.7% of people whose survey showed the result is “sick”, are actually healthy people. The reason for this is that, by the condition of the problem, the probability of a false-positive result, although small, is an order of magnitude greater than the proportion of patients in the examined group of people.

If erroneous examination results can be considered random, then re-examination of the same person will give an independent result from the first. In this case, to reduce the share of false positive results, it makes sense to re-examine people who have received the result of "sick". The probability that a person is healthy after receiving a second result is “sick” can also be calculated by the Bayes formula: P (Z | “B”, “B”) = 0.999 × 0.01 × 0.01 / (0.999 × 0.01 × 0.01 + 0.001 × 0.9 × 0.9) ≈ 10.98%.

Probability Interpretation Options in Bayes Theorem

Mathematically, the Bayesian theorem shows the relationship between the probability of event A and the probability of event B, P ( A ) and P ( B ), the conditional probability of occurrence of event A with existing B and occurrence of event B with existing A, P ( A | B ) and P ( B | A).

In general terms, Bayes formula is as follows:

$P(A\mid B)={P(B\mid A)\,P(A)}/{P(B)}$ ${\ displaystyle P (A \ mid B) = {P (B \ mid A) \, P (A)} / {P (B)}}$

The meaning of the expression depends on how the probabilities in the given formula are interpreted.

Interpretation of Bayes

In Bayesian interpretation, probability measures the level of confidence. Bayes' theorem ties together confidence in an assumption before and after taking into account the obvious evidence. For example, someone suggested that when tossing a coin, it will land 2 times more often tails up, and the eagle down. Initially, the degree of confidence that such an event will happen, the coin will fall that way - 50%. The level of confidence can increase to 70% if the assumption is confirmed by evidence. ^{[ clarify ]}

For assumption (hypothesis) A and proof B

P (A) is the a priori probability of hypothesis A, the initial level of confidence in assumption A;
P (A | B) is the posterior probability of hypothesis A when event B occurs;
the ratio P ( B | A ) / P ( B ) shows how event B helps to change the level of confidence in Assumption A.

Frequency Interpretation

Frequency Interpretation Illustration

In a frequency interpretation, Bayes' theorem calculates the fractions of certain results of an event. Suppose that a certain experiment was carried out many times and in some cases led to the results of A and / or B. Then:

P ( A ) - percentage of cases when the experiment led to result A.
P ( B ) - percentage of cases when the experiment led to result B.
P ( B | A ) - the proportion of cases with result B among cases with result A.
P ( A | B ) - the proportion of cases with result A among cases with result B.

The role of Bayes theorem can best be understood from the tree diagrams presented on the right. The diagrams demonstrate a different order of distribution of events according to the presence or absence of results A and B. Bayes' theorem acts as a connecting link for these distributions.

Forms

Events

Simple form

For events A and B , provided that P ( B ) ≠ 0,

P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B)}}\cdot

{\ displaystyle P (A \ mid B) = {\ frac {P (B \ mid A) \, P (A)} {P (B)}} \ cdot}

Many additions to Bayes' theorem indicate that event B is known and it is necessary to understand how knowledge of event B affects the confidence that event A will occur. In this case, the denominator of the last expression - the probability of occurrence of event B - is known; we want to change A. Bayes' theorem shows that posterior probabilities are proportional to the numerator:

P(A\mid B)\propto P(A)\cdot P(B\mid A)\

{\ displaystyle P (A \ mid B) \ propto P (A) \ cdot P (B \ mid A) \}

(proportionality A for a given B ).

  In short: a posteriori probability is proportional to a priori probability (see Lee, 2012, Chapter 1).

If events A ₁ , A ₂ , ..., mutually exclusive and exhaustive, that is, only one of the events is possible, at the same time two events cannot happen together, we can determine the proportionality coefficient, being guided by the fact that their probabilities in the sum should be one. For example, for a given event A , event A itself and its opposite, ¬ A, are mutually exclusive and exhaustive. Denoting the proportionality coefficient as C, we have:

P(A\mid B)=c\cdot P(A)\cdot P(B\mid A)\

{\ displaystyle P (A \ mid B) = c \ cdot P (A) \ cdot P (B \ mid A) \}

and

P(\neg A\mid B)=c\cdot P(\neg A)\cdot P(B\mid \neg A)

{\ displaystyle P (\ neg A \ mid B) = c \ cdot P (\ neg A) \ cdot P (B \ mid \ neg A)}

.

Combining these two formulas, we get that:

c={\frac {1}{P(A)\cdot P(B\mid A)+P(\neg A)\cdot P(B\mid \neg A)}}.

{\ displaystyle c = {\ frac {1} {P (A) \ cdot P (B \ mid A) + P (\ neg A) \ cdot P (B \ mid \ neg A)}}.}

Extended form

Often the space of events (such as { A _j }) is defined in terms of P ( A _j ) and P ( B | A _j ). It is in this case that it is useful to determine P ( B ) by applying the formula for the total probability :

P(B)={\sum _{j}P(B\mid A_{j})P(A_{j})},

{\ displaystyle P (B) = {\ sum _ {j} P (B \ mid A_ {j}) P (A_ {j})},}

\implies P(A_{i}\mid B)={\frac {P(B\mid A_{i})\,P(A_{i})}{\sum \limits _{j}P(B\mid A_{j})\,P(A_{j})}}\cdot

{\ displaystyle \ implies P (A_ {i} \ mid B) = {\ frac {P (B \ mid A_ {i}) \, P (A_ {i})} {\ sum \ limits _ {j} P (B \ mid A_ {j}) \, P (A_ {j})}} \ cdot}

In particular

P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B\mid A)P(A)+P(B\mid \neg A)P(\neg A)}}

{\ displaystyle P (A \ mid B) = {\ frac {P (B \ mid A) \, P (A)} {P (B \ mid A) P (A) + P (B \ mid \ neg A ) P (\ neg A)}}}

.

Continuous Random Variables

The diagram reflects the meaning of Bayes' theorem and is applicable to the event space formed by continuous random variables X and Y. Note that by the Bayes theorem, there are requirements for each point in the domain . In practice, these requirements can be represented in a parametric form, using the notation of distribution density as a function of x and y .

Consider the space of elementary events Ω formed by two quantities X and Y. In principle, the Bayesian theorem applies to the events A = { X = x } and B = { Y = y }. However, the expressions become equal to 0 at the points at which the variable has a finite probability density . In order to usefully continue to use Bayes' theorem, it can be formulated in terms of suitable densities (see Derivation of formulas ).

Simple form

If X is continuous and Y is discrete, then

f_{X}(x\mid Y=y)={\frac {P(Y=y\mid X=x)\,f_{X}(x)}{P(Y=y)}}.

{\ displaystyle f_ {X} (x \ mid Y = y) = {\ frac {P (Y = y \ mid X = x) \, f_ {X} (x)} {P (Y = y)}} .}

If X is discrete and Y is continuous,

P(X=x\mid Y=y)={\frac {f_{Y}(y\mid X=x)\,P(X=x)}{f_{Y}(y)}}.

{\ displaystyle P (X = x \ mid Y = y) = {\ frac {f_ {Y} (y \ mid X = x) \, P (X = x)} {f_ {Y} (y)}} .}

If both X and Y are continuous,

f_{X}(x\mid Y=y)={\frac {f_{Y}(y\mid X=x)\,f_{X}(x)}{f_{Y}(y)}}.

{\ displaystyle f_ {X} (x \ mid Y = y) = {\ frac {f_ {Y} (y \ mid X = x) \, f_ {X} (x)} {f_ {Y} (y) }}.}

Extended form

A diagram showing how an event space formed by continuous random variables X and Y is often defined.

A continuous event space is often defined as a numerator of conditions A. A continuous event space is often represented as a numerator. In the future, it is useful to get rid of the denominator using the general probability formula . For 'f _Y ( y ), this becomes an integral:

f_{Y}(y)=\int _{-\infty }^{\infty }f_{Y}(y\mid X=\xi )\,f_{X}(\xi )\,d\xi .

{\ displaystyle f_ {Y} (y) = \ int _ {- \ infty} ^ {\ infty} f_ {Y} (y \ mid X = \ xi) \, f_ {X} (\ xi) \, d \ xi.}

Bayes

Bayes rule is a transformed Bayes theorem:

O(A_{1}:A_{2}\mid B)=O(A_{1}:A_{2})\cdot \Lambda (A_{1}:A_{2}\mid B)

{\ displaystyle O (A_ {1}: A_ {2} \ mid B) = O (A_ {1}: A_ {2}) \ cdot \ Lambda (A_ {1}: A_ {2} \ mid B)}

Where

\Lambda (A_{1}:A_{2}|B)={\frac {P(B\mid A_{1})}{P(B\mid A_{2})}}

{\ displaystyle \ Lambda (A_ {1}: A_ {2} | B) = {\ frac {P (B \ mid A_ {1})} {P (B \ mid A_ {2})}}}

This is called a Bayesian rule or likelihood ratio. The difference in the probability of two events occurring is simply the ratio of the probabilities of these two events. In this way,

O(A_{1}:A_{2})={\frac {P(A_{1})}{P(A_{2})}}

{\ displaystyle O (A_ {1}: A_ {2}) = {\ frac {P (A_ {1})} {P (A_ {2})}}}

,

O(A_{1}:A_{2}\mid B)={\frac {P(A_{1}\mid B)}{P(A_{2}\mid B)}}

{\ displaystyle O (A_ {1}: A_ {2} \ mid B) = {\ frac {P (A_ {1} \ mid B)} {P (A_ {2} \ mid B)}}}

,

Derivation of formulas

For events

Bayes theorem can be obtained from the definition of probability :

P(A\mid B)={\frac {P(A\cap B)}{P(B)}},{\text{ if }}P(B)\neq 0,

{\ displaystyle P (A \ mid B) = {\ frac {P (A \ cap B)} {P (B)}}, {\ text {if}} P (B) \ neq 0,}

P(B\mid A)={\frac {P(A\cap B)}{P(A)}},{\text{ if }}P(A)\neq 0,

{\ displaystyle P (B \ mid A) = {\ frac {P (A \ cap B)} {P (A)}}, {\ text {if}} P (A) \ neq 0,}

\implies P(A\cap B)=P(A\mid B)\,P(B)=P(B\mid A)\,P(A),

{\ displaystyle \ implies P (A \ cap B) = P (A \ mid B) \, P (B) = P (B \ mid A) \, P (A),}

\implies P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B)}},{\text{ if }}P(B)\neq 0.

{\ displaystyle \ implies P (A \ mid B) = {\ frac {P (B \ mid A) \, P (A)} {P (B)}}, {\ text {if}} P (B) \ neq 0.}

For random variables

For two continuous random variables X and Y, Bayes' theorem can be similarly deduced from the definition of a conditional distribution :

f_{X}(x\mid Y=y)={\frac {f_{X,Y}(x,y)}{f_{Y}(y)}}

{\ displaystyle f_ {X} (x \ mid Y = y) = {\ frac {f_ {X, Y} (x, y)} {f_ {Y} (y)}}}

f_{Y}(y\mid X=x)={\frac {f_{X,Y}(x,y)}{f_{X}(x)}}

{\ displaystyle f_ {Y} (y \ mid X = x) = {\ frac {f_ {X, Y} (x, y)} {f_ {X} (x)}}}

\implies f_{X}(x\mid Y=y)={\frac {f_{Y}(y|X=x)\,f_{X}(x)}{f_{Y}(y)}}.

{\ displaystyle \ implies f_ {X} (x \ mid Y = y) = {\ frac {f_ {Y} (y | X = x) \, f_ {X} (x)} {f_ {Y} (y )}}.}

Notes

↑ Kahneman, et al, 2005 , pp. 153-160.
↑ Bayes, Thomas, and Price, Richard (1763). "An Essay towards solving a Problem in the Doctrine of Chance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton, MA and FRS. " Philosophical Transactions of the Royal Society of London 53: 370-418. (unopened) (inaccessible link) . Date of treatment April 21, 2010. Archived April 10, 2011.
↑ Jeffreys, Harold (1973), Scientific Inference (3rd ed.), Cambridge University Press, p. 31, ISBN 978-0-521-18078-8

Literature

Gmurman V.E. Probability Theory and Mathematical Statistics, - M .: Higher Education. 2005
Judgment under Uncertainty: Heuristics and Biases / Daniel Kahneman, et al. - 21st. - Cambridge University Press, 2005 .-- 555 p. - ISBN 978-0-521-28414-1 .
Eliezer Yudkowski . Visual explanation of Bayes theorem

For further study

McGrayne, Sharon Bertsch. The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines & Emerged Triumphant from Two Centuries of Controversy. - Yale University Press , 2011 .-- ISBN 978-0-300-18822-6 .
Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin (2003), Bayesian Data Analysis, Second Edition, CRC Press.
Charles M. Grinstead and J. Laurie Snell (1997), “Introduction to Probability (2nd edition)”, American Mathematical Society (free pdf available [1] .
Pierre-Simon Laplace. (1774/1986), Memoir on the Probability of the Causes of Events, Statistical Science 1 (3): 364-378.
Peter M. Lee (2012), Bayesian Statistics: An Introduction, Wiley.
Rosenthal, Jeffrey S. (2005): “Struck by Lightning: the Curious World of Probabilities.” Harper collings.
Stephen M. Stigler (1986), Laplace's 1774 Memoir on Inverse Probability, Statistical Science 1 (3): 359-363.
Stone, JV (2013). Chapter 1 of book “Bayes' Rule: A Tutorial Introduction” , University of Sheffield, England.

Links

The Theory That Would Not Die by Sharon Bertsch McGrayne New York Times Book Review by John Allen Paulos on 5 August 2011
Weisstein, Eric W. Bayes' Theorem on Wolfram MathWorld .
Bayes' theorem on the PlanetMath website .
Bayes Theorem and the Folly of Prediction
A tutorial on probability and Bayes' theorem devised for Oxford University psychology students
An Intuitive Explanation of Bayes' Theorem by Eliezer S. Yudkowsky

[_73c88986a6998bed-1] Kahneman, et al, 2005 , pp. 153-160.

[2] Bayes, Thomas, and Price, Richard (1763). "An Essay towards solving a Problem in the Doctrine of Chance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton, MA and FRS. " Philosophical Transactions of the Royal Society of London 53: 370-418. (unopened) (inaccessible link) . Date of treatment April 21, 2010. Archived April 10, 2011.

[3] Jeffreys, Harold (1973), Scientific Inference (3rd ed.), Cambridge University Press, p. 31, ISBN 978-0-521-18078-8

Bayes Theorem

Wording

Proof

Calculation $P(B)$ ${\ displaystyle P (B)}$

Physical Meaning and Terminology

Examples

Example 1

Example 2

Example 3

Example 4 - The Bayesian Theorem Paradox

Probability Interpretation Options in Bayes Theorem

Interpretation of Bayes

Frequency Interpretation

Forms

Events

Simple form

Extended form

Continuous Random Variables

Simple form

Extended form

Bayes

Derivation of formulas

For events

For random variables

See also

Notes

Literature

For further study

Links

More articles:

Bayes Theorem

Wording

Proof

CalculationP(B) {\ displaystyle P (B)}

Physical Meaning and Terminology

Examples

Example 1

Example 2

Example 3

Example 4 - The Bayesian Theorem Paradox

Probability Interpretation Options in Bayes Theorem

Interpretation of Bayes

Frequency Interpretation

Forms

Events

Simple form

Extended form

Continuous Random Variables

Simple form

Extended form

Bayes

Derivation of formulas

For events

For random variables

See also

Notes

Literature

For further study

Links

More articles:

Calculation $P(B)$ ${\ displaystyle P (B)}$