Bayes theorem (or Bayes formula ) is one of the basic theorems of elementary probability theory , which allows you to determine the probability of an event, provided that another statistically interdependent event has occurred. In other words, according to the Bayes formula, one can more accurately recalculate the probability, taking into account both previously known information and the data of new observations. Bayes formula can be derived from the main axioms of probability theory, in particular from conditional probability. The peculiarity of the Bayesian theorem is that for its practical application a large number of calculations is required, therefore Bayesian estimates began to be actively used only after the revolution in computer and network technologies.
When Bayes' theorem arose, the probabilities used in the theorem were subjected to a number of probabilistic interpretations. One such interpretation said that the derivation of the formula is directly related to the application of a special approach to statistical analysis. If we use the Bayesian interpretation of probability, then the theorem shows how the personal level of trust can radically change due to the number of events that have occurred. This is the conclusion of Bayes that became fundamental to Bayesian statistics. However, the theorem is used not only in Bayesian analysis, but is also actively used for a large number of other calculations.
Psychological experiments [1] showed that people often incorrectly assess the probability of an event based on their experience (a posteriori probability ), because they ignore the very probability of an assumption (a priori probability ). Therefore, the correct result according to the Bayes formula can be very different from the intuitively expected one.
Bayes' Theorem is named after its author, Thomas Bayes (1702-1761), an English mathematician and priest who was the first to use the theorem to correct beliefs based on updated data. His work " An Essay towards solving a Problem in the Doctrine of Chances " was first published in 1763 [2] , 2 years after the death of the author. Before Bayes' posthumous work was accepted and read by the Royal Society, it was significantly edited and updated by Richard Price . However, these ideas were not made public until they were rediscovered and developed by Laplace , who first published the modern statement of the theorem in his 1812 book, Analytical Probability Theory.
Sir Harold Jeffries wrote that Bayes' theorem "for probability theory, the same as the Pythagorean theorem for geometry" [3] .
Wording
|
Proof
Bayes formula follows from the definition of conditional probability . Probability of a joint event expressed in two ways through conditional probabilities
Consequently
Calculation
In tasks and statistical applications usually calculated by the formula for the total probability of an event, which depends on several incompatible hypotheses with a total probability of 1.
- ,
where the probabilities under the sum sign are known or allow experimental evaluation.
In this case, the Bayes formula is written as follows:
Physical Meaning and Terminology
Bayes formula allows you to "rearrange the cause and effect": on the well-known fact of the event to calculate the probability that it was caused by this cause.
Events reflecting the action of “causes” in this case are called hypotheses , since they are the alleged events that entailed this. The unconditional probability of validity of the hypothesis is called a priori (how likely the cause is in general ), and conditional - taking into account the fact of the event - a posteriori (how likely the reason turned out to be taking into account the data on the event ).
Examples
Example 1
Let the event - the car will not start, but the hypothesis - there is no fuel in the tank. Obviously the probability the fact that the car will not start if there is no fuel in the tank is equal to one. As a result, the posterior probability that there is no fuel in the tank if the car does not start, that is is equal to , that is, the ratio of the a priori probability that there is no fuel in the tank to the probability that the car will not start. For example, if the a priori probability that there is no fuel in the tank is 1%, and the probability that the car will not start is 2%, and a randomly selected car does not start, then the probability that there is no fuel in its tank is 50%.
Example 2
Let the first worker have the probability of marriage , the second worker - and the third - . First made details, the second - parts, and the third - details. The shop manager takes a random part, and it turns out to be defective. The question is, with what probability was this part made by a third worker?
Event - marriage details, event - the detail was produced by the worker . Then where , but .
By the formula of total probability
By the Bayes formula, we get:
Example 3
An entomologist suggests that a beetle may be a rare subset of beetles , since it has a pattern on its body. In a rare subspecies, 98% of beetles have a pattern or P (Pattern | Rare) = 0.98 ( P (Pattern | Rare) = 0.98). Among ordinary beetles, only 5% have a pattern. A rare species of beetles accounts for only 0.1% of the entire population. What is the likelihood that a beetle having a pattern belongs to a rare subspecies or P (Rare | Pattern) ( P (Rare | Pattern) )?
From the extended Bayesian theorem we get (any bug can be either rare or common):
Example 4 - The Bayesian Theorem Paradox
Let there be a disease with a frequency of distribution among the population of 0.001 and a diagnostic examination method that identifies the patient with a probability of 0.9, but at the same time has a probability of 0.01 to incorrectly determine the disease in a healthy person ( more ... ) Find the likelihood that a person is healthy if he was found to be ill during the examination.
Let us designate through B - the event that the person is sick, "B" - the event that the examination showed that the person is sick, and through H - the event that the person is healthy. Then the specified conditions are rewritten as follows:
- P ("B" | B) = 0.9;
- P ("B" | H) = 0.01;
- P (B) = 0.001, then P (S) = 0.999.
- P ("B" | H) = 0.01;
The probability that a person is healthy, if he was recognized as sick is equal to the conditional probability:
- P (Z | "B").
To find it, we first calculate the total probability of recognition as a patient:
P ("B") = 0.999 × 0.01 + 0.001 × 0.9 = 1.089%.
The probability that a person is healthy with the result of "sick":
P (3 | "B") = 0.999 × 0.01 / (0.999 × 0.01 + 0.001 × 0.9) ≈ 91.7%.
Thus, 91.7% of people whose survey showed the result is “sick”, are actually healthy people. The reason for this is that, by the condition of the problem, the probability of a false-positive result, although small, is an order of magnitude greater than the proportion of patients in the examined group of people.
If erroneous examination results can be considered random, then re-examination of the same person will give an independent result from the first. In this case, to reduce the share of false positive results, it makes sense to re-examine people who have received the result of "sick". The probability that a person is healthy after receiving a second result is “sick” can also be calculated by the Bayes formula: P (Z | “B”, “B”) = 0.999 × 0.01 × 0.01 / (0.999 × 0.01 × 0.01 + 0.001 × 0.9 × 0.9) ≈ 10.98%.
Probability Interpretation Options in Bayes Theorem
Mathematically, the Bayesian theorem shows the relationship between the probability of event A and the probability of event B, P ( A ) and P ( B ), the conditional probability of occurrence of event A with existing B and occurrence of event B with existing A, P ( A | B ) and P ( B | A).
In general terms, Bayes formula is as follows:
The meaning of the expression depends on how the probabilities in the given formula are interpreted.
Interpretation of Bayes
In Bayesian interpretation, probability measures the level of confidence. Bayes' theorem ties together confidence in an assumption before and after taking into account the obvious evidence. For example, someone suggested that when tossing a coin, it will land 2 times more often tails up, and the eagle down. Initially, the degree of confidence that such an event will happen, the coin will fall that way - 50%. The level of confidence can increase to 70% if the assumption is confirmed by evidence. [ clarify ]
For assumption (hypothesis) A and proof B
- P (A) is the a priori probability of hypothesis A, the initial level of confidence in assumption A;
- P (A | B) is the posterior probability of hypothesis A when event B occurs;
- the ratio P ( B | A ) / P ( B ) shows how event B helps to change the level of confidence in Assumption A.
Frequency Interpretation
In a frequency interpretation, Bayes' theorem calculates the fractions of certain results of an event. Suppose that a certain experiment was carried out many times and in some cases led to the results of A and / or B. Then:
- P ( A ) - percentage of cases when the experiment led to result A.
- P ( B ) - percentage of cases when the experiment led to result B.
- P ( B | A ) - the proportion of cases with result B among cases with result A.
- P ( A | B ) - the proportion of cases with result A among cases with result B.
The role of Bayes theorem can best be understood from the tree diagrams presented on the right. The diagrams demonstrate a different order of distribution of events according to the presence or absence of results A and B. Bayes' theorem acts as a connecting link for these distributions.
Forms
Events
Simple form
For events A and B , provided that P ( B ) ≠ 0,
Many additions to Bayes' theorem indicate that event B is known and it is necessary to understand how knowledge of event B affects the confidence that event A will occur. In this case, the denominator of the last expression - the probability of occurrence of event B - is known; we want to change A. Bayes' theorem shows that posterior probabilities are proportional to the numerator:
- (proportionality A for a given B ).
In short: a posteriori probability is proportional to a priori probability (see Lee, 2012, Chapter 1).
If events A 1 , A 2 , ..., mutually exclusive and exhaustive, that is, only one of the events is possible, at the same time two events cannot happen together, we can determine the proportionality coefficient, being guided by the fact that their probabilities in the sum should be one. For example, for a given event A , event A itself and its opposite, ¬ A, are mutually exclusive and exhaustive. Denoting the proportionality coefficient as C, we have:
- and .
Combining these two formulas, we get that:
Extended form
Often the space of events (such as { A j }) is defined in terms of P ( A j ) and P ( B | A j ). It is in this case that it is useful to determine P ( B ) by applying the formula for the total probability :
In particular
- .
Continuous Random Variables
Consider the space of elementary events Ω formed by two quantities X and Y. In principle, the Bayesian theorem applies to the events A = { X = x } and B = { Y = y }. However, the expressions become equal to 0 at the points at which the variable has a finite probability density . In order to usefully continue to use Bayes' theorem, it can be formulated in terms of suitable densities (see Derivation of formulas ).
Simple form
If X is continuous and Y is discrete, then
If X is discrete and Y is continuous,
If both X and Y are continuous,
Extended form
A continuous event space is often defined as a numerator of conditions A. A continuous event space is often represented as a numerator. In the future, it is useful to get rid of the denominator using the general probability formula . For 'f Y ( y ), this becomes an integral:
Bayes
Bayes rule is a transformed Bayes theorem:
Where
This is called a Bayesian rule or likelihood ratio. The difference in the probability of two events occurring is simply the ratio of the probabilities of these two events. In this way,
- ,
- ,
Derivation of formulas
For events
Bayes theorem can be obtained from the definition of probability :
For random variables
For two continuous random variables X and Y, Bayes' theorem can be similarly deduced from the definition of a conditional distribution :
See also
- Bayesian spam filtering
- Bayesian programming
- Bayesian Trust Network
- Bayesian probability
- Incorrect prior distribution
- The Monty Hall Paradox
- The paradox of laws
Notes
- ↑ Kahneman, et al, 2005 , pp. 153-160.
- ↑ Bayes, Thomas, and Price, Richard (1763). "An Essay towards solving a Problem in the Doctrine of Chance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton, MA and FRS. " Philosophical Transactions of the Royal Society of London 53: 370-418. (inaccessible link) . Date of treatment April 21, 2010. Archived April 10, 2011.
- ↑ Jeffreys, Harold (1973), Scientific Inference (3rd ed.), Cambridge University Press, p. 31, ISBN 978-0-521-18078-8
Literature
- Gmurman V.E. Probability Theory and Mathematical Statistics, - M .: Higher Education. 2005
- Judgment under Uncertainty: Heuristics and Biases / Daniel Kahneman, et al. - 21st. - Cambridge University Press, 2005 .-- 555 p. - ISBN 978-0-521-28414-1 .
- Eliezer Yudkowski . Visual explanation of Bayes theorem
For further study
- McGrayne, Sharon Bertsch. The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines & Emerged Triumphant from Two Centuries of Controversy. - Yale University Press , 2011 .-- ISBN 978-0-300-18822-6 .
- Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin (2003), Bayesian Data Analysis, Second Edition, CRC Press.
- Charles M. Grinstead and J. Laurie Snell (1997), “Introduction to Probability (2nd edition)”, American Mathematical Society (free pdf available [1] .
- Pierre-Simon Laplace. (1774/1986), Memoir on the Probability of the Causes of Events, Statistical Science 1 (3): 364-378.
- Peter M. Lee (2012), Bayesian Statistics: An Introduction, Wiley.
- Rosenthal, Jeffrey S. (2005): “Struck by Lightning: the Curious World of Probabilities.” Harper collings.
- Stephen M. Stigler (1986), Laplace's 1774 Memoir on Inverse Probability, Statistical Science 1 (3): 359-363.
- Stone, JV (2013). Chapter 1 of book “Bayes' Rule: A Tutorial Introduction” , University of Sheffield, England.
Links
- The Theory That Would Not Die by Sharon Bertsch McGrayne New York Times Book Review by John Allen Paulos on 5 August 2011
- Weisstein, Eric W. Bayes' Theorem on Wolfram MathWorld .
- Bayes' theorem on the PlanetMath website .
- Bayes Theorem and the Folly of Prediction
- A tutorial on probability and Bayes' theorem devised for Oxford University psychology students
- An Intuitive Explanation of Bayes' Theorem by Eliezer S. Yudkowsky