# Mathematics and Probability

http://www.maths.lth.se/matstat/staff/bengtr

## Basics

In probability theory, or rather probability calculus, one starts with given probabilities and computes new ones. As long as one agrees to measure probability on an increasing linear scale between 0 and 1, the following rules are obvious.
1. Probabilities are numbers between 0 and 1, endpoints included.
2. If an event surely occurs, its probability equals 1.
3. If an event can be decomposed into smaller events, its probability equals the sum of the probabilities of the subevents.
• Example:

P(X ≤ 2) = P(X=0) + P(X=1) + P(X=2)

where X might denote the numbers of customers in a shop at a given moment.
As a consequence, the probability that an event does not occur equals one minus the probability that it does occur.
• For instance,

P(X > 2) = 1 − P(X ≤ 2).

4. To compute the probability that two events occur simultaneously, one multiplies the probability of one of them with the conditional probability of the other one.
• Example: An urn contains 5 white and 3 black balls, and one draws two balls without replacement. The probability that both are white is given by  P(A and B) = P(A)⋅P(B|A)= 55+3 ⋅ 44+3 = 2056 = 514
where A denotes "first ball white" and B denotes "second ball white", and the | sign denotes "given" or "conditionally on".

The values of P(A) and P(B|A) might have been obtained from the previous rule and symmetry; at each step all the remaining balls have the same probability to be drawn. Note also, to avoid potential misunderstandings, that P(B) = 5/8.
The above means that probabilities are treated axiomatically ; for instance, if one claims

P(rain tomorrow) = 0.7,

then, whether this means that 70% of the days are rainy ones, or that 70% of the days of this time of the year are rainy, or some meteorological method has been used, or that it is merely based on a pessimistic feeling, one has to admit that

P(not rain tomorrow) = 0.3,

since, otherwise, one contradicts oneself. In probability theory the above reasonings are formalised into rules which guarantee consistent, i. e. contradictionfree, results.

## Relation to measure

The events considered are of the type: The outcome of some observed random phenomenon falls within a given set.
• For some events one is faced with given probabilities according to

P(outcome ∈ A) = μ(A)

where A is a subset of all possible outcomes of the phenomenon in question and μ is a suitable function, which is thought of as assigning probability mass to the subsets. For instance:
• In elementary courses one has

μ(A) = ∑_{k ∈ A} p_k,    with p_k ≥ 0 and ∑_{k} p_k = 1,

if the outcome is an integer, and

μ(A) = ∫_{x ∈ A} f(x)dx,    with f(x) ≥ 0 and ∫_{x} f(x) dx = 1,

if the outcome is a real number. If the outcome is a pair of two numbers, one has a double sum and p_{jk} or a double integral and f(x, y).

In exercises, one is given the p or the f and is required to compute some interesting probability. Sometimes no μ is given. Instead one has to use symmetry, for instance when a die is thrown or when one counts the number of heads when tossing a coin.
Since

P(outcome ∈ A or outcome ∈ B) = P(outcome ∈ A ∪ B),
P(outcome ∈ A and outcome ∈ B) = P(outcome ∈ A ∩ B),
and
P(outcome ∉ A) = P(outcome ∈ A')

where A' denotes the complementary set to A, the above rules for probabilities correspond to the following rules for μ, called axioms :
1. μ(Ω) = 1, where Ω is the set of all possible outcomes, and, if μ(A) and μ(B) are defined, then so are μ(A'), μ(A∪B), and μ(A∩B).
2. μ(A) ≥ 0.
3. μ(A∪B) = μ(A)+ μ(B) if A and B are disjoint.

## Probability theory

Luckily, it turns out, Kolmogorov (1933) calls this the Fundamental Theorem, that practically all μ's encountered in practice satisfy the following:
• Axiom of continuity:
If an infinite intersection A_1 ∩ A_2 ∩ … is empty, then

μ(A_1 ∩ A_2 ∩ … ∩A_n) → 0      as n tends to infinity.

Such a μ is called a probability measure defined on an algebra of subsets of a given set. The Extension Theorem says that μ's domain of definition can be extended so that one remains within it even after forming infinite unions and intersections. The extended μ is called a probability measure defined on a σ-algebra . This implies:
• One can compute new probabilities by repeated use of

P(outcome ∈ A) = P(outcome ∈ A_1) + P(outcome ∈ A_2) + ...

when A = A_1 ∪ A_2 ∪ ... with the A_n pairwise disjoint, and of

P(outcome ∈ A') = 1 - P(outcome ∈ A),

without risking to contradict oneself, since the result would always be

P(outcome in A) = μ(A).

Link to Kolmogorov's "Grundbegriffe der Wahrscheinlichkeitsrechnung" (1933), translated to English, at www.mathematik.com
Kolmogorov's Fundamental Theorem of Probability Calculus
Bengt Ringnér Centre for Mathematical Sciences, Lund University, Lund, Sweden
http://www.maths.lth.se/matstat/staff/bengtr