Sampling§

Repeated Sampling§

A Pair of Dice

Suppose we are given two 6 sided dice, then our sample space is a set of a pairing of possible outcomes from the first and second dice,

$$ \Omega = \{(i, j) | i, j \in \{1 \dots 6\}\} $$

where $(i, j)$ are an ordered pair, i.e. $(i, j) \neq (j, i).$

Then, $$ \forall i, j, P((i, j)) = \frac{1}{6^2} $$

Let $$ \begin{align*} D &= \{\text{sum of dice is }8\}\\ &= \{(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)\} \end{align*} $$

Then, $$ P(D) = \sum_{d \in D} P(d). $$

Because each pair is equally likely, $\forall d \in D, P(d) = \frac{1}{36},$ therefore

$$ P(D) = \frac{5}{36}. $$

Flipping a Coin Multiple Times

Suppose we flipped a coin 3 times in a row,

$$ \Omega = \{0, 1\}^{3}, $$

where $0$ is heads and $1$ tails, so $\forall w \in \Omega, P(w) = \frac{1}{8}$

Cartesian Product

Given sets $A_1, A_2, \dots, A_n$ then we define the Cartesian product $$ A_1 \times A_2 \times \dots \times A_n $$

as the set of ordered $n$-tuples

$$ (a_1, a_2, \dots, a_n) $$

where $a_1 \in A_1, a_2 \in A_2, \dots, a_n \in A_n.$

As a consequence, we can rewrite $\Omega = \{(i, j) | i, j \in \{1 \dots 6\}\}$ from the first example as $\Omega = \{0, 6\}^{2}.$ Here, exponentiation means the repeated Cartesian product with itself.

The Cartesian product is useful as probability often deals with repetitions of a sample.

Random Sampling§

Random Sampling

Choosing elements at random from a given set is called sampling.

Cardinality

If a set $S$ contains $n$ elements, then we say that the cardinality of $S$ to be $n,$ and we write

$$ |S| = n. $$

If all outcomes of a random experiment are equally likely, then

$$\forall \omega \in \Omega, P(\omega) = |\Omega|^{-1} \iff \sum_{w \in \Omega} P(\omega) = 1.$$

If outcomes are equally likely, then we can say that it was chosen uniformly at random.

Fact

If $\Omega$ is finite and each outcome equally likely, then for any subset $A \subseteq \Omega,$

$$ P(A) = \sum_{a \in A} P(a) = \frac{|A|}{|\Omega|}. $$

Generally, random outcomes are not equally likely.

Sampling Mechanisms§

Ordered Sampling

Objects are chosen one at a time and the order is tracked.

Unordered Sampling

Objects are chosen one at a time and only the identity of objects matter, not their order.

Sampling With or Without Replacement

After each choice, we have another decision to make: to return the object or to discard, i.e. to replace or to not replace.

Urn Example Setup§

For the next few sections, suppose we are given an urn containing numbered balls, and we reach in and retrieve one ball. Also assume that the balls are uniformly random.

Ordered Sampling with Replacement§

Suppose we pick a ball from the urn, copy the number, put it back, and repeat $k$ times. This yields an ordered $k$-tuple of numbers,

$$ \omega = (s_1, s_2, \dots, s_k). $$

Alternatively, $\Omega = S^k$ where $S = \{1\dots n\},$ and $$ |\Omega| = |S|^k = n^k. $$

As we picked uniformly at random, $$ \forall \omega \in \Omega, P(\omega) = \frac{1}{n^k}. $$

With Replacement

Suppose the urn contains $4$ balls, and we pick $k = 3$ times.

$$ \Omega = \{1..4\}^3,\quad |\Omega| = 64. $$

Then, each triplet is equally likely, for example $$ P((1, 1, 1)) = \frac{1}{64}. $$

Ordered Sampling without Replacement§

Suppose we pick a ball from the urn, copy the number, discard the ball, and repeat $k$ times. Clearly for this to work, $k \leq n,$ and this yields a $k$-tuple of numbers.

At step $i, s_i \in \{1..n\},$ but $s_1, s_2, \dots, s_k$ must be distinct,

$$ \Omega = \{(s_1, s_2, \dots, s_k) | s_i \in \{1..n\} \wedge \forall i, j,\,s_i \neq s_j\}. $$

A natural question is the cardinality of $\Omega,$ at step $1$ we have $n$ choices, at $2$ we have $n-1$ choices, and so on. At step $k,$ we have $n - k + 1$ choices. Then, $\Omega$ must contain $n(n-1)(n-2)\dots(n-k+1)$ elements, i.e.

$$ |\Omega| = \prod_{i=0}^{k-1}(n - i) = (n)_k $$

Descending Factorial

The descending factorial is written using the notation $(n)_k$ and is defined by

$$ (n)_k = \prod_{i=0}^{k-1}(n - i). $$

By assuming equally likely outcomes,

$$ \forall \omega \in \Omega, P(\omega) = \frac{1}{(n)_k}. $$

Without Replacement

Suppose we sample 3 balls without replacement from an urn containing a total of 4 balls numbered from 1 to 4, distinctly.

  1. What is the probability that we get the sequence $2, 1, 4?$

    $$ P((2, 1, 4)) = \frac{1}{(4)_3} = \frac{1}{4 \cdot 3 \cdot 2}, $$

    because of our assumption of uniformity.

  2. What is the probability that we get the sequence $1, 2, 1?$ $$ P((1, 2, 1)) = 0, $$ it is impossible as we sampled $1$ first then removed it, so we cannot get it again.

Unordered Sampling without Replacement§

Suppose we sampled $k$ balls without replacement, and only recorded which balls appeared and not the order. Again, $k \leq n.$

The outcome will be some subset from the set $S,$ with cardinality $k,$ i.e. the sample space is

$$ \Omega = \{\omega | \omega \in \mathcal{P}(S), |\omega| = k\}. $$

Then, the number of outcomes of the sample space is given by

$$ |\Omega| = \binom{n}{k} = \frac{n!}{(n-k)!\,k!} $$

and for our example all outcomes equally likely means that the probability measure is given by

$$ \forall \omega \in \Omega,\,P(\omega) = \binom{n}{k}^{-1}. $$

Alternatively, instead thinking about the situation as taking out one ball at a time $k$ times, we can think of it as:

  1. Randomize order of $n$ balls.
  2. Look at the first $k$ balls.
  3. Note the identity of the $k$ balls, ignoring order.

Number of outcomes

How exactly do we compute the probability of $k$ balls being $\{s_1, s_2, \dots, s_k\}?$

$$ \overbrace{\underbrace{s_1, s_2, s_3, s_4, \dots, s_k}_{k}, \underbrace{\dots, s_n}_{n-k}}^{n} $$

  1. There are $n!$ possible arrangements of $n$ balls.
  2. The first $k$ balls have an ordering $\{s_1, s_2, \dots, s_k\}$ with $k!$ possible arrangements.
  3. The final $n - k$ balls have $(n - k)!$ possible arrangements.

So the number of favourable orderings such that the first $k$ balls are $\{s_1, s_2, \dots, s_k\}$ is given by

$$ k!\,(n-k)! $$

$$ P(\{s_1, s_2, \dots, s_k\}) = \frac{\text{\# favourable}}{\text{\# total}} = \frac{k!\,(n-k)!}{n!} $$

There are many ways to solve problems, sometimes adding additional structure such as order may help solve a problem. In any case, we must pick one and stick with it.