Probability; Languages

New Beginnings: Week 6
Lecture Notes by Bart Massey
Copyright (c) 2014 Bart Massey

2014-08-04: Midterm; Midterm Analysis

  • Here is an annotated copy of the Midterm.

2014-08-05: Random Numbers; Probability

Random and Pseudo-Random Numbers

  • Hard to define "random number":

    • Best bet: A random sequence is one correlated no better than chance predicts to any sequence you can discover.

    • Problem: A computer is carefully designed to avoid random behavior.

  • Perhaps we can settle for deterministically generating a sequence that is correlated with no generator except its own? This is a "Pseudo-Random Number Generator" (PRNG).

    • We should allow such a generator to generate different sequences somehow, so that successive runs don't repeat the original sequence.

An Example PRNG: MLCG

  • Multiplicative Linear Congruential Generator

    s ← some “seed” value in 1..m

    To compute randrange(n):__ r ← s mod n
    s ← (s ✕ a) mod m
    return r

  • Choice of a and m matters a lot.

  • Note that s may never be 0.

  • Sample output (a=55, m=251, s=12, n=15):

    12, 8, 6, 1, 5, 6, 9, 4, 12, 10, 0, 12

  • Coefficients from paper:

    Tables of linear congruential generators of different sizes and good lattice structure., Pierre L'Ecuyer, Math. Comput 01/1999; 68:249-260.

Exercise: Program This MLCG

Cryptographic PRNG

  • MLCG is totally predictable after observing just a few elements.

  • Crypto PRNG is not predictable except by knowing its state s.

Choosing s

  • Terrible Ways To Choose s

    • System clock: predictable.
    • PID: small and predictable.
    • Hash of memory: surprisingly predictable.
    • User timing: Under user control.
  • Good Ways To Choose s

    • True or likely “entropy” sources.
    • Hardware RNG:

      • Pick a phenomenon (e.g. thermal noise) that is truly random.
      • Use measurements of that phenomenon.
      • Problems: slow / colored / expensive / fiddly.
      • We're working on it. :-)

Probability

  • If some sequence is random, then one cannot know for sure what the next element will be.

  • What we can know:

    • Likelihood of each possible next element, from principle.
    • Overall distribution of elements over the sequence, from measurement.
  • Some 17th-century gamblers figured out there's good math here.

Sample Space; Events

  • The sample space in a probability question is the set of all possible outcomes of a measurement.

  • An event is a subset of the sample space that we are interested in.

  • Example:

    • Let H be the set of all two-card hands from a standard deck.

    • Let E be the event (subset of H) such that both cards in the hand have rank 'A'.

    • |H| = choose(52, 2) = 52 * 51 / 2 = 1326

    • |E| = choose(4, 2) = 4 * 3 / 2 = 6

  • The obvious set stuff is useful here.

Probability of an Event

  • The probability Pr[S](E) of an event E in a sample space S is just |E| / |S|.

  • Consequences:

    • forall S, E . 0 <= Pr[S](E) <= 1
    • forall S, E1, E2 . E1 isect E2 = empty ->
      Pr(E1) + Pr(E2) = Pr(E1 union E2) = Pr(E1 or E2)
    • forall S, E1, E2 . E1 isect E2 = empty ->
      Pr(E1) Pr(E2) = Pr(E1 isect E2) = Pr(E1 and E2)

Conditional Probability

  • We define conditional probability by

    Pr(E|H) Pr(H) = Pr(E isect H) = Pr(E and H)

    and read "probability of E given H".

  • Thus when S is equiprobable

    Pr[S](E|H) = |E isect H| / |H|

  • Example: Probability of rolling a 7 on two dice when one of the dice is a four.

    • S is space of 36 die rolls.

    • Let H be the event that one of the dice is a four.

      • 12 possible rolls, but overcounted by 1, so 11.
    • Let E be the event that a seven was rolled:

      • 6 possible rolls.

      • Of these, 2 contain a four.

    • Then we have

      Pr(E|H) = |E isect H| / |H| = 2 / 11 = 0.181818...

Exercise: Check the Example Calculation

  • 100000 rolls should do it. Answer will be approximate.

Bayes' Rule

  • Note the symmetry in

    Pr(E|H) Pr(H) = Pr(E isect H) = Pr(H|E) Pr(E)

  • We get

    Pr(H|E) = Pr(E|H) Pr(H) / Pr(E)

  • For one reason to care, interpret E as "evidence" and H as "hypothesis"

  • Classic Example:

    • Let H be the hypothesis that Patient X has a rare disease, say Pr(H) = 1/1000000

    • Let E be the evidence: X has an unusual symptom, say Pr(E) = 1/100

    • Let the unusual symptom be really common in people with the rare disease, say Pr(E|H) = 98/100

    • Now, calculate the probability that X has the disease given the symptom:

      Pr(H|E) = Pr(E|H) Pr(H) / Pr(E) = (98/100) (1/1000000) / (1/100) = (98/100) / 10000 = 0.0098%

    • This may be counterintuitive