# Lecture 6 .pdf

À propos / Télécharger Aperçu

**Lecture 6.pdf**

**Lecture6**

**Giuliana Cortese**

Ce document au format PDF 1.3 a été généré par pdftopdf filter / Mac OS X 10.9.2 Quartz PDFContext, et a été envoyé sur fichier-pdf.fr le 12/10/2015 à 18:38, depuis l'adresse IP 93.34.x.x.
La présente page de téléchargement du fichier a été vue 801 fois.

Taille du document: 23.8 Mo (26 pages).

Confidentialité: fichier public

### Aperçu du document

Lecture 6

Probability Distributions

Random Variable,

probability distribution

●

Probability is how frequently we expect different

outcomes to occur if we repeat the experiment over

and over (“frequentist” view)

●

A random variable is a quantity whose value is not

fixed, but can take on different values with different

probabilities.

●

A probability distribution is used to describe the

probabilities of different values occurring.

Random Variable (example)

Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a

Random Variable "X":

the outcome is random (not fixed) and there are 2 possible outcomes,

each of which occur with probability 0.5 (50%).

Rolling a die: the outcome is random (not fixed) and there are 6

possible outcomes, each of which occur with probability one-sixth.

!

Examples: Dead/alive, treatment/placebo, dice,

counts, etc.

Discrete random variables have a countable

number of outcomes

Random variables can be discrete

or continuous

Choose a person at random, one random variable may be the

person's height.

!

!

!

Continuous random variables have an

infinite continuum of possible values.

Examples: blood pressure, weight, the speed of

a car, a real number taken from 1 to 6, time to

an event.

Random Variable (r.v.)

A Random Variable has a whole set of values, and it

could take on any of those values, randomly.

Example: Throw a die once

Random Variable:

X = "The score shown on the top face".

X could be 1, 2, 3, 4, 5 or 6

So the Sample Space, the set of all possible values,

is {1, 2, 3, 4, 5, 6}

Symbols: X = random variable

x = a single value that the r.v. can assume

Probability

0.5

impossible

Can be interpreted as the likelihood that

the event will occur.

certain

1

Probability is a number between 0 and 1.

closer to 1: a value x is more likely to

occur;

closer to 0: the value x is less likely to

occur.

Sum of the probabilities of all values

must be 1.

0

!

!

Probability distribution

A probability distribution is an

assignment of probabilities

to each distinct value of a discrete

random variable,

to each interval of values of a continuous

random variable.

Probability functions

Sum of the probabilities of all values must be 1.

Probability density function (pdf): a probability

function maps the possible values of x against their

respective probabilities of occurrence,

! Discrete case

p(x) = P( X= x)

! Continuous case

! The probability that X is any exact particular

value x is 0;

! Probabilities are given for a range of values,

rather than a particular value (e.g., the

probability of getting a math score between

700 and 800 is 2%).

!

!

Probability density function (pdf)

The pdf, p(x), is a mathematical function that sum to 1.

Discrete case

!

!

Continuous case

!

The pdf, f(x), of a continuous random variable is a continuous

mathematical function that integrates to 1.

The probabilities associated with continuous functions are just

areas under the curve (integrals!).

Cumulative distribution function

A = given value

P(x ≤ A)

Expected Value and Variance

Cumulative distribution function:

!

All probability distributions are characterized by

an expected value (mean) and a variance

(standard deviation squared).

!

!

these symbols are used interchangeably

these symbols are used interchangeably

Var(X) = σ2

!

E(X) = µ

Typical notation

!

!

!

Expected value, or mean

The expected value (mean) µ is the weighted average

of the random variable X, where the weights are the

probabilities.

Imagine placing the masses p(x) at the points X on a

beam; the balance point of the beam is the expected

value of x.

Expected value, formally

Continuous case:

!

!

!

!

!

Sample Mean is a special case of

Expected Value…

**A few notes about Expected Value

If c = a constant number (i.e., not a variable) and

X and Y are any random variables.

E(c) = c there’s no randomness

E(c*X) = c * E(X)

E(c + X) = c + E(X)

E(X + Y)= E(X) + E(Y)

E(X+Y)= E(X) + E(Y)

If the casino throws in a free drink worth exactly $5.00 every time

you play a game, you always expect to (and do) gain an extra

$5.00 regardless of the outcome of the game.

E(c + X)=c + E(X)

If the casino charges $10 per game instead of $1, then the casino

expects to make 10 times as much on average from the game

E(cX)=cE(X)

**A few notes about Expected Value (Examples)

!

!

!

A certain lottery (also known as a tax on people

who are bad at math…) works by picking 6

numbers from 1 to 49.

It costs $1.00 to play the lottery, and if you win,

you win $2 million after taxes.

Example: the lottery (1)

If you play the lottery twice, you expect to lose: -$.86 + -$.86.

!

!

!

If you play the lottery once, what are your

expected winnings or losses?

Example: the lottery (2)

Calculate the probability of winning in 1 try:

7.2 x 10--8

p(x)

.999999928

The probability function (note that sums to 1.0):

x$

-1

+ 2 million

Example: the lottery (3)

ExpectedValue

E(X) = P(win) * $2,000,000 + P(lose) *(-$1.00 )

= 7.2 * 10-8 * 2,000,00 + 0.999999928 * (-1) = 0.144 - 0.999999928

= - $0.86

Negative expected value is never good! You shouldn’t play if you

expect to lose money!

If you play the lottery every week for 10 years (52 * 10), what are your

expected winnings or losses?

520 * (-0.86) = -$447.20

Variance/standard deviation

Variance

“The expected squared distance/ deviation

from the mean”

Discrete case:

Continuous case:

Handy calculation formula!

For discrete variables ...

E(x) = µ

E(cX) = cE(X),

E(c) = c

Use rules of expected value: E(X+Y)= E(X) + E(Y)

(your calculation formula!)

Proofs:

E(x-µ)2 = E(x2–2µx + µ2)

=E(x2) – E(2µx) +E(µ2)

= E(x2) – 2µE(x) +µ2

= E(x2) – 2µ µ +µ2

= E(x2) – µ2

= E(x2) – [E(x)]2

!

!

!

!

!

Similarity to empirical variance

… weighted squared deviation from the mean !

**A few notes about Variance

If c = constant number (i.e., not a variable) and X and Y are

random variables, then

Var(c) = 0

Var (c +X) = Var(X)

Var(c X)= c2 Var(X)

Var(X + Y)= Var(X) + Var(Y)

ONLY IF X and Y are independent!!!!

Var(X + Y)= Var(X) + Var(Y) + 2 Cov(X,Y)

IF X and Y are not independent.

Var (c + X) = Var(X)

(c constant)

Adding a constant to every outcome of a random variable doesn’t

change the variability. It just shifts the whole distribution by c.

+c

(c constant)

If everybody grew 5 inches suddenly, the variability in the

population would still be the same.

Var(c X) = c2 Var(X)

Multiplying every outcome of the random variable by c makes its

distribution c-times wider, which corresponds to c2- times the

variance (deviation squared).

If everyone suddenly became twice as tall, there’d be twice the

deviation and 4 times the variance in heights in the population.

Var(X+Y)= Var(X) + Var(Y)

Var(X+Y)= Var(X) + Var(Y)

ONLY IF X and Y are independent!!!!!!!!

With two random variables, you have more opportunity for

variation, unless they vary together (are dependent, and

have covariance):

Var(X +Y)= Var(X) + Var(Y) + 2 Cov(X, Y)

mean

x

average distance from the mean

What’s the variance and standard deviation of

the roll of a die?

1.0

!

!

If you bet $1.00 that an odd number comes up, you win

or lose $1.00 according to whether or not that event

occurs.

A roulette wheel has the numbers 1 through 36, as well

as 0 and 00.

Exercise (1)

!

If X denotes your net gain, X=1 with probability 18/38

and X= -1 with probability 20/38. The mean is -$0.053.

What’s the variance of X?

Answer

Standard deviation is $0.998. Interpretation: On average,

you’re either 1 dollar above or 1 dollar below the mean,

which is just below zero (µ= -0.053). Makes sense!

!

Exercise (2)

Find the variance and standard deviation for the number of ships to

arrive at the harbor (recall that the mean is 11.3).

= 129.5 – (11.3)2 = 1.81

The sample covariance:

The Sample Covariance

Interpretation: On an average day, we expect 11.3 ships to arrive in the harbor, plus

or minus 1.35. This gives you a feel for what would be considered a usual day!

!

!

!

Covariance: joint probability

The sample covariance:

The Sample Covariance

The covariance measures the strength of the

linear relationship between two variables

X and Y (the dependence between X and Y).

The covariance:

!

Interpreting Covariance

Cov(X,Y) > 0

X and Y are inversely correlated

X and Y are positively correlated

Covariance between two random variables:

Cov(X,Y) < 0

X and Y are independent

!

Cov(X,Y) = 0

Examples of discrete probability

distributions:

The binomial distribution

!

!

!

!

!

An important discrete distribution

Binomial

Yes/no outcomes

(dead/alive, treated/untreated, smoker/nonsmoker, sick/well, etc.)

Binomial: Definitions

n fixed independent trials, are performed

! 15 tosses of a coin; 20 patients; 1000 people surveyed

Binary random variable is observed in each trial. The result can be:

! a “success” with probability p.

! a “failure” with probability 1- p.

Constant probability p in every trial

! e.g., Probability of getting a tail is the same each time we toss

the coin

The total number of successes, X, is a binomial random variable

with parameters n and p : X ~ Bin (n, p)

The probability that X=r (i.e., that there are exactly r successes in n

trials) is:

Binomial: Definitions

general pattern ! if you have only two possible outcomes (call

them 1/0 or yes/no or success/failure) in n independent trials, then

the probability of exactly X “successes” =

n = number of trials

1-p = probability

of failure

Definitions: Bernoulli distribution

Bernouilli trial: If there is only 1 trial with

probability of success p and probability of failure

1-p, this is called a Bernouilli distribution. (special

case of the binomial with n =1)

Probability of success:

Probability of failure:

Binomial: expected value and variance

If X follows a binomial distribution with

parameters n and p: X ~ Bin (n, p)

Then:

Binomial: expected value and variance

If X follows a binomial distribution with

parameters n and p: X ~ Bin (n, p)

Then:

Bernoulli: expected value and a variance

For Bernoulli (n=1)

E(X) = p

Var (X) = p(1-p)

Variance Proof

For Y~ Bernouilli (p)

Y =1 if yes

Y = 0 if no

For X ~ Bin (n,p)

Yi ~ Bernouilli (p)

!

!

Binomial: example

Take the example of 5 coin tosses. What’s the probability that

you get exactly 3 heads in 5 coin tosses?

X= the number of heads observed in 5 coin tosses

Solution:

_ One way to get exactly 3 heads: HHHTT

What’s the probability of this exact arrangement?

_

P(head)* P(head)* P(head) * P(tail)* P(tail) =

Binomial: example

is the probability of each single outcome that

has exactly 3 heads and 2 tails (no matter the order).

So, the overall probability of 3 heads and 2 tails is:

for as many single arrangements as there are

but how many are there??

ways to

arrange 3

heads in

5 trials

Binomial: example

3

The probability of

each single

outcome (note:

they are all equal)

* P(tail)2 = 10 * (½)5 = 0. 312

Outcome

Probability

THHHT

(1/2)3 * (1/2)2

HHHTT

(1/2)3 * (1/2)2

TTHHH

(1/2)3 * (1/2)2

HTTHH

(1/2)3 * (1/2)2

HHTTH

(1/2)3 * (1/2)2

HTHHT

(1/2)3 * (1/2)2

THTHH

(1/2)3 * (1/2)2

HTHTH

(1/2)3 * (1/2)2

HHTHT

(1/2)3 * (1/2)2

THHTH

(1/2)3 * (1/2)2

HTHHT

(1/2)3 * (1/2)2

10 arrangements * (1/2)3 * (1/2)2

P(3 heads, 2 tails) =

Binomial: example

x

X= the number of heads obtained in 5 coin

tosses

number of heads

Binomial: example

Probability

= (0.551)2 * (0.449)4

(0.449)1 * (0.551)2 x (0.449)3 = (0.551)2 * (0.449)4

(0.449)2 * (0.551)2 x (0.449)2 = (0.551)2 * (0.449)4

(0.449)3 * (0.551)2 x (0.449)1 = (0.551)2 * (0.449)4

(0.449)4 * (0.551)2

= (0.551)2 * (0.449)4

p = P(Obama) = 0.551

As voters exit the polls on Nov. 4, you ask a

representative random sample of 6 voters if they

voted for Obama. If the true percentage of voters

who vote for Obama on Nov. 4 is 55.1%, what is

the probability that, in your sample, exactly 2

voted for Obama and 4 did not?

Solution:

n = 6,

Outcome

OONNNN

NOONNN

NNOONN

NNNOON

NNNNOO

.

.

*(0.449)4 = 18.5%

15 arrangements * (0.551)2 * (0.449)4

2

!

If I toss a coin 20 times, what’s the probability of getting

exactly 10 heads?

Binomial distribution: example

!

If I toss a coin 20 times, what’s the probability of getting 2 or

fewer heads?

Binomial distribution: example

You are performing a cohort study. If the probability of

developing disease in the exposed group is 0.05 for the study

duration, then if you sample (randomly) 500 exposed people,

how many do you expect to develop the disease? Give a

margin of error (+/- 1 standard deviation) for your estimate.

X ~ Binomial (500, 0.05)

(n=500, p=0.05)

E(X) = n p =500 * 0.05 = 25

Var(X) = n p (1- p) = 500 * 0.05 * 0.95 = 23.75

(25 – 4.87, 25 + 4.87)

Binomial distribution: example

What’s the probability that at most 10 exposed subjects

develop the disease?

This is asking for a CUMULATIVE PROBABILITY:

the probability that 0 subjects get the disease, or 1 or 2 or 3 or 4 or up to 10 subjects.

P(X≤10) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+….+ P(X=10)=

Binomial distribution: example

If the probability of being a

smoker among a group of

cases with lung cancer is 0.6,

! what’s the probability that

in a group of 8 cases you

have less than 2 smokers?

More than 5?

!

What are the expected

value and variance of the

number of smokers?

Examples of continuous

probability distributions:

µ

The normal distribution

The standard normal distribution

f(X)

The normal distribution

“Bell shaped”

Symmetrical

Mean, Median and

Mode are equal

Continuous random

variable X has infinite

range

Mean

Median

Mode

X

The normal random variable: X ~ N(µ, σ2)

The Normal Distribution (pdf)

Normal distribution is defined by

its mean and standard dev.

One standard

deviation from the

mean (σ)

For example, bell-curve (normal) distribution:

Mean (µ)

The Normal Distribution

f(X)

σ

X

Changing σ increases or

decreases the spread.

Changing µ shifts the

distribution left or right.

µ

The beauty of the normal curve:

No matter what µ and σ are

• the area between µ-σ and µ+σ is about 68%;

• the area between µ-2σ and µ+2σ is about 95%;

• the area between µ-3σ and µ+3σ is about 99.7%.Almost all

values fall within 3 standard deviations from the mean.

Sigma Rules! (from last lesson)

How good is this rule for real data?

Check with some data example:

! n=120 runners

! mean of the weight of the runners = 127.8

standard deviation (SD) = 15.5

!

Assuming normality……………68% of 120 = 0.68*120 = ~ 82 runners

In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

Assuming normality………95% of 120 =0 .95 x 120 = ~ 114 runners

In fact, 115 runners fall within 2-SD’s of the mean.

!

The Normal Distribution: example

68% of students will have scores between 450 and

550

95% will be between 400 and 600

99.7% will be between 350 and 650

Suppose SAT scores roughly follows a normal

distribution in the U.S. population of collegebound students (with range restricted to 200800), and the average math SAT is 500 with a

standard deviation of 50, then:

!

!

!

The standard normal random variable:

µ=1, σ=0,

Z ~ N(0,1)

The Normal Distribution: example

The Standard Normal Distribution (pdf)

!

!

BUT…

What if you wanted to know the math SAT score,

say x90 , corresponding to the 90th percentile (=

90% of students have math score lower or equal to

x90)?

Random variable X = math SAT scores

P( X ≤ x90) = 0.90 " …... what is x90 ?

Difficult to solve!!!….Yikes! … see later!

Z ̴ N(0,1)

The Standard Normal Distribution (Z)

µ=0, σ=1

The Standard Normal Distribution (pdf):

!

The Standard Normal Distribution (Z)

All normal distributions can be converted into

the standard normal curve by subtracting the

mean and dividing by the standard deviation:

Somebody calculated all the integrals for the standard

normal and put them in a table! So we never have to

integrate!

Even better, computers now do all the integration.

Standard normal distribution: example

of 575 is 1.5 standard deviations above the mean

What’s the probability of getting a math SAT score of 575 or

less, if µ=500 and σ=50 ?

X = math SAT score, x= 575, P(X ≤ x) = P(X ≤ 575 ) = ?

Solution:

Compute

#i.e., A score

From x=575, we find z =1.5 (the corresponding value

in the Standard Normal curve)

Look up z =1.5 in the Standard Normal chart and find

the corresponding probability P!

That is the solution! Since …..

P(X ≤ x) = P(Z ≤ z) and P(X ≤ 575 ) = P(Z ≤ 1.5)

200

2.0

X (µ = 100, σ = 50)

Z (µ = 0, σ = 1)

Comparing X and Z units

100

0

Z units and cumulative distribution function

What is the chance of obtaining a birth weight of 120 or

heavier when sampling birth records at random?

What is the chance of obtaining a birth weight of 141 oz or

mean of 109 oz (ounce) and a standard deviation of 13 oz,

If birth weights in a population are normally distributed with a

Exercise

Z units and cumulative distribution function

!

!

lighter?

"

#

Answer

What is the chance of obtaining a birth weight of

120 or lighter?

From the chart or computer software, find P for z = 0.846 !

z = 0.846 is positive

Find P(Z ≤ 0.846) =? This P corresponds to the left tail of the

Standard normal curve for values lower than z=0.846.

P(Z ≤ 0.846) = 0.8023 (80.23 %)

Answer

What is the chance of obtaining a birth weight of

141 oz or heavier when sampling birth records at

random?

From the chart or computer software, find P for z = 2.46 !

z = 2.46 is positive

Find P(Z ≥ 2.46) =? This P corresponds to an area, the right tail of the

Standard normal curve for values greater than z=2.46

P(Z ≥ 2.46 ) = 1- 0.9931 = 0.0069 (0.69 %)

"

"

Are my data “normal”?

Not all continuous random variables are

normally distributed!!

It is important to evaluate how well the data

are approximated by a normal distribution

Are my data normally distributed?

1.

2.

3.

4.

5.

Look at the histogram! Does it appear bell shaped?

Compute descriptive summary measures—are

mean, median, and mode similar?

Do 2/3 of observations lie within 1 std dev from

the mean? Do 95% of observations lie within 2 std

dev from the mean?

Look at a normal probability plot! Is it

approximately linear?

Run tests of normality (such as KolmogorovSmirnov). But, be cautious, highly influenced by

sample size!

Data from one class…

Normal probability plot math

love…

Not straight line!

Concave down

(indicates leftskewed) and bimodal

Not a normal

distribution!

Data from one class…

Norm prob. plot Coffee…

Not straight line!

Right-Skewed!

(concave up)

Not a normal

distribution!

7.7 +/- 4*1.8=

0.5 – 10+

Norm prob. plot Clinton…

Data from one class…

Closer to normal,

but slight left

skew…

Norm prob. plot Sleep

Closest to a

straight line…

It is reasonable to

assume a Normal

distribution!

Order the data.

Normal probability plot

The Normal Probability Plot

!

!

Find corresponding standardized normal quantile values.

Evaluate the plot for evidence of linearity.

Plot the observed data values against normal quantile

values.

!

!

!

!

If straight line, assume data come from a Normal

distribution.

Normal approximation of the binomial

When you have a binomial distribution where n is large

!

rule of thumb: mean = n p >5 ;

p not too small, not too big, closer to 0.5;

and p is middle-of-the road

!

then the binomial starts to look like a normal distribution !

Normal approximation to the binomial

1

2

3

4

5

6

7

8

Given a 0.6 risk of smoking, what is the probability of

being a smoker among a group of cases with lung cancer?

What’s the probability that in a group of 8 cases with lung

cancer you have less than 2 smokers?

.27

0

Starting to have a normal

shape even with fairly small n.

You can imagine that if n got

larger, the bars would get

thinner and thinner and this

would look more and more

like a continuous function,

with a bell curve shape. Here

n p = 8 * 0.6 = 4.8.

Results:

Math Love: Strong evidence of non-normality

(p < 0.01)

Coffee: Strong evidence of non-normality

(p < 0.005)

Clinton: Moderate evidence of non-normality

(p = 0.02)

Sleep: No evidence of non-normality (p > 0.25)

Formal tests for normality

!

!

!

!

!

1

2

3

4

5

6 7

8

Normal approximation to binomial

.27

0

What is the probability of fewer than 2 smokers?

Exact binomial probability (from before) = 0.00065 + 0.008 = 0.00865

Normal approximation probability:

µ=np

P(X < 2) = P(Z < -2.014) = 0.022

From the standard normal chart!

Exercise

By hand (yikes!):

σ =9.68

You are performing a cohort study. If the probability of developing

disease in the exposed group is 0.25 for the study duration. Then if

you sample (randomly) 500 exposed people, what’s the probability

that at most 120 people develop the disease?

!

OR, use normal approximation:

!

For example, if we sample 200 cases and find 60

smokers, X=60 but the observed proportion = 60/200 =

0.30.

The binomial distribution forms the basis of

statistics for proportions.

A proportion is just a binomial count divided by n.

Proportions…

From the standard normal chart!

P(Z < - 0.52)= 0.3015

µ = np = 500 (0.25)=125; σ2 = np(1-p) =93.75;

!

!

!

!

Statistics for proportions are similar to binomial

counts, but differ by a factor of n.

Stats for proportions

P-hat stands for “sample proportion.”

It all comes back to Z…

!

Statistics for proportions are based on a

normal distribution, because the binomial

can be approximated as normal if np > 5.