Lecture 6 .pdf

Nom original: Lecture 6.pdfTitre: Lecture6Auteur: Giuliana Cortese

Ce document au format PDF 1.3 a été généré par pdftopdf filter / Mac OS X 10.9.2 Quartz PDFContext, et a été envoyé sur fichier-pdf.fr le 12/10/2015 à 18:38, depuis l'adresse IP 93.34.x.x. La présente page de téléchargement du fichier a été vue 792 fois.
Taille du document: 23.8 Mo (26 pages).
Confidentialité: fichier public

Aperçu du document

Lecture 6
Probability Distributions

Random Variable,
probability distribution

Probability is how frequently we expect different
outcomes to occur if we repeat the experiment over
and over (“frequentist” view)

A random variable is a quantity whose value is not
fixed, but can take on different values with different

A probability distribution is used to describe the
probabilities of different values occurring.

Random Variable (example)

Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a
Random Variable "X":

the outcome is random (not fixed) and there are 2 possible outcomes,
each of which occur with probability 0.5 (50%).

Rolling a die: the outcome is random (not fixed) and there are 6
possible outcomes, each of which occur with probability one-sixth.


Examples: Dead/alive, treatment/placebo, dice,
counts, etc.

Discrete random variables have a countable
number of outcomes

Random variables can be discrete
or continuous

Choose a person at random, one random variable may be the
person's height.




Continuous random variables have an
infinite continuum of possible values.

Examples: blood pressure, weight, the speed of
a car, a real number taken from 1 to 6, time to
an event.

Random Variable (r.v.)
A Random Variable has a whole set of values, and it
could take on any of those values, randomly.
Example: Throw a die once
Random Variable:
X = "The score shown on the top face".
X could be 1, 2, 3, 4, 5 or 6
So the Sample Space, the set of all possible values,
is {1, 2, 3, 4, 5, 6}
Symbols: X = random variable
x = a single value that the r.v. can assume




Can be interpreted as the likelihood that
the event will occur.
Probability is a number between 0 and 1.
closer to 1: a value x is more likely to
closer to 0: the value x is less likely to
Sum of the probabilities of all values
must be 1.



Probability distribution

A probability distribution is an
assignment of probabilities
to each distinct value of a discrete
random variable,
to each interval of values of a continuous
random variable.

Probability functions

Sum of the probabilities of all values must be 1.
Probability density function (pdf): a probability
function maps the possible values of x against their
respective probabilities of occurrence,
! Discrete case
p(x) = P( X= x)
! Continuous case
! The probability that X is any exact particular
value x is 0;
! Probabilities are given for a range of values,
rather than a particular value (e.g., the
probability of getting a math score between
700 and 800 is 2%).



Probability density function (pdf)
The pdf, p(x), is a mathematical function that sum to 1.

Discrete case


Continuous case


The pdf, f(x), of a continuous random variable is a continuous
mathematical function that integrates to 1.
The probabilities associated with continuous functions are just
areas under the curve (integrals!).

Cumulative distribution function
A = given value

P(x ≤ A)

Expected Value and Variance

Cumulative distribution function:


All probability distributions are characterized by
an expected value (mean) and a variance
(standard deviation squared).



these symbols are used interchangeably
these symbols are used interchangeably

Var(X) = σ2


E(X) = µ

Typical notation


Expected value, or mean
The expected value (mean) µ is the weighted average
of the random variable X, where the weights are the
Imagine placing the masses p(x) at the points X on a
beam; the balance point of the beam is the expected
value of x.

Expected value, formally

Continuous case:



Sample Mean is a special case of
Expected Value…

**A few notes about Expected Value
If c = a constant number (i.e., not a variable) and
X and Y are any random variables.
E(c) = c there’s no randomness
E(c*X) = c * E(X)
E(c + X) = c + E(X)
E(X + Y)= E(X) + E(Y)

E(X+Y)= E(X) + E(Y)

If the casino throws in a free drink worth exactly $5.00 every time
you play a game, you always expect to (and do) gain an extra
$5.00 regardless of the outcome of the game.

E(c + X)=c + E(X)

If the casino charges $10 per game instead of $1, then the casino
expects to make 10 times as much on average from the game


**A few notes about Expected Value (Examples)



A certain lottery (also known as a tax on people
who are bad at math…) works by picking 6
numbers from 1 to 49.
It costs $1.00 to play the lottery, and if you win,
you win $2 million after taxes.

Example: the lottery (1)

If you play the lottery twice, you expect to lose: -$.86 + -$.86.




If you play the lottery once, what are your
expected winnings or losses?

Example: the lottery (2)
Calculate the probability of winning in 1 try:

7.2 x 10--8


The probability function (note that sums to 1.0):
+ 2 million

Example: the lottery (3)
E(X) = P(win) * $2,000,000 + P(lose) *(-$1.00 )
= 7.2 * 10-8 * 2,000,00 + 0.999999928 * (-1) = 0.144 - 0.999999928
= - $0.86
Negative expected value is never good! You shouldn’t play if you
expect to lose money!
If you play the lottery every week for 10 years (52 * 10), what are your
expected winnings or losses?
520 * (-0.86) = -$447.20

Variance/standard deviation


“The expected squared distance/ deviation
from the mean”

Discrete case:

Continuous case:

Handy calculation formula!

For discrete variables ...

E(x) = µ

E(cX) = cE(X),

E(c) = c

Use rules of expected value: E(X+Y)= E(X) + E(Y)

(your calculation formula!)
E(x-µ)2 = E(x2–2µx + µ2)
=E(x2) – E(2µx) +E(µ2)
= E(x2) – 2µE(x) +µ2
= E(x2) – 2µ µ +µ2
= E(x2) – µ2
= E(x2) – [E(x)]2



Similarity to empirical variance

… weighted squared deviation from the mean !

**A few notes about Variance

If c = constant number (i.e., not a variable) and X and Y are
random variables, then

Var(c) = 0
Var (c +X) = Var(X)
Var(c X)= c2 Var(X)
Var(X + Y)= Var(X) + Var(Y)
ONLY IF X and Y are independent!!!!
Var(X + Y)= Var(X) + Var(Y) + 2 Cov(X,Y)
IF X and Y are not independent.

Var (c + X) = Var(X)

(c constant)

Adding a constant to every outcome of a random variable doesn’t
change the variability. It just shifts the whole distribution by c.


(c constant)

If everybody grew 5 inches suddenly, the variability in the
population would still be the same.

Var(c X) = c2 Var(X)

Multiplying every outcome of the random variable by c makes its
distribution c-times wider, which corresponds to c2- times the
variance (deviation squared).

If everyone suddenly became twice as tall, there’d be twice the
deviation and 4 times the variance in heights in the population.

Var(X+Y)= Var(X) + Var(Y)

Var(X+Y)= Var(X) + Var(Y)
ONLY IF X and Y are independent!!!!!!!!

With two random variables, you have more opportunity for
variation, unless they vary together (are dependent, and
have covariance):
Var(X +Y)= Var(X) + Var(Y) + 2 Cov(X, Y)



average distance from the mean

What’s the variance and standard deviation of
the roll of a die?




If you bet $1.00 that an odd number comes up, you win
or lose $1.00 according to whether or not that event

A roulette wheel has the numbers 1 through 36, as well
as 0 and 00.

Exercise (1)


If X denotes your net gain, X=1 with probability 18/38
and X= -1 with probability 20/38. The mean is -$0.053.
What’s the variance of X?


Standard deviation is $0.998. Interpretation: On average,
you’re either 1 dollar above or 1 dollar below the mean,
which is just below zero (µ= -0.053). Makes sense!


Exercise (2)

Find the variance and standard deviation for the number of ships to
arrive at the harbor (recall that the mean is 11.3).

= 129.5 – (11.3)2 = 1.81

The sample covariance:

The Sample Covariance

Interpretation: On an average day, we expect 11.3 ships to arrive in the harbor, plus
or minus 1.35. This gives you a feel for what would be considered a usual day!




Covariance: joint probability

The sample covariance:

The Sample Covariance

The covariance measures the strength of the
linear relationship between two variables
X and Y (the dependence between X and Y).
The covariance:


Interpreting Covariance

Cov(X,Y) > 0

X and Y are inversely correlated

X and Y are positively correlated

Covariance between two random variables:

Cov(X,Y) < 0

X and Y are independent


Cov(X,Y) = 0

Examples of discrete probability

The binomial distribution






An important discrete distribution
Yes/no outcomes
(dead/alive, treated/untreated, smoker/nonsmoker, sick/well, etc.)

Binomial: Definitions
n fixed independent trials, are performed
! 15 tosses of a coin; 20 patients; 1000 people surveyed
Binary random variable is observed in each trial. The result can be:
! a “success” with probability p.
! a “failure” with probability 1- p.
Constant probability p in every trial
! e.g., Probability of getting a tail is the same each time we toss
the coin
The total number of successes, X, is a binomial random variable
with parameters n and p : X ~ Bin (n, p)
The probability that X=r (i.e., that there are exactly r successes in n
trials) is:

Binomial: Definitions

general pattern ! if you have only two possible outcomes (call
them 1/0 or yes/no or success/failure) in n independent trials, then
the probability of exactly X “successes” =
n = number of trials

1-p = probability
of failure

Definitions: Bernoulli distribution

Bernouilli trial: If there is only 1 trial with
probability of success p and probability of failure
1-p, this is called a Bernouilli distribution. (special
case of the binomial with n =1)
Probability of success:
Probability of failure:

Binomial: expected value and variance
If X follows a binomial distribution with
parameters n and p: X ~ Bin (n, p)

Binomial: expected value and variance
If X follows a binomial distribution with
parameters n and p: X ~ Bin (n, p)

Bernoulli: expected value and a variance
For Bernoulli (n=1)
E(X) = p
Var (X) = p(1-p)

Variance Proof
For Y~ Bernouilli (p)
Y =1 if yes
Y = 0 if no

For X ~ Bin (n,p)

Yi ~ Bernouilli (p)



Binomial: example
Take the example of 5 coin tosses. What’s the probability that
you get exactly 3 heads in 5 coin tosses?
X= the number of heads observed in 5 coin tosses
_ One way to get exactly 3 heads: HHHTT
What’s the probability of this exact arrangement?

P(head)* P(head)* P(head) * P(tail)* P(tail) =

Binomial: example
is the probability of each single outcome that
has exactly 3 heads and 2 tails (no matter the order).
So, the overall probability of 3 heads and 2 tails is:

for as many single arrangements as there are
but how many are there??

ways to
arrange 3
heads in
5 trials

Binomial: example


The probability of
each single
outcome (note:
they are all equal)

* P(tail)2 = 10 * (½)5 = 0. 312

(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
(1/2)3 * (1/2)2
10 arrangements * (1/2)3 * (1/2)2

P(3 heads, 2 tails) =

Binomial: example


X= the number of heads obtained in 5 coin

number of heads

Binomial: example

= (0.551)2 * (0.449)4
(0.449)1 * (0.551)2 x (0.449)3 = (0.551)2 * (0.449)4
(0.449)2 * (0.551)2 x (0.449)2 = (0.551)2 * (0.449)4
(0.449)3 * (0.551)2 x (0.449)1 = (0.551)2 * (0.449)4
(0.449)4 * (0.551)2
= (0.551)2 * (0.449)4

p = P(Obama) = 0.551

As voters exit the polls on Nov. 4, you ask a
representative random sample of 6 voters if they
voted for Obama. If the true percentage of voters
who vote for Obama on Nov. 4 is 55.1%, what is
the probability that, in your sample, exactly 2
voted for Obama and 4 did not?

n = 6,


*(0.449)4 = 18.5%

15 arrangements * (0.551)2 * (0.449)4


If I toss a coin 20 times, what’s the probability of getting
exactly 10 heads?

Binomial distribution: example


If I toss a coin 20 times, what’s the probability of getting 2 or
fewer heads?

Binomial distribution: example

You are performing a cohort study. If the probability of
developing disease in the exposed group is 0.05 for the study
duration, then if you sample (randomly) 500 exposed people,
how many do you expect to develop the disease? Give a
margin of error (+/- 1 standard deviation) for your estimate.

X ~ Binomial (500, 0.05)
(n=500, p=0.05)
E(X) = n p =500 * 0.05 = 25
Var(X) = n p (1- p) = 500 * 0.05 * 0.95 = 23.75
(25 – 4.87, 25 + 4.87)

Binomial distribution: example
What’s the probability that at most 10 exposed subjects
develop the disease?
This is asking for a CUMULATIVE PROBABILITY:
the probability that 0 subjects get the disease, or 1 or 2 or 3 or 4 or up to 10 subjects.
P(X≤10) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+….+ P(X=10)=

Binomial distribution: example
If the probability of being a
smoker among a group of
cases with lung cancer is 0.6,
! what’s the probability that
in a group of 8 cases you
have less than 2 smokers?
More than 5?
What are the expected
value and variance of the
number of smokers?

Examples of continuous
probability distributions:


The normal distribution
The standard normal distribution


The normal distribution

“Bell shaped”
Mean, Median and
Mode are equal
Continuous random
variable X has infinite



The normal random variable: X ~ N(µ, σ2)
The Normal Distribution (pdf)

Normal distribution is defined by
its mean and standard dev.

One standard
deviation from the
mean (σ)

For example, bell-curve (normal) distribution:

Mean (µ)

The Normal Distribution



Changing σ increases or
decreases the spread.

Changing µ shifts the
distribution left or right.


The beauty of the normal curve:
No matter what µ and σ are
• the area between µ-σ and µ+σ is about 68%;
• the area between µ-2σ and µ+2σ is about 95%;
• the area between µ-3σ and µ+3σ is about 99.7%.Almost all
values fall within 3 standard deviations from the mean.
Sigma Rules! (from last lesson)

How good is this rule for real data?

Check with some data example:
! n=120 runners
! mean of the weight of the runners = 127.8
standard deviation (SD) = 15.5

Assuming normality……………68% of 120 = 0.68*120 = ~ 82 runners
In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

Assuming normality………95% of 120 =0 .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2-SD’s of the mean.

The Normal Distribution: example

68% of students will have scores between 450 and
95% will be between 400 and 600
99.7% will be between 350 and 650

Suppose SAT scores roughly follows a normal
distribution in the U.S. population of collegebound students (with range restricted to 200800), and the average math SAT is 500 with a
standard deviation of 50, then:


The standard normal random variable:
µ=1, σ=0,
Z ~ N(0,1)

The Normal Distribution: example

The Standard Normal Distribution (pdf)


What if you wanted to know the math SAT score,
say x90 , corresponding to the 90th percentile (=
90% of students have math score lower or equal to

Random variable X = math SAT scores
P( X ≤ x90) = 0.90 " …... what is x90 ?
Difficult to solve!!!….Yikes! … see later!

Z ̴ N(0,1)

The Standard Normal Distribution (Z)
µ=0, σ=1

The Standard Normal Distribution (pdf):


The Standard Normal Distribution (Z)
All normal distributions can be converted into
the standard normal curve by subtracting the
mean and dividing by the standard deviation:

Somebody calculated all the integrals for the standard
normal and put them in a table! So we never have to
Even better, computers now do all the integration.

Standard normal distribution: example

of 575 is 1.5 standard deviations above the mean

What’s the probability of getting a math SAT score of 575 or
less, if µ=500 and σ=50 ?
X = math SAT score, x= 575, P(X ≤ x) = P(X ≤ 575 ) = ?

#i.e., A score

From x=575, we find z =1.5 (the corresponding value
in the Standard Normal curve)
Look up z =1.5 in the Standard Normal chart and find
the corresponding probability P!
That is the solution! Since …..
P(X ≤ x) = P(Z ≤ z) and P(X ≤ 575 ) = P(Z ≤ 1.5)


X (µ = 100, σ = 50)
Z (µ = 0, σ = 1)

Comparing X and Z units


Z units and cumulative distribution function

What is the chance of obtaining a birth weight of 120 or

heavier when sampling birth records at random?

What is the chance of obtaining a birth weight of 141 oz or

mean of 109 oz (ounce) and a standard deviation of 13 oz,

If birth weights in a population are normally distributed with a


Z units and cumulative distribution function







What is the chance of obtaining a birth weight of
120 or lighter?

From the chart or computer software, find P for z = 0.846 !

z = 0.846 is positive
Find P(Z ≤ 0.846) =? This P corresponds to the left tail of the
Standard normal curve for values lower than z=0.846.
P(Z ≤ 0.846) = 0.8023 (80.23 %)


What is the chance of obtaining a birth weight of
141 oz or heavier when sampling birth records at

From the chart or computer software, find P for z = 2.46 !
z = 2.46 is positive
Find P(Z ≥ 2.46) =? This P corresponds to an area, the right tail of the
Standard normal curve for values greater than z=2.46
P(Z ≥ 2.46 ) = 1- 0.9931 = 0.0069 (0.69 %)



Are my data “normal”?
Not all continuous random variables are
normally distributed!!
It is important to evaluate how well the data
are approximated by a normal distribution

Are my data normally distributed?




Look at the histogram! Does it appear bell shaped?
Compute descriptive summary measures—are
mean, median, and mode similar?
Do 2/3 of observations lie within 1 std dev from
the mean? Do 95% of observations lie within 2 std
dev from the mean?
Look at a normal probability plot! Is it
approximately linear?
Run tests of normality (such as KolmogorovSmirnov). But, be cautious, highly influenced by
sample size!

Data from one class…

Normal probability plot math

Not straight line!

Concave down
(indicates leftskewed) and bimodal

Not a normal

Data from one class…

Norm prob. plot Coffee…

Not straight line!
(concave up)
Not a normal

7.7 +/- 4*1.8=

0.5 – 10+

Norm prob. plot Clinton…

Data from one class…

Closer to normal,
but slight left

Norm prob. plot Sleep

Closest to a
straight line…

It is reasonable to
assume a Normal

Order the data.

Normal probability plot

The Normal Probability Plot


Find corresponding standardized normal quantile values.

Evaluate the plot for evidence of linearity.

Plot the observed data values against normal quantile



If straight line, assume data come from a Normal

Normal approximation of the binomial
When you have a binomial distribution where n is large


rule of thumb: mean = n p >5 ;

p not too small, not too big, closer to 0.5;

and p is middle-of-the road


then the binomial starts to look like a normal distribution !

Normal approximation to the binomial









Given a 0.6 risk of smoking, what is the probability of
being a smoker among a group of cases with lung cancer?
What’s the probability that in a group of 8 cases with lung
cancer you have less than 2 smokers?


Starting to have a normal
shape even with fairly small n.
You can imagine that if n got
larger, the bars would get
thinner and thinner and this
would look more and more
like a continuous function,
with a bell curve shape. Here
n p = 8 * 0.6 = 4.8.

Math Love: Strong evidence of non-normality
(p < 0.01)
Coffee: Strong evidence of non-normality
(p < 0.005)
Clinton: Moderate evidence of non-normality
(p = 0.02)
Sleep: No evidence of non-normality (p > 0.25)

Formal tests for normality









6 7


Normal approximation to binomial


What is the probability of fewer than 2 smokers?

Exact binomial probability (from before) = 0.00065 + 0.008 = 0.00865

Normal approximation probability:

P(X < 2) = P(Z < -2.014) = 0.022

From the standard normal chart!


By hand (yikes!):

σ =9.68

You are performing a cohort study. If the probability of developing
disease in the exposed group is 0.25 for the study duration. Then if
you sample (randomly) 500 exposed people, what’s the probability
that at most 120 people develop the disease?

OR, use normal approximation:


For example, if we sample 200 cases and find 60
smokers, X=60 but the observed proportion = 60/200 =

The binomial distribution forms the basis of
statistics for proportions.
A proportion is just a binomial count divided by n.


From the standard normal chart!

P(Z < - 0.52)= 0.3015

µ = np = 500 (0.25)=125; σ2 = np(1-p) =93.75;





Statistics for proportions are similar to binomial
counts, but differ by a factor of n.

Stats for proportions

P-hat stands for “sample proportion.”

It all comes back to Z…

Statistics for proportions are based on a
normal distribution, because the binomial
can be approximated as normal if np > 5.

Aperçu du document Lecture 6.pdf - page 1/26
Lecture 6.pdf - page 3/26
Lecture 6.pdf - page 4/26
Lecture 6.pdf - page 5/26
Lecture 6.pdf - page 6/26

Télécharger le fichier (PDF)

Lecture 6.pdf (PDF, 23.8 Mo)

Formats alternatifs: ZIP

Documents similaires

lecture 6
statistics equations answers quickstudy
ibhm 633 672
ibhm 722 728
lecture 1
6 alternativerisk investment

Sur le même sujet..