close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
Statistical Inference
•
Statistical Inference is the process of
making judgments about a population
based on properties of the sample
•
•
intuition only goes so far towards making
decisions of this nature
experts can offer conflicting opinions using
the same data
Methods of Statistical Inference
•
Estimation
•
•
Predict the value of an unknown parameter
with specified confidence
Decision Making
•
Decide between opposing statements about
the population (parameter)
Estimation
•
Estimating a Population Mean (m)
•
Point Estimate
•
•
•
•
Interval Estimate
•
•
•
Mean, median, mode, etc.
Easy to calculate and use
Random in value (changes from sample to sample)
Range of values containing parameter
Unknown accuracy within range
Confidence Interval
•
Interval with known probability of containing truth
Estimating m (s known)
•
(1-a) Confidence Interval for μ (n≥30):
•
The central limit theorem provides a sampling
distribution for the sample mean in cases of sufficient
sample size (n ≥ 30) that can be used to find a
confidence interval for μ with margin of error E is:
x  E  m  x  E,
•
where
EZ
An alternative form is:
x  Za / 2



a /2 

s
n
s
n





Confidence Intervals
•
The level of confidence and sample size both
effect the width of the confidence interval.
•
•
•
Increasing the level of confidence results in a wider
confidence interval.
Increasing the sample size results in a narrower
confidence interval.
Setting the level of confidence too high results in a
confidence interval that is too wide to be of any
practical use.
•
i.e. 100% confidence intervals are from -∞ to ∞
Necessary Sample Size for
Estimating the Mean (m)
•
The sample size necessary to estimate the mean
(μ) with a margin of error E and (1-α) level of
confidence is:
 Za / 2  s 
n

E


2
What if s is unknown?
•
William Gossett - a chemist for Guiness Brewery
in the early 1900's discovered that substituting s
for σ in the margin of error formula,
 s 
E  Za / 2 

 n
resulted in a confidence interval that was too
narrow for the desired level of confidence (1- α).
•
•
Resulted in increased error rate in statistical inference.
Error rate particularly noticeable for small samples
Student t Distribution
•
Gossett discovered that the statistic,
t
xm
s
n
has a Student t distribution with degree of
freedom equal to n-1. The t distribution:
•
•
•
is symmetric about 0
has heavier tails than the normal distribution
converges to the normal distribution as n∞.
Estimating m (s unknown)
•
If the sample data is from a normal distribution
or if the sample is of sufficient size (n≥30), then
the (1-a) Confidence Interval for μ is:
 s 
x  ta / 2 

 n
Why settle for small sample size?
•
Can’t you just collect more data?
•
Samples can be expensive to obtain.
•
•
Samples can be difficult to obtain.
•
•
rare specimen, chemical process
Samples can be time consuming to obtain.
•
•
shuttle launch, batch run
cancer research, effects of time
Ethical questions can arise.
•
medical research can't continue if initial results look bad
Estimating Population Proportion (p)
•
Can be thought of as the binomial probability of
success if randomly sampling from the
population.
•
Let p be the proportion of the population with some
characteristic of interest. The characteristic is either
present or it is not present, so the number with the
characteristic is binomial.
Estimating Population Proportion (p)
•
The central limit theorem applies to a binomial
random variable if the expected number of
successes (np) and expected number of failures
(nq) are at least 5.
•
•
The number of successes (X) is normally distributed
with mean np and variance npq.
The proportion of interest is also normally distributed,
with a mean of p and a variance of pq/n.
(1-a) Confidence Interval for
a Population Proportion (p)
•
The point estimate for proportion is:
pˆ 
x
n
•
The (1-a) Confidence Level for p is:
ˆE
p
where
E  za / 2
pq
n
Necessary Sample Size for
Estimating the Proportion (p)
•
The sample size necessary to estimate the
proportion (p) a margin of error E and (1-a) level
of confidence (a) with prior knowledge of p and
q and (b) no prior knowledge of p and q is.
2
 Za / 2  * *
a) n  
 p q
 E 
 Za / 2 
b) n  

 2 E 
2
1/--страниц
Пожаловаться на содержимое документа