Essential Statistics: Sampling, Distributions, and Testing
1. Sampling and Basic Concepts
Population: The entire group being studied.
Sample: A subset of the population.
Example
- Population: All university students.
- Sample: 200 students surveyed.
Parameter vs. Statistic
- Parameter: A numerical value describing a population.
- Statistic: A numerical value derived from a sample.
Examples:
- p = True population proportion.
- p̂ (p-hat) = Sample proportion.
Sample Proportion Formula
p̂ = x / n
Where:
- x = Number of successes.
- n = Sample size.
Example: 48 support a policy out of 80.
p̂ = 48 / 80 = 0.60
Sampling Methods
- Simple Random Sampling: Every individual has an equal probability of selection.
- Stratified Sampling: Divide the population into groups and sample from each.
- Cluster Sampling: Randomly choose entire groups and survey everyone in them.
- Systematic Sampling: Select every k-th observation.
2. Normal Distribution and Z-Scores
Z-score formula: z = (x − μ) / σ
Where:
- x = Value.
- μ = Population mean.
- σ = Population standard deviation.
Meaning: The number of standard deviations a value is from the mean.
Example
If μ = 70, σ = 10, and x = 85:
z = (85 − 70) / 10 = 1.5
Interpretation: The score is 1.5 standard deviations above the mean.
Z-Table and Probabilities
The Z-table provides P(Z < z), which is the probability to the left of z.
Right tail probability: P(Z > z) = 1 − P(Z < z)
Probability Between Two Values
To find P(a < X < b):
- Convert both numbers to z-scores.
- Find probabilities from the Z-table.
- Subtract the smaller probability from the larger one.
68–95–99.7 Rule
- Within 1 standard deviation: ~68%
- Within 2 standard deviations: ~95%
- Within 3 standard deviations: ~99.7%
3. Confidence Interval for a Proportion
Formula: p̂ ± z × √[p̂(1 − p̂) / n]
Standard Error (SE): SE = √[p̂(1 − p̂) / n]
Example Calculation
- p̂ = 120 / 200 = 0.60
- SE = √[(0.6)(0.4) / 200] ≈ 0.0346
- Margin of Error (95% confidence, z=1.96): 1.96 × 0.0346 ≈ 0.068
- Confidence Interval: 0.60 ± 0.068 = [0.532, 0.668]
Interpretation: We are 95% confident that the true population proportion lies between 0.532 and 0.668.
4. Hypothesis Testing for Proportion
- Hypotheses: H₀: p = p₀; Hₐ: p ≠ p₀
- Sample Proportion: p̂ = x / n
- Standard Error: Use p₀ (not p̂): SE = √[p₀(1 − p₀) / n]
- Test Statistic: z = (p̂ − p₀) / SE
- P-value: Determine from Z-table.
- Decision Rule: If p-value < α, reject H₀.
Common Exam Mistakes
- Confidence Intervals: Use p̂ for Standard Error.
- Hypothesis Testing: Use p₀ for Standard Error.
Fast Exam Checklist
- Identify the question type.
- Compute p̂ = x / n.
- Compute the standard error.
- Calculate the z-score or confidence interval.
- Interpret the final result.
Core Formulas
- Sample proportion: p̂ = x / n
- Z-score: z = (x − μ) / σ
- Confidence interval: p̂ ± z × √[p̂(1 − p̂) / n]
- Hypothesis test: z = (p̂ − p₀) / √[p₀(1 − p₀) / n]
