Essential Statistics: Sampling, Distributions, and Testing

1. Sampling and Basic Concepts

Population: The entire group being studied.
Sample: A subset of the population.

Example

  • Population: All university students.
  • Sample: 200 students surveyed.

Parameter vs. Statistic

  • Parameter: A numerical value describing a population.
  • Statistic: A numerical value derived from a sample.

Examples:

  • p = True population proportion.
  • (p-hat) = Sample proportion.

Sample Proportion Formula

p̂ = x / n

Where:

  • x = Number of successes.
  • n = Sample size.

Example: 48 support a policy out of 80.
p̂ = 48 / 80 = 0.60


Sampling Methods

  • Simple Random Sampling: Every individual has an equal probability of selection.
  • Stratified Sampling: Divide the population into groups and sample from each.
  • Cluster Sampling: Randomly choose entire groups and survey everyone in them.
  • Systematic Sampling: Select every k-th observation.

2. Normal Distribution and Z-Scores

Z-score formula: z = (x − μ) / σ

Where:

  • x = Value.
  • μ = Population mean.
  • σ = Population standard deviation.

Meaning: The number of standard deviations a value is from the mean.


Example

If μ = 70, σ = 10, and x = 85:
z = (85 − 70) / 10 = 1.5
Interpretation: The score is 1.5 standard deviations above the mean.


Z-Table and Probabilities

The Z-table provides P(Z < z), which is the probability to the left of z.

Right tail probability: P(Z > z) = 1 − P(Z < z)


Probability Between Two Values

To find P(a < X < b):

  1. Convert both numbers to z-scores.
  2. Find probabilities from the Z-table.
  3. Subtract the smaller probability from the larger one.

68–95–99.7 Rule

  • Within 1 standard deviation: ~68%
  • Within 2 standard deviations: ~95%
  • Within 3 standard deviations: ~99.7%

3. Confidence Interval for a Proportion

Formula: p̂ ± z × √[p̂(1 − p̂) / n]

Standard Error (SE): SE = √[p̂(1 − p̂) / n]


Example Calculation

  1. p̂ = 120 / 200 = 0.60
  2. SE = √[(0.6)(0.4) / 200] ≈ 0.0346
  3. Margin of Error (95% confidence, z=1.96): 1.96 × 0.0346 ≈ 0.068
  4. Confidence Interval: 0.60 ± 0.068 = [0.532, 0.668]

Interpretation: We are 95% confident that the true population proportion lies between 0.532 and 0.668.


4. Hypothesis Testing for Proportion

  1. Hypotheses: H₀: p = p₀; Hₐ: p ≠ p₀
  2. Sample Proportion: p̂ = x / n
  3. Standard Error: Use p₀ (not ): SE = √[p₀(1 − p₀) / n]
  4. Test Statistic: z = (p̂ − p₀) / SE
  5. P-value: Determine from Z-table.
  6. Decision Rule: If p-value < α, reject H₀.

Common Exam Mistakes

  • Confidence Intervals: Use for Standard Error.
  • Hypothesis Testing: Use p₀ for Standard Error.

Fast Exam Checklist

  1. Identify the question type.
  2. Compute p̂ = x / n.
  3. Compute the standard error.
  4. Calculate the z-score or confidence interval.
  5. Interpret the final result.

Core Formulas

  • Sample proportion: p̂ = x / n
  • Z-score: z = (x − μ) / σ
  • Confidence interval: p̂ ± z × √[p̂(1 − p̂) / n]
  • Hypothesis test: z = (p̂ − p₀) / √[p₀(1 − p₀) / n]