Statistical Inference Solutions: Estimation and Hypothesis Testing

Posted on Jan 10, 2026 in Mathematics

Q1 True/False (each needs a quick justification)

Q1(a) Binomial(100, 0.2) approximated by Normal N(20,16)

Steps:

Identify: X ~ Bin(n=100, p=0.2).
Compute mean: μ = np = 100(0.2) = 20.
Compute variance: σ² = np(1−p) = 100(0.2)(0.8) = 16.
Rule: For large n, Binomial ≈ Normal with same mean/variance (CLT-ish approximation).
Conclusion: Yes, approximate with N(20,16).

Q1(b) MAP estimator equals argmax log posterior

Steps:

MAP definition: θ̂MAP = argmaxθ f(θ|x).
Taking log does not move the maximizer because log is increasing.
So θ̂MAP = argmaxθ log f(θ|x). True.

Q1(c) p-value = 0.0001 means we can ALWAYS reject H0

Steps:

Decision requires comparing p-value to the chosen significance level α.
If α = 0.01: p < α → reject.
If α = 0.00001: p > α → do not reject.
Since α isn’t fixed as “always,” statement is false.

Q1(d) Xi ~ Geometric(0.2), n=80, P(X̄ > 6) ≈ 0.023

Steps:

For Geometric(p) on {1,2,3,…}:
μ = 1/p = 5.
σ² = (1−p)/p² = 0.8/0.04 = 20.
For sample mean X̄, n=80:
μX̄ = μ = 5.
Var(X̄) = σ²/n = 20/80 = 0.25.
SD(X̄) = √0.25 = 0.5.
Standardize: z = (6 − 5)/0.5 = 2.
Tail probability: P(X̄>6)=P(Z>2)=1−Φ(2)=0.023 (given table).
True.

Q2 “Should you come to class?” (CI + 2-population hypothesis test)

Given:
Group 1: n1=53, x̄1=46.48, S1=7.03
Group 2: n2=102, x̄2=38.15, S2=8.19

Q2(a) 99% CI for θ1 and θ2 using Normal (large N)

Steps:

Confidence level 99% → α=0.01 → z_(1−α/2)=z_0.995 = 2.576.
Compute standard errors:
SE1 = S1/√n1 = 7.03/√53
SE2 = S2/√n2 = 8.19/√102
CI formula: x̄ ± z·SE.
Group 1 margin: 2.576·7.03/√53 ≈ 2.49 → CI ≈ [46.48−2.49, 46.48+2.49] = [43.99, 48.97]
Group 2 margin: 2.576·8.19/√102 ≈ 2.08 → CI ≈ [38.15−2.08, 38.15+2.08] = [36.07, 40.23]

Q2(b) Hypothesis test: do attendees score higher?

Steps:

Write hypotheses:
H0: θ1 = θ2
H1: θ1 > θ2 (right-tailed)
Compute difference in sample means: Δ = x̄1 − x̄2 = 46.48 − 38.15 = 8.33
Compute SE of the difference (independent samples):
SEΔ = √( S1²/n1 + S2²/n2 )
Compute each term:
S1²/n1 = 7.03²/53 ≈ 0.932
S2²/n2 = 8.19²/102 ≈ 0.657
SEΔ = √(0.932+0.657)=√1.589≈1.26
Test statistic: z = Δ / SEΔ = 8.33/1.26 ≈ 6.61
P-value (right tail): p = P(Z>6.61) ≈ 2×10⁻10 (given)
Decision: compare to α=0.01 → p << α → reject H0
Conclusion sentence: strong evidence that attending class leads to higher average scores.
Q3 Poisson estimation (MLE + MAP + bias + consistency)
Given Xi i.I.D. Poisson(θ)
Q3(a) MLE θ̂ML
Steps:
1. Write pmf for one sample: f(xi;θ)= (θ^xi / xi!) e^(−θ).
2. Likelihood: L(θ)=∏_{i=1}^N (θ^xi / xi!) e^(−θ).
3. Log-likelihood:
  ℓ(θ)= Σ[ xi log θ − log(xi!) − θ ]
  = (Σ xi) log θ − Nθ + constant
4. Differentiate: dℓ/dθ = (Σ xi)/θ − N
5. Set to zero: (Σ xi)/θ = N → θ̂ML = (1/N) Σ xi = X̄
6. Check θ>0 domain: X̄ ≥ 0 so valid.
Q3(b) MAP with Θ ~ Exponential(3)
Steps:
1. Prior: fΘ(θ)=3 e^(−3θ), θ>0
2. Log prior: log fΘ(θ)=log 3 − 3θ
3. Log posterior ∝ log L + log prior:
  = (Σ xi) log θ − Nθ − 3θ + constant
  = (Σ xi) log θ − (N+3)θ + constant
4. Differentiate: (Σ xi)/θ − (N+3)
5. Set to zero: θ̂MAP = (Σ xi)/(N+3)
Q3(c) Unbiased? Consistent?
Steps (unbiasedness):
1. Use E[ΣXi]=Nθ (Poisson mean = θ).
2. E[θ̂MAP] = E[ΣXi]/(N+3) = (Nθ)/(N+3)
3. Compare to θ: (N/(N+3))θ ≠ θ → biased.
4. Direction: N/(N+3) < 1 → underestimates on average.
Steps (consistency):
1. As N→∞, N/(N+3) → 1.
2. Also (1/N)ΣXi → θ by LLN.
3. So θ̂MAP → θ, consistent.
Q4 Random processes + t-interval
Q4(a) A~N(2,3), B~N(0,1) independent, X(t)=At+B
Mean μ(t) steps:
1. μX(t)=E[X(t)]=E[At+B]
2. Linearity: = tE[A] + E[B]
3. E[A]=2, E[B]=0 → μX(t)=2t
Autocorrelation RX(t1,t2) steps:
1. RX(t1,t2)=E[X(t1)X(t2)]
2. Substitute X(t)=At+B:
  X(t1)X(t2) = (At1+B)(At2+B)
3. Expand: = A² t1 t2 + AB t1 + AB t2 + B²
4. Take expectation term-by-term:
  E[A²] t1 t2 + EAB + E[B²]
5. Independence: E[AB]=E[A]E[B]=2·0=0
6. Compute second moments:
  E[A²]=Var(A)+E[A]²=3+4=7
  E[B²]=Var(B)+E[B]²=1+0=1
7. Final: RX(t1,t2)=7 t1 t2 + 1
Q4(b) 95% CI for mean with 4 samples: −1,0,1,2 (σ unknown → t)
Steps:
1. n=4, df=n−1=3
2. Sample mean: x̄ = (−1+0+1+2)/4 = 0.5
3. Compute deviations from x̄:
  −1−0.5=−1.5 → square 2.25
  0−0.5=−0.5 → square 0.25
  1−0.5=0.5 → square 0.25
  2−0.5=1.5 → square 2.25
4. Sum squares = 2.25+0.25+0.25+2.25=5
5. Sample variance: S² = (1/(n−1))·5 = 5/3
6. Sample SD: S = √(5/3) ≈ 1.29
7. 95% CI uses t_{0.975,3} = 3.18 (given)
8. Standard error: SE = S/√n = 1.29/2 = 0.645
9. Margin: m = t·SE = 3.18·0.645 ≈ 2.05
10. CI: [x̄−m, x̄+m] = [0.5−2.05, 0.5+2.05] = [−1.55, 2.55]
FALL 2024 QUIZ 4 (Solutions-based steps)
Q1 True/False
Q1(a) Xi i.I.D Poisson(λ), Y=(X1+…+X5)/5 is Gaussian?
Steps:
1. Use property: sum of independent Poisson is Poisson.
  S = ΣXi ~ Poisson(5λ)
2. Y = S/5 is a scaled Poisson random variable, not Gaussian.
3. CLT requires large n; n=5 is too small.
4. False.
Q1(b) Xi i.I.D Bernoulli(Θ), with Θ~Uniform[0,1]. Is θ̂ML = θ̂MAP?
Steps:
1. MAP maximizes L(θ)·prior(θ).
2. Uniform[0,1] prior means prior(θ) is constant for all feasible θ.
3. Multiplying by a constant doesn’t change the argmax.
4. Therefore MAP = ML. True.
Q1(c) Estimator Θ̂ = (X1+…+XN+100)/N is consistent?
Steps:
1. Rewrite: Θ̂ = (1/N)ΣXi + 100/N
2. LLN: (1/N)ΣXi → E[Xi]=θ
3. 100/N → 0
4. So Θ̂ → θ. True.
Q1(d) CI is always centered at the true θ
Steps:
1. A confidence interval is computed from data (random), so it’s centered at an estimator.
2. Typical form: [θ̂−ε, θ̂+ε], not [θ−ε, θ+ε].
3. False.
Q2 Medical Trial (Bernoulli MLE + variance + CI + hypothesis test)
Given: n=10, successes=7. Model Xi ~ Bernoulli(θD).
Q2(a) MLE for θD
Steps:
1. For Bernoulli, likelihood: L(θ)=θ^{#success}(1−θ)^{#fail}
2. Here #success=7, #fail=3 → L(θ)=θ^7(1−θ)^3
3. Known result (or by derivative): θ̂ML = (#success)/n
4. θ̂D = 7/10 = 0.7
Q2(b) Estimate sample variance S² from Bernoulli data
Steps:
1. Treat data as 7 ones and 3 zeros.
2. Sample mean x̄ = 0.7
3. Sample variance formula: S²=(1/(n−1)) Σ(xi−x̄)²
4. Compute squared deviations:
  For a “1”: (1−0.7)² = 0.3² = 0.09 (happens 7 times → 0.63)
  For a “0”: (0−0.7)² = 0.7² = 0.49 (happens 3 times → 1.47)
5. Sum squares = 0.63 + 1.47 = 2.10
6. Divide by (n−1)=9: S² = 2.10/9 ≈ 0.233
7. S = √0.233 if needed later.
Q2(c) 99% confidence interval for θD (unknown variance → t)
Steps:
1. Confidence 99% → α=0.01
2. n=10 → df=9
3. Use t_{0.995,9} given in table (Ψ9^{-1}(0.995)=3.25 in the provided sheet).
4. Compute SE = S/√n = √0.233/√10
5. Margin m = t * SE = 3.25 * √0.233 / √10
6. CI = [0.7 − m, 0.7 + m]
7. If upper bound > 1, clip to 1 because probability can’t exceed 1.
Q2(d) Hypothesis test: drug better than 0.3 at α=0.05 (one-sided)
Steps:
1. Set hypotheses:
  H0: θD = 0.3
  H1: θD > 0.3 (right-tailed)
2. Compute test statistic using estimated SD (Normal approx):
  z = (θ̂ − 0.3)/(S/√n)
3. Plug in: θ̂=0.7, S²≈0.233, n=10
  z = 0.4 / √(0.233/10)
4. Compute p-value: p = P(Z > z) = 1 − Φ(z)
5. Compare to α=0.05:
  if p < 0.05 reject H0; else fail to reject
6. Write conclusion in words.
Q3 Gaussian estimation (MLE + MAP with Exponential prior)
Given Xi i.I.D. N(θ,1)
Q3(a) MLE for θ
Steps:
1. Write likelihood: L(θ)=∏ exp(−(xi−θ)²/2) (ignoring constants)
2. Take log: ℓ(θ)= Σ [ −(xi−θ)²/2 ]
3. Differentiate wrt θ:
  dℓ/dθ = Σ (xi − θ)
4. Set to zero: Σ(xi−θ)=0 → Σxi − Nθ = 0
5. Solve: θ̂ML = (1/N)Σxi = x̄
Q3(b) MAP when Θ~Exponential(1), θ≥0
Steps:
1. Prior: fΘ(θ)=e^{−θ}, θ≥0
2. Log prior: log fΘ(θ)=−θ (plus constant)
3. Log posterior: log L + log prior
  = [−1/2 Σ(xi−θ)²] − θ + constant
4. Differentiate:
  d/dθ = Σ(xi−θ) − 1 = Σxi − Nθ − 1
5. Set to zero: Σxi − Nθ − 1 = 0 → θ = (Σxi − 1)/N = x̄ − 1/N
6. Enforce domain θ≥0:
  θ̂MAP = max(0, x̄ − 1/N)
Q4 Back to basics (autocorrelation + CLT probability)
Q4(a) A~Uniform[0,1], X(t)=sin(At). Find R_X(t1,t2)=E[X(t1)X(t2)]
Steps:
1. Write definition: R = E[ sin(At1) sin(At2) ]
2. Since A uniform on [0,1]:
  E[g(A)] = ∫_0^1 g(a) da
3. So R = ∫_0^1 sin(at1) sin(at2) da
4. Use trig identity: 2 sin X sin Y = cos(X−Y) − cos(X+Y)
5. Then:
  sin(at1) sin(at2) = 1/2 [ cos(a(t1−t2)) − cos(a(t1+t2)) ]
6. Integrate term-by-term:
  ∫_0^1 cos(aC) da = sin(C)/C (for C ≠ 0)
7. Apply to both terms:
  ∫_0^1 cos(a(t1−t2)) da = sin(t1−t2)/(t1−t2)
  ∫_0^1 cos(a(t1+t2)) da = sin(t1+t2)/(t1+t2)
8. Multiply by 1/2 and subtract:
  R_X(t1,t2) = 1/2 [ sin(t1−t2)/(t1−t2) − sin(t1+t2)/(t1+t2) ]
Q4(b) Xi~Exponential(2), n=25, compute P(X̄ < 0.3) using CLT
Steps:
1. For Exponential(λ):
  mean μ = 1/λ = 1/2 = 0.5
  variance σ² = 1/λ² = 1/4 = 0.25
2. For sample mean X̄:
  μX̄ = 0.5
  Var(X̄)=σ²/n = 0.25/25 = 0.01
  SD(X̄)=√0.01=0.1
3. CLT approximation: X̄ ≈ Normal(0.5, 0.01)
4. Standardize:
  z = (0.3 − 0.5)/0.1 = −2
5. Probability:
  P(X̄<0.3)=Φ(−2)=0.0228 (≈2.3%)

Statistical Inference Solutions: Estimation and Hypothesis Testing

Q1 True/False (each needs a quick justification)

Q1(a) Binomial(100, 0.2) approximated by Normal N(20,16)

Q1(b) MAP estimator equals argmax log posterior

Q1(c) p-value = 0.0001 means we can ALWAYS reject H0

Q1(d) Xi ~ Geometric(0.2), n=80, P(X̄ > 6) ≈ 0.023

Q2 “Should you come to class?” (CI + 2-population hypothesis test)

Q2(a) 99% CI for θ1 and θ2 using Normal (large N)

Q2(b) Hypothesis test: do attendees score higher?

P-value (right tail): p = P(Z>6.61) ≈ 2×10⁻10 (given)

Q3 Poisson estimation (MLE + MAP + bias + consistency)

Q3(a) MLE θ̂ML

Q3(b) MAP with Θ ~ Exponential(3)

Q3(c) Unbiased? Consistent?

Q4 Random processes + t-interval

Q4(a) A~N(2,3), B~N(0,1) independent, X(t)=At+B

Q4(b) 95% CI for mean with 4 samples: −1,0,1,2 (σ unknown → t)

FALL 2024 QUIZ 4 (Solutions-based steps)

Q1 True/False

Q1(a) Xi i.I.D Poisson(λ), Y=(X1+…+X5)/5 is Gaussian?

Q1(b) Xi i.I.D Bernoulli(Θ), with Θ~Uniform[0,1]. Is θ̂ML = θ̂MAP?

Q1(c) Estimator Θ̂ = (X1+…+XN+100)/N is consistent?

Q1(d) CI is always centered at the true θ

Q2 Medical Trial (Bernoulli MLE + variance + CI + hypothesis test)

Q2(a) MLE for θD

Q2(b) Estimate sample variance S² from Bernoulli data

Q2(c) 99% confidence interval for θD (unknown variance → t)

Q2(d) Hypothesis test: drug better than 0.3 at α=0.05 (one-sided)

Q3 Gaussian estimation (MLE + MAP with Exponential prior)

Q3(a) MLE for θ

Q3(b) MAP when Θ~Exponential(1), θ≥0

Q4 Back to basics (autocorrelation + CLT probability)

Q4(a) A~Uniform[0,1], X(t)=sin(At). Find R_X(t1,t2)=E[X(t1)X(t2)]

Q4(b) Xi~Exponential(2), n=25, compute P(X̄ < 0.3) using CLT

Recent Notes

Subjects

Publicidad