Statistical Hypothesis Testing and JMP Analysis

1. Background, Problem Statement & Goals

This section aligns with the 4M framework: Motivation → Method → Mechanics → Message.

A. Define the Business Problem

Clearly explain the question you are trying to answer. Examples:

  • How can we estimate home prices?
  • Which marketing channel generates the highest sales?
  • Does employee experience affect salary?

B. Identify the Response Variable (Dependent Variable)

The response variable is what you are predicting or explaining (e.g., Sales Revenue, Home Price, GPA, Customer Satisfaction). Explain why this variable matters and how understanding its variation helps business decisions.

C. Identify Explanatory Variables (Independent Variables)

These variables help explain changes in the response variable (e.g., Advertising budget, Years of experience, Square footage, Customer age).

2. Hypotheses

A statistical hypothesis is a claim about a population parameter.

A. Null Hypothesis (H₀)

Represents the default assumption or “no effect” (e.g., no relationship exists, mean sales are equal, regression slope equals zero).

B. Alternative Hypothesis (Hₐ)

Represents the claim you are testing (e.g., a relationship exists, means are different, slope is positive).

C. Decide the Type of Test

  • One-Sided Test: Used when testing for “greater than” or “less than.”
  • Two-Sided Test: Used when checking for “any difference.”

3. Hypothesis Writing Examples

A. Regression Example

Question: Does advertising spending increase sales?

  • H₀: β₁ = 0 (Advertising has no effect)
  • Hₐ: β₁ > 0 (Increasing advertising increases sales)

B. Two-Sample t-Test Example

Question: Do male and female employees earn different salaries?

  • H₀: μ₁ = μ₂ (Average salaries are equal)
  • Hₐ: μ₁ ≠ μ₂ (Average salaries are different)

C. One-Sided t-Test Example

Question: Does the new training program improve productivity?

  • H₀: μ_new ≤ μ_old (The new program is not better)
  • Hₐ: μ_new > μ_old (The new program improves productivity)

D. ANOVA Example

Question: Do different marketing channels produce different average sales?

  • H₀: μ₁ = μ₂ = μ₃ (All group means are equal)
  • Hₐ: At least one mean differs

E. Chi-Square Example

Question: Is purchase type related to customer income level?

  • H₀: Variables are independent
  • Hₐ: Variables are associated

F. Correlation Example

Question: Is there a linear relationship between study hours and exam scores?

  • H₀: ρ = 0 (No linear relationship)
  • Hₐ: ρ ≠ 0 (A linear relationship exists)

4. JMP Data Preparation & Setup

  1. Import Data: Load your dataset into JMP.
  2. Review Variable Types: Ensure Continuous (numerical), Nominal (categories), or Ordinal (ranked) types are correct via Column Info.
  3. Clean Data: Check for missing values, duplicate rows, outliers, and incorrect entries.

5. Descriptive Statistics

A. Continuous Variables

Use Analyze → Distribution to calculate Mean, Median, Standard Deviation, and Min/Max. Interpret center, spread, skewness, and outliers.

B. Categorical Variables

Use Analyze → Distribution to review frequency tables, percentages, and conditional distributions.

6. Visualizations

  • Histogram: Used for continuous variables (Analyze → Distribution).
  • Boxplot: Shows median, quartiles, and outliers (Red triangle → Display Options → Box Plot).
  • Bar/Pie Charts: Used for categorical variables (Graph → Graph Builder).

7. Correlation & Scatterplots

Use Analyze → Fit Y by X to visualize relationships. Correlation near +1 or -1 indicates a strong relationship; near 0 indicates a weak one.

8. Regression Models

  • Simple Linear Regression (SRM): ŷ = b₀ + b₁x. Use Fit Line to find the equation, R², and p-value.
  • Multiple Regression (MRM): ŷ = b₀ + b₁x₁ + b₂x₂ + … + bₖxₖ. Use Fit Model to interpret partial slopes and adjusted R².

9. Regression Assumptions

  • Linearity: Residuals should not show curves.
  • Independent Errors: Use residual plots or Durbin-Watson.
  • Homoscedasticity: Residual spread should remain constant.
  • Normal Residuals: Use Normal Quantile Plot.

10. Final Conclusion Rule

Always include p-values and state whether you reject or fail to reject H₀. Decision Rule: Reject H₀ if p < 0.05; fail to reject if p ≥ 0.05. Note: We never say “accept H₀.”

11. Advanced Decision Methods

1. Critical Value Method

Compare the test statistic to a critical value. If the statistic falls in the rejection region, reject H₀.

2. Confidence Interval Method

If the hypothesized value (e.g., 0 for slope) is NOT inside the confidence interval, reject H₀.

3. Visual Method

If the test statistic falls in the shaded tail area of the distribution, reject H₀.