Statistics Essentials: Mean, Regression, Events & Sampling

Posted on Jan 8, 2026 in Statistics

Measures of Central Tendency

Explain measures of central tendency.

Mean: The average value, calculated by summing all values and dividing by the number of observations.
Median: The middle value when data is arranged in order; useful for skewed distributions.
Mode: The most frequently occurring value in the dataset.

Regression and Regression Equations

Describe regression and types of regression equations

Regression models the relationship between a dependent variable (y) and one or more independent variables (x). It helps predict the value of y based on the values of x.

Types of Regression Equations:

Simple Linear Regression: y = a + b x
- y: dependent variable
- x: independent variable
- a: intercept
- b: slope
Multiple Linear Regression: y = a + b1 x1 + b2 x2 + … + bn xn
- y: dependent variable
- x1, x2, …, xn: independent variables
- a: intercept
- b1, b2, …, bn: coefficients
Polynomial Regression: y = a + b1 x + b2 x^2 + … + bn x^n
- y: dependent variable
- x: independent variable
- a: intercept
- b1, b2, …, bn: coefficients
Logistic Regression: p = 1 / (1 + e^-z)
- p: probability of an event
- z: linear combination of independent variables

Events and Their Types

Explain event and its types

Simple Event: One outcome, for example rolling a 6 on a die. Straightforward!
Compound Event: Multiple outcomes combined, for example getting an even number (2, 4, or 6).
Certain Event: Guaranteed to happen, for example any roll of a standard die results in 1–6.
Impossible Event: Cannot happen, for example rolling a 7 on a standard six-sided die.
Independent Event: Not influenced by other events, for example flipping a coin twice where outcomes do not affect each other.
Dependent Event: Affected by previous events, for example drawing cards without replacement.
Mutually Exclusive Events: Cannot occur together, for example heads or tails in a single coin flip.
Exhaustive Events: Cover all possible outcomes, for example heads or tails in a coin flip.

Simple Bar Diagram

Explain simple bar diagram

A simple bar diagram, also known as a bar chart or bar graph, is a chart that presents categorical data with rectangular bars. The bars can be horizontal or vertical, and their lengths are proportional to the values they represent.

Key components:

Bars: The rectangular blocks that represent the data values.
X-axis: The horizontal axis that displays the categories or labels.
Y-axis: The vertical axis that displays the scale or values.

How it works:

Choose categories (e.g., months, products, cities).
Assign a value to each category (e.g., sales, temperature, population).
Draw a bar for each category, with the length proportional to the value.
Use the x-axis for categories and the y-axis for values.

Scatter Diagram (Scatter Plot)

Explain scatter diagram. A scatter diagram, also known as a scatter plot, is a graph that shows the relationship between two variables. It’s a visual representation of how two sets of data relate to each other.

Key components:

X-axis: The horizontal axis represents one variable (e.g., temperature).
Y-axis: The vertical axis represents the other variable (e.g., ice cream sales).
Points: Each point on the graph represents a pair of data values (e.g., temperature and ice cream sales for a particular day).

How it works:

Collect data on two variables (e.g., temperature and ice cream sales).
Plot each data point on the graph, with the x-axis value and y-axis value determining its position.
Look for patterns, trends, or correlations in the data.

Types of relationships:

Positive correlation: As one variable increases, the other tends to increase (e.g., temperature and ice cream sales).
Negative correlation: As one variable increases, the other tends to decrease (e.g., temperature and winter coat sales).
No correlation: No apparent relationship between the variables.

Data in Brief

Describe data in brief

Data refers to the facts and figures collected for analysis and interpretation.

Types of Data:

Quantitative Data: Numerical values (e.g., height, weight).
Qualitative Data: Descriptive information (e.g., colors, opinions).

Data Characteristics:

Discrete: Countable values (e.g., number of students).
Continuous: Measurable values (e.g., height, temperature).

Data Sources:

Primary Data: Original data collected firsthand.
Secondary Data: Existing data from external sources.

Data Analysis:

Descriptive Statistics: Summarizing and describing data.
Inferential Statistics: Drawing conclusions and making predictions.

Sampling

Explain sampling

Sampling in statistics is like taking a representative slice of the whole pie to understand its flavor.

What is Sampling?

Sampling is the process of selecting a subset of individuals or data points from a larger population to make inferences about the whole population.

Practicality: Studying the entire population is often impractical or impossible.
Cost-effective: Sampling reduces costs and time.
Accuracy: A well-designed sample can provide accurate estimates.

Types of Sampling:

Random Sampling: Every individual has an equal chance of being selected.
Stratified Sampling: Divide the population into subgroups and sample from each.
Systematic Sampling: Select every nth individual.
Cluster Sampling: Divide the population into clusters and sample from each.

Sampling Methods:

Probability Sampling: Random selection, allowing for statistical inference.
Non-Probability Sampling: Non-random selection, often used for exploratory research.

Key Concepts:

Sample Size: The number of individuals in the sample.
Sampling Frame: The list of individuals from which the sample is drawn.
Sampling Error: The difference between the sample estimate and the true population value.