Statistical Foundations for Data Analysis
PPDAC Cycle: Data Problem-Solving
Problem: Clearly define your research question.
Plan: Choose a sampling method and variables.
Data: Collect and clean data (e.g., remove errors, handle missing values).
Analysis: Use EDA (plots & statistics) and model relationships (e.g., regression).
Conclusion: Answer your research question. Be cautious about generalizing!
Essential Sampling Methods
| Method | Description | Pros | Cons |
|---|---|---|---|
| Simple Random | Each unit has equal chance (like a lucky draw) | Unbiased | May need full list of population |
| Systematic | Pick |
Data Analysis & Measurement in Psychology: Scientific Method Foundations
Data Analysis and Measurement in Psychology: The Scientific Method
The objective of scientific method studies is to conduct procedures that are systematic (with established steps) and verifiable (with data that can be replicated or refuted by any researcher). However, the scientific method is just one component of the scientific research process, which consists of three levels (Arnaud):
Theoretical and Conceptual Level
1. Defining the problem and hypotheses
2. Deduction of testable predictionsTheoretical-
Data Analysis Fundamentals: Central Tendency & Variability
Descriptive Statistics: Central Tendency & Dispersion
Measures of Central Tendency
Understanding the Mean
The mean of the weights is the average of all weights in the table.
Remarks on the Mean
- Very easy to compute.
- Takes into consideration all values in the dataset.
- Highly sensitive to extreme values among the data (outliers).
There are some variations of the mean (harmonic mean, geometric mean…) which we will not study in this course.
Understanding the Median
The median is the number in the middle
Read MoreMastering Data Analytics Fundamentals: Concepts & Excel Techniques
Descriptive Analytics Fundamentals
Descriptive analytics helps us understand what has happened using past data.
Key Use Cases for Descriptive Analytics
- Sales trends analysis
- Customer behavior patterns
- Web traffic analysis
The Data Science Process
- Define the Problem: Clearly articulate the question to be answered.
- Data Collection:
- Primary: Gather new data (e.g., surveys, experiments).
- Secondary: Utilize existing data (e.g., public databases, internal records).
- Data Cleaning: Address missing or outlier data,
Mastering Statistics: Variables, Spread, and Data Insights
Understanding Data Types and Variables
1. Identifying True Statements about Variables
Select all the true statements:
- a. Classification of children in a daycare center (infant, toddler, preschool) is a categorical variable. (This variable has labels, and each child has one of those labels.)
- b. Eye color is a discrete variable. (Incorrect: Eye color is a categorical variable.)
- c. Number of bicycles sold by a large sporting goods store is a continuous variable. (Incorrect: This is a discrete variable,
Essential Statistical Concepts and Tests
Simple Linear Regression
Purpose: Predict a numerical outcome (dependent variable Y) from a numerical predictor (independent variable X).
Equation: Y = a + bX
a (intercept): Predicted Y when X = 0
b (slope): For each 1-unit increase in X, Y increases/decreases by b units.
Example: Income = 20000 + 3000 × YearsOfEducation → Each extra year of education predicts $3,000 more income.
R² (Coefficient of Determination): Tells us how much of the variation in Y is explained by X. Ranges from 0 to 1.
Interpretation:
