Key Statistical Concepts: Non-Parametric Tests and Time Series Analysis
Non-Parametric Methods
Non-parametric methods are statistical tests that do not assume the data follows a specific distribution, such as the normal distribution. They are often used when the assumptions of parametric tests are violated.
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test): Used to compare the distributions of two independent groups or samples to determine if they have different medians. It is an alternative to the independent samples t-test.
- Wilcoxon Signed-Rank Test: Used to compare the medians
Statistical Fundamentals and Key Concepts Reference
Hypothesis Testing and P-Values
P-Value Definition
The p-value is the probability of observing a statistic as extreme (or more extreme) as the sample statistic, assuming the null hypothesis (H₀) is true.
Interpretation
- Large p-value: Evidence in favor of H₀ (Null Hypothesis).
- Small p-value: Evidence in favor of Hₐ (Alternative Hypothesis).
Types of Errors
- Type I Error (α): Rejecting H₀ when H₀ is true.
- Type II Error (β): Failing to reject H₀ when H₀ is false.
Study Design Fundamentals
- Sample:
Python Fundamentals Quiz: 99 Essential Concepts
Section 1: Basics, Variables, and Data Types (Q1–Q16)
Q1 Who created Python? Yeongin Kim / Bill Gates / Justin Martin / Guido Van Rossum
Q2 What does the Garbage Collector do? Hides memory / Deletes object permanently, frees memory / Renames object / Creates new object
Q3 What type of language is Python? Compiled / Dismissed / Interpreted / Associated
Q4 A Syntax Error occurs when misusing: Keywords / Parenthesis / Punctuations / One of the above
Q5 A Logical Error results in: Unexpected output / Crash
Read MoreStatistical Forecasting Methods and Time Series Analysis
Regression Analysis: Modeling Relationships
Regression analysis is a statistical technique used to model and analyze the relationship between a dependent variable (outcome) and one or more independent variables (predictors). It helps in:
- Understanding relationships between variables.
- Making predictions based on past data.
- Identifying key factors influencing an outcome.
Types of Regression
A. Linear Regression
Linear Regression models a relationship between the dependent variable (Y) and independent variable(
Read MoreEssential Statistical Concepts: Data Analysis and Modeling
Statistics: techniques (collecting,organizing,analysing,interpreting data)
Data may be:
quantitative (values expressed numerically) qualitative: (characteristics being tabulated). Descriptive statistics
: techniques summarize, describe numerical data= easier interpretation – can be graphical/involve computational analysis. Inferential statistics: techniques about decisions about statistical population/process are made based only on a sample being observed – use of probability concepts. VARIABLES:
Time Series Analysis and Regression Modeling in R
R Setup and Initial Data Handling
Setting the working directory:
setwd("/Users/hajdumarcell/Downloads/Öko. II. R Jegyzet")Data inspection and preparation:
str(Titanic)
PS4$Date <- as.Date(PS4$Date)Basic visualization using ggplot2:
ggplot(PS4, aes(x=Date, y=Google_PS4)) + geom_line()Regression Modeling with Dummy Variables
The general regression model structure, including trend ($t$) and quarterly dummy variables ($DQ$):
$$Y_t = \beta_0 + \beta_1 \times t + \beta_2 \times DQ_1 + \beta_3 \times DQ_
Read More