Statistics and ggplot2: Quick Reference
Statistics and ggplot2: Quick Reference
Central Tendency
- Mean: The average of values, affected by outliers.
- Formula: \(\bar{x} = \frac{\Sigma x_i}{n}\)
- Median: The middle value, robust to outliers.
- Mode: The most frequent value in a dataset.
Variability Metrics
- Range: \(\text{Max} – \text{Min}\)
- Population Variance: \(\sigma^2 = \frac{\Sigma (x_i – \mu)^2}{N}\)
- Sample Variance: \(s^2 = \frac{\Sigma (x_i – \bar{x})^2}{n-1}\) (Bessel’s correction).
- Standard Deviation (SD): The square root of variance.
- Formula:
Statistical Inference: Z-Distribution, T-Distribution, and Regression
Chapter 6: Standard Error (SE)
The standard error (SE) is the standard deviation of the sampling distribution of a statistic. It measures the precision of the sample statistic as an estimate of the population parameter. A z-distribution is the standard normal distribution with a mean of 0 and a standard deviation of 1. It is used for testing hypotheses about a single population mean or proportions when σ is known. The T-distribution is a family of distributions that are similar to the normal distribution
Read MoreVehicle Tax Analysis: Price, Age, and Regression Insights
1. Interpreting the Slope in the Simple Linear Regression Model
A 1% increase in price is associated with a 0.8% increase in taxes. Given that the increase is less than 1%, the vehicle tax is regressive, not progressive, meaning that more expensive cars pay proportionally less tax.
rate = exp(b1) * exp(0.8161 * log_price) = exp(b1) * (exp(log_price))^0.8161 = exp(b1) * (price)^0.8161
Hence, an increase in the price of 1% implies an increase in the rate of (1.01)^0.8161 = 1.00815, that is, an increase
Read MoreMastering Math: Exponents, Roots, Polynomials, and More
Powers
Power = (Base)exponent
Properties of Powers
- 1st
- 2nd
- 3rd
- 4th
- 5th
- 6th
- 7th
- 8th
- 9th
Scientific Notation
A) Mt indicates the number of zeros to the right.
B) Mor indicates tenths. No. If I have a high ten, and M is positive, add zeros to the number as indicated by the M.
Roots
A) Numeric values of a radical. If the radical is a positive number, the solution is a unique positive root.
B) If the radical is negative and the index is even, the solution is a negative root.
C) Based on a positive, even index, there is
Read MoreHypothesis Theory: Types, Characteristics, and Formulation
Hypothesis Theory
A hypothesis is an assumption that establishes a relationship between two or more variables, expressed as facts, events, or factors. It must be tested to be accepted as valid.
Role of a Hypothesis
- Guides and directs an investigation.
- Assumptions should be deduced from the problem and aims of the study.
- Must be consistent with the theoretical framework.
- Determines the type of study and design methodology.
Characteristics of a Hypothesis
- Must refer to a real situation with a defined context.
Key Concepts in Statistics: Data Analysis and Metrics
Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data.
Population vs. Sample
- Population: All items of interest (e.g., all cars bought in Ontario last year). Note: It is often impossible to collect all these data points.
- Sample: Items randomly selected from the population (e.g., 1000 cars bought in Ontario last year).
Parameter vs. Statistic
- Parameter: A numerical description of the population.
- Statistic: A numerical description