Statistics and ggplot2: Quick Reference

Statistics and ggplot2: Quick Reference

Central Tendency

  • Mean: The average of values, affected by outliers.
    • Formula: \(\bar{x} = \frac{\Sigma x_i}{n}\)
  • Median: The middle value, robust to outliers.
  • Mode: The most frequent value in a dataset.

Variability Metrics

  • Range: \(\text{Max} – \text{Min}\)
  • Population Variance: \(\sigma^2 = \frac{\Sigma (x_i – \mu)^2}{N}\)
  • Sample Variance: \(s^2 = \frac{\Sigma (x_i – \bar{x})^2}{n-1}\) (Bessel’s correction).
  • Standard Deviation (SD): The square root of variance.
    • Formula:
Read More

Statistical Inference: Z-Distribution, T-Distribution, and Regression

Chapter 6: Standard Error (SE)

The standard error (SE) is the standard deviation of the sampling distribution of a statistic. It measures the precision of the sample statistic as an estimate of the population parameter. A z-distribution is the standard normal distribution with a mean of 0 and a standard deviation of 1. It is used for testing hypotheses about a single population mean or proportions when σ is known. The T-distribution is a family of distributions that are similar to the normal distribution

Read More

Vehicle Tax Analysis: Price, Age, and Regression Insights

1. Interpreting the Slope in the Simple Linear Regression Model

A 1% increase in price is associated with a 0.8% increase in taxes. Given that the increase is less than 1%, the vehicle tax is regressive, not progressive, meaning that more expensive cars pay proportionally less tax.

rate = exp(b1) * exp(0.8161 * log_price) = exp(b1) * (exp(log_price))^0.8161 = exp(b1) * (price)^0.8161

Hence, an increase in the price of 1% implies an increase in the rate of (1.01)^0.8161 = 1.00815, that is, an increase

Read More

Mastering Math: Exponents, Roots, Polynomials, and More

Powers

Power = (Base)exponent

Properties of Powers

  • 1st
  • 2nd
  • 3rd
  • 4th
  • 5th
  • 6th
  • 7th
  • 8th
  • 9th

Scientific Notation

A) Mt indicates the number of zeros to the right.

B) Mor indicates tenths. No. If I have a high ten, and M is positive, add zeros to the number as indicated by the M.

Roots

A) Numeric values of a radical. If the radical is a positive number, the solution is a unique positive root.

B) If the radical is negative and the index is even, the solution is a negative root.

C) Based on a positive, even index, there is

Read More

Hypothesis Theory: Types, Characteristics, and Formulation

Hypothesis Theory

A hypothesis is an assumption that establishes a relationship between two or more variables, expressed as facts, events, or factors. It must be tested to be accepted as valid.

Role of a Hypothesis

  • Guides and directs an investigation.
  • Assumptions should be deduced from the problem and aims of the study.
  • Must be consistent with the theoretical framework.
  • Determines the type of study and design methodology.

Characteristics of a Hypothesis

  • Must refer to a real situation with a defined context.
Read More

Key Concepts in Statistics: Data Analysis and Metrics

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data.

Population vs. Sample

  • Population: All items of interest (e.g., all cars bought in Ontario last year). Note: It is often impossible to collect all these data points.
  • Sample: Items randomly selected from the population (e.g., 1000 cars bought in Ontario last year).

Parameter vs. Statistic

  • Parameter: A numerical description of the population.
  • Statistic: A numerical description
Read More