Mastering Statistics: Variables, Spread, and Data Insights
Understanding Data Types and Variables
1. Identifying True Statements about Variables
Select all the true statements:
- a. Classification of children in a daycare center (infant, toddler, preschool) is a categorical variable. (This variable has labels, and each child has one of those labels.)
- b. Eye color is a discrete variable. (Incorrect: Eye color is a categorical variable.)
- c. Number of bicycles sold by a large sporting goods store is a continuous variable. (Incorrect: This is a discrete variable, as it involves counting distinct values.)
- d. Time it takes to mow a lawn is a numerical and continuous variable. (Any value from zero to infinity is possible.)
- e. Weight of a cat is a numerical variable. (Any value from zero to infinity is possible.)
- f. Number of weekly movies watched on Netflix is a numerical and discrete variable. (This involves counting, therefore, distinct values.)
2. Inappropriate Summaries for Qualitative Variables
Select all that are not appropriate summaries for qualitative (categorical) variables:
- A. Bar Graph
- B. Pie Chart
- C. Pareto Chart
- D. Two-way (Pivot) Table
- E. Histogram (Histograms are used for numerical data.)
3. Selecting an Appropriate Standard Deviation
The first exam in this course will be graded on a scale from 0 to 100. Assume that the mean score on the exam is 80. Which of these values would make more sense as a standard deviation of these exam scores? (Select one)
- a) 0 (Not the best, as it’s very rare that all students get the same grade.)
- b) 10 (Better, because on average, a grade could be 10 points below the mean (80-10 = 70) or above the mean (80+10 = 90), which is a reasonable spread for exam scores.)
- c) 50 (Too large, because on average, a grade cannot be 50 points above the mean (80+50=130), exceeding the maximum score.)
- d) -5 (Standard deviation cannot be negative.)
4. Identifying Highest Standard Deviation in Histograms
The following histograms show the results of 3 different samples, each with a sample size of 100. Which graph has the highest standard deviation? Explain why.
Standard deviation describes the average spread of data from the mean. All three graphs have the same mean, which is 5.
- Histogram b has all the data very close to the mean (most consistent).
- Histogram c has most of the data much further away from the mean (least consistent, thus highest standard deviation).
- Histogram a has some data away from the mean (somewhat consistent).
5. Analyzing Pet Ownership Data
The following graph shows the number of people possessing different pets based on a random sample of people in a given city. Select all the true statements about this data.
- a) Most people possess dogs. (Incorrect: The tallest bar, indicating the mode, is for Cats, not Dogs.)
- b) More people possess goldfish than hamster. (The bar for goldfish is taller than that for hamster.)
- c) It is more appropriate to use a Pareto chart for this data. (Incorrect: While a Pareto chart can be used for categorical data ordered by frequency, a standard bar chart is also appropriate for this nominal data.)
- d) Least frequent pet is rabbit. (The shortest bar has the lowest frequency, which is for rabbits.)
6. Comparing Stock Price Descriptive Statistics
The following Excel output shows the descriptive statistics of the stock prices for Nike and Google during the same six-month period:
Statistic | Nike | |
---|---|---|
Mean | 49.77 | 719.62 |
Standard Error | 1.51 | 19.58 |
Median | 48.39 | 717.31 |
Mode | #N/A | #N/A |
Standard Deviation | 3.70 | 47.96 |
Sample Variance | 13.68 | 2299.92 |
Kurtosis | -1.29 | 1.23 |
Skewness | 0.61 | 0.05 |
Range | 9.53 | 145.63 |
Minimum | 45.42 | 647.26 |
Maximum | 54.95 | 792.89 |
Sum | 298.6 | 4317.7 |
Count | 6 | 6 |
Select all the true statements, and correct or rewrite the false statements:
- a) There are 6 stock prices for Nike and 6 stock prices for Google in the sample. (Confirmed by the ‘Count’ statistic.)
- b) The average stock price for Nike is much smaller than the average stock price for Google. (Mean stock price for Nike is $49.77 and for Google is $719.62.)
- c) The average spread of stock prices from the mean is much higher for Google than Nike. (Standard deviation is higher for Google (47.96) than Nike (3.70).)
- d) Stock prices for Nike are more consistent than the stock prices for Google. (Incorrect: Consistency is often measured by the Coefficient of Variation (CV = Standard Deviation / Mean).
- CV for Nike = 3.70 / 49.77 ≈ 0.0743