Lesson Notes By Weeks and Term v5 - Grade 11

Statistics – Week 2 focus

Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.

Get it on Google Play

Get it on the Apple App Store

Subject: Mathematics

Class: Grade 11

Term: Term 4

Week: 2

Theme: General lesson support

Lesson Video

This page supports the lesson note with a companion video and a short classroom-ready summary.

For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.

Performance objectives

Calculate range, IQR, and semi-IQR
Calculate variance and standard deviation for ungrouped data
Interpret the meaning of these measures
Compare the spread of different datasets

Lesson summary

This week, we delve deeper into the fascinating world of statistics. We'll build upon the foundational concepts covered in Week 1 and focus on measures of dispersion – range, interquartile range (IQR), semi-interquartile range, variance, and standard deviation. Understanding these concepts is crucial for analyzing data sets effectively, allowing us to determine not only the central tendency (like the mean) but also how spread out the data is. This is vital in understanding patterns and making informed decisions. Why does this matter to you as South African learners?

Lesson notes

2.1 Measures of Dispersion Measures of dispersion, also known as measures of variability or spread, describe how scattered or clustered a set of data is. They tell us how much the individual data points deviate from the average (or central) value. A high dispersion indicates that the data points are widely spread out, while a low dispersion indicates that they are clustered closely around the average. 2.2 Range The range is the simplest measure of dispersion. It is the difference between the highest and lowest values in a dataset.

Formula: Range = Maximum Value - Minimum Value Example 1: Consider the following set of scores on a mathematics test: 55, 60, 72, 85,

9

0. Range = 90 - 55 = 35 Interpretation: The scores are spread over a range of 35 marks.

Limitations: The range is highly sensitive to outliers (extreme values) and doesn't consider the distribution of the data between the maximum and minimum values. 2.3 Interquartile Range (IQR) and Semi-Interquartile Range To address the limitations of the range, we use the Interquartile Range (IQR). Quartiles divide a dataset into four equal parts. The first quartile (Q1) is the value below which 25% of the data falls, the second quartile (Q2) is the median (50%), and the third quartile (Q3) is the value below which 75% of the data falls.

Interquartile Range (IQR): The difference between the third quartile (Q3) and the first quartile (Q1). IQR = Q3 - Q

1. The IQR represents the range of the middle 50% of the data.

Semi-Interquartile Range: Half of the interquartile range. Semi-IQR = (Q3 - Q1) /

2. It provides a more refined measure of spread, representing the average distance of the upper and lower quartiles from the median.

Finding Quartiles: Ungrouped Data: Arrange the data in ascending order. Find the median (Q2). The median of the lower half of the data is Q1, and the median of the upper half is Q

3. If there is an odd number of data points, do not include the median in the calculations of Q1 and Q

3. Grouped Data: Use cumulative frequency and interpolation. Q1 position ≈ (n+1)/4 Q3 position ≈ 3(n+1)/4 Locate the class interval containing the required cumulative frequency and interpolate.

Example 2: Consider the following data set of household incomes in a small rural community (in Rand thousands): 10, 12, 15, 18, 20, 22, 25, 28, 30, 35, 40 Data is already sorted. n =

1

1. Q2 (Median) = 22 Lower half: 10, 12, 15, 18,

2

0. Q1 = 15 Upper half: 25, 28, 30, 35,

4

0. Q3 = 30 IQR = 30 - 15 = 15 Semi-IQR = 15/2 = 7.5 Interpretation: The middle 50% of household incomes are spread over a range of R15,

0

0

0. The average distance of the upper and lower quartiles from the median is R7,500. 2.4 Variance and Standard Deviation Variance and standard deviation are the most commonly used measures of dispersion because they consider all data points in the dataset.

Variance: The average of the squared differences between each data point and the mean. Squaring the differences ensures that both positive and negative deviations contribute positively to the measure.

Standard Deviation: The square root of the variance. It provides a measure of spread in the same units as the original data, making it easier to interpret.

Formulas for Ungrouped Data: Variance (Population): σ² = Σ(xi - μ)² / N, where xi is each data point, μ is the population mean, and N is the population size.

Variance (Sample): s² = Σ(xi - x̄)² / (n - 1), where xi is each data point, x̄ is the sample mean, and n is the sample size. We use (n-1) in the denominator for sample variance to provide an unbiased estimate of the population variance.

Standard Deviation (Population): σ = √(σ²)

Standard Deviation (Sample): s = √(s²)

Example 3: Consider the following sample of waiting times (in minutes) at a clinic: 5, 7, 9, 11, 13 Calculate the sample mean (x̄): x̄ = (5 + 7 + 9 + 11 + 13) / 5 = 9 Calculate the squared differences from the mean: (5 - 9)² = 16 (7 - 9)² = 4 (9 - 9)² = 0 (11 - 9)² = 4 (13 - 9)² = 16 Calculate the sum of the squared differences: Σ(xi - x̄)² = 16 + 4 + 0 + 4 + 16 = 40 Calculate the sample variance (s²): s² = 40 / (5 - 1) = 40 / 4 = 10 Calculate the sample standard deviation (s): s = √10 ≈ 3.16 Interpretation: The waiting times at the clinic have a standard deviation of approximately 3.16 minutes. This means that, on average, waiting times deviate from the mean (9 minutes) by about 3.16 minutes. 2.5 Interpreting Standard Deviation A small standard deviation indicates that the data points are clustered closely around the mean, implying less variability. A large standard deviation indicates that the data points are spread out over a wider range, implying greater variability. 2.6 Comparing Datasets When comparing the spread of two or more datasets, it's crucial to use the same measure of dispersion for all datasets. For instance, comparing the standard deviation of one dataset with the IQR of another is meaningless.