Data handling: summarising and interpreting data – Week 3 focus
Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.
Subject: Mathematical Literacy
Class: Grade 11
Term: Term 4
Week: 3
Theme: General lesson support
This page supports the lesson note with a companion video and a short classroom-ready summary.
For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.
This week, we delve deeper into summarising and interpreting data, a crucial skill for understanding the world around us, especially within the South African context. Data is everywhere – from crime statistics in your local newspaper to unemployment rates affecting your family and community. Understanding how to analyse and interpret this data allows us to make informed decisions, identify trends, and advocate for change. For instance, understanding the implications of electricity price increases (often presented as data) allows us to budget better or even challenge the price hikes using factual data.
Measures of Central Tendency: These values represent the "center" or typical value of a dataset.
Mean (Average): The sum of all the values divided by the number of values.
Formula (Ungrouped Data): Mean = (Sum of all values) / (Number of values)
Formula (Grouped Data): Mean ≈ (Sum of (Midpoint of each interval Frequency of each interval)) / (Total Frequency)
Example (Ungrouped): The monthly water bills (in Rand) for 5 households in a township are: R150, R200, R180, R220, R
1
7
0. The mean water bill is (R150 + R200 + R180 + R220 + R170) / 5 = R920 / 5 = R
1
8
4. Example (Grouped): The following table shows the number of learners in each class at a school: | Class Size | Frequency (Number of Classes) | | ---------- | ----------------------------- | | 20-24 | 5 | | 25-29 | 8 | | 30-34 | 6 | | 35-39 | 3 | To estimate the mean class size: Find the midpoint of each interval: (20+24)/2 = 22; (25+29)/2 = 27; (30+34)/2 = 32; (35+39)/2 = 37 Multiply each midpoint by its frequency: 225 = 110; 278 = 216; 326 = 192; 373 = 111 Sum these products: 110 + 216 + 192 + 111 = 629 Divide by the total frequency: 629 / (5+8+6+3) = 629 / 22 ≈ 28.6 Therefore, the estimated mean class size is approximately 28.6 learners.
Median: The middle value when the data is arranged in ascending order. If there are an even number of values, the median is the average of the two middle values. Example (Ungrouped, Odd Number of Values): The ages of 7 children are: 5, 7, 3, 9, 6, 4,
8. Arranging in ascending order: 3, 4, 5, 6, 7, 8,
9. The median age is
6. Example (Ungrouped, Even Number of Values): The test scores of 6 students are: 70, 80, 65, 75, 90,
8
5. Arranging in ascending order: 65, 70, 75, 80, 85,
9
0. The median score is (75+80)/2 = 77.
5. Mode: The value that appears most frequently in the dataset. There can be more than one mode (bimodal, trimodal, etc.), or no mode at all.
Example: The number of loaves of bread bought daily at a local bakery over a week were: 20, 22, 20, 25, 20, 28,
2
2. The mode is 20 (appears 3 times).
Measures of Spread (Dispersion): These values describe how spread out the data is.
Range: The difference between the highest and lowest values in the dataset.
Example: Using the ages of the 7 children from above (3, 4, 5, 6, 7, 8, 9), the range is 9 - 3 = 6 years.
Interquartile Range (IQR): The difference between the upper quartile (Q3) and the lower quartile (Q1). The quartiles divide the data into four equal parts. Q1 is the median of the lower half of the data, and Q3 is the median of the upper half of the data.
Steps: Order the data from least to greatest. Find the median (Q2). Find the median of the lower half of the data (Q1). Find the median of the upper half of the data (Q3). IQR = Q3 - Q1
Example: Consider the following data set representing the number of minutes spent on social media per day by 11 students: 20, 30, 40, 45, 50, 55, 60, 65, 70, 75,
8
0. Ordered Data: 20, 30, 40, 45, 50, 55, 60, 65, 70, 75, 80 Median (Q2): 55 Lower Half: 20, 30, 40, 45,
5
0. Q1 = 40 Upper Half: 60, 65, 70, 75,
8
0. Q3 = 70 IQR = 70 - 40 = 30 Box and Whisker Plots: A visual representation of the data that displays the minimum value, Q1, median (Q2), Q3, and maximum value. It helps to visualize the distribution and spread of the data.
Construction: Draw a number line covering the range of the data. Mark the minimum, Q1, median, Q3, and maximum values. Draw a box from Q1 to Q3, with a line inside the box marking the median. Draw "whiskers" extending from the box to the minimum and maximum values.
Interpretation: The length of the box represents the IQR. Longer boxes or whiskers indicate greater variability in that portion of the data. The position of the median within the box indicates the skewness of the data.
Outliers: Values that are significantly different from other values in the dataset. They can affect the measures of central tendency and spread. One common rule of thumb is to define outliers as values that are less than Q1 - 1.5IQR or greater than Q3 + 1.5*IQR. Outliers should be investigated to determine if they are errors or legitimate data points. Guided Practice (With Solutions)
Question 1: The monthly salaries (in Rand) of 8 employees at a small spaza shop are: R3000, R3500, R3200, R4000, R3300, R3100, R12000, R
3
4
0
0. Calculate the mean, median, and mode of these salaries. Which measure of central tendency best represents the "typical" salary?
Solution: Mean: (R3000 + R3500 + R3200 + R4000 + R3300 + R3100 + R12000 + R3400) / 8 = R36500 / 8 = R4562.50 Median: Arranging in ascending order: R3000, R3100, R3200, R3300, R3400, R3500, R4000, R
1
2
0
0
0. The median is (R3300 + R3400) / 2 = R
3
3
5
0. Mode: No salary appears more than once, so there is no mode.
Interpretation: The mean (R4562.50) is significantly higher than the median (R3350) due to the outlier (R12000). The median is a better representation of the "typical" salary as it is less affected by the outlier.