Measures of dispersion
Download the Lessonotes Mobile Nigeria 2025 app for faster lesson access on Android and iPhone.
Subject: Further Mathematics
Class: Senior Secondary 1
Term: 3rd Term
Week: 2
Theme: Statistics
This page supports the lesson note with a companion video and a short classroom-ready summary.
For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.
This topic introduces learners to measures of dispersion, which describe the spread or variability of a data set. While measures of central tendency (like mean, median, mode) describe the central value, measures of dispersion provide crucial information about how much individual data points differ from the central value or from each other. Understanding dispersion is vital in various fields, such as economics (e.g., price stability of goods), health (e.g., consistency of drug effects), and quality control in industries.
| 25 | $25 - 20 = 5$ | 25 | 625 | | 30 | $30 - 20 = 10$ | 100 | 900 | | Total | 0 | 250 | 2250 |
1. Calculate Variance using $\frac{\sum (x - \bar{x})^2}{N}$: $\sigma^2 = \frac{250}{5} = 50$
2. Calculate Standard Deviation: $\sigma = \sqrt{50} \approx 7.07$ bottles (Using computational formula $\frac{\sum x^2}{N} - (\bar{x})^2$) $\sigma^2 = \frac{2250}{5} - (20)^2 = 450 - 400 = 50$ $\sigma = \sqrt{50} \approx 7.07$ bottles Example 2.7 (Variance & Standard Deviation - Grouped Data): Using the age data from Example 2.
5. We found $\bar{x} = 34.5$. N = 50. | Age (years) | f | Midpoint (x) | fx | $(x - \bar{x})$ | $(x - \bar{x})^2$ | $f(x - \bar{x})^2$ | $fx^2$ | | :---------- | :- | :----------- | :--- | :-------------- | :---------------- | :---------------- | :--------- | | 10-19 | 6 | 14.5 | 87 | -20 | 400 | 2400 | 6 14.5^2 = 1261.5 | | 20-29 | 12 | 24.5 | 294 | -10 | 100 | 1200 | 12 24.5^2 = 7203 | | 30-39 | 15 | 34.5 | 517.5 | 0 | 0 | 0 | 15 34.5^2 = 17868.75 | | 40-49 | 10 | 44.5 | 445 | 10 | 100 | 1000 | 10 44.5^2 = 19802.5 | | 50-59 | 7 | 54.5 | 381.5 | 20 | 400 | 2800 | 7 54.5^2 = 20791.75 | | Total | 50 | | 1725 | | | 7400 | 66927.5 |
1. Calculate Variance using $\frac{\sum f(x - \bar{x})^2}{\sum f}$: $\sigma^2 = \frac{7400}{50} = 148$
2. Calculate Standard Deviation: $\sigma = \sqrt{148} \approx 12.17$ years (Using computational formula $\frac{\sum fx^2}{\sum f} - (\bar{x})^2$)* $\sigma^2 = \frac{66927.5}{50} - (34.5)^2 = 1338.55 - 1190.25 = 148.3$ (slight difference due to rounding of mean, but $148$ is direct from $f(x-\bar{x})^2$) $\sigma = \sqrt{148.3} \approx 12.18$ years Measures of dispersion quantify the extent to which data values are spread out or clustered together. A small dispersion indicates data points are close to each other, while a large dispersion indicates data points are widely spread.
A. Range Definition: The simplest measure of dispersion, calculated as the difference between the highest (maximum) and lowest (minimum) values in a data set.
Formula: Range = Maximum Value - Minimum Value Characteristics: Easy to calculate but highly sensitive to outliers (extreme values). It does not consider the distribution of data points between the maximum and minimum. Example 2.1 (Range - Ungrouped Data): The daily prices (in Naira) of a tuber of yam recorded in a local market over 7 days were: N850, N900, N820, N1100, N880, N950, N
8
7
0. Maximum Value: N1100 Minimum Value: N820 Range: N1100 - N820 = N280
B. Quartiles and Interquartile Range (IQR)
Quartiles: Values that divide a data set, ordered from lowest to highest, into four equal parts.
First Quartile (Q1) / Lower Quartile: The value below which 25% of the data falls. It is the median of the lower half of the data.
Second Quartile (Q2) / Median: The value below which 50% of the data falls.
Third Quartile (Q3) / Upper Quartile: The value below which 75% of the data falls. It is the median of the upper half of the data.
Interquartile Range (IQR): The range of the middle 50% of the data. It is the difference between the third quartile (Q3) and the first quartile (Q1).
Formula: IQR = Q3 - Q1 Characteristics: Less sensitive to outliers than the range because it excludes the extreme 25% of data at both ends. It gives a better picture of the spread of the central portion of the data.
Finding Quartiles for Ungrouped Data:
1. Arrange the data in ascending order.
2. Determine the position of the quartiles: Position of Q1 = (N + 1) / 4 Position of Q2 = (N + 1) / 2 (or 2 Position of Q1) Position of Q3 = 3 (N + 1) / 4 Where N is the total number of data points.
3. If the position is an integer, the quartile is the value at that position.
4. If the position is not an integer (e.g., 2.5), interpolate between the values at the surrounding integer positions. For example, if the position is 2.5, the quartile is the average of the 2nd and 3rd values. Example 2.2 (Quartiles & IQR - Ungrouped Data): The scores of 11 students in a Further Mathematics test are: 35, 42, 50, 61, 65, 70, 72, 75, 80, 85, 90.
1. Ordered Data: 35, 42, 50, 61, 65, 70, 72, 75, 80, 85, 90 (N=11)
2. Position of Q1: (11 + 1) / 4 = 12 / 4 = 3rd position. Q1: 50
3. Position of Q2 (Median): (11 + 1) / 2 = 12 / 2 = 6th position. Q2: 70
4. Position of Q3: 3 (11 + 1) / 4 = 3 3 = 9th position. Q3: 80
5. IQR: Q3 - Q1 = 80 - 50 = 30 Finding Quartiles for Grouped Data (Frequency Distribution):
1. Construct a cumulative frequency table.
2. Determine the position of the quartiles using the total frequency, N (i.e., $\sum f$): Position of Q1 = N / 4 Position of Q2 = N / 2 Position of Q3 = 3N / 4
3. Locate the class interval (quartile class) where the cumulative frequency first exceeds the quartile position.
4. Use the formula for quartiles for grouped data (similar to median formula): $Q_k = L + \left( \frac{\frac{kN}{4} - CF_b}{f_k} \right) c$ Where: $Q_k$ = k-th quartile (k=1 for Q1, k=2 for Q2, k=3 for Q3) $L$ = Lower class boundary of the quartile class $N$ = Total frequency ($\sum f$) $CF_b$ = Cumulative frequency of the class before the quartile class $f_k$ = Frequency of the quartile class * $c$ = Class width of the quartile class the cumulative frequency first exceeds the quartile position.
4. Use the formula for quartiles for grouped data (similar to median formula): $Q_k = L + \left( \frac{\frac{kN}{4} - CF_b}{f_k} \right) c$ Where: $Q_k$ = k-th quartile (k=1 for Q1, k=2 for Q2, k=3 for Q3) $L$ = Lower class boundary of the quartile class $N$ = Total frequency ($\sum f$) $CF_b$ = Cumulative frequency of the class before the quartile class $f_k$ = Frequency of the quartile class $c$ = Class width of the quartile class Example 2.3 (Quartiles & IQR - Grouped Data): A survey on the ages (in years) of 50 residents in a community in Katsina state is shown below: | Age (years) | Frequency (f) | | :---------- | :------------ | | 10-19 | 6 | | 20-29 | 12 | | 30-39 | 15 | | 40-49 | 10 | | 50-59 | 7 |
1. Cumulative Frequency Table: | Age (years) | Frequency (f) | Class Boundaries | Cumulative Frequency (CF) | | :---------- | :------------ | :--------------- | :------------------------ | | 10-19 | 6 | 9.5 - 19.5 | 6 | | 20-29 | 12 | 19.5 - 29.5 | 18 | | 30-39 | 15 | 29.5 - 39.5 | 33 | | 40-49 | 10 | 39.5 - 49.5 | 43 | | 50-59 | 7 | 49.5 - 59.5 | 50 | N = 50, Class width (c) = 10 (e.g., 19.5 - 9.5)
2. Calculate Q1: Position of Q1 = N/4 = 50/4 = 12.5 Q1 class: 20-29 (CF is 18, which is the first to exceed 12.5) L = 19.5, $CF_b$ = 6, $f_1$ = 12, c = 10 $Q_1 = 19.5 + \left( \frac{12.5 - 6}{12} \right) 10 = 19.5 + \left( \frac{6.5}{12} \right) 10 = 19.5 + 5.42 = 24.92$ years
3. Calculate Q3: Position of Q3 = 3N/4 = 3 50 / 4 = 37.5 Q3 class: 40-49 (CF is 43, which is the first to exceed 37.5) L = 39.5, $CF_b$ = 33, $f_3$ = 10, c = 10 $Q_3 = 39.5 + \left( \frac{37.5 - 33}{10} \right) 10 = 39.5 + \left( \frac{4.5}{10} \right) 10 = 39.5 + 4.5 = 44.0$ years
4. IQR: Q3 - Q1 = 44.0 - 24.92 = 19.08 years
C. Mean Deviation (MD)
Definition: The average of the absolute differences between each data point and the mean of the data set. It measures the average distance of each data point from the mean.
Formula (Ungrouped Data): $MD = \frac{\sum |x - \bar{x}|}{N}$ Where: $x$ = individual data point $\bar{x}$ = mean of the data $|x - \bar{x}|$ = absolute difference between x and $\bar{x}$ $N$ = number of data points Formula (Grouped Data): $MD = \frac{\sum f|x - \bar{x}|}{\sum f}$ Where: $x$ = midpoint of each class interval $\bar{x}$ = mean of the grouped data $f$ = frequency of each class $\sum f$ = total frequency Characteristics: Relatively easy to understand.
However, the use of absolute values makes it less suitable for further mathematical operations compared to variance and standard deviation. Example 2.4 (Mean Deviation - Ungrouped Data): The number of sachet water bottles sold by a vendor in 5 hours are: 15, 20, 10, 25, 30.
1. Calculate the Mean ($\bar{x}$): $\bar{x} = \frac{15 + 20 + 10 + 25 + 30}{5} = \frac{100}{5} = 20$
2. Calculate absolute deviations from the mean: $|15 - 20| = 5$ $|20 - 20| = 0$ $|10 - 20| = 10$ $|25 - 20| = 5$ $|30 - 20| = 10$ $\sum |x - \bar{x}| = 5 + 0 + 10 + 5 + 10 = 30$
3. Calculate Mean Deviation: $MD = \frac{30}{5} = 6$ bottles Example 2.5 (Mean Deviation - Grouped Data): Using the age data from Example 2.
3. First, calculate the mean. | Age (years) | f | Midpoint (x) | fx | $|x - \bar{x}|$ | f$|x - \bar{x}|$ | | :---------- | :- $|20 - 20| = 0$ $|10 - 20| = 10$ $|25 - 20| = 5$ $|30 - 20| = 10$ $\sum |x - \bar{x}| = 5 + 0 + 10 + 5 + 10 = 30$
3. Calculate Mean Deviation: $MD = \frac{30}{5} = 6$ bottles Example 2.5 (Mean Deviation - Grouped Data): Using the age data from Example 2.
3. First, calculate the mean. | Age (years) | f | Midpoint (x) | fx | $|x - \bar{x}|$ | f$|x - \bar{x}|$ | | :---------- | :- | :----------- | :-- | :--------------- | :--------------- | | 10-19 | 6 | 14.5 | 87 | $|14.5 - 34.5|$=20 | 6 20 = 120 | | 20-29 | 12 | 24.5 | 294 | $|24.5 - 34.5|$=10 | 12 10 = 120 | | 30-39 | 15 | 34.5 | 517.5 | $|34.5 - 34.5|$=0 | 15 0 = 0 | | 40-49 | 10 | 44.5 | 445 | $|44.5 - 34.5|$=10 | 10 10 = 100 | | 50-59 | 7 | 54.5 | 381.5 | $|54.5 - 34.5|$=20 | 7 20 = 140 | | Total | 50 | | 1725 | | 480 |
1. Calculate Mean ($\bar{x}$): $\bar{x} = \frac{\sum fx}{\sum f} = \frac{1725}{50} = 34.5$ years
2. Calculate Mean Deviation: $MD = \frac{\sum f|x - \bar{x}|}{\sum f} = \frac{480}{50} = 9.6$ years
D. Variance ($\sigma^2$ or $s^2$)
Definition: The average of the squared differences between each data point and the mean. It measures how far, on average, each value in the data set is from the mean. Squaring the differences ensures positive values and penalizes larger deviations more heavily.
Formula (Ungrouped Data): $\sigma^2 = \frac{\sum (x - \bar{x})^2}{N}$ or $\sigma^2 = \frac{\sum x^2}{N} - (\bar{x})^2$ (Computational formula)
Where: $x$ = individual data point $\bar{x}$ = mean of the data $N$ = number of data points Formula (Grouped Data): $\sigma^2 = \frac{\sum f(x - \bar{x})^2}{\sum f}$ or $\sigma^2 = \frac{\sum fx^2}{\sum f} - (\bar{x})^2$ (Computational formula)
Where: $x$ = midpoint of each class interval $\bar{x}$ = mean of the grouped data $f$ = frequency of each class $\sum f$ = total frequency Characteristics: Widely used in statistical inference. The unit of variance is the square of the original data units, which can be hard to interpret.
E. Standard Deviation ($\sigma$ or $s$)
Definition: The square root of the variance. It is the most commonly used measure of dispersion as it expresses the spread in the same units as the original data, making it easier to interpret.
Formula (Ungrouped Data): $\sigma = \sqrt{\frac{\sum (x - \bar{x})^2}{N}}$ or $\sigma = \sqrt{\frac{\sum x^2}{N} - (\bar{x})^2}$ Formula (Grouped Data): $\sigma = \sqrt{\frac{\sum f(x - \bar{x})^2}{\sum f}}$ or $\sigma = \sqrt{\frac{\sum fx^2}{\sum f} - (\bar{x})^2}$ Characteristics: Provides a 'typical' distance of data points from the mean. Larger standard deviation indicates greater spread, while smaller standard deviation indicates data points are closer to the mean. Example 2.6 (Variance & Standard Deviation - Ungrouped Data): Using the sachet water sales data from Example 2.4: 15, 20, 10, 25,
3
0. We found $\bar{x} = 20$. N = 5. | x | $(x - \bar{x})$ | $(x - \bar{x})^2$ | $x^2$ | | :- | :-------------- | :---------------- | :---- | | 15 | $15 - 20 = -5$ | 25 | 225 | | 20 | $20 - 20 = 0$ | 0 | 400 | | 10 | $10 - 20 = -10$ | 100 | 100 | | 25 | $25 - 20 = 5$ | 25 | 625 | | 30 | $30 - 20 = 10$ | 100 | 900 | | Total | 0 | 250 | 2250 |
1. Calculate Variance using $\frac{\sum (x - \bar{x})^2}{N}$: $\sigma^2 = \frac{250}{5} = 50$
2. Calculate Standard Deviation: $\sigma = \sqrt{50} \approx 7.07$ bottles (Using computational formula $\frac{\sum x^2}{N} - (\bar{x})^2$)* $\sigma^2 = \frac{2250}{5} - (20)^2 = 450 - 400 = 50$ $\sigma = \sqrt{50} \approx 7.07$ bottles Example 2.7 (Variance & Standard Deviation - Grouped Data): Using
Agriculture and Food Security: Application: A farmer wants to choose between two varieties of maize seeds (Hybrid A vs. Hybrid B). Over several planting seasons, Hybrid A yielded: 1.5, 1.8, 1.6, 1.7, 1.9 tonnes per hectare, while Hybrid B yielded: 1.0, 2.5, 1.7, 1.2, 2.1 tonnes per hectare.
Integration: By calculating the standard deviation for each variety, the farmer can determine which variety offers more consistent yields. A lower standard deviation indicates a more reliable yield, which is crucial for food security and planning, especially in unpredictable weather conditions in Nigeria. This helps farmers make informed decisions to minimize risks.
Financial Markets and Investment: Application: A Nigerian investor is considering investing in shares of two companies listed on the Nigerian Stock Exchange (NSE), Company X and Company Y. Historical data shows that the monthly returns for Company X had a standard deviation of 5%, while Company Y had a standard deviation of 12%. Both companies had an average return of 8%.
Integration: The standard deviation here represents volatility or risk. Company X, with a lower standard deviation, indicates more stable and predictable returns, making it a less risky investment. Company Y, with a higher standard deviation, implies its returns fluctuate more, suggesting a higher risk but potentially higher reward. Investors use this information to align investment choices with their risk tolerance.
Public Health and Disease Control: Application: Public health officials are monitoring the number of malaria cases reported weekly in two different local government areas (LGAs) in a state. LGA A records weekly cases with a small standard deviation, while LGA B shows a large standard deviation in weekly cases, even if both have similar average cases.
Integration: A small standard deviation for LGA A suggests a consistent number of cases, possibly indicating endemic malaria with stable transmission patterns requiring continuous, routine interventions. A large standard deviation for LGA B suggests fluctuating cases, perhaps indicating sporadic outbreaks or highly seasonal transmission, which would require more dynamic and responsive intervention strategies. This helps in allocating resources and designing targeted public health campaigns effectively.