From the course: Statistics Foundations 1: The Basics

Basic data sets

- [Instructor] So how do you find the mean, median, and mode of a data set, and how do the three of them together help you better understand your data set? Let's begin with a small and simple data set. I'm in exercise file 02_02_Begin. It only has seven data points. First, let's find the median. The median is the middle value. To find the middle value, you can use this formula, the quantity n + 1 over 2, where n is the number of values in our data set. Our data set has only seven values, so n is equal to 7. And then we can solve, n + 1 divided by 2. So what we're looking for is our fourth value. What we then need to do is organize our values from smallest to largest or largest to smallest, so we'll put these in ascending order. And now we're looking for the fourth value. One, two, three, four, this is our fourth value. That is our median. If you want, you can use a formula. The median of these values is 40. Next, we can calculate the mean, which is often called the average. The average is the sum of all the data points divided by n, so if we know that n is equal to 7, so what we're going to do is we're going to sum up all of our data points, which is 490, so 490 divided by 7 gives us 70. You can also use a formula for this. So it's average of all of these values over here. And again, the average is 70. Finally, let's identify the mode. The mode is the most common number in our data set. In this dataset, the mode is 20. It shows up three times all the other values in this dataset only appear once. Again, you can use a formula. Using all three of these numbers together, we can now better understand our data set. Here's our complete data set: 20, 20, 20, 40, 50, 140, 200. The median is 40, the mean is 70, the mode is 20. Because this data set is so small, we can see that while the middle value or median is 40, it makes sense that the mean is larger since we have two data points much larger than 40. Five of the numbers are 50 or below, but the two largest values, 140 and 120, they greatly increase the average. Together, the median 40 and the mean 70 told me that while half the data points are under 40, the numbers above the median must have been really large if they pushed the average to 70. As for the mode in the data set with only seven data points, we have three data points of 20. It represents all of the data points below the median and it definitely pulled down the average. The mode is begging us to investigate why it's so common in such a small data set. One last thing, in this video, we used a tiny data set so it's easy to both find and then understand the mode, median, and mean, and now when you run up against a giant data set, you'll know that you can trust the mode, median, and mean to help you better understand your data.

Contents