Section 2.2: Organizing Quantitative Data: The Popular Displays
Objectives
By the end of this section, you will be able to...
 organize quantitative data into tables
 construct histograms for discrete and continuous data
 draw stemandleaf plots
 draw dot plots
 identify the shape of a distribution
For a quick overview of this section, feel free to watch this short video summary:
Like qualitative data in the last section, quantitative data can (and should) be organized into tables. We'll break this page up into two parts  discrete and continuous.
Organizing Discrete Data into Tables
If you recall from Section 1.2,
A discrete variable is a quantitative variable that has either a finite number of possible values or a countable number of values. (Countable means that the values result from counting  0, 1, 2, 3, ...)
Since we can list all the possible values (that's essentially what countable means), one way to make a table is just to list the values along with their corresponding frequency.
Example 1
Here's some data I collected from a previous students Mth120 course. It refers to the number of children in their family (including themselves).
2  2  2  4  5  3  3  3  3 
2  1  2  3  5  3  4  3  1 
2  3  5  3  2  1  3  2 
An easy way to compile the data would then be to make a frequency or relative frequency table as we did before.
children  frequency  relative frequency 
1  3  3/26 ≈ 0.12 
2  8  8/26 ≈ 0.31 
3  10  10/26 ≈ 0.38 
4  2  2/26 ≈ 0.08 
5  3  3/26 ≈ 0.12 
Sometimes, however, we have too many values to make a row for each one. In that case, we'll need to group several values together.
Example 2
A good example might be the scores on an exam, ranging from 1100. Here are some data from a past Mth120 class.
62 
87  67  58  95  94  91  69  52 
76  82  85  91  60  77  72  83  79 
63  88  79  88  70  75  87 
In this case, we'll have to set up intervals of numbers called classes. Each class has a lower class limit and an upper class limit, along with a class width. The class width is the difference between successive lower class limits.
To be consistent, the class width should be same for each class. One good option might look something like this:
Organizing Continuous Data into Tables
Organizing continuous data is similar to organizing multivalued discrete data. We have to form classes which don't overlap. I usually try to design a class width that's either logical (i.e. 10 points for grades above) or so that I have 58 classes when complete.
Example 3
For this example, let's consider the average commute for each of the 50 states. The data below show the average daily commute of a random sample of 15 states.
23.1  18.3  23.2  19.9  26.6 
24.8  23.1  23.2  22.7  29.4 
22.3  30.0  25.8  21.9  16.7 
Source: US Census 
Do you know why this is a continuous random variable and not discrete? (Hint: It's not because of the decimal.)
This is continuous because the variable we're measuring  time  is not finite. When, say, a marketing agent measures her commute time, she actually rounds to the nearest minute. If she reports 32 minutes, it's not exactly 32 minutes, it's 32 minute to the nearest minute. In reality, it might be 32.15323623245134... (you get the idea).
To make a frequency or relative frequency for continuous data, we use the same strategy we'd use for multivalued discrete data.
average commute  frequency  relative frequency 
1617.9  1  1/15 ≈ 0.07 
1819.9  2  2/15 ≈ 0.13 
2021.9  1  1/15 ≈ 0.07 
2223.9  6  6/15 = 0.40 
2425.9  2  2/15 ≈ 0.13 
2627.9  1  1/15 ≈ 0.07 
2829.9  1  1/15 ≈ 0.07 
3031.9  1  1/15 ≈ 0.07 
Once we have these tables, we'll need to learn how to create some charts to display the information, which is what the next few page are about.
Technology
Here's a quick overview of how to create frequency and relative frequency tables for quantitative data in StatCrunch.
Discrete Data
Continuous or Multivalued Discrete Data:
* Note that these classes seem to overlap, but that the class "0k" does not include Mk. Creating a relative frequency table from a frequency table If you are given a frequency table and need to create a relative frequency table, use the following steps, assuming that "Frequency" is the label of the column containing the frequencies  edit as needed.

Singlevalued Histograms
To display quantitative data, we need a new type of chart, called a histogram. Histograms look similar to bar graphs, but they have some distinct differences  and for good reason.
A histogram is constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles touch each other.
The rectangles need to touch in a histogram because we want to imply that the classes are adjacent. In a bar graph, a favorite color of "blue" isn't really adjacent to "red", even though we might put it that way in a bar graph. For quantitative data like the data used in Example 1 earlier this section, the value 2 really is next to the value 3.
Let's take a closer look at that example.
Example 4
children  frequency  relative frequency 
1  3  3/26 ≈ 0.12 
2  8  8/26 ≈ 0.31 
3  10  10/26 ≈ 0.38 
4  2  2/26 ≈ 0.08 
5  3  3/26 ≈ 0.12 
To make a histogram, we make what looks like a bar graph with a couple key differences:
 rectangles must touch
 class labels are underneath the rectangle
Here's what they'd look like for our example data:
Technology
Here's a quick overview of how to create histograms for singlevalued discrete data using StatCrunch.

Histograms for Multivalued and Continuous Data
Multivalued and continuous histograms are probably where the most errors occur. There are some key differences between this and singlevalued histograms. In this case, each rectangle doesn't represent a single value, but rather a range of values. Because of that, we don't label the class on the horizontal axis. Instead, we label the lower class limits at the left edge of each rectangle.
Let's demonstrate using an example:
Example 5
average commute  frequency  relative frequency 
1617.9  1  1/15 ≈ 0.07 
1819.9  2  2/15 ≈ 0.13 
2021.9  1  1/15 ≈ 0.07 
2223.9  6  6/15 = 0.40 
2425.9  2  2/15 ≈ 0.13 
2627.9  1  1/15 ≈ 0.07 
2829.9  1  1/15 ≈ 0.07 
3031.9  1  1/15 ≈ 0.07 
Here's what a frequency histogram would look like for these data:
Technology
Here's a quick overview of how to create histograms for multivalued discrete data or continuous data in StatCrunch.

One final note about histograms: Because they show us such nice information about the distribution of a set of data, we'll be using them frequently throughout the rest of the semester. Be sure you spend plenty of time familiarizing yourself with the technology, so you're able to create histograms with ease.
StemandLeaf Plots
Stemandleaf plots are another way to represent quantitative data. They give more detail because they show the actual data. The idea is to split each data value into two parts  a stem and a leaf. The stem is everything of the rightmost digit, and the leaf is that rightmost digit. Here's an example, using the data from earlier this section regarding exam scores from a previous Mth120 class.
Example 6
62 
87  67  58  95  94  91  69  52 
76  82  85  91  60  77  72  83  79 
63  88  79  88  70  75  87 
With these data, the stems are the first digits  5, 6, 7, 8, and 9. The leafs are all the second digits, 0, 1, ... , 9. The full stemandleaf plot lists the stems down the left side, a vertical bar between, and then lists the leafs in order to the right. Something like this:
It's interesting that this plot looks very similar to a histogram, only it gives us the actual data. Take a look at this animation to see the relationship:
There are some limitations to stemandleaf plots. In particular, we're limited to small data sets  can you imagine the leaves if we had 1,000 test scores? Also, the range in the data needs to be fairly small.
By that, I mean if the data values range from 1100, our stems can be 0, 10, 20, ... , 90, as they were in this example. On the other hand, if the values range from 110,000, the stems would have to be 0, 10, 20, ... , 9,980, 9,990. That's a lot of rows!
Technology
Here's a quick overview of how to create stemandleaf plots in StatCrunch.

Dot Plots
Dot pots are similar to singlevalued histograms, but rather than placing rectangles above each particular value, a dot plot just places the required number of dots above each value. Looking at our example again with the number of children, the plot would look something like this:
Technology
Here's a quick overview of how to create dot plots in StatCrunch.

Distribution Shape
A good way to describe a distribution is its shape. In general, we describe a distribution's shape in one of four ways (though there are others):
 uniform  frequencies are evenly spread out among all values of the variable
 symmetric (bellshaped)  highest value is in the middle, with values tailing off to the right and left
 left (negative) skewed  highest value is on the right, with a longer left "tail"
 right (positive) skewed  highest values is on the left, with a longer right "tail"
uniform 
symmetric (bellshaped) 
left (negative) skewed 
right (positive) skewed 