home 1 2 3 4 5 6 7 8 9 10 11 12 13 Print

Section 2.1: Organizing Qualitative Data

Objectives

By the end of this section, you will be able to...

  1. organize qualitative data in tables
  2. construct bar graphs
  3. construct pie charts

For a quick overview of this section, feel free to watch this short video summary:

Frequency and Relative Frequency Tables

Let's suppose you give a survey concerning favorite color, and the data you collect looks something like the table below.

blue
red blue orange blue yellow green red pink
blue green blue purple blue blue green yellow pink
blue red pink green blue yellow green blue  

Clearly, we need a better way to summarize the data. The most obvious thing to do would be to make a table with the list of favorite colors and the frequency for each.

favorite color frequency
blue 10
red 3
orange 1
yellow 3
green 5
pink 3
purple 1

Officially, we call this a frequency distribution.

A frequency distribution lists each category of data and the number of occurrences for each category.

Sometimes, we really want to know the frequency of a particular category in reference to the total. We can do this just by finding the total, and dividing the frequency for each category by that total.

The relative frequency is the proportion (or percent) of observations within a category and is found using the formula

relative frequency =   frequency
sum of all frequencies

A relative frequency distribution lists each category of data together with the relative frequency of each category.

favorite color relative frequency
blue 10/26 ≈ 0.38
red 3/26 ≈ 0.12
orange 1/26 ≈ 0.04
yellow 3/26 ≈ 0.12
green 5/26 ≈ 0.19
pink 3/26 ≈ 0.12
purple 1/26 ≈ 0.04

Technology

Here's a quick overview of how to create frequency and relative frequency tables in StatCrunch.

  1. Enter or import the data.
  2. Select Stat > Tables > Frequency.
  3. Select the column(s) you want to summarize and click Next.
  4. Add any modifications for an "Other" category and how to order the categories.
  5. Click Calculate and another window with these numbers calculated will pop up.
  6. You can then choose Options > Copy to copy the output for use elsewhere.

 

Bar Graphs

Bar graphs are probably the most commonly used graphs, and one you're already familiar with. I won't mention much more here, except to state a couple keys:

  1. heights can be frequency or relative frequency
  2. bars must not touch

Using our the data from our previous color example,

favorite color frequency relative frequency
blue 10 10/26 ≈ 0.38
red 3 3/26 ≈ 0.12
orange 1 1/26 ≈ 0.04
yellow 3 3/26 ≈ 0.12
green 5 5/26 ≈ 0.19
pink 3 3/26 ≈ 0.12
purple 1 1/26 ≈ 0.04

we could then make both frequency and relative frequency bar graphs.

frequency bar graph

relative frequency bar graph

 

Technology

Here's a quick overview of how to create bar graphs in StatCrunch.

  1. Enter or import the data.
  2. Select Graphics > Bar Graph, then choose with data or with summary.
  3. If you chose with data, select the column(s) you wish to use and click Next. If you chose with summary, set the columns containing the categories and counts and click Next.
  4. Choose the type (Frequency or Relative Frequency) and click Next.
  5. Enter any modifications and/or color schemes and click Create Graph!
  6. You can then choose Options > Copy to copy the box plot for use elsewhere.

 

Pareto Charts

A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.

You see Pareto charts fairly often in the newspaper, because often the article is trying to show that one particular category is the highest or lowest. The image below, for example, is from the Chicago Tribune. You can see clearly from the graph that it's attempting to show that the local BP refinery in Whiting, Indiana is the highest-capacity refinery that is considering expansion.

oil refineries Paretto chart

If you don't remember the issue, you can read up about BP's plan to expand it's refinery in this article from CBS2 Chicago.

Here's another one, using the favorite color data from the last section:

paretto graph

Side-by-Side Bar Graphs

Side-by-side bar graphs are used when you want to compare two different populations. The key with side-by-side bar graphs is that you must use relative frequencies. Do you know why?

I think so. But just in case...

Look at it this way: Let's suppose we want to compare the poverty levels for different cities in Illinois. If we used frequencies only, Cook county dominates - almost 800,000, where no other county has over 50,000. On the other hand, if we looked at relative frequency, Cook county still has the most (15%), but other counties such as Kane are close, with rates around 8%.

Source: 2007 Illinois Poverty Summit

Here's a good example of a side-by-side chart, from the Associated Press.

side-by-side bar graph

What's shown isn't quite a relative frequency as we've defined it - it's the number per 100,000, where ours as a percent is the number per 100. The reason why the rate per 100,000 is used here is because the percents would all be less than 1% and difficult to read. Still, if frequency was used instead, the "White" category would be the largest, simply because that's the largest segment of the U.S. population.

Technology

Here's a quick overview of how to create side-by-side bar graphs in StatCrunch.

  1. Enter or import the data.
  2. Select Graphics > Chart > Columns
  3. Select the columns you'll be using.
  4. Select the location of the lablels (Row labels in).
  5. If desired, choose an order.
  6. Choose the plot type (vertical bars for a side-by-side bar graph) and click Next.
  7. Enter any modifications and/or color schemes and click Create Graph!
  8. You can then choose Options > Copy to copy the box plot for use elsewhere.

Pacman

Pie Charts

Like bar graphs, pie charts are very common. You're probably already aware of these as well. I'll just include a couple comments:

  1. should always include the relative frequency
  2. also should include labels, either directly or as a legend

Using our the data from our previous color example,

favorite color frequency relative frequency
blue 10 10/26 ≈ 0.38
red 3 3/26 ≈ 0.12
orange 1 1/26 ≈ 0.04
yellow 3 3/26 ≈ 0.12
green 5 5/26 ≈ 0.19
pink 3 3/26 ≈ 0.12
purple 1 1/26 ≈ 0.04

we get this pie chart:.

pie chart

Technology

Here's a quick overview of how to create pie charts in StatCrunch.

  1. Enter or import the data.
  2. Select Graphics > Pie Chart, then choose with data or with summary.
  3. If you chose with data, select the column(s) you wish to use and click Next. If you chose with summary, set the columns containing the categories and counts and click Next.
  4. Enter any modifications (labels, title, color scheme, etc) and click Create Graph!
  5. You can then choose Options > Copy to copy the box plot for use elsewhere.

 

<< previous section | next section >>

home 1 2 3 4 5 6 7 8 9 10 11 12 13 Print