Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
NICAR Courses
1 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Filter and PivotTables in Excel
FILTERING
With filters in Excel you can quickly collapse your spreadsheet to find records meeting specific
criteria. A lot of reporters use filter to cut their data down to size. Here are some examples:
A reporter gets a spreadsheet with data covering her entire state. By filtering she can
quickly pick out only the counties in her coverage area.
When working with school test scores a reporter wants to look only at schools with
enrollment greater than 500 but less than 1,000. He can do that by filtering using numbers
and ranges.
A political reporter can use filters with campaign contribution data to find all records that
came from donors in a particular city or zip code.
A crime reporter got a spreadsheet of incidents from the police department. She heard
that there was an increase in crime on one particular street. She can find all of the
incidents occurring on that street by filtering the address column to find any records
containing that street name.
To turn on the filters simply click anywhere in the data and then look under the
“Data” tab for the “Filter” button. The icon looks like a funnel. Click on it and
dropdown arrows should appear next to each column header.
Figure 1
If you click on one of those dropdowns, you should find a unique list of all of the values
contained in that column (which is a great way to get to know what each column contains). You
can click on the box next to “Select all” to uncheck all items. Then select only what you want.
The example below shows a filter to get only the records for the National League in a
spreadsheet of baseball players. Once you’re happy with your selections, close the box.
NICAR Courses
2 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Figure 2
You’ll know the filter is working by several things:
1) The dropdown next to the heading of the column you’re filtering on will become a
funnel.
2) All of the row numbers become blue. You now only see the records meeting your filter
criteria. None of the other records have gone away, they are just hiding until you either
change your filter or turn it off.
3) At the bottom left-hand corner of your screen you’ll see a record count. This tells you
how many records have been selected out of the total number of records. This can be very
useful during an analysis:
NICAR Courses
3 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
If you want all of the records to come back you have two options: You can click back on the
dropdown next to the heading of the column you are currently filtering and click on “Select all.”
Or, you can just turn off the filters; often this is the easiest way to go because you can think of
the filter like a light switch. You flip the switch and the filters appear, you flip it again and they
turn off.
To turn off the filters just click on Filter (the funnel icon) under the Data tab. It will clear all of
your filters, bringing back every record.
Now that we’ve covered the basics of filtering, explore filters on your own to see all that they
offer. Here are some highlights:
Put filters on multiple columns at the same time.
When working with number columns explore the drop-down number filters under
“Choose One.” Options here include “Greater Than,” “Between,” “Less Than” and “Top
10.”
Figure 3
Text columns also offer some useful filtering options. Find records that begin, end or
contain certain characters. The “Contains” option is especially useful if you have to deal
with information that is commonly misspelled or spelled inconsistently. For example, St.
NICAR Courses
4 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Louis can be written many different ways, including “St. Louis,” “St Louis” and “Saint
Louis.” None of these are wrong, but Excel will see them as different items. They all
have “Louis” in common so you could use the “Contains” or “Ends With” options to grab
all of the records at once.
Figure 4
NICAR Courses
5 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
GROUPING
We’re journalists, so we’re often concerned with answering questions like: Who is paying the
most? Which county had the most? How many? What is the most common? When you start
wanting to ask your data questions like this and a simple sort won’t do, you’re going to need to
use pivot tables.
One of the main concepts to grasp with pivot tables is grouping.
Imagine if I had a traditional deck of cards, shuffled them up, took
two cards out and put them in my pocket. You’d know I had 50
cards remaining, but if I asked you which cards were missing,
what would you do? You wouldn’t be able to just look at the deck
of cards and tell me that I had the Queen of Hearts and the 8 of
Spades in my pocket. You’d put the cards into piles, each
containing one type of card, such as Kings, Aces, 7’s, etc. After you’d gone through the 50 cards
you’d be able to see which pile had fewer than four cards to discover which ones are missing.
Excel is going to do the same thing for us with a pivot table. Before learning this skill many
journalists have simply used a piece of paper and a pencil to tally up things and report on them.
Take, for example, this listing of individuals and firms barred from doing business with the
World Bank (See Figure 5).
One reporter wanted to write a story looking at the number of debarments for his country
compared to others. He went through the list of more than 500 records of companies and
individuals and kept a tally of the number of records for each country. The story isn’t impossible
without Excel but analyzing this same information in a spreadsheet can drastically cut down on
the time and increase accuracy by doing the math for you. Follow the steps below to walk
through a basic pivot table using WorldBank.xlsx.
The data contain the name, address and country of the debarred individuals or firms as well as
the ineligibility dates and grounds for debarment.
NICAR Courses
6 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Figure 5
Figure 6
NICAR Courses
7 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
To find out the total debarments for each country you’ll need to put the countries into “groups”
using a pivot table.
BUILDING A PIVOTTABLE
First, highlight all your data: select A1 and hold down Shift + Command, then hit the right arrow
(which should highlight all the headers) and the down arrow (which will highlight all the rows).
Next, go to the Data tab and look all the way to the left. You should see “PivotTable.” Click the
small down arrow and choose “Create Manual PivotTable.”
Figure 7
The Create PivotTable window should open. It has two pieces to it:
1) You’re asked to select the data you’d like to analyze with your PivotTable. This is why
you should select your data in advance. The “Table/Range:” information in Figure 8 is
showing us exactly what we selected. It looks funny, but really it’s just saying that we
selected cells A1 all the way through F608 in the sheet called “WorldBank” found in this
workbook.
2) Excel wants us to tell it where we’d like to put the PivotTable. By default it selects “New
Worksheet.” This is good because we don’t want the PivotTable to just appear right on
top of our data.
NICAR Courses
8 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Figure 8
If you follow our steps you should always be able to simply click “OK” in this window, but it’s
still good to understand exactly what Excel is doing.
After you click “OK,” Excel pops you into a new sheet with all of the tools you’ll need to build
your summary. There are two pieces, the various boxes on the left and the “PivotTable Builder”
on the right. The boxes on the left are where your summary or chart will appear and change each
time you do something in the task pane on the right. See Figure 10 for more information on this
task pane.
Figure 9
NICAR Courses
9 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Figure 10: PivotTable Field List Task Pane
Just like we’ve done with other datasets, frame your analysis with a question. In this situation we
want to know which country has the most firms and/or individuals on the debarred list. To
answer that question you’ll move “Country” from the Field name list to the Row Labels box. As
We call them column “headings,” but
Excel calls them “fields.” This is a list
of all of the column headings, or fields,
in the data.
These boxes allow you to do some
more complicated things in your
PivotTable. We won’t use them in the
introduction to the concepts, but we’ll
explore them in some of the in-class
and homework exercises.
This is where you put your “groups.”
Think of this as the “labels” for your
chart. It’s where you’d put counties,
team names, cities, etc. – anything that
will be the group. We typically start
PivotTables in this area. Just drag a
column heading from the field list here
and the PivotTable box will
automatically contain a list of unique
values found in the column you chose.
Think of this as the place where math
happens. Anytime you want to find
the sum, average or total number of
records for a category you’ll work in
this area.
NICAR Courses
10 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
soon as you drop “Country” under Row Labels you should see a list of country names appear in
the PivotTable box. This list is alphabetical and each country name should be listed only once.
Figure 11
Next, we’ll want to count the number of debarments for each country. Remember that each row
in the spreadsheet represents one firm or individual debarred. To count up the totals by country,
drag “Country” under the Values box. See Figure 12.
NICAR Courses
11 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Figure 12
The last step is to get the country with the most debarments on the top of the list. For this we’ll
need to sort. Sorting is different in PivotTables than sorting in a regular sheet. Here, all you need
to do is click on any number next to a country, and use the dropdown tool on the sort icon to
select “Descending” (Figure 13). You can also sort the records alphabetically by clicking on any
one of the country names, selecting the sort icon, then whichever option you prefer.
Figure 13
NICAR Courses
12 ©2014 National Institute for Computer-Assisted Reporting & Investigative Reporters and Editors, Inc.
Sorting by the number of debarments brings Canada to the top of the list, followed by the United
States, Indonesia and the United Kingdom. Notice that Excel creates a Grand Total row at the
bottom. This total should equal the number of records in your original spreadsheet.
Figure 14