3
Page 1 of 3 The Data Journalist Checklist Michael J. Berens - The Seattle Times | @MJBerens1 | [email protected] It’s the small things that bite you in this profession. That’s especially true with computerassisted reporting. Here are some of my simple philosophies: Learn only what you need to know to do the job at hand Toss aside manuals. You don’t have to master the software to do amazing things. Begin with a handful of core commands – import, sort and filter. Build a repertory of new skills based on what you need. Case example: State health officials claimed that sexual misconduct violations were rare among licensed health practitioners. I obtained a state disciplinary database, imported it into Excel and counted the violations. I found more than 500 violators. The result: License to Harm, which was a Pulitzer Prize finalist. Just a handful of Excel skills can create powerful stories. No detail is too small Avoid the temptation to leave out data that appears inconsequential – it may later be a critical component. Case example: My mission was to examine the local drug war. Using Excel, we computerized drug search warrants – address, how many arrests, quantity of drugs found, etc. An editor also asked what day of the week was the most popular for drug raids. I thought it was a stupid question. But I did it. The result: cops conducted drug raids on only on weekdays. Drug dealers knew this. In our interviews, dealers told us they only sold on weekends. Great sidebar? You bet. Know the rules of the data Every database has a file layout or manual that defines the order and definitions of the information. NEVER work with data without the manual. It’s important to know what is captured (or counted) and what is excluded. Case example: To track fatal police chases, I obtained a federal database (Fatality Analysis Reporting System). A data field designated if the crash was linked to a police pursuit. The problem (as the manual noted): the government doesn’t count crashes when officers intentionally ram cars. Under their logic, this wasn’t an accident. And the data doesn’t include crash deaths that occur on private property.

The data journalist checklist michael j. berens - las vegas news train - oct. 10-11, 2014

Embed Size (px)

DESCRIPTION

Handout for Data Journalism 101, presented by Michael Berens at the Las Vegas Newstrain October 10-11, 2014. Once a potential watchdog story is identified, discover time- saving techniques to drill through mountains of information -- from paper records to electronic databases -- and extract the critical information that turns routine stories into must-read enterprise. This session provides simple methods and innovative reporting tools to mold raw data into hard-hitting stories.

Citation preview

Page 1: The data journalist checklist   michael j. berens - las vegas news train - oct. 10-11, 2014

Page  1  of  3    

The Data Journalist Checklist

Michael J. Berens - The Seattle Times | @MJBerens1 | [email protected]

It’s the small things that bite you in this profession. That’s especially true with computer‐assisted reporting. Here are some of my simple philosophies:

Learn only what you need to know to do the job at hand Toss aside manuals. You don’t have to master the software to do amazing things. Begin with a handful of core commands – import, sort and filter. Build a repertory of new skills based on what you need. Case example: State health officials claimed that sexual misconduct violations were rare among licensed health practitioners. I obtained a state disciplinary database, imported it into Excel and counted the violations. I found more than 500 violators. The result: License to Harm, which was a Pulitzer Prize finalist. Just a handful of Excel skills can create powerful stories.

No detail is too small Avoid the temptation to leave out data that appears inconsequential – it may later be a critical component. Case example: My mission was to examine the local drug war. Using Excel, we computerized drug search warrants – address, how many arrests, quantity of drugs found, etc. An editor also asked what day of the week was the most popular for drug raids. I thought it was a stupid question. But I did it. The result: cops conducted drug raids on only on weekdays. Drug dealers knew this. In our interviews, dealers told us they only sold on weekends. Great sidebar? You bet.

Know the rules of the data Every database has a file layout or manual that defines the order and definitions of the information. NEVER work with data without the manual. It’s important to know what is captured (or counted) and what is excluded. Case example: To track fatal police chases, I obtained a federal database (Fatality Analysis Reporting System). A data field designated if the crash was linked to a police pursuit. The problem (as the manual noted): the government doesn’t count crashes when officers intentionally ram cars. Under their logic, this wasn’t an accident. And the data doesn’t include crash deaths that occur on private property.

Page 2: The data journalist checklist   michael j. berens - las vegas news train - oct. 10-11, 2014

Page  2  of  3    

Segregation is good

Every unique piece of information should go into its own “cell” on a spreadsheet program. Separate numbers and letters (words) from each other, like addresses. Create columns for yes/no results for easier counting. Case example: I obtained hundreds of pages of inspection reports for adult family homes. Report by report, I divided the findings into columns. Did this case involve physical abuse – yes/no? Was there a lack of food? Was there night staffing? I uncovered more that 50 categories; each went into it’s own column. That’s how I could count how many elderly patients were roped into wheelchairs or how many choked to death on food.

Complex is good The more variables, the better the story. This goes hand‐in‐hand with segregation and no piece of information is too small. Think of the data as a source. The more questions you can ask the source, the more answers you might get for a story. Avoid the temptation to skip the seemingly insignificant. Case example: While tracking rates of faulty medical devices, I created a spreadsheet that detailed everything about the device except who was using it and when. Once I realized these were important factors, I had to go back to the first report and recapture the information. I learned that many so‐called device errors are human error, and that the most dangerous time for a hospital patient is on the thinly staffed weekend shift when low‐paid aides outnumber highly trained nurses.

You’re one keystroke away from career-ending errors Always remember that your data results can be wildly skewed if you forget a step, like failing to sort all your data or misinterpreting the data. Never guess. Never rush. Double‐check everything. Case example: I can’t count how many errors I’ve made with data reporting. I’ve imported data wrong, skewing information three characters to the left (which can look normal in coded data). I’ve failed to sort all the data at one time. See the next three checklist items to protect yourself.

Keep a log Every time I start a new database, I grab a legal pad and write down every step and every data result. If I screw up, I have a map to retrace my work. This also creates a index of query file names.

Create a master file backup Make a back‐up copy of your original database and never touch it. Likewise,

Page 3: The data journalist checklist   michael j. berens - las vegas news train - oct. 10-11, 2014

Page  3  of  3    

invest in a portable hard drive and back up all your data and files at least once a week. Don’t trust a company computer. At the very least, store your data on a network drive that is backed up.

Share everything I share raw data findings and methodology before publication with every relevant person in the story. I want to know if I’ve made a mistake. I want to hear from the story’s harshest critic. Case example: Using a court database, I had tracked prosecution rates of domestic violence cases and uncovered a phenomenal finding: prosecutors dismissed 90 percent of cases. But when I checked the finding with prosecutors it was determined that there had been a data entry error in the clerk’s office.

Force your editor to understand It’s surprising how many editors will weigh every word in a story but have no idea how data results were crafted. Sit down with your editors and make sure they understand the data process. Their questions will surely make for a better story. A final note: Where do you learn about computer‐assisted reporting? One of the best and most comprehensive sources is Investigative Reporters and Editors (www.ire.org). Get a membership and obtain access to thousands of handouts and tutorials – or attend one of their workshops. Invest in yourself.