4
1 YVONNE LEOW - INNOVATION PROPOSAL - Local newspapers have been documenting our lives for centuries, but in an increasingly data- driven world, they are failing to realize their potential. While other industries are embracing data science to make smarter business decisions, newspapers are not. Instead, decades of institutional knowledge are squandered in outdated archival systems, or buried in the minds of journalists who eventually retire or leave the industry. This matters because in a marketplace where virality and pageviews are valued over substance, newspapers are losing their identity. They report on towns that are rarely covered by national media; meanwhile their newsrooms continue to shrink. It is time for that to change. French political thinker Alexis de Tocqueville once said, “When the past no longer illuminates the future, the spirit walks in darkness.” I would like to develop a tool that analyzes and visualizes information from past stories to better inform a newspaper’s editorial and operational strategies. It will save a tremendous amount of time and resources, and more importantly, enable journalists to better serve their communities. The Problem with Search The majority of newspaper companies, such as Gannett and the Tribune Company, have digital archives, but they are all limited to a basic search engine. When reporters type in keywords, like “Maricopa County government,” they have to dig through hundreds if not thousands of results to find what they are looking for. Rather than poring over pages and pages of headlines, journalists could generate data visualizations to quickly inform their decision-making. This would improve local journalism on two fronts: It could help reporters: - Connect the dots. Historical trends and patterns are easier to spot when they are displayed in a chart, map or graph. Imagine if a tool could illustrate how neighborhood property development has expanded or stalled in the past decade. Or if it could scrape names and titles from articles to illustrate the relationships among public officials, religious leaders and small business owners. (See Figure A below.) - Discover story ideas. In addition to rookie reporters, local newsrooms are filled with people who have covered a beat for ages. This tool could help them rethink their approach. They could see how frequently they were quoting the same sources or possibly neglecting a community in their coverage.

Yvonne Leow's Knight Project Proposal

Embed Size (px)

DESCRIPTION

Yvonne Leow's 2014-2015 Innovation Proposal for the John S. Knight Fellowship at Stanford.

Citation preview

Page 1: Yvonne Leow's Knight Project Proposal

  1

YVONNE LEOW

- INNOVATION PROPOSAL -

Local newspapers have been documenting our lives for centuries, but in an increasingly data-

driven world, they are failing to realize their potential. While other industries are embracing data science

to make smarter business decisions, newspapers are not. Instead, decades of institutional knowledge are

squandered in outdated archival systems, or buried in the minds of journalists who eventually retire or

leave the industry. This matters because in a marketplace where virality and pageviews are valued over

substance, newspapers are losing their identity. They report on towns that are rarely covered by national

media; meanwhile their newsrooms continue to shrink. It is time for that to change.

French political thinker Alexis de Tocqueville once said, “When the past no longer illuminates

the future, the spirit walks in darkness.” I would like to develop a tool that analyzes and visualizes

information from past stories to better inform a newspaper’s editorial and operational strategies. It will

save a tremendous amount of time and resources, and more importantly, enable journalists to better

serve their communities.

The Problem with Search

The majority of newspaper companies, such as Gannett and the Tribune Company, have digital

archives, but they are all limited to a basic search engine. When reporters type in keywords, like

“Maricopa County government,” they have to dig through hundreds if not thousands of results to find

what they are looking for. Rather than poring over pages and pages of headlines, journalists could

generate data visualizations to quickly inform their decision-making. This would improve local journalism

on two fronts:

It could help reporters:

- Connect the dots. Historical trends and patterns are easier to spot when they are displayed in a

chart, map or graph. Imagine if a tool could illustrate how neighborhood property development

has expanded or stalled in the past decade. Or if it could scrape names and titles from articles to

illustrate the relationships among public officials, religious leaders and small business owners.

(See Figure A below.)

- Discover story ideas. In addition to rookie reporters, local newsrooms are filled with people

who have covered a beat for ages. This tool could help them rethink their approach. They could

see how frequently they were quoting the same sources or possibly neglecting a community in

their coverage.

Page 2: Yvonne Leow's Knight Project Proposal

  2

- Fact-check sources. Reporters often rely on archive searches to fact-check political figures.  Instead of looking for similar keywords and phrases, this tool could aggregate past quotes to

verify whether the mayor’s political rhetoric changed during his or her administration.

(Figure A)

With limited staff and shrinking budgets, editors have to be smarter about how they direct their coverage.

Newsroom managers could also use this tool to:

- Identify coverage gaps. Editors could visualize how many stories have been written about

school districts in upper class versus working class neighborhoods in the past five years. They

could use that kind of information to guide their beat reporters.

- Enforce accountability. In addition to quantifying facts and figures, editors could use sentiment

analysis to measure the breadth of their coverage. It could break down, for instance, how many

negative stories have been published about predominantly Hispanic and Asian immigrant

communities compared to white middle-class communities. (See Figure B below.) Hidden biases

may exist, but newsroom leaders could use this tool to diversify reporting.

- Create a source of revenue. Most small nonprofits, businesses and government agencies share

the same challenges as newspapers. They do not have the resources to integrate data analytics

into their operations. It is very likely they would be interested in the data or a way to visualize

years of information about their cities. If newspapers continue to add reporting into the

Page 3: Yvonne Leow's Knight Project Proposal

  3

database, they could potentially sell this tool or the technology powering it to local

organizations.

The problem newsrooms face is that their data is meaningless without context. By analyzing and

visualizing decades of historical coverage, this tool would be the solution.

(Figure B)

The Era of Big Data

There are all kinds of organizations that use natural language processing (NLP), machine

learning and data visualization techniques to address industry challenges. Palantir Technologies is a

notable data analytics company that tackles issues like climate change and cyber attacks. A professor at

Columbia University recently created MedLEE (Medical Language Extraction and Encoding System) that

extracts medical information from past patient reports. A Stanford Ph.D. graduate founded a startup

called Ayasdi, which uses topological data analysis and machine learning to visualize massive data sets

without writing algorithms or queries. On the visualization side, several private software companies and

open-source platforms help organizations create charts and infographics. Tableau, Visual.ly, CartoDB,

Google Fusion Tables and Timeline.js are just a few examples. Even data journalism is not new. Knight-

funded projects like DocumentCloud and Overview process thousands of PDFs and government

document dumps. Data analysis, however, has primarily been a way to tell a story. While the underlying

Page 4: Yvonne Leow's Knight Project Proposal

  4

technology exists, it has not been commonly applied to help reporters cover their communities,

streamline internal operations, and earn potential revenue.

A Year at Stanford

If awarded a Knight fellowship, I would create a functional prototype of my proposal by the end

of the year. Here is my plan of action:

- Months 0-3: I want to be fluent in the latest NLP, machine learning and sentiment analysis

techniques. I would immerse myself in Stanford’s computer science department, particularly with

the nationally recognized NLP group led by Christopher Manning. Given my role at Digital First

Media and my relationship with managing editors in the Bay Area News Group, I am confident I

can work with local newspapers to access their archive data. I would gather a three-year sample

of archive news stories, and then create a graph database where I would assign relationships

between names, locations, organizations and other relevant terms.

- Months 3-6: Design would be the next step. I hope to enroll in Stanford d.school courses to

learn how to best visualize information and user experience design. I will also take advantage of

the university’s vicinity to Silicon Valley, particularly Palo Alto-based companies like Palantir and

Ayasdi, to interview data scientists and product managers. The objective is to begin visualizing

the article data and wireframing a user interface for the prototype. I will gather feedback from

Stanford professors and local journalists to refine the design.

- Months 6-10: During these last few months, I hope to enroll in entrepreneurship courses and

develop a working web application that visualizes the sample data. I hope to convince editors

that it is not only feasible, but also imperative to integrate data science into local journalism.

After demonstrating the project to Knight colleagues, Stanford professors and industry peers, I

would like to present the idea to newspaper groups and other potential funders.

The Power of Legacy

At the end of the day, it is not all about the data. This proposal is about changing local

newsroom culture to be smarter and more effective about how journalists cover their communities. In an

era when newspapers are contracting and rapidly falling behind, this is an effort to help them evolve. To

innovate is to make changes in something established by introducing new methods, ideas or products.

With the support of the Knight Fellowship, I want to help newspapers realize that their legacy does not

have to be a burden; it can be a competitive advantage. The key to their future is simply buried in their

past.