Click here to load reader
Upload
vuongdang
View
214
Download
2
Embed Size (px)
Citation preview
HCIL Symposium Talk – Ira Chinoy – May 2011
My talk is a bit of a departure from what you have heard and what you will hear at
this symposium. I have been a journalism educator for 10 years, and before that I
was a journalist for 25. I come to you not as a creator of new tools but as a user.
And in particular, I want to talk to you today about what happens at the
intersection of technology, information, and our collective rights to know about the
affairs of government.
SLIDE (Start of series): Watchdog journalism
There has long been a close connection in the United States between the watchdog
function of journalism …
SLIDE: Public records
… and public records. By public records I mean those records of government that the
public has a right to see.
SLIDE: Data as news
I want to talk with you about how journalists use a particular kind of public record –
those maintained as databases.
SLIDE: Not on the web
And I am further focused here on databases that are NOT posted on the Web but
must be obtained directly from the agencies that keep them. I will provide
examples of the ways that reporters have used these databases, …
1
SLIDE: Obstacles
…I also want to call your attention to the frequent obstacles that reporters face in
obtaining them – obstacles resulting as much from human as technological
considerations. [1:01 to here]
SLIDE: Daily drama, chilling effect
The net result of this daily drama over digital records is a chilling effect on reporters’
willingness to do this work, and that leaves a long list of potentially important
stories that are never done. The barriers and refusals end up sabotaging the ideal of
transparency that informs our basic notions of democratic government.
SLIDE: The takeaway: A challenge to the HCI community
So I am going to conclude my talk with a takeaway that is actually a challenge to the
HCI community. I’d like to see if I can interest you in finding alternatives to these
case-by-case battles. [1:28 to here]
SLIDE: Computer-Assisted Reporting
The practice of obtaining and analyzing databases for news has been around for
decades. About two decades ago, it picked up a label – “computer-assisted
reporting.” The idea is not that an entire story is produced by sitting at a computer.
Rather, tools of analysis – such as relational databases – are combined with
traditional reporting methods.
I’ll start by sharing some classic examples. HOLD HERE – DO NOT ADVANCE
2
SEGUE: The holy grail of computer-assisted reporting may be the linkage of
databases that were never meant to go together. In 1990, reporters at the St. Louis
Post Dispatch were investigating voter fraud in East St. Louis, Illinois. They
compared the database of registered voters, including those who had actually cast
votes, with a database of people who were dead.
What do you think they found? …
SLIDE: Dead voters
… Here’s what. It’s a story headlined, “Dead or Alive: City’s Ineligible Voters
Number In Thousands.” Next to it is a graphic that says “More Voters than Adults in
East St. Louis.” Here’s the lead:
“A man named Admiral Wherry, an army veteran who owned a barbecue pit
and a tire repair shop in East St. Louis, died more than two years ago. But that
didn’t stop him from voting in the Illinois Democratic Primary on March 20.”
The story revealed how a failure to cull the voter registration list on a regular basis
had opened the door to fraud. [2:41 to here]
SLIDE: School Bus Drivers 1 - joins
In 1986, the Providence Journal responded to a series of fatal school bus accidents
by looking for links between databases of school bus drivers, traffic court cases, and
criminal convictions.
SLIDE: School bus drivers 2 - story
Here’s what THEY found. Basically, scary people driving the children of Providence
Journal readers to school, and large gaps in the regulatory system. [3:00 to here]
3
SLIDE: Hurricane Andrew 1 – NOAA IMAGE
In 1992, Hurricane Andrew blew across south Florida, leaving an enormous amount
of destruction in its wake.
SLIDE: Hurricane Andrew 2 – Wind speeds
The Miami Herald linked data showing the highest wind at any given location – the
highest being in orange here…
SLIDE: Hurricane Andrew 3 – Damage
… with data showing properties that were rendered uninhabitable, the colored
areas here. You can see that the winds and the degree of damage are not much of a
match.
SLIDE: Hurricane Andrew 4 – Pulitzer
Much of the damage, in fact, followed a building boom. The Herald’s investigation
revealed, as the Pulitzer Prize board put it, “how lax zoning, inspection and building
codes had contributed to the destruction.” [3:32 to here]
SLIDE: Black lung 1: Dust, deception & death
Sometimes the story is less about what is in the data than what is missing. The
Louisville Courier Journal wanted to know why miners are still contracting and
dying of black lung.
SLIDE: Black Lung 2: Kills 1,500 a year
As part of earlier reforms, systems were put in place to record the amount of coal
dust in the mines. Readings end up in a database maintained by the U.S. Mine
4
Safety and Health Administration. The newspaper analysis showed a remarkable
pattern in the test results, with many mines almost dust-free – impossibly so.
SLIDE: Black Lung 3: 2 teaspoons of dust
As this excerpt shows, many had readings that were, quotes, “the equivalent of two
teaspoons of dust spread across a warehouse the size of a football field…”
How could that be?
SLIDE: Black Lung 4: 234 miners interviewed
The reporters tracked down and interviewed 234 miners. They heard about rampant
cheating to confound the surveillance, with miners themselves participating –
essentially putting their health at risk to save their livelihood. [4:24 to here]
SLIDE: CNS 1: Capital News Service frontpage
I teach computer-assisted reporting here at the University of Maryland, and the
students practice what they learn. The College of Journalism operates a newswire,
Capital News Service, or CNS. It is staffed by students, with veteran instructors as
editors. Stories are picked up by news organizations in Maryland and Washington,
D.C. These stories have become increasingly important here as traditional news
organizations face cutbacks. Students who obtain databases as part their work in
my class tend use them to do stories in the next semester at CNS.
SLIDE: CNS 2: Elevators
Even simple databases can tell important stories. From a database that included
inspection dates for elevators in Maryland, one student discovered a pattern of
5
chronic under-inspection, the result, as reported here, of two decades of
inadequate staffing.
SLIDE: CNS Stories using databases of public records [do not read these]
CNS stories using databases of public records:Consumer complaints / Prison violence / Boating safety enforcement
Train accidents / Amusement rides / Subprime loansUnsolved homicides / Leaking underground oil storage tanks
Campaign finance / Lawyer discipline
You can see in this slide just a few examples of the topics of other stories CNS
students have done. A couple of them have won national citations from a
professional journalism organization, Investigative Reporters and Editors.
SLIDE: Animal complaints 1
Not all stories done with databases of public records have to be exposés. A former
student working at the City Paper in Baltimore got the database of calls to the city’s
311 complaint line. Here’s an item on animal complaints. You’ll see that in addition
to the usual city-dwelling wildlife … dogs, cats, bats …
SLIDE: Animal complaints 2
… Chris found complaints about “snapping turtles, sheep, a hyena, an iguana, a
coyote, a swarm of bees, and a baby tiger” – though he notes, “that probably
wasn’t a hyena.” [5:46 to here]
SLIDE: Public Information Act
All this is well and good. Now for the dark side.
6
For decades, laws have recognized the public’s right to know about the affairs of
government. At the federal level, the Freedom of Information Act begins with the
presumption that all federal records are public as long as they do not fall into one of
several categories of exemption, such as national security.
States have their own laws. In Maryland, this law is called the Public Information
Act. These laws set out the limits on the time agencies have to respond and the fees
they may charge.
In 1996, FOIA was amended to allow requestors the right – explicitly – to get
databases of government records as databases – that is, rather than as printouts…
SLIDE: Senate Bill 740
… It has taken Maryland 15 years to catch up, which it did this year.
Unfortunately, this is not going to resolve, in my opinion, some the more deeply
rooted issues that prevent journalists in many cases from getting these records in a
timely or affordable way – or at all. [6:42 to here]
I want to give you some examples of what these obstacles look like:
SLIDE: Florida teachers 1: Herald-Tribune (Sarasota)
Reporters at the Herald-Tribune in Sarasota were looking into teacher certification
practices in Florida.
SLIDE: Florida teachers 2: How did story
They eventually obtained several databases, including a database of teacher
certification test scores.
7
SLIDE: Florida teachers 3: State fights
The data was readily available, but the state put up a fight, and the paper had to
sue. The reporters eventually prevailed ..
SLIDE: Florida teachers 4: 9 months
… but the state took a total of 9 months to provide the data.
SLIDE: Florida teachers 5: Horne condemns
Along the way, a state education commissioner publicly condemned the reporters
and their request, ….
SLIDE: Florida teachers 6: Time spent ...
… saying, “The time spent on such tasks is time taken away from increasing student
achievement among all of Florida's students."
SLIDE: Florida teachers 7: Took one hour
When the reporters finally got the data, the irony was that it took them about an
hour to do the basic analysis.
SLIDE: Florida teachers 8: Teachers who fail - lead
Of course, they did much more. Here’s the lead:
“More than half a million Florida students sat in classrooms last year in front of teachers who failed the state's basic skills tests for teachers.
Many of those students got teachers who struggled to solve high school math problems or whose English skills were so poor, they flunked reading tests designed to measure the very same skills students must master before they can graduate.
8
The story went on to report that students in the neediest areas were more likely to
have teachers who had done poorly on these tests.
Certainly, this is an example of the kind of story that more often than not may just
stay in the dark in the face of determined resistance from gatekeepers.
[7:59 to here]
SLIDE: 38 Excuses 1: Title page, 1994
In 1994, 17 years ago, a group of journalists got together to compare notes on the
resistance they were getting. They prepared a now infamous list, “the 38 excuses.”
We still hear them. These are some we hear most often:
READ THESE SLIDES AS THEY COME UP
SLIDE: We've never done that before.
SLIDE: We don't know how to do that.
SLIDE: It takes too long.
SLIDE: It costs too much money for us to do it.
SLIDE: There are confidential records mixed in.
SLIDE: We don't think you'll understand the data / technology, you'll mess it up.
SLIDE: We'd love to give it to you but it violates our contract with the software
company.
When I show these to my students, they can’t believe it. How hard can this be, I can
see them thinking to themselves – and admitting later. By the end of the semester,
9
we always – ALWAYS – hit at least half of the 38. And by then, the students no
longer think it’s funny.
Sad to say, this far along in the computer era , requests for
records in digital form are still treated as strange and even
threatening .
I have seen all of these, many times over, as the initial response to requests that are
ultimately successful. I have also seen them many times as responses to requests
that that are not successful. [9:07 to here]
SLIDE: A Maryland case
I want to talk to you about one case in a student project here as an example. This
student was looking for the database maintained by a government agency that
served as a resource for worker complaints over non-payment of wages and
commissions.
SLIDE: Documentation
We always ask for database documentation – record layouts, relationships
between tables, and so forth. He was not given this initially …
SLIDE: Fields
… and was told he could have five fields of data. When he actually succeeded in
getting the documentation, he learned that there were 115 fields
10
SLIDE: Format
And although the database was already a Microsoft Access file, he was told that it
would first be converted to a PDF format – virtually useless for analysis…
SLIDE: Cost
… and he was told he would be charged $450 for the work to produce the records in
that PDF format… Or he could pay even more – $600 – for some kind of spreadsheet
described as, quotes, “non-alterable.”
SLIDE: Time
Months went by. The semester ended, but the student wanted to keep going
SLIDE: Luck
Then a stroke of luck. A person moved into the agency’s communications director
slot who was willing to ask INSIDE her agency why the database couldn’t just be
handed over as a database. When she was told it would be a lot of work, she said
“Show me.” The time needed to do the work turned out to be measurable in
minutes, as I understand it, not hours or days, ….
SLIDE: Database and a story:
… and the database was on its way to the student, with no fees necessary.
SLIDE: Maryland workers file thousands …
And the student had his story, which reported that workers had recouped millions
of dollars through the program, but it was being suspended as a cost-cutting move.
[10:33 to here]
11
SLIDE: The fear factor …
We could spend a whole day talking about what accounts for these perpetual
obstacles. One useful way to think about the responses we get is that each exists
somewhere along a continuum of fear.
SLIDE: … a continuum:
On one end of the continuum [gesture to left], there are well-founded fears. Let me
ask you – have you ever seen a news story that had an error in it? Of course you
have. Does a public official have justification to be concerned when a reporter
comes looking for a story? Sure.
Over on the other end [gesture to the right], I have seen the situation in my own
reporting career in which an agency head knows that the release of the data will
reveal malfeasance.
And then somewhere in the vast middle are those who fear all those things in the
list of 38 excuses. For example, that it will take a lot of work.
And they may fear this because they think that copying a simple database with
10,000 records will take a really long time – perhaps because they are in the habit of
thinking about paper records. And they do not understand that, with a few
keystrokes, their staff can redact the part of a database that may not be legally
disclosed, even if that database contains 100,000 records.
Databases are just a mystery to many people in positions of authority when it
comes to acting on these requests.
12
Or maybe they know what the data will show, and they fear the reporter will not
see it the way they do. Or their agency hasn’t analyzed the data and they have no
idea what it might show – and they operate from the premise, in this case, that no
news is good news.
[11:56 to here]
SLIDE: Other factors: [DO NOT READ THESE]
Administrators, public information officers, and attorneys with inadequate training, experience, and support for handling database requests.
An agency culture in which requests for records are treated as an intrusion.
Inadequate budgeting to deal with public records requests.
Limited repercussions for the “just say no” approach to requests.
I do not mean to suggest that fear is the only factor affecting an agency’s stance in
responding to requests for digital public records. But investigating the resistance in
that way is certainly a useful heuristic when trying to understand the resistance.
There are plenty of other factors. Here are a few.
Summary remarks for this slide:
We often hear that the request for a copy of a database is a burden and takes
resources away from the REAL work of government. But providing access to records
about the workings of government IS part of the real work of government, and the
law and our national democratic ethos say so. [12:29 to here]
13
SLIDE: Tactics … [DO NOT READ THESE]
Information has context. Finding out about it first can help you understand what data they keep, why they keep it, how they keep it.
Ask to speak directly to the staff who manage the database. Avoid second-hand exchanges of information.
Know the law.
Look for common ground.
Explore the logic – and illogic – of baffling excuses.
Understand the fears … and address them.
Right now, we deal with the problems we encounter at what might be called a
tactical level. There are some examples here, but I’m not going to go recite them
except to say that while they can be helpful in anticipating and responding to
resistance on a case-by-case basis, they do not get at core issues which transcend
these case-by-case encounters.
So, I have been thinking about this since I got involved with the HCI Lab here in the
fall, wondering whether there might be a third way besides perpetually fighting –
which is exhausting – or caving in. [13:00 to here]
SLIDE: STRATEGY; Is there something that can be done well upstream? At the level
of of broad public policy?
Is there something at the level of broader strategy, a higher level of action,
perhaps? Here is an example
SLIDE: Public Records Assessment or Information Impact Statement
We have all sorts of imperatives now built into our understanding of how
government programs and government-regulated endeavors should be conducted.
14
Environmental impact statements … Fiscal notes …. Set-asides and diversity
checklists…
So: why not require this question to be considered when a government agency
establishes a new record-keeping system: How will those seeking the public
records maintained within that system be able to get them? Something we might
call a public records assessment or an information impact statement.
[13:33 to here]
SLIDE: Challenge and opportunity
I will end here and throw a challenge out to this community – what can we do that
transcends the case-by-case approach? This is classically one of those challenges
that is also an opportunity, especially for a community such as HCI, which is used to
thinking in innovative and interdisciplinary ways.
The alternative, not trying anything new, leaves us where we are now, and that is
not a particularly pretty picture when we compare what goes on in requests for
digital public records to our ideals about transparency.
It is wonderful that more government agencies are posting data on the web. But we
should not be deluded into thinking that this is anything close to the bulk of data
that these and other agencies maintain about issues of great public interest and
importance.
SLIDE: End – email address and web site
So I am eager to hear your observations, your ideas and your questions, and you
have my email and I hope the conversation might continue. [14:21 to here]
15
Databases obtained (mostly) by students:
16
17
18
19
20