Searching Political Data by Strategy

Preview:

DESCRIPTION

Presentation on the Exposé demonstrator, to enable search of Dutch parliamentary proceedings (the Political Mashup data collection).

Citation preview

Searching

Political Data by Strategy

Roberto CornacchiaJaap KampsWouter Alink

Arjen P. de Vries

info@spinque.com

Search by Strategy

An iterative 2-stage search process Express domain knowledge as high-level

search strategies Generate search engine from the strategy

A dynamic REST API UI controls for unspecified parameters

Separate search strategy definition (the how) from actual searching and browsing of data collections (the what)

https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#dashboard/demo04:

/p/topic/Mokken

Search by Strategy captures:

Arbitrary retrieval unit types (not just documents) E.g., expert finding, entity search

“Semantic” search The building blocks operate on scored triples

Semi-structured search Data objects may be structured in hierarchies

Exploratory search Use facets as preferences

Exposé

Searching the parliamentary proceedings of the Dutch parliament Complete transcripts of everything said in

parliament Organized by parliamentary session Detailing who sais what in what role and

context

Exposé

Original data is PDF, transformed into XML by award-winning project Political Mashup http://politicalmashup.nl/

In Politics…

Essence is not only what is said, but also by who and to whom, and why

Concrete example: Wilders sais “knettergek” in parliament (in

2007) – is this remarkable?

“Knettergek” case

The word “knettergek” has been used many times in parliament…

… but never to address a member of the government

Vary

ing

resu

lt typ

es

Utterances

Person / Party / …

Flexibility

Concrete case: Maarten: “I cannot find Prof. Mokken, who I

know has been spoken about in parliament multiple times!”

Flexibility

Default indexing uses stemming and normalization

But… searching for people’s names (and, as we mention it, many other domain specific terminology) can be negatively affected by stemming

“Mokken” transformed into “mok”, leading us to geographic locations “Mook” and “De Mok”, but not to the famous professor!

https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#dashboard/demo05:/p/topic/mokken/p/emphasis_stemming/0

Joins to the rescue!

Which house speakers from the Rotterdam harbour say what about “Amsterdam”?

biographies

describes

person

utterance

Sem

an

tic Search

Advantages

Define and execute custom build search strategies Specialized to the task, or even to the search

at hand

Search multiple data sources at once Explore and refine results interactively “Search provenance”

Complete transparency on how search results were obtained

Position Statement

Search professionals think in terms of search strategies already

Let them design their own strategies, and thereby tailor their search engines

So they learn to trust what we claim to be the effective information retrieval techniques!

Recommended