Upload
giuseppe-sollazzo
View
206
Download
0
Embed Size (px)
Citation preview
Parli-N-Grams
Giuseppe Sollazzo@puntofisso
Accountability Hack 2014 ReduxPortcullis House, 15 January 2015
The language used in politicspoints to the
political, cultural, social backgroundof MPs, parties, and their time
A great search lets you
discover knowledgeby just searching
Parli-N-Grams
A search and analysis tool for Hansard
An N-Gram is a sequence of N words extracted from a longer sentence
N-Grams?
You can split any word sequence in n-grams lists
e.g. “The quick brown fox”
N-Grams?
“The quick brown fox” | 1-grams
● the● quick● brown● fox
N-Grams?
“The quick brown fox” | 2-grams
● the quick● quick brown● brown fox
N-Grams?
“The quick brown fox” | 3-grams
● the quick brown● quick brown fox
N-Grams?
“The quick brown fox” | 4-grams
● the quick brown fox
N-Grams?
30” demo
What can we find out?
Related Ideas
1. Create a Markov Chain by analysing text2. Generate text by using the Markov Chain
Markov Text Generation
photo CC BY-NC by UK Parliament: https://www.flickr.com/photos/uk_parliament/8719647747
What happens if we apply Markov analysis and generation to the
Queen’s Speeches?
My Government will be devolved.
My Government will be devolved.
FALSE
My Government will work with the Afghan government.
My Government will work with the Afghan government.
TRUE
My Government looks forward to an enhanced partnership with India.
My Government looks forward to an enhanced partnership with India.
FALSE
And more...
My Government will be reduced.
My Government will be strengthened in July.
My Government will be strengthened in Afghanistan.
People will be reviewed.
My Government will introduce fixed term Parliaments of unnecessary laws.
My Government will introduce fixed term Parliaments of children.
Legislation will be introduced for a viable Palestinian state existing in Northern Ireland.
My Government will support the abolition of Wales.
My Government looks forward to the blessing of Identity Cards and political parties.
The Almighty God may rest upon the management of the Commons.
Perception: real but implausibleCan we analyse empathy in language?Can we analyse emotional reactions to speeches?Do speeches by MPs elicit the same reaction?
A language uncanny valley?
Hubris Syndrome?
P Garrard, D Owen, et al., “Linguistic Biomarkers of Hubris Syndrome, Cortex, Elsevier, 2013
COPYRIGHT BY THE ARTICLE’S AUTHORS
SEE FOOTNOTE FOR DETAILS
Stylistic similarity in literature
J Hughes et al., “Quantitative patterns of stylistic influence in the evolution of literature”, PNAS, 2012
COPYRIGHT BY THE ARTICLE’S AUTHORS
SEE FOOTNOTE FOR DETAILS
Vocabulary Size
M Daniels, “The largest vocabulary in Hip-Hop”, http://rappers.mdaniels.com.s3-website-us-east-1.amazonaws.com/
COPYRIGHT BY THE ARTICLE’S AUTHORS
SEE FOOTNOTE FOR DETAILS
Reading level
Fox et al., “Who was America’s most well-spoken President?” http://www.vocativ.com/interactive/usa/us-politics/presidential-readability/
COPYRIGHT BY THE ARTICLE’S AUTHORS
SEE FOOTNOTE FOR DETAILS
Reading level
Fox et al., “Who was America’s most well-spoken President?” http://www.vocativ.com/interactive/usa/us-politics/presidential-readability/
COPYRIGHT BY THE ARTICLE’S AUTHORS
SEE FOOTNOTE FOR DETAILS
What lies ahead
Vocabulary Size
Ongoing Work
Run away from PHP Python + NLTKMake it stable; auto-harvestSearch on N-Grams, N>1Search by MP and political party
-> build each MP’s political historyBuild a base of analysesUI for the public + researchers
Questions?Parli-N-Gram
http://parli-n-grams.puntofisso.net
http://github.com/puntofisso/AccHack14
http://www.slideshare.net/puntofisso
Giuseppe Sollazzo@puntofisso
Accountability Hack 2014 ReduxPortcullis House, 15 January 2015