Upload
david-beavan
View
309
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Talk given at Digital Humanities 2011 (DH2011) in Stanford, USA on 21 June 2011. Web site: http://www.scottishcorpus.ac.uk/corpus/bnc/compair.php Abstract: https://dh2011.stanford.edu/wp-content/uploads/2011/05/DH2011_BookOfAbs.pdf This paper will demonstrate ComPair, a new tool to investigate and compare word usage, encouraging new ways to explore language variation. While remaining focussed on the usability and the promotion of navigation, this tool represents an evolutionary step forward from the author’s previous award winning visualisation applications. This paper will introduce the methods and technologies at its core, perform a demonstration of the tool and discuss opportunities for further collaboration.
Citation preview
ComPair: Compare and Visualise the Usage of Language
David Beavan University of Glasgow [email protected] @DavidBeavan
‘You shall know a word by the company it keeps’
Firth, John R., 1957. Modes of meaning. Oxford: Oxford University Press.
Collocation
• Words which go together • More than by chance, they show an association
• Take a corpus • Search for a term (node word) • Examine words in a window (e.g. 5) either side of node • Aggregate these co-occurring words • Rank (e.g. by frequency or collocational strength)
‘Stanford’ collocate search via Davies, Mark. (2004-) BYU-BNC: The British National Corpus.Available online at http://corpus.byu.edu/bnc.
Collocates
Collocate Cloud
‘Stanford’ search via Beavan, David. (2008-) BNC Collocate Cloud. Available online at http://www.scottishcorpus.ac.uk/corpus/bnc/collocatecloud.php
Collocate Cloud properties
• 100 most frequent collocates listed alphabetically • Font size shows frequency of word • Brightness shows collocational strength of word • Interactively create new clouds
• Best New Idea for Improving a Current Web-Based Tool,
2008 TADA Research Evaluation eXchange (T-REX)
Comparison
• Investigate and compare word usage – Expose attitudes and cultures – Investigate degrees of synonymy
• Semantic prosody – How synonymous words can actually take on positive or negative
connotations
• Applications for language learning – Examine real-world usage of words
ComPair properties
• Visualise usage of two node words • Distribute 150+ collocates on a continuum • Colour shows attraction to node • Brightness shows degree of collocational attraction
• Currently uses British National Corpus • Can be applied to any corpus or dataset (in progress)
ComPair how-to
• Take two collocate word lists – Same corpus, different node words – Different corpora, same node word
• Calculate collocational strength towards each node – Mutual Information etc.
• Place collocates on continuum between node words – Those with attraction to a single node appear near that node – Those with little attraction to either node appear central and dim – Those with attraction to both nodes appear central and bright
ComPair: http://www.scottishcorpus.ac.uk/corpus/bnc/compair.php
David Beavan University of Glasgow [email protected] @DavidBeavan