www.sas.ac.uk
Professor Lorna M. HughesSchool of Advanced Study
University of London@lornamhughes
Digital Humanities, Big Data, and New Research Methods
Digital Music Lab: Analysing Big Music Data, Final workshopBritish Library March 13th 2015
‘We are all digital humanists now…’
• The content we use is increasingly ‘digital by default’
• We produce, curate and manage vast quantities of data, and are getting better at data management
• We publish digital resources, and digital outputs, that increasingly include data
• Our content is re-used for unforeseen purposes
Core elements of Digital Humanities
Digital Content
• Digital collections, and projects with digital outputs
Methods
• ‘Scholarly primitives’ to gain new knowledge:
Discovering, annotating, comparing, referring,
sampling, illustrating, and representing digital content
Tools
• For processing and analysis
Researchers in the humanities are creating, managing, and using data
• To enable existing research processes to be conducted better and/or faster
• To enable researchers to ask, and answer, completely new research questions
Rhyfel Byd a’r profiad Cymreig /Welsh experience of the Frirst World War: http://Cymru1914.org
• Unified digital archive of 220,000 pages of text, image; audio, film
• Collaborative development between Libraries & academics
• Exposing content for widest harvesting
• Incorporated use and re-use of content into development
• Fully bilingual and accessible user interface
Variants on “Belgian refugees”In Welsh and English, 1914-19cymru1914.org
1914 1915 1916 1917 1918 1919
Re-using digital newspaper content
8
Macroscopic analysis
• Distant reading methodologies to work with datasets
• Kyffin Williams Online
• Lloyd Roderick, Aberystwyth University and National Library of Wales
9
Visualising Data
• Welsh Traditional Music
• Integration of sources to map traditional music and its cultural reception
• Andrew Cusworth: Open University and National Library of Wales
Digital methods in the humanities highlight challenges of Big Data
1. The underlying data and metadata
2. Linking datasets from disparate collections
3. The human infrastructure: data sharing, rights management, open data and open access…
4. Invisibility of digital methods in scholarly outputs: we do not ‘show our workings’
5. Bringing together research questions, data, methods, and tools…
2. Better linking of digital content
“We hoped to be able to send send all these people to Glasgow at Easter…”
19th April, 1916: War Refugees Committeecymru1914.org
W.D. Roberts manuscripts, NLW MS 9982E
4. Transparency of Method: ‘Showing our workings’
Debate about sentiment analysis and the SyuzhetPackage: Annie Swafford and Matt Jockers
Addressing the challenges
• Better collaborations with the cultural heritage sector
• Better partnerships around data creation and management
• Pay more attention to the human infrastructure: the scholarly ecosystem around digital research
• Develop new approaches to documenting and describing digital methods within traditional publications
Conclusions
• Humanities research questions build an enquiry-led understanding of the essential elements of data
• The key to big data is its unpredictability and un-structured nature: moving beyond scaling up, into the realm of the Known Unknowns
• Understanding the complexity of data is transferrable across disciplines and genres
• From small things, big things one day come…