View
185
Download
1
Category
Tags:
Preview:
Citation preview
Motivation: a question
How to find sources for a humanities research?
How to find literature for a research in “hard” sciences?
Motivation: the differences between humanities and “hard” sciences
• Primary and secondary sources • Citation history (e.g. Google Scholar) • Citation semantics
Motivation: primary and secondary sources
Approx. half of the citations in humanities are to primary sources [Wiberley (2009)].
Their use has hardly ever been studied with citation analytic methods.
“For scholarship in the humanities there are three kinds of literature: primary literature that contains the evidence on which humanists base their scholarship, secondary literature in which humanists write up their scholarship, and access services that describe and index the publications written by humanists.” (Wiberley, 2009)
Motivation: citation history
Lack of data [Sula and Miller (2014)], why? • Sparse and local sub-fields • Nationality (language and schools) • Proliferation of editorial practices
Motivation: citation semantics
•Humanists are less prone to credit each other than scientists [Heinzkill, 1980; Swales, 1990; Hellqvist, 2010]
•They are less prone to work together. Avg. authors per publication of 1.06 in a study by Linmans (2010)
•They use citations with a great variety of meanings and ways: agree, disagree, full association, minor reference, etc. [Harwood (2008), Cano (1989)]
Examples: Strongly negative: “Professor Epstein’s comment presents no new findings and ignores the theoretical issues I raise.” and quote to Epstein 2008. Ogilvie (2008). Association: “non basta ridimensionare gli aspetti strutturali del declino economico, che per Venezia fu comunque solo “relativo”, ..” and quote to Rapp 1979. Trivellato (2000).
Motivation: our answer
Citation analysis for humanities is an almost non-existent field, yet the results could be very rich:
We cannot simply use traditional citation analysis methods on humanities data. We need new questions and methods.
The project: goals
• Digitise all historiography on Venice we can (i.e., for now, history).
• Extract all citations and populate a database. • Analyse the history of the history of Venice and
develop a framework for citation analysis for humanities.
• Publish an open access search engine for scholars and general public.
The project: goals
“Side effects”, we have the full text of most publications on Venice, considering we are also digitising documents at the Archive.. • Indexes of keywords (e.g. named
entities) • Direct link publication-sources • Topic modelling and fine-grain
classification of publications (currently at most Dewey subjects..)
• Enhanced library catalogue
The project: partners and materials
Partnership with Ca’ Foscari Library System (humanities library) and discussion with major Venetian libraries.
Digitisation goal: digitise all secondary literature on Venice for the last 200y (monographs, journals, editions, etc.). Currently circa 5000 estimated items (there are many more). Digitisation ongoing (1513 done last Friday).
Methods I: data extraction
The steps: • OCR • Citation detection • Citation parsing • Model and populate the db (ontologies for citations)
Basic tools: • Active annotation for supervised learning (minimise
training data to annotate) • Conditional Random Fields for parsing • RDF and triple stores as database
Methods II: citation analysis, networks
Network-based models. Remember primary and secondary sources, how many graphs can we build?
Bibliographic coupling and co-citation
Methods II: citation analysis
Network-based models: • Global analysis • Local analysis (communities and nodes) • Temporal analysis • Publication classification and analysis
Big questions: • Key works, authors, sources • Disciplinary segmentations • Measure intellectual influence and schools of thought • Map scholarly debates
Recommended