Text complexity in and for literary studies. foundations

Preview:

Citation preview

text complexityin and for literary studies

foundations

complexity – a definition

„Complexity is generally used to characterize something with many parts where those parts interact with each other in multiple ways.”

Wikipedia

“the only consensus among researchers is that there is no agreement about the specific definition of complexity” (Wikipedia)

organized complexity• non-random interaction between the parts of a system• these correlated parts create a differentiated structure • the system manifests emergent properties (i.e. properties not

reducible to parts of the system)

text complexity and readability• Readability research describes a lot of stylistic features• Correlation with readability not totally clear

Vocabulary• Type-Token Ratio • Root Type-Token Ratio • Corrected Type-Token Ratio • Bilogarithmic Type-Token Ratio • Uber Index = log (Typ2)/ log (Tok/Typ) • Measure of Textual Lexical Diversity

(McCarthy, 2005) • Lexical Density = TokLex/Tok

• Lexical Word Variation = TypLex/TokLex• Noun Variation = TypNoun/TokLex• Adjective Variation, Adverb Variation• Modifier Variation = (TypAd j +

TypAdv)/TokLex• Verb Variation 1 = TypVer b/TokVer b• Verb Variation 2 = TypVer b/TokLex• Squared Verb Variation 1 = Typ2• Verb/TokVerb

Syntax

Language Model

Morphology

Most predictive features

text complexity in literary studies • style

• syntax• vocabulary• registers• figurative language

• aesthetics• form and content

• depicted world• symbolic elements / aspects of the fictional world• Intertextuality• polyvalent – inexhaustible for interpretations

Readability and text complexity• cognitive load vs. interpretability

gold standard• very difficult to achieve agreement• especially the very trivial literature may be hard to get• difficult to do for different times and languages• three levels: highbrow, middlebrow, lowbrow

babystepssome experiments

corpus - novels 1908-1932highbrow

• C. Einstein: Bebuquin

• C.M. Rilke: Malte Laurids Brigge

• F. Kafka: Der Prozess

• R. Müller: Tropen

• R. Musil: Der Mann ohne Eigenschaften

• P. Scheerbart: Lesabéndio

• R. Walser: Der Gehülfe

middlebrow• Baum: Menschen im Hotel

• Bettauer: Stadt ohne Juden

• Fallada: Kleiner Mann, was nun

• Kellermann: Der Tunnel

• Perutz: Der Meister des jüngsten Tags

• Wassermann: Das Gänsemännchen

• Zobeltitz: Aus tiefem Schacht

lowbrow• H. Courths-Mahler

• R. Huch: Der Fall Deruga

• N. Jacques: Marbuse, der Spieler

• R. Kraft: Nobody‘s Erlebnisse

• E. Wallace: Die toten Augen von London

Literature• Julia HANCKE Sowmya VAJJALA Detmar MEURERS: Readability

Classification for German using lexical, syntactic, and morphological features. In: Proceedings of COLING 2012: Technical Papers, pages 1063–1080.

Recommended