Upload
serge-potemkin
View
182
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Visualizing the story plot Visual representations of linguistic data such as parsing trees, semantic relationships, statistical diagrams, etc., are widely used by linguists due to their usefulness and flexibility [Chris Curly]. At the same time, text as a whole poses a number of challenges for visualization, due to significant differences from other types of linguistic data. This paper presents a novel approach to visual representation of fiction and non-fiction texts in terms of event detection within those. Formal detection of events in the narrative is based on text itself, without considering its perception by a reader. Our hypothesis is as follows: the description of the state change is defined by the number of pairs of antonyms one of which is a word on the left from the sentence under consideration and the other one occurs on the right of the sentence. The article examines this hypothesis and the results obtained. The results are presented in graphical form on the X-Y plane. Each point of the graph has coordinates (x, y) where x = Ns is the serial number of the sentence starting from the beginning of the story; y is the count of the antonymous pairs in the whole story, one member of the pair is on the left and the other is on the right side of the sentence Ns. The resulting graph of antonyms undergoes smoothing and interpretation. The user can mark on the graph the sentence in which there is a significant change in the number of antonyms; read the sentence itself, move this dot along the curve on the graph. The results of calculations for Charles Darwin’s AUTOBIOGRAPHY [Darwin] (22000 words) are presented below: One can see a number of marked events on the graph, Figure 1: Te first maximum of antonyms number is the event when the boy has left the school. The second is what C. Darwin mentioned as one of his best achievements. The third one is about his disagreement with some scholars. The fourth is his satisfaction with his work appreciation. And the fifth, the maximal one, is connected with his life credo. Fig. 1 Graph of antonyms for the AUTOBIOGRAPHY by Charles Darwin [Darwin]. In comparison with the pure fiction text one can observe main differences: The fiction story should comprise Exposition, Rising action, Crisis, Climax, Falling action and Denouement. In contrast the non-fiction story has a very short exposition (or all the story is a sort of exposition), it has a number of Climaxes connected to the important events of the life and no Denouement (the denouement is the death of the character – in the autobiography it is impossible). But, if we consider a large fiction story, e.g., a novel, we also can see a number of maximums on the graph of antonyms. References [Chris Culy] Some Challenges and Directions for the Visualization of Language and Linguistic Data http://avml-meeting.com/keynote-speakers/ [Darwin] The Autobiography of Charles Darwin from The Life and Letters of Charles Darwin http://manybooks.net/t
Citation preview
Detection of events by analyzing Detection of events by analyzing antonymsantonyms
Potemkin S.Potemkin S.Philological Faculty Philological Faculty
Moscow State University Moscow State University [email protected]@philol.msu.ru
www.slideshare.net/sergeprim/lookingwww.slideshare.net/sergeprim/looking--foreeventsforeevents--inin--narrative narrative
Plot, Plot, sujetsujet, , fabulafabulaThe sequence of The sequence of eventseventsthat makes up a story that makes up a story (occurring around the main (occurring around the main conflict).conflict).
Five stages of plot Five stages of plot developmentdevelopment
Culmination
OutcomeDevelopment
Exposition
QuestionQuestion
How can one measure the plot How can one measure the plot development?development?
What values should be set on Y axes while What values should be set on Y axes while X axes is time (or the number of the X axes is time (or the number of the sentence from the story beginning)sentence from the story beginning)
Result Result –– the answer (?)the answer (?)
In the real story the curve has a number of In the real story the curve has a number of maximumsmaximums
EventEvent“Character moving across the border “Character moving across the border
of the semantic field” of the semantic field” -- LotmanLotmanEventEvent is the is the transitiontransition from one from one
internally consistent state (of the internally consistent state (of the world and the character) to world and the character) to another internally consistent stateanother internally consistent state
Formal identification of Formal identification of eventsevents
Each state is described by a set of Each state is described by a set of semantic features = a point in the semantic features = a point in the semantic space semantic space
The transition can be specified by a set The transition can be specified by a set of antonyms (rich of antonyms (rich –– poor, healthy poor, healthy –– ill, ill, nice nice –– ugly, young ugly, young –– old, etc.)old, etc.)
The number of pairs of antonyms within The number of pairs of antonyms within the state is small, between the states is the state is small, between the states is largelarge
Counting antonymsCounting antonyms
For each sentence of the story we count For each sentence of the story we count the number of antthe number of ant--pairs, one before and pairs, one before and the other after this sentence. the other after this sentence.
The number of antThe number of ant--pairs is plotted along pairs is plotted along Y axesY axes, , as the function of the sentence as the function of the sentence number (number (XX axes)axes). .
Events in Events in «Душечка»«Душечка»
Analyzing the curveAnalyzing the curve
Find local max.Find local max. Read the corresponding sentenceRead the corresponding sentence Is this really an event?Is this really an event?Move left or right finding the eventMove left or right finding the event Try to adjust the window width Try to adjust the window width
Events in “The Darling”Events in “The Darling”
Seems similar to the original textSeems similar to the original text
Waves and betweenWaves and between
Each wave max is a main event (positive)Each wave max is a main event (positive) Minimum and near is a negative situationMinimum and near is a negative situation Maximum and near is a positive situationMaximum and near is a positive situation
It is interesting to compare with the folk It is interesting to compare with the folk talestales
Events in Events in ««HerzchenHerzchen»»
http://www.slideshare.net/sergeprim/herzchen
3D Picture3D Picture
We have a small set of German anonyms.We have a small set of German anonyms.So we tried to make 3DSo we tried to make 3D
X is sentence numberX is sentence number Y is window widthY is window width Z is antonyms countZ is antonyms count 4 waves corresponding to 4 main events 4 waves corresponding to 4 main events
in in Herzchen’sHerzchen’s life life
Event in Darwin’s biography Event in Darwin’s biography
Darwin’s lifeDarwin’s life
First maximum of the antonyms curve First maximum of the antonyms curve --the boy has left the school. the boy has left the school.
The second is his best achievementThe second is his best achievement The third The third -- he disagree with scholars.he disagree with scholars. The fourth is his work appreciation. The fourth is his work appreciation. The fifth, the maximal one, is connected The fifth, the maximal one, is connected
with his life credo. with his life credo.
ConclusionConclusion
Detecting the events in the narrative Detecting the events in the narrative via counting antonymsvia counting antonyms
Depicting the curve of antonyms as the Depicting the curve of antonyms as the function of sentence number function of sentence number
Analyzing the graph of antonymsAnalyzing the graph of antonyms The event is identified as a local max on The event is identified as a local max on
the graphthe graph
Thank you for your attentionThank you for your attention