22
Comparing cohesion and coherence of two texts using Coh-Metrix Bertoli Giacomo Condorelli Damiano Cognizione e linguaggio Maria Teresa Guasti A. A. 2014/2015

Comparing cohesion and coherence of two texts using Coh-Metrix

Embed Size (px)

Citation preview

Comparing cohesion and

coherence of two texts using

Coh-Metrix

Bertoli Giacomo

Condorelli Damiano

Cognizione e linguaggio

Maria Teresa Guasti

A. A. 2014/2015

Goals

• Does Italian to English translation

influence cohesion and coherence?

• Is it possible to get statistically significant

results from a comparison of two texts?

Asimov vs Fallaci

Asimov vs Fallaci

Asimov I., Of Time and

Space and Other

Things, Doubleday,

1965.

Literary genre: scientific essay

Fallaci O., The Force of

the Reason, Rizzoli,

2004.

Literary genre: essay

Asimov’s Text

“On July 20, 1963 there was a total eclipse of the Sun, visible in parts of Maine, but not quite visible in its total aspect from my house. In order to see the totaleclipse I would have had to drive two hundred miles, take a chance on clouds, then drive back two hundred miles, braving the traffic congestion produced by thousands of other New Englanders with the same notion.

I decided not to (as it happened, clouds interfered with seeing, so it was just aswell) and caught fugitive glimpses of an eclipse that was only 95 per cent total, from my backyard. However, the difference between a 95 per cent eclipse and a 100 per cent eclipse is the difference between a notion of water and an oceanof water, so I did not feel very overwhelmed by what I saw.

What makes a total eclipse so remarkable is the sheer astronomical accidentthat the Moon fits so snugly over the Sun. The Moon is just large enough. to cover the Sun completely (at times) so that a temporary night falls and the starsspring out. And it is just small enough so that during the Sun's obscuration, the corona, especially the brighter parts near the body of the Sun, is completelyvisible.”

Fallaci’s Text

“My only regret is to have said less than I should. To have called simply"cicadas" those that today I call collaborationists, traitors. Then I add that the rage and the pride have married each other and produced a sturdy son: the disdain. And disdain has intensified my cogitation, reinvigorated my reason. Reason has brought into focus the truths that feelings had not focused and thatnow I can express without half-measures, without restraints.

For instance, by asking: what kind of democracy is a democracy that forbidsdissent, that punishes it, that turns it into a crime? What kind of democracy is a democracy that instead of listening to its children silences them, hands them to the enemy, abandons them to abuse and bullying? What kind of democracy is a democracy that favours theocracy, that re-establishes heresy, that tortures and burns the free minds on the stake? What kind of democracy is a democracywhere the minority counts for more than the majority and where, counting for more than the majority, it swaggers and blackmails?!?

A non-democracy, I say. A deceit, a lie. And what kind of freedom is a freedomwhich prevents us from thinking, from speaking, from going against the wind, from rebelling, from opposing those who invade us and muzzle us?”

Eleven Groups of

Coh-Metrix 3.0 Indices1. Descriptive

2. Text Easability Principal Component Scores

3. Referential Cohesion

4. Latent Semantic Analysis

5. Lexical Diversity

6. Connectives

7. Situation Model

8. Syntactic Complexity

9. Syntactic Pattern Density

10. Word Information

11. Readability

Text Easability Principal

Component Scores

• Text Easability Principal Component Scores have been developed specifically for Coh-Metrix. They provide metrics of text characteristics on multiple levels of language and discourse, aligned with theories of text and discourse comprehension.

• Examples: Narrativity, Syntactic Simplicity, Referential Cohesion, Temporality.

Coh-Metrix Analysis

Descriptive Indices

Index Asimov Fallaci

Paragraphs 3 3

Sentences 8 12

Words 216 214

Sentences in a

paragraph, mean2,667 4

Number of words in

a sentence, mean27 17,833

Number of

syllables, mean1,407 1,579

Number of letters,

mean4,264 4,762

Flesch Reading Ease

Index Asimov Fallaci

Flesch Reading

Ease

60,398 55,151

Scores

Over 60 Readable

50-60 Avarage readability

Under 40 Low readability

Formula

206.835 - (1.015 * number of words / number of sentences) - (84.6 * number of syllables / number of words)

Syntactic Indices

• Connectives – Words creating cohesive

links and clues about text organization

• Argument overlap – Do arguments of

verbs stay the same in different

sentences?

• Syntactic simplicity – Sentences with

simple, familiar syntactic structures

• Temporality – Cues about temporal order

(tense, aspect of verbs)

Syntactic Indices

Comparison of analyzed syntactic indices

suggests similar levels of cohesion and

coherence.

Index Asimov Fallaci

Connectives 92.593 74.766

Argument overlap 0.857 0.636

Syntactic simplicity 7.35 16.35

Temporality 50.08 71.57

Semantic Indices

• Familiarity – Words used often, every day

• Concreteness – Words related to thingsyou can see, hear, touch, etc.

• Lexical Diversity – Does the same word occur several times?

• Given/new – How much information isadded or just repeated?

• Narrativity – Texts telling stories, with characters, events and places

Semantic Indices

Comparison of analyzed semantic indices

suggests that Asimov’s text is more cohesive and

coherent.

Index Asimov Fallaci

Familiarity 580,197 556,783

Concreteness 373,12 338,894

Lexical Diversity 0,678 0,808

Given/New 0,304 0,216

Narrativity 78,81 62,93

Referential Cohesion

Referential cohesion refers to overlap in

content words between local sentences, or

co-reference. Co-reference is a linguistic

cue that can help readers in making

connections between propositions, and

sentences in their textbase understanding.

Index Asimov Fallaci

Referential

cohesion 83.71 47.61

Construction-Integration

• Propositions form textbase, the second

mental representation theorized in

Construction-Integration Model

• Integration phase: argument overlap

allows one proposition to connect to

another

Cohesion vs Coherence

• Coh-Metrix indices can be used to investigate the cohesion of the explicit text and the coherence of the mental representation of the text.

• Cohesion consists of characteristics of the explicit text that play some role in helping the reader mentally connect ideas in the text.

• Coherence is not easy to define. Itapproximatelycoincides with cohesion when texts are analyzed.

Conclusions

• Does Italian to English translation influencecohesion and coherence?

• Comparison of two short texts discloses no sufficient data to assert a role of translation on cohesion and coherence of a text.

• Is it possible to get statistically significant results from a comparison of two texts?

• Regardless of translation, a comparison of two texts is insufficient to get statistically significant results.

References

• Asimov I., Of Time and Space and Other

Things, Doubleday, 1965.

• Fallaci O., The Force of the Reason,

Rizzoli, 2004.

• http://cohmetrix.com/

• http://141.225.42.101/CohMetrixHome/doc

umentation_indices.html

Thanks for your attention