Upload
lamtuong
View
216
Download
0
Embed Size (px)
Citation preview
1
Characters in 19th Century Novels Display Distinctive Voices as Seen by Stylometric Analysis [email protected],UnitedStatesofAmericaLarryBassistlarrybassist@comcast.netBrighamYoungUniversity,UnitedStatesofAmericaMattRopermatt_roper@byu.eduBrighamYoungUniversity,UnitedStatesofAmerica
Abstract Cannovelistscreatecharacterswithdistinctvoices
as reflected in each character’s wordprint? We ex-ploredtheliteraryskillsoffourprominentnineteenthcenturyauthorswhoaregenerallyconsideredtohavebeenexpertsincreatingcharacterswithintheirnov-els:
• Jane Austen –Pride and Prejudice andSenseandSensibility.
• CharlesDickens–OliverTwistandGreatExpectations,
• JamesFennimoreCooper–TheLastoftheMohicansandTheDeerslayer,
• Mark Twain – The Adventures of TomSawyer and The Adventures of Huckle-berryFinn.
Applyingstylometricanalysisusingnon-contextualwordsandprinciplecomponentsanalysis,wefound:
Thevoiceofthenarratorineachnovelwasdiffer-entthanthecharactersinthenovel,andthenarrator’svoicedidnotmatchtheauthor’sownwordprintvoice.
• Each author’s characterswere distinctivelydifferent amongst themselves and also dif-ferentfromotherauthors’characters.
• Theauthorsdisplayedvaryingabilitytocre-ate distinctive characters, with Dickens’characters being the most distinctive fol-lowedbyTwain’s,Austens’andCooper’s.
Weconcludethattalentedauthorscancreatechar-acterswith distinctive voices and that authors havedifferingabilitytodoso.Introduction
Sinceisitwidelyacceptedthatauthorstendtohavetheir own unique writing style, and since it is alsowidelyacceptedthatauthorsarenotabletodisguisetheiroverallwritingstyle,itisreasonabletoaskifnov-elistscancreatedifferentvoicesforthecharactersintheirbooks.Theappropriatenullandalternativehy-pothesesforthisresearchquestionare:
• NullHypothesis: Anovelist’scharactersdonothavedistinctivevoices.
• Alternative Hypothesis: At least some of anovelist’scharactershavedistinctivevoices.
TimHiattandJohnHilton(1990and1993)consid-eredthequestionofcharactervoiceforWilliamFaulk-ner,JamesJoyce,MarkTwainandRobertHeinleinandconcludedthatalthoughanauthorcouldcreatechar-actervoices,oftheauthortheytestedFaulkneralonewasuniquelyabletocreatecharacterswithdifferingvoices. However, we noted that their analyses weresimplisticanddidnotusethemultivariatestatisticaltechniques commonly used current in stylometricanalysis.Therefore,wechosetotestthenullhypothe-sisofnon-distinctivecharactervoicesbasedonnon-contextualwordfrequenciesandusingprinciplecom-ponentsanalysis(PCA).
Method WeappliedPCAtonovelswrittenbyJaneAusten,
CharlesDickens, JamesFennimoreCooper andMarkTwain (Samuel Clements). We selected two novelsfromeachauthorandseparatedthewordsquotedbyeach character.We only included characters whosequoted words exceeded 500 words. Austen createdsixteencharactersinPrideandPrejudiceandfourteencharactersinSenseandSensibilitywhometthemini-mumnumberofquotedwords,whileDickenscreatedtwenty-threecharacters inOliverTwistand fourteeninGreatExpectations.Similarly,CoopercreatedtwelvecharactersinTheLastoftheMohicansandteninTheDeerslayer,andTwaincreatedninecharactersTheAd-ventures of Tom Sawyer and fourteen in The Adven-turesofHuckleberryFinn.We thenspliteachcharac-ter’squotedwordsinto200wordblocksforanalysis.Sinceeachbookalsohadanarrator,wealsosplitthenarrator’swordsinto2000-wordblocks.
Results and Discussion
2
Figure1showsplotsofthefirsttwoprinciplecom-ponentsforthefourauthorswhereeachstardotrep-resentsa2000-wordblockforthenarratorandeachcircledotrepresentsa200-wordblockforeachchar-acter. Eachnovel’s narrator clearly stands out fromthecharactersinthebook.ItcanalsobeseeninFigure1 that the narrator’s voices are similar for Austen’s,Dickens’, and Cooper’s novels, while the narratorvoicesinTwain’snovelsaredistinguishable.
Althoughwedonotshowtheplotshere,asimilaranalysis as in Figure 1 comparing the author’s ownwordprint voice based on non-contextual word fre-quenciesintheauthor’snon-novelwritingtothenar-rator’svoiceineachnovelshowedthatthenarrators’wordprintsdidnotmatchtheauthor’sownwordprintforallfournovelists
Figure 1: Principle components plots for the 19th century authors. The narrator’s voice is clearly distinguishable from
the characters’ voices within each novel, and there is obvious diversity of voices among each author’s characters.
Since thenarrators foreachauthorareobviouslydifferentinfunctionwordfrequenciesthanthechar-acters in each book, we only used the non-narratorwords to compare the distinctiveness of the voiceseachauthorcreatedforhisorhercharacters.Againus-ingPCAweappliedmultivariateanalysisofvariance(MANOVA) and constructed 95% confidence ellip-soids around the characters in each author’s books.Figure2showstheellipsoidsusingthefirsttwoprin-ciplecomponents.Eachdotintheplotsrepresentsone
character.Eachellipsoidshowsthediversityofchar-actervoicesforonenovel.
Figure 2: Principle components plots with 95% confidence
ellipsoids for each author’s characters by novel. The character voices are distinguishable between each novel for
all four authors.
ItcanbeseeninFigure2thatthecharactervoicesin onenovel are distinguishable from the charactervoicesintheothernovelforeachauthor,eventhoughthereissomeoverlapoftheellipsoids.
Thevolumesoftheellipsoidsforeachauthorpro-videameasureof thatauthor’scharacterdiversity–thelargerthevolume,thegreaterthediversityofchar-actervoiceswithinanovel.Figure3showsacompari-sonacrossauthorsofthetotalvolumeofa95%confi-dence ellipsoid encompassing both of each author’snovels.
Figure 3: The volume of a 95% confidence ellipsoid
encompassing the character voices of both novels for each author compared across all four authors. The log10 is shown when the ellipsoid includes from two to fifteen
principle components. For eight principle components and
3
above, the diversity of character voices for Dickens is clearly greater than for the other three authors.
ItcanbeseeninFigure3thatDickensandTwainshow greater character diversity than Austen andCooper. For example, Dickens’ character diversity(measuredbyellipsoidvolumewith fifteenprinciplecomponents)isaboutonehundredtimesbiggerthanAusten’s.Thisshowsthatevenamonggreatauthors,there is a spectrum of ability in creating charactervoices.
Conclusions Weexaminedthediversityofcharactervoicescre-
atedbyfourofthemostrespectednineteenthcenturynovelistsasshownintheircharacters’functionalwordfrequencies.Usingstylometricanalysesweconsideredwhetherornotanauthorcancreatecharacterswithdifferentwordprintvoicesandfoundpersuasiveevi-dencethattheycan.Wealsofoundthatthenarratorineachnovelspeakswithadifferentvoicethanthechar-actersandadifferentvoicethantheauthor’sownnon-novel voice. Further, not only can a talented authorcreatecharacterswithdifferingvoicesamongcharac-terswithinanovel,thecharactersaredifferentamongnovels.Inaddition,wefoundthatauthorsexhibitvar-ying ability to create character voice diversity. Alt-houghallfourweregreatnovelistsandcouldcreateddistinctive characters, Dickens’ and Twain’s charac-tersshowedevengreatervoicediversitythanAustenandCooper.
Bibliography
Hiatt, T. and Hilton, J. (1990). "Can authors alter theirwordprints? Faulkner's narrators in As I Lay Dying."Deseret Language and Linguistic Society Symposium: Vol.16,Iss.1,Article 6.
Hiatt,T.(1993)."Canauthorsaltertheirwordprints?JamesJoyce’sUlysses."Master’sThesis,BrighamYoungUniver-sity.