Upload
vaughn
View
35
Download
3
Embed Size (px)
DESCRIPTION
Digitalized Dialect Studies: North-Western Romanian. Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada. Context. Noul Atlas lingvistic român. Crisana. Crisana region in north-west Romania Hard copy atlas by Stan and Uritescu (1996, 2003) - PowerPoint PPT Presentation
Citation preview
Digitalized Dialect Studies: North-Western Romanian
Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler
York University, Toronto, Canada
Context
Noul Atlas lingvistic român. Crisana Crisana region in
north-west Romania
Hard copy atlas by Stan and Uritescu (1996, 2003)
Digitize to make it more accessible
RODA: Romanian Online Dialect Atlas
Digitize and present hard copy atlas: Mostly graduate students
in Canada and Romania Enter data from maps into text files When complete, it will be posted to
the Internet for general use
Objective Use Information Technology to
permit a broad range of scholars to access the data, select the data appropriately, and present the data clearly;
and so gain greater understanding of its significance.
Other Digital Atlases
Other Digital Atlases Salzburg
H.Goebl • phonetic dialect atlas of Dolomitic Ladinian
(since 1985) Edgar Haimerl
• ‘Visual DialectoMetry’ (VDM) (ca 2000) Netherlands
Heeringa et al.; de Vriend et al.• Dialectometric and cartographic software
Other Dialect Atlases Japan
D. Long, others (http://nihongo.human.metro-u.ac.jp/~long/maps/perceptmaps.htm )• Japanese area maps
Related endeavours Google Earth
• Available mapping software• Images world-wide
Dialect studies with databases• e.g. Iran: National survey for 2009
Visualization software• e.g T. Pi. Atlas of Dialect Topography• http://dialect.topography.chass.utoronto.ca/dt_atlas.php
Overall challenges: Digitize data Accessible interface to data
Search Analyze
Presentation of data As data As maps
RODA as linguistic technology
The technology allows one to:
View the data Search for data and count it Interpret the data or the counts Analyze the data (e.g. MDS) See the results as maps
Save the maps as .jpg pictures Save the results for later use
Hear samples of the data
RODA: function Custom-defined maps
• You select the data• You see the result as a map
Programmable access to the whole set of digitized data• You ask about data spread over many maps• You can customize what you search for
(not just the editor’s choice)
RODA: selection of data Context of search becomes important
• Word-final vs non-final vs either• Plain character vs accented character• Character vs (superposed) alternate
Choice of fields to search• E.g. With nouns: sg. vs pl. entries• Variations heard by field workers• Flags to mark special situations (e.g.
hesitation)
Examples from RODA
Crisana, Romania
Crisana, Romania
(from RODA)
Seeing Words Change
Word-final /u/in Latin and non-Latin words
Word-final /u/ from Latin
Latin Romanian(standard and most
dialects)
Dialectal Variation
canto ‘I sing’ cânt cântu(vowel present)
cântu
(non-syllabic)
oculum ‘eye’ ochi ochiu ochiu
Is word-final /u/ random? Look for a geographic pattern over
all potential occurrences The maps for single examples such
as /ochi/ and others, are in the hard-copy dialect Atlas,
But total data for all examples is spread widely over many maps.
Word-final /u/
Data from:•407 maps•Field 1
Size of cross shows the number of occurrences
Horizontal= syllabic
Vertical = non-syllabic
Word-final,syllabic /u/
Data from:•407 maps•Field 1•word-final only•(horizontal = vertical)
Locations 137, 141, 146 show most examples
Word-final,syllabic /u/
Can review the data
Word-final,syllabic /u/
Data from:•selected maps•Field 1•word-final only•removed non-vocalic /u/ , def. art., some clusters +/u/.•(horizontal = vertical)
Locations 137, 141, 146 show most examples
/u/ Pattern There is a pattern:
Word final /u/ is retained in central, and north-eastern areas
It is syllabic mostly in parts of the central area
The locations with most frequent syllabic final /u/ do not form a continuous area
Raised word-final /e/
Raised, word-final /e/
Data from:•407 maps•Field 1
Horizontal= vertical
Raised /e/ is wide-spread
Raised, word-final /e/ vs schwa
Data from:•407 maps•Field 1
Raised /e/ (horizontal)
Raised schwa (vertical)
Raised schwa is also wide-spread but does not always coincide with raised /e/(cf. 158, 159)
High /e/ and schwa
High /e/ and schwa
Retained /u/versusRaised /e/
•Syllabic word-final /u/ (horizontal)
•Raised word-final /e/ (vertical)
•Zoom-in view of central area
137, 141, 146 have both
Retained /u/versusRaised schwa
•Syllabic word-final /u/ (horizontal)
•Raised word-final schwa (vertical)
•Zoom-in view of central area
137, 146 (not 141) have both
Conclusion The raising of final mid vowels and
the weakening of final high vowels are distinct natural lenition processes.
Non-palatalized dentals before front vowels
Non-palatalized dentals before front vowels
Crişana: dentals before front vowels are palatalized.
Are they restructured as palatals? If the process is no longer productive,
there may be non-palatalized dentals before front vowels.
If so, where, in what forms and what is the frequency?
Non-palatalized dentals before front vowels
•Examples everywhere.
•(As is well-known, dentals are not palatalized in Oaş, except for 220.)
•Map shows where and how many examples.
/st/ before front vowels
/t/ but not /st/ before /e/ and /i/
•407 maps, field 1
•/te/ (horizontal)
•/ti/ (vertical)
•values all scaled x 3 to make more visible
/t/ but not /st/ before /e/ and /i/
Shown as an interpretive map
•407 maps, field 1
•/te/ (red)
•/ti/ (black)
Map is automatically drawn from the previous searches
/t/ before /e/ or /i/
•See the examples that were found and counted.
•See the source map number and location number of each.
•Can delete “exceptions” from the count.
Non-palatalized dentals before front vowels
There are examples everywhere (not only in Oaş)
Here we establish a result with the location and frequency of examples.
Can view the examples that support the conclusion.
With digital data and tools, we easily discover significant patterns
Here, we see the conservation of front vowels after velarizing consonants.
We see frequency
and areas phonological
context
/e, i/ after /ts, z, s/
/e, i/ after /ts, z, s/
MDS
MDS process Multidimensional Scaling (MDS) uses
the “linguistic distance” between N+1 locations to place them in an N-dimensional space.
Then, the N-space is projected onto a 2-space (a map) such that the distances among the points are preserved as best as possible.
MDS and dialects Embleton and
Wheeler have used an MDS process on English dialects Finnish dialects
Dialect roughly correlates with geography
Dialect groupings Began with a hypothesis about
dialect groupings in Crisana Analyzed all data in 407 maps using
the MDS method Identity is exact match; any difference
is a difference of 1. Distance is sum of differences.
We see the groupings on a map.
MDS mapAll groups
South-east and South-west are distinct.
The rest are less so. Suggests
the dialect unity of the region
--> refine groupings
MDS mapRefined groupings
Still, considerable overlap or closeness
More groups that could be identified, e.g.:
Several divisions in West
Two areas in Oaş
Oaş is close to southern areas
Still, its distinctness is clear (cf. also Uritescu 1984a).
MDS mapRefined groupings
MDS mapRefined groupings
MDS For large quantities of data, MDS
needs RODA’s digitized data. MDS provides another
understanding of the data. MDS is only one of many possible
quantitative tools (e.g. factor analysis, cluster analysis).
Hear the Data
•Selected clips from source data in over 40 locations
•From map, pick location and play
•(Sound data is large; needs to be packaged separately for easy downloading)
Bigger challenge
Access to Data In the humanities,
Large amounts of data Diverse ways of selecting it
Information Technology Has the technology May not understand the needs
Need to learn how to apply IT to our discipline effectively
Development Process Requirements gathering
Prototypes Cycles of propose-and-revise
User testing Test versions on web User feedback is important
Explore technology Changes fast Much to learn
Bridging the Gap IT specialist: the challenge is to
make IT accessible to non-IT users Humanist: go after the technology
Plan for it. It needs careful thought Use it. It is powerful
Dialectologist and Romanist: RODA
Future Directions Digitalize future volumes (3-5) Create digital interpretive maps
from hard-copy Atlas Apply MDS
Enhance the sound and multimedia aspects of the online atlas Play sound and see a transliterated text
Summary Data will soon be available
You are invited to apply your techniques to the data
Digital data and IT methods permit: Widely accessible data Flexible searching and custom
presentation Repeatable processing
Contacts Sheila [email protected] Dorin [email protected] Eric [email protected]
Test sites: ericwheeler.ca/test