Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
ProposalforanOriyaScriptRootZoneLabelGenerationRuleset(LGR)
LGRVersion: 3.0Date: 2019-03-06Documentversion: 2.12Authors: Neo-BrahmiGenerationPanel[NBGP]
1 GeneralInformation/Overview/Abstract
ThepurposeofthisdocumentistogiveanoverviewoftheproposedRootZoneLevelGenerationRulesfortheOriyascript.Itincludesadiscussionofrelevantfeaturesofthescript,thecommunitiesorlanguagesusingit,theprocessandmethodologyusedandinformationonthecontributors.TheformalspecificationoftheLGRcanbefoundintheaccompanyingXMLdocument:
proposal-oriya-lgr-06mar19-en.xml
Labelsfortestingcanbefoundintheaccompanyingtextdocument:
oriya-test-labels-06mar19-en.txt
2 ScriptforwhichtheLGRisproposed
ISO15924Code:Orya
ISO15924KeyN°:327
ISO15924EnglishName:Oriya(Odia)
Latintransliterationofnativescriptname:oṛiā
Nativenameofthescript:ଓଡ଼ିଆ
MaximalStartingRepertoire(MSR)version:MSR-4
3 BackgroundonScriptandPrincipalLanguagesusingit
Oriya(amendedlaterasOdia)isanEasternIndiclanguagespokenbyabout40millionpeople(3,75,21,324aspercensus2011(http://censusindia.gov.in/2011Census/Language-2011/Statement-4.pdf)mainlyintheIndianstateofOrissa,andalsoinpartsofWest
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
2
Bengal,Jharkhand,ChhattisgarhandAndhraPradesh.Oriya(Odia)isoneofthemanyofficiallanguagesofIndia.ItistheofficiallanguageofOdisha,andthesecondofficiallanguageofJharkhand.EminentLinguistslikeJohnBeames,G.A.Grierson,L.S.S.O’Malley,SunitiKumarChatterjee,S.N.Rajaguru,JohnBoultonandothersconsiderOriyaasoneofthemostancientlanguagesofIndia.InIndicfamilyoflanguages,OriyaisclosesttoSanskritandleastinfluencedbyforeignlanguages.OnlythesetwoIndiclanguages(viz.SanskritandOriya)receivedaclassicaltagduetotheirrich,uninfluencedandlongliteraryhistory.AccordingtoNationalMissionforManuscripts,afterSanskrit(11,66,743),Oriya(2,13,088)hasthelargestnumberofdocumentedmanuscriptsintheIndia(https://news.webindia123.com/news/Articles/India/20160318/2819026.html).
Oriya(Odia)hasbeenamendedtoOdia,andOrissaasOdisha;OdiaandOdishaarenowthepreferrednamesinEnglishastheyareclosertotheirnativenames:ଓଡ଼ିଆ (oṛiā)[ɔɖiaː]andଓଡ଼ିଶା(oṛiśā). Oriya(Odia)languageisalsousedbypopulationsoftheneighboringstatesofJharkhand,WestBengal,ChhattisgarhandAndhraPradesh.
ThemodernOriya(Odia)languageisformedmostlyfromPaliwordswithsignificantSanskritinfluence.About28%ofmodernOriya(Odia) wordshaveAdivasiorigins,andabout2%haveHindustani(Hindi/Urdu),Persian,orArabicorigins.(Refer:IndianLiteraturebyLanguage:OriyaLiteratureBook(ISBN1156665574,9781156665572)Theearliestwrittentextsinthelanguageareaboutthousandyearsold.ThefirstOriya(Odia)newspaper UtkalaDeepikawasfirstpublishedon4August1866.
AmongtheIndo-EuropeanlanguagesofIndia,onlyOriyaandSanskrithavebeenrecognizedasclassicallanguages;andofthesixIndianlanguagesthathavebeenconferredclassicallanguagestatus,Oriya(Odia)wasrecognizedmostrecently(in2014)1.ItformsthebasisofOdissidanceandOdissimusic.2
InOrissa,OriyascriptisusednotonlyforwritingOriyalanguagebutalsoforSanskritandatleast21otherlanguages.TheseincludeOriya,Desiya,Sadri,Sanskrit,Sambalpuri,SantaliandBodo.SeeTable4foramorecompletelistoflanguages.ThescriptmainlydiffersfromotherscriptslikeDevanagariandBanglabytheabsenceoftheShirorekhaorthelineabovethecharacterandalsoitsmoreroundedshapes.
3.1 TheEvolutionoftheScript
TheOriya(Odia)scriptdevelopedfromtheKalingascript,oneofthemanydescendantsoftheBrahmiscriptofancientIndia(Rajaguru,S.N.,OdiaLipiraKramabikash,OdiaSahitya1Criteriaforthisstatusinclude:highantiquityofitsearlytexts/recordedhistoryoveraperiodof1500–2000years;abodyofancientliterature/texts,whichisconsideredavaluableheritagebygenerationsofspeakers;aliterarytraditionthatisoriginalandnotborrowedfromanotherspeechcommunity;2ThevarietyofOriyadialectsetc.isreviewedinAppendixB.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
3
Akademi,page2).TheearliestknowninscriptionintheOriyalanguage,intheKalingascript,datesfrom1051.ItdescendsfromOdra-MagadhiPrakritsimilartoArdhaMagadhi,prevalentineasternIndiaover1,500yearsago.
ThecurvedappearanceoftheOriyascriptisaresultofthepracticeofwritingonpalmleaves,whichhaveatendencytotearifwritteninstraightlines.
ThediagrambelowshowsthemajorstagesintheevolutionofOriyaattestingitslatedivergencefromamongNorthernScriptsderivedfromBrahmiscript.
Figure1:PictorialdepictionofEvolutionofOriya
3.2 PeriodsofOdiaHistory
Oriyalanguageliterature(Odia) ଓଡ଼ିଆ ସାହିତ୍ୟ (OdiaSahitya) isthepredominantliteratureofthestateofOdishainIndia.HistorianshavedividedthehistoryoftheOriyaliteratureintofivemainstages:OldOriya(8thcenturyto1300),EarlyMiddleOriya(1300to1500),MiddleOriya(1500to1700),LateMiddleOriya(1700to1850)andModernOriya(1850topresent).Seehttp://indohistory.com/oriya.htmlformoreinformation.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
4
3.3 UseofOriyalanguageglobally
TheOriyadiasporaconstitutesasizeablenumberofspeakersinseveralcountriesaroundtheworld,withthenumberofOriyaspeakersgloballyto55million.Moredetailsaregivenhttp://censusindia.gov.in/2011Census/Language-2011/Statement-4.pdf
IthasasignificantpresenceineasterncountriessuchasBangladesh,Indonesia,mainlycarriedbythesadhaba,theancienttradersfromOdisha,whocarriedthelanguagealongwiththecultureduringtheold-daytrading,andinwesterncountriessuchastheUnitedStates,Canada,AustraliaandEnglandaswell.ThelanguagehasalsospreadtoBurma,Malaysia,Fiji,SriLankaandcountriesoftheMiddleEast.
3.4 WrittenOriya(Odia)orthestandardOriya(Odia)
Itisusedforofficialpurpose.IthaselementsfromdifferentlocalOriyadialectsbutitusuallyavoidswordsofforeignoriginsuchasArabicandPersian.IthasalsoassimilatedmanytribalwordsprevalentinOdisha.Notablefeatures
TheOriyascriptisasyllabicalphabetwrittenlefttorightinhorizontallines,inwhichallconsonantshaveaninherentvowel.Diacriticscanappearabove,below,beforeoraftertheconsonanttheybelongto.Theseareusedtochangetheinherentvowelandalsotoshowtheallophonicvariantsoftheconsonant.Whentheyappearatthebeginningofasyllable,vowelsarewrittenasindependentletters.Whencertainconsonantsoccurtogether,specialconjunctshapesareusedwhichcombinetheessentialpartsofeachletter.
TheconsonantsandvowelsinOriya(Odia)arepresentedinTable1below.ThechartbelowshowsthewayinwhichtheInternationalPhoneticAlphabet(IPA)representsOriyapronunciations.VargyaConsonants
Avargyaconsonants
IPA Oriya(Odia) IPA Oriya(Odia)b ବ j ଯ bʱ ଭ e̯ ୟ d̪ ଦ ɾ ର d̪ʱ ଧ l ଲ ɖ ଡ l̪ ଳ ɖ ɦ ଢ du ʒ ଜ w ୱୱ du ʒ ɦ ଝ s ସ ɡ ଗ ʂ ଷ ɡʱ ଘ ɕ ଶ h ହ ɦ ହ k କ ŋ,ɲ,ɳ,n,m,◌̃ ଂ
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
5
kʰ ଖ ◌̃ ଁ ɲ ଞ ɦ ଃ m ମ VowelsandMatras n ନ IPA Vowels Matras ɳ ଣ ə ଅ ŋ ଙ aː ଆ ା p ପ ɪ ଇ ି p h ଫ iː ଈ ୀ ɾ ର ʊ ଉ ୁ ɽ ଡ଼ uː ଊ ୂ ɽ ɦ ଢ଼ r̩ ଋ ୃ s ସ eː ଏ େ t̪ ତ ɛː ଐ ୈ t̪ʰ ଥ oː ଓ ୋ ʈ ଟ ɔː ଔ ୌ ʈ h ଠ t͡ʃ ଚ t͡ʃʰ ଛ
Table1:InternationalPhoneticAlphabetOriyaPronunciations
3.5 Structuredconsonants
Thestructuredconsonantsareclassifiedaccordingtotheplaceandmannerofarticulationandareclassifiedaccordinglyintofivestructuredgroups.TheseconsonantsareshownherewiththeirIAST(InternationalAlphabetofSanskritTransliteration)3transcriptions.
3TheInternationalAlphabetofSanskritTransliteration(I.A.S.T.)isatransliterationschemethatallowsthelosslessromanizationofIndicscriptsasemployedbySanskritandrelatedIndiclanguages.IASTmakesitpossibleforthereadertoreadtheIndictextunambiguously,exactlyasifitwereintheoriginalIndicscript.Example:କ0B15(ka),ଖ0B16(kha)etc.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
6
voiceless voicelessaspirate voiced voicedaspirate nasal
Velarsକ(ka) 0B15
ଖ(kha)0B16
ଗ(ga) 0B17
ଘ(gha) 0B18
ଙ(ṅa)0B19
Palatalsଚ(ca)0B1A
ଛ(cha)0B1B
ଜ(ja) 0B1C
ଝ(jha)0B1D
ଞ(ña) 0B1E
Retroflexଟ(ṭa) 0B1F
ଠ(ṭha)0B20
ଡ(ḍa)0B21
ଢ(ḍha) 0B22
ଣ(ṇa) 0B23
Dentalsତ(ta) 0B24
ଥ(tha) 0B25
ଦ(da) 0B26
ଧ(dha) 0B27
ନ(na) 0B28
Labialsପ(pa) 0B2A
ଫ(pha) 0B2B
ବ(ba) 0B2C
ଭ(bha) 0B2D
ମ(ma) 0B2E
Table2:StructuredConsonants
3.6 Unstructuredconsonants
Theunstructuredconsonantsareadditionalconsonantsthatdonotfallintoanyoftheabovecategoriesandincludethefollowing: ଯ(ja)U+0B2F,ୟ(ia)U+0B5F,ର(ra)U+0B30,ଲ(la)U+0B32,ଳ(ḷa)U+0B33,ଵଶ(sa)U+0B36,ଷ(sa)U+0B37,ସ(sa)U+0B38,ହ(ha)U+0B39.
3.7 TheimplicitvowelkillerHalant(Virama)
TheHalantisusedafteraconsonantto"strip"itofitsinherentvowel.AconsonantclustercannotendwithaHalant.Withafewexceptions,mostoftheOriyawordsaresvaranta(i.e.endingwithavowel).
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
7
Insomecases,asyllablecontainingaHalantmaynothaveavisibleHalantsign,wheretheHalantenablesthedifferentconsonantstoformaconjunct.
Thehalantformofconsonants,i.e.,theformproducedbyaddingtheHalanttothenominalshape,isusedinsyllablesthathavenoinherentvowelorinconjunctsorconsonantclusters.
InOriya,consonantclustersarewritteninmanyways:
1) Inmostcasesthesecondcomponentisinhalfform(withouttheroundedtop)orreducedformandputunderthefirstcomponente.g.ସ୍କ (U+0B38,
U+0B4D,U+0B15),ସ୍ତ (U+0B38,U+0B4D,U+0B24),ବ୍ଧ (U+0B2C,
U+0B4D,U+0B27),etc.
2) Sometimesthereversecasehappensandthefirstcomponent(esp.whenitisତ(0B24))isreducedandisputbelowthesecondasinତ୍ସ (U+0B24,
U+0B4D, U+0B38),ତ୍ମ U+0B24, U+0B4D, U+0B2E,ତ୍କତ୍ପ (U+0B24,
U+0B4D,U+0B15,U+0B24,U+0B4D,U+0B2A)etc.
3) Pre-baseformsarealsoseeninବ୍ଦଦ୍ଭ (U+0B2C,U+0B4D,U+0B26,
U+0B26,U+0B4D,U+0B2D),butduetocursivewriting,thepre-basecomponentsseemtobelikenewglyphs.Inthiscase,thecomponentsofaconjunctarerenderedsidebysidenotlikementionedabovein1and2.
Ifnodistinctshapeexists,thefullformwilldisplaywithanexplicitHalant.(sameshapeastheHalantform).
3.8 Nukta ( ଼)( ଼U+0B3C)
Thenuktasign(.଼)isusedinOriyalanguagejustlikemanyotherscriptsusedinSouthAsia.Itcanbecommonlyusedwith“ଡ”U+0B21,“ଢ”U+0B22.InthepubliccommentversionoftheOriyaScriptLGR,theNBGPhadproposedallowingtheuseofnuktawithକ0B15,ଖ0B16,ଗ0B17,ଚ0B1A,ଜ0B1C,ଫ0B2B.Butbasedonpubliccommentfeedback,suggestingthisusewasnotsufficientlycommon,andfurtheranalysisontheusageofnuktabyNBGP,nuktaisnotincludedwithକ0B15,ଖ0B16,ଗ0B17,ଚ0B1A,ଜ0B1C,ଫ0B2B.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
8
3.9 Visarga“ଃ” (U+0B03) and Avagraha “ଽ” (U+0B3D) TheVisarga(ଃ,U+0B03)isfrequentlyusedinSanskritandrepresentsasoundverycloseto/h/.Example,ଦୁଃଖ/du:kh/“sorrow”(U+0B26U+0B41U+0B03U+0B16).TheAvagraha"ଽ"(U+0B3D)createsanextrastressontheprecedingvowelandisusedinSanskrittexts.ItisrarelyusedinotherlanguagesusingOriyascript.
3.10 Candrabindu (◌ँ - U+0B01) Candrabindudenotesnasalizationoftheprecedingvowelandconsonantsasinଅଁଳା/am̐ḷā/“nameofseasonalfruit”(U+0B05U+0B01U+0B33U+0B3E).OriyauserscommonlyuseitforwritingthewordsandsoundsofSanskritlanguage.
3.11 Anusvara (ଂ - U+0B02) AnusvarareplacesaconjunctgroupofaNasalConsonant+Halant+Consonantbelongingtothatparticularvarga(plosive),representingahomorganicnasal.Beforeanon-varga(non-plosive)consonant,theAnusvararepresentsanasalsound.Forexample:ଏବଂ/ēbaṁ / (U+0B0F,U+0B2C,U+0B02)(means:and),ସଂଖ୍ୟା/saṁkhyā/(U+0B38,U+0B02,U+0B16,U+0B4D,U+0B5F,U+0B3E)(means:number),etc.3.12 Matrasign(DependentVowel)Adependentvowelsignisusedtorepresentavowelsoundthatisnotinherenttoaconsonant.Dependentvowelsarereferredtoas"matras".Theyarealwaysdepictedincombinationwithaconsonantorwithaconsonantcluster.TherulesspecifictoOriyaforcombiningmatrasarementionedinSection6(Variants)andSection7(WLERules).
Thefollowingtableexplainsthecorrelationbetweenavowelanditsmatrasign.
VowelanditsmatrasignGlyph Unicode Name Glyph Unicode Nameଅ U+0B05 ORIYALETTERA ଆ U+0B06 ORIYAVOWEL
LETTERAAା U+0B3E ORIYAVOWELSIGN
AAଇ U+0B07 ORIYAVOWEL
LETTERIି◌ U+0B3F ORIYAVOWELSIGN
Iଈ U+0B08 ORIYAVOWEL
LETTERIIୀ U+0B40 ORIYAVOWELSIGN
IIଉ U+0B09 ORIYAVOWEL
LETTERUୁ◌ U+0B41 ORIYAVOWELSIGN
U
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
9
VowelanditsmatrasignGlyph Unicode Name Glyph Unicode Nameଊ U+0B0A ORIYAVOWEL
LETTERUUୂ◌ U+0B42 ORIYAVOWELSIGN
UUଋ U+0B0B ORIYAVOWEL
LETTERVOCALICRୃ◌ U+0B43 ORIYAVOWELSIGN
VOCALICRଏ U+0B0F ORIYAVOWEL
LETTEREେ U+0B47 ORIYAVOWELSIGN
Eଐ U+0B10 ORIYAVOWEL
LETTERAIୈ U+0B48 ORIYAVOWELSIGN
AIଓ U+0B13 ORIYAVOWEL
LETTEROୋ U+0B4B ORIYAVOWELSIGN
Oଔ U+0B14 ORIYAVOWEL
LETTERAUୌ U+0B4C ORIYAVOWELSIGN
AUଌ U+0B0C ORIYALETTER
VOCALICL◌◌ୢ U+0B62 ORIYAVOWELSIGN
VOCALICLୡ U+0B61 ORIYALETTER
VOCALICLL◌◌ୣ U+0B63 ORIYAVOWELSIGN
VOCALICLLTable3:VowelanditsMatraSign
The vowels “ଌ”U+0B0C,“ୡ”U+0B61,“◌ୢ”U+0B62and ”◌ୣ”U+0B63arehardlyinuseinmoderndays.
4 OverallDevelopmentProcessandMethodology
UndertheNeo-BrahmiGenerationPanel,therearemanydifferentscriptsbelongingtoseparateUnicodeblocks.EachofthesescriptsisthebasisforaseparateLGRproposal;howeverNeo-BrahmiGPensuresthatthefundamentalphilosophybehindbuildingthesevariousscriptLGRsissameacrossthedifferentscriptsbeingconsidered.NBGPdecidedtoemploy“ExpandedGradedIntergenerationalDisruptionScale”[EGIDS],whichisdesignedtomeasurethestatusofthelanguagesoftheworldintermsofendangermentordevelopment.TheEGIDSconsistsof13levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguage.NBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusage.Followingarethedescriptions4ofthosescalesandcorrespondinglanguagesrelevantfortheOriya(Odia)scriptLGRproposal.
4https://www.ethnologue.com/about/language-status
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
10
TheOriya(Odia)scriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknown.TheNBGPanalyzedallcommentsreceivedtofinalizetheproposal.Theanalysisofpubliccommentscanbeaccessedonlinegivenat[106].
EGIDSScale Definition RelevantLanguagesandtheirfamily
EGIDSScale2(Provincial)
Thelanguageisusedineducation,work,massmedia,andgovernmentwithinmajoradministrativesubdivisionsofanation
Oriya(Indic)
EGIDSScale3(Widercommunication)
Thelanguageisusedinworkandmassmediawithoutofficialstatustotranscendlanguagedifferencesacrossaregion.
Desiya(Indic)
Sadri(Indic)
EGIDSScale4(Educational)
Thelanguageisinvigoroususe,withstandardizationandliteraturebeingsustainedthroughawidespreadsystemofinstitutionallysupportededucation.
Sanskrit(Indic)
Sambalpuri(Indic)
Santali(Austroasiatic)
EGIDSScale5(Developing)
Thelanguageisinvigoroususe,withliteratureinastandardizedformbeingusedbysomethoughthisisnotyetwidespreadorsustainable.
Bodoparja(Indic)
Sora(AustroAsiatic)
Ho(AustroAsiatic)
Mundari(AustroAsiatic)
Juang(AustroAsiatic)
Koya(Dravidian)
Kisan(Dravidian)
Kuvi(Dravidian)
EGIDSScale6a(Vigorous)
Thelanguageisusedforface-to-facecommunicationbyallgenerationsandthesituationissustainable.
Kudmali(Indic)
Mirgan(Mirgan)
Bondo(AustroAsiatic)
Munda(AustroAsiatic)
Pengo(Dravidian)
Duruwa(Dravidian)
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
11
EGIDSScale Definition RelevantLanguagesandtheirfamily
EGIDSScale6b(Threatened)
Thelanguageisusedforface-to-facecommunicationwithinallgenerations,butitislosingusers.
Gadababodo(AustroAsiatic)
Kui(Dravidian)
Table4:EGIDSScaleLanguagesConsideredunderOriya(Odia)LGR
Outoftheselanguages,sevenareIndicandthuscanbeeasilywrittenusingaBrahmibasedscriptlikeOriyascript.
4.1 GuidingPrinciples
TheNBGPadoptsthefollowingbroadprinciplesfortheselectionofcode-pointsintherepertoireacrosstheboardforallthescriptswithinitsscope.
4.1.1 Inclusionprinciples:
4.1.1.1 Modernusage:Everycharacterproposedshouldbeintheeverydayusageofaparticularlinguisticcommunity.CharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire.
4.1.1.2.Unambiguoususe:
Everycharacterproposedshouldhaveunambiguousunderstandingamongthelinguisticaboutitsusageinthelanguage.
4.1.2 Exclusionprinciples:
ThemainexclusionprincipleisthatofExternalLimitsonScope.Theycompriseprotocolsorstandardswhicharepre-requisitestothedefinitionofLabelGenerationRules.Allfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity.
4.1.2.1 ExternalLimitsonScope:Thecodepointrepertoireforrootzonebeingaveryspecialcase,uptheladderintheprotocolhierarchies,thecanvasofavailablecharactersforselectionasapartoftheRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathit.Thefollowingthreemainprotocols/standardsactassuccessivefilters:
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
12
i.TheUnicodeChart:
Outofallthecharactersthatareneededbythegivenscript,ifthecharacterinquestionisnotencodedinUnicode,itcannotbeincorporatedinthecodepointrepertoire.Suchcasesarequiterare,giventheelaborateandexhaustivecharacterinclusioneffortsmadebyUnicodeconsortium.
ii.IDNAProtocol:
Unicodebeingthecharacterencodingstandardforprovidingthemaximumpossiblerepresentationofagivenscript/language,ithasencodedasfaraspossibleallthepossiblecharactersneededbythescript.However,theDomainnamebeingaspecializedcase,itisgovernedbyanadditionalprotocolknownasIDNA(InternationalizedDomainNamesinApplications).TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames.
Example:Oriyascriptfrequentlyuses“ଡ”(U+0B21),“ଢ”(U+0B22)aswellastheirrespectiveformswithnukta“ଡ଼”,and“ଢ଼”.“ଡ଼”and“ଢ଼”asdistinctlettersarenotencodedbuttheirdecomposedformi.e.“ଡ”, “ଢ”followedbyOriyasignnukta(U+0B3C)canbeused.
iii.MaximalStartingRepertoire:
AstheRootZoneLGRisusedforcreationoftherootzoneTLDs,whichinturnareanevenmorespecializedcaseofdomainnamelabels,theRootZoneLGRprocedureintroducesadditionalexclusionsforcharactersallowedbyIDNA.
Example:OriyaSignAvagraha"ଽ"(U+0B3D)evenifallowedbyIDNAprotocol,isnotpermittedintheRootZoneRepertoireasperthe[MSR].
MaximalStartingRepertoirealsoexcludesinvisiblecharactersZeroWidthNon-Joiner(U+200C)andZeroWidthJoiner(U+200D).Thesearerequiredincertaincaseswhereatypicalvisualshapeofanaksharisdesired.
Tosumup,therestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthecode-blockofthegivenscript/language.ThisisfurthernarroweddownbytheIDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore.
4.1.2.2 NoFractionMarks:TheTLDsbeingidentifiers,fractionmarkerspresentinBrahmibasedlanguagessuchasgivenbelowwillnotbeincluded.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
13
Figure2:FractionMarksinOriya
4.1.2.3 NoSymbolsandAbbreviations:Abbreviations,weightsandmeasuresandothersuchiconiccharacterslikeIsshar"୰"(U+0B70)willnotbeincluded.
4.1.2.4 NoRareandObsoleteCharacters:TherearecharacterswhichhavebeenaddedtoUnicodetoaccommodaterareformsespeciallylikeOriyaLETTERVOCALICRR"ୠ"(U+0B60)andOriyaLETTERVOCALIC
LL"ୡ"(U+0B61)aswellastheirMatraforms)“◌ୄ◌ୄ“(U+0B44)and"◌ୣ◌ୣ"(U+0B63).Allsuchcharacterswillbeexcluded.ThisisincompliancewiththeConservatismprincipleaslaiddownintheRootZoneLGRprocedure.
5 Repertoire
ThisSectionprovidestherelevantSectionof[MSR]applicabletotheOriyascriptonwhichOriyacodepointrepertoirefortheRootZoneLGRisbasedon.Section5.1detailsthecode-pointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobeincludedintheOriyaRootZoneLGR.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
14
5.1 OriyaSectionofMaximalStartingRepertoire[MSR]Version4
Figure3:OriyaCodePagefromMSR-4
Colorconvention:Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromthe[MSR]-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
15
5.2 CodePointRepertoire
ThetablebelowlistsallthecodepointsincludedintherepertoireforOriyascript.Foreachofthecodepoints,languagereferenceshavebeengiveninthelastcolumn.
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
1 0B01 ଁ ORIYASIGNCANDRABINDU
2-Oriya Candrabindu[0],[101],[102],[103],[104],[105]
2 0B02 ଂ ORIYASIGNANUSVARA 2-Oriya Anusvara[0],[101],[102],[103],[104],[105]
3 0B03 ଃ ORIYASIGNVISARGA 2-Oriya Visarga[0],[101],[102],[103],[104],[105]
4 0B05 ଅ ORIYALETTERA 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
5 0B06 ଆ ORIYALETTERAA 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
6 0B07 ଇ ORIYALETTERI 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
7 0B08 ଈ ORIYALETTERII 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
8 0B09 ଉ ORIYALETTERU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
9 0B0A ଊ ORIYALETTERUU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
16
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
10 0B0B ଋ ORIYALETTERVOCALICR 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
11 0B0F ଏ ORIYALETTERE 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
12 0B10 ଐ ORIYALETTERAI 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
13 0B13 ଓ ORIYALETTERO 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
14 0B14 ଔ ORIYALETTERAU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]
15 0B15 କ ORIYALETTERKA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
16 0B16 ଖ ORIYALETTERKHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
17 0B17 ଗ ORIYALETTERGA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
18 0B18 ଘ ORIYALETTERGHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
19 0B19 ଙ ORIYALETTERNGA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
20 0B1A ଚ ORIYALETTERCA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
17
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
21 0B1B ଛ ORIYALETTERCHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
22 0B1C ଜ ORIYALETTERJA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
23 0B1D ଝ ORIYALETTERJHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
24 0B1E ଞ ORIYALETTERNYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
25 0B1F ଟ ORIYALETTERTTA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
26 0B20 ଠ ORIYALETTERTTHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
27 0B21 ଡ ORIYALETTERDDA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
28 0B22 ଢ ORIYALETTERDDHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
29 0B23 ଣ ORIYALETTERNNA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
30 0B24 ତ ORIYALETTERTA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
31 0B25 ଥ ORIYALETTERTHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
18
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
32 0B26 ଦ ORIYALETTERDA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
33 0B27 ଧ ORIYALETTERDHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
34 0B28 ନ ORIYALETTERNA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
35 0B2A ପ ORIYALETTERPA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
36 0B2B ଫ ORIYALETTERPHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
37 0B2C ବ ORIYALETTERBA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
38 0B2D ଭ ORIYALETTERBHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
39 0B2E ମ ORIYALETTERMA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
40 0B2F ଯ ORIYALETTERYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
41 0B30 ର ORIYALETTERRA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
42 0B32 ଲ ORIYALETTERLA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
19
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
43 0B33 ଳ ORIYALETTERLLA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
44 0B36 ଶ ORIYALETTERSHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
45 0B37 ଷ ORIYALETTERSSA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
46 0B38 ସ ORIYALETTERSA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
47 0B39 ହ ORIYALETTERHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
48 0B3C ଼ ORIYASIGNNUKTA 2-Oriya Nukta[0],[101],[102],[103],[104],[105]
49 0B3E ା ORIYAVOWELSIGNAA 2-Oriya Matra[0],[101],[102],[103],[104],[105]
50 0B3F ି ORIYAVOWELSIGNI 2-Oriya Matra[0],[101],[102],[103],[104],[105]
51 0B40 ୀ ORIYAVOWELSIGNII 2-Oriya Matra[0],[101],[102],[103],[104],[105]
52 0B41 ୁ ORIYAVOWELSIGNU 2-Oriya Matra[0],[101],[102],[103],[104],[105]
53 0B42 ୂ ORIYAVOWELSIGNUU 2-Oriya Matra[0],[101],[102],[103],[104],[105]
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
20
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category References
54 0B43 ୃ ORIYAVOWELSIGNVOCALICR
2-Oriya Matra[0],[101],[102],[103],[104],[105]
55 0B47 େ ORIYAVOWELSIGNE 2-Oriya Matra[0],[101],[102],[103],[104],[105]
56 0B48 ୈ ORIYAVOWELSIGNAI 2-Oriya Matra[0],[101],[102],[103],[104],[105]
57 0B4B ୋ ORIYAVOWELSIGNO 2-Oriya Matra[0],[101],[102],[103],[104],[105]
58 0B4C ୌ ORIYAVOWELSIGNAU 2-Oriya Matra[0],[101],[102],[103],[104],[105]
59 0B4D ୍ ORIYASIGNVIRAMA 2-Oriya Halant[0],[101],[102],[103],[104],[105]
60 0B56 ୖ ORIYAAILENGTHMARK 2-Oriya Matra[2],[101],[102],[103],[104],[105]
61 0B5F ୟ ORIYALETTERYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]
62 0B71 ୱ ORIYALETTERWA 2-Oriya Consonant[6],[101],[102],[103],[104],[105]
Table5:CodePointRepertoire
5.2.1 CodePointsExcluded
TheFollowingcodepointsareexcludedfromtherepertoirebyNBGP.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
21
Sr.No.
UnicodeCodePoint
Glyph CharacterName
LanguagewithEGIDS
Category Reference
1 0B0C ଌ ORIYALETTERVOCALICL
2-Oriya Vowel[0],[101],[102],[103],[104],[105]
2 0B35 ଵ ORIYALETTERVA 2-Oriya Consonant[6],[101],[102],[103],[104],[105]
3 0B44 ◌ୄORIYAVOWELSIGNVOCALICRR
2-Oriya Matra[9],[101],[102][103][104][105]
4 0B57 ୗ ORIYAAULENTHMARK
2-Oriya Matra [0],[101],[102][103][104][105]
Table6:CodePointExcludedfromRepertoire
Sincethematraୗ(U+0B57)ORIYAAULENTHMARKisnotincurrentusebytheOriyascriptcommunity,itisdecidedbytheNBGPtoexcludeit.Also,“ଌ”U+0B0C,“ୡ”U+0B61,“◌ୢ”U+0B62and”◌ୣ”U+0B63arerarelyusedinmoderndays.
ThecharacterVa“ଵ”(0B35) isusedinformofhandwrittencharacterbyfewlinguisticsorhasrecentlybeenusedinafewwebsitesontheinternet.Butthiscodepointisnotveryfrequentlyusedinmasseducationalinstitutions,printedinmasscommunication,likenewspapersormagazines.Sometextbooksintheschoolsandcollegeexplaintheplosive"Ba"ବ(0B2C) andnon-plosive"Va"ଵ(0B35) separatelyasvarga(plosive)andnon-varga(non-plosive)respectively.
ThepubliccommentversionoftheOriyaScriptLGR,NBGPhadincludedthecharacterVa“ଵ”(0B35) intherepertoire.However,thepubliccommentfeedbacksuggestedthatthisusewasnotsufficientlycommon,andfurtheranalysisbyNBGPconcludedthat"Va"ଵ(0B35) isnotusedincommonlyprintedbooks,magazines,etc.Therefore,itisexcludedfromtherepertoire.
However,incomingyearsifthischaracterisseeninfrequentuse,NBGPmayreconsiderincludingitinalaterversionofitsLGRfortheOriyascript.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
22
Variablesinvolved
C →Consonant
M →Matra
V →Vowel
B →Anusvara
H →Halant
N →Nukta
C1 → ConsonantswithNukta{ଡ0B21,ଢ0B22}
X → Visarga
D → Candrabindu
TheVowelSequence
The Vowel Sequence begins with a vowel which is followed by either of Anusvara (B),Candrabindu(D)oraVisarga(X)butnevermorethanone.
InOriya,thusthevowelsequencecanberepresentedasV[B|D|X]
Examples:
SequenceDescription Sequence Example Constitutingcharacters
Vowel Vଅ/a/
U+0B05
Vowel+Anusvara V[B]ଅଂ/aṁ/
U+0B05U+0B02
ଅ ଂ
U+0B05U+0B02
Vowel+Candrabindu V[D]ଅଁ/aṃ/
U+0B05U+0B01
ଅଁ
U+0B05U+0B01
Vowel+Visarga V[X]ଅଃ/aḥ/
U+0B05U+0B03
ଅଃ
U+0B05U+0B03
Table7
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
23
ConsonantSequence
ConsonantsequencebeginswithaconsonantandmayormaynotbefollowedbyaNukta(N). Consonantwithorwithout theNukta (N)may again be followed by eitherofMatra(M),Anusvara(B),Candrabindu(D),Visarga(X)oraHalant(H)butnevermorethanoneof
M,B,D,X,H.AftertheN,MandH,theconsonantsequencecanbeextendedasfollows:
1.Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent.)
Examples:
SequenceDescription Sequence Example Constitutingcharacters
Consonant Cଡ/ḍa/
U+0B21
Consonant+Nukta C[N] ଡ଼/ṛa/ ଡ଼
U+0B21U+0B3C Table8
2.Aconsonantisoptionallyfollowedbyadependentvowelsign/Matra[M],Anusvara[D],Candrabindu[B],Visarga[X]orHalant[H].
C[M|B|D|X|H]
Examples:
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] କି/ki/ କି
U+0B15U+0B3F
Consonant+Anusvara C[B] କଂ/kaṁ/ କଂ
U+0B15U+0B02
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
24
Consonant+Candrabindu C[D] କଁ/kaṃ/ କଁ
U+0B15U+0B01
Consonant+Visarga C[X] କଃ/kaḥ/ କଃ
U+0B15U+0B03
Consonant+Halant C[H] କ୍/k/(PureConsonant)
କ୍
U+0B15U+0B4D
Table9
3.ACMsequencecanbeoptionallyfollowedbyD,BorX(CM)[D|B|X]
Example:
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] କିଂ/kiṁ/ କିଂ
U+0B15U+0B3FU+0B02
Consonant+Matra+Candrabindu CM[D] କାଁ/kāṃ/ କାଁ
U+0B15U+0B3EU+0B01
Consonant+Matra+Visarga CM[X] କିଃ/kiḥ/ କିଃ
U+0B15U+0B3FU+0B03
Table10
Asequenceofconsonants(upto4)joinedbyHalant5(CH)C
Example:
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+
CHCHCHC ସ୍ତ୍ର୍ୟ/strya/ସ୍ ତ୍ ର୍ ୟ
U+0B38U+0B4DU+0B24U+0B4D
5IncaseofSanskrit,itcanjoinupto5consonants.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
25
Consonant+Halant+Consonant
U+0B30U+0B4DU+0B5F
Table11
However, theWLE rulesproposed in SectionError!Reference sourcenot found. doesnotimposeanyrestrictiononthenumberofconsonantsthatcanbejoinedbyaHalant.
Subsets:
ThecombinationmaybefollowedbyM,B,DorX
Example:
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M]
ସ୍କୀ/skī/ସ ୍ କ ୀ
U+0B38U+0B4DU+0B15U+0B40
Consonant+Halant+Consonant+Anusvara
CHC[B] ସ୍କଂ/skaṁ/ସ ୍ କ ଂ
U+0B38U+0B4DU+0B15U+0B02
Consonant+Halant+Consonant+Candrabindu
CHC[D] ସ୍କଁ/skaṃ/ସ ୍ କ ଁ
U+0B38U+0B4DU+0B15U+0B01
Consonant+Halant+Consonant+Visarga
CHC[X] ସ୍କଃ/skaḥ/ସ ୍ କ ୍ ଃ
U+0B38U+0B4DU+0B15U+0B4DU+0B03
Table12
(CH)CMmaybefollowedbyaB,DorX
Example:
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara
CHCM[B] ସ୍କୀଂ/skīṁ/
ସ ୍ କ ୀ ଂ
U+0B38U+0B4DU+0B15
U+0B40U+0B02
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
26
Consonant+Halant+Consonant+Matra+Candrabindu
CHCM[D] ସ୍କୀଁ/skīṃ/
ସ ୍ କ ୀ ଁ
U+0B38U+0B4DU+0B15
U+0B40U+0B01
Consonant+Halant+Consonant+Matra+Visarga
CHCM[X] ସ୍କୀଃ/skīḥ/ସ ୍ କ ୀ ଃ
U+0B38U+0B4DU+0B15U+0B40U+0B03
Table13
ThesearethebasicaksharformationrulesonwhichtheoverallOriyascriptLGRisbased.AslanguagesotherthanOriyaareconsidered,someadditionallanguage-specificcharacters
andrulesareintroduced.
6 Variants
6.1 In-ScriptVariants
InOriyascript,therearenocharacters/charactersequencesthatcanbecreatedbyusingtheOriyacharacterspermittedasperthe[MSR]andlookidentical.Therearenoin-scriptvariants.
6.2 Cross-ScriptVariants
Across-scriptvariantlabel,alsosometimesreferredtoas"WholeLabelconfusable",isthevariant case where one label in one script can be composed in such a way that it canresemble another entire label in a different script, to the point that it is effectivelyindistinguishable.
Every individual LGRdeveloped by theNBGPprovides a set of cross-script variant codepointsthatitidentifieswithmembersofotherrelatedscripts.
TheNBGPhasensuredthatnotonlytheindividualcharactersbutalsomostoftheaksharvariations are taken into consideration during the cross-script variant analysis of Oriyawith all the other scripts under NBGP. It was achieved by sharing a list ofmost of theaksharcombinationswithalltheotherscriptteams(‘most’isusedhereasallthepossibleConsonant +Halant + Consonant +… cases cannot be practically covered. Case of all theOriya“Consonant+Halant+Consonant”wasincludedintheanalysis).
Oriya scripthasa setofpossible cross-scriptvariantswith theMalayalamandMyanmarscripts. Cases listed in Table 6 are cross-script variants between Oriya, Malayalam and
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
27
Myanmar. This follows the NBGP Cross-script Variant inclusion policy available inAppendixD.
TheNBGPhasensuredthatOriyaandMalayalamLGRteamsproposethesamesetofcross-script variants by meeting face-to-face on many occasions as well as through mailcommunications.Thesamesetofcross-scriptvariants(withMalayalam)issupposedtobefoundintheMalayalamLGRdocuments.
TheNBGPhasensuredthatOriyaandMyanmarLGRteamsproposethesamesetofcross-script variants by meeting face-to-face as well as through mail communications. Thefollowing set of cross-script variants (with Malayalam and Myanmar) are commonlyagreed.
VariantSet
Oriya Malayalam Myanmar
CP Glyph CP Glyph CP Glyph
1. 0B20 ଠ 0D20 ഠ 101D ဝ
2 0B47 େ --- --- 1031 ေ◌ Table14:VariantSetbetweenOriya,MalayalamandMyanmarScript
ThecaseslistedinAppendixBarevisuallyconfusablecodepointslistedforreference,buttheyarenotdefinedasvariantcodepoints.
7 WholeLabelEvaluationRules(WLE)
ThisSectionprovidesthewholelabelevaluationrulesfortextwrittenintheOriyascript.TheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification.BelowarethesymbolsusedintheWLErulesforeachofthe"IndicSyllabicCategory"asmentionedinTable4:Codepointrepertoire.Additionalsymbolsdefinetheappropriatesubsetsforvariousrules.C → ConsonantM →MatraV →VowelB →Anusvara
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
28
H →Halant/ViramaN →NuktaC1 → ConsonantswithNukta{ଡ0B21DDA,ଢ0B22DDHA}X → VisargaD → CandrabinduRule1:N(଼)mustbeprecededonlybyC1Forexample:ଡ(0B21)+଼(0B3C)=ଡ଼ଢ(0B22)+଼(0B3C)=ଢ଼Rule2:B(ଂ)mustbeprecededbyV,C,NorMi) BmaybeprecededbyV(examples:ଅଂଶ,)ii) BmaybeprecededbyC,(example:ସଂସାର) iii) BmaybeprecededbyN(example:ଡଂଗା) iv) BmaybeprecededbyM,(examples:ସିଂହ, ମାଂସ, ସୁତରାଂ)Rule3:X(ଃ)mustbeprecededbyC,V,NorMi) XmaybeprecededbyC,(example:ପ୍ରାୟତଃ, କ୍ରମଶଃ)
ii) XmaybeprecededbyN(example:କାଢ଼ାଃ) iii) XmaybeprecededbyM,(examples:ଦୁଃଖ, ଦୁଃଖିତ)iv) XmaybeprecededbyV,(examples:ଅଃ, ଆଃ, ଇଃ, ଉଃ)commonlyusedwhenwriting
Sanskritorwhenthereisreligiousrequirement Rule4:D(ଁ)mustbeprecededbyV,C,NorM i) DmaybeprecededbyV(examples:କଅଁଳିଆ, ନିଆଁ, ପାଇଁ, ଯେଉଁ,)
ii) DmaybeprecededbyC(example:ମୁହଁ, ପହଁରା)
iii) DmaybeprecededbyN(example:ପୋଢ଼ୁଁଆ)iv) DmaybeprecededbyM(examples:ନାହିଁ, ନାଁ, ଗାଁରେ)
Rule5:H(୍)mustbeprecededbyCorNi) HmaybeprecededbyC,(example:ଠିକ୍, ଭୁଲ୍, ହଠାତ୍)
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
29
ii) HmaybeprecededbyN(example:ଗାର୍ଡ଼୍, ୱାର୍ଡ଼୍)
Rule6:MmustbeprecededbyCorNi) MmaybeprecededbyC(example:ମୁହଁ, ପହଁରା, ନୁହଁ)ii) MmaybeprecededbyN(example:ବଢ଼ିଆ,ଗଡ଼ୁ)
8 Contributors
ThisproposalispreparedandsubmittedbyMr.KuldeepPatnaik(Freelancer)andreviewedbyDr.DevasisaJethy,Oriyalinguisticanalyst,translatingmedicalsciencetoOdia,coauthorofabook(Odiaequivalentsofscientificterms)fromCentralInstitutionofIndianLanguages,Mysore.
FollowingNBGPmembershelpedMr.KuldeepPatnaiktotakecrucialdecisionwhileworkingtogetherfornineIndianlanguagesincludingOriya(Odia).
Position Name Organization Country LanguageExpertise
Co-Chair AjayData DataXgenTechnologies India Hindi,EnglishCo-Chair MaheshD.Kulkarni C-DAC India Marathi,HindiCo-Chair UdayaNarayana
SinghVisva-Bharati,Santiniketan,WestBengal
India Bengali,Maithili,Hindi,English
Member AkshatS.Joshi C-DAC India Hindi,MarathiMember AtiurRahmanKhan C-DAC India BanglaMember DrDebasishyaJethy Oriyalinguisticanalyst India OdiaMember JayPaudyal Consultant India HindiMember NehaGupta C-DAC India HindiMember ShanmugamR C-DAC India TamilMember VeenaSolomon (freelancer) India Malayalam
FollowingisthelistofotherNBGPmemberswiththeirlanguageexpertise.
Position Name Organization Country LanguageExpertise
Member AbhijitDutta Wikimedia India Bengali,Hindi
Member AnivarA.Aravind IndicProject India Malayalam
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
30
Member AnupamAgrawal TataConsultancyService India Hindi,Bengali
Member ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMARPANDA RegionalInstituteof
Education(NCERT)India Odia
Member BhimDhojShrestha Consultant Nepal Nepali,Newar
Member ChitritaChatterjee InternetandMobileAssociationofIndia(IAMAI)
India MultiplelanguagesrepresentedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar Consultant Nepal Nepali,NewarMember DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversity&
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra LanguageTechnologyCentre,RavenshawUniversity
India Odia
Member GurpreetSinghLehal PunjabiUniversityPatiala India PanjabiMember HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneurs'Hub
(NEHUB)Nepal Nepali,
NewarMember JijoPappachan DN.Domains India MalayalamMember K.C.Tikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeoKale Formerlyaffiliatedwith
UniversityofPuneIndia Marathi
Member KuldeepPatnaik(Editor) Freelancer India OdiaMember MukeshSaini EsselGroup India HindiMember N.DeivaSundaram NDSLingsoftSolutionsPvt
LtdIndia Tamil
Member NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India Hindi
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
31
Member PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India English,
Hindi,Marathi,Gujarati
Member RajibChakraborty SocietyforNaturalLanguageTechnologyResearch
India Bangla(Bengali)
Member RajivKumar NIXI India Member S.Maniam InternationalForumITfor
TamilSingapore Tamil
Member SanthoshThottingal Wikimediafoundation India Malayalam,Sourashtra,Tamil
Member SarojaBhate UniversityofPune India SanskritMember ShambhuKumarSingh NationalTranslation
Misson,MysoreIndia Maithili
Member ShantaramS.WardeWalawalikar
IndependentResearcher India Konkani
Member ShashiPathania P.G.D.ofDogri,UniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member SinnathambiShanmugarajah UniversityofColombo
SchoolofComputingSriLanka Tamil
Member SujithKartha Digitalkz.com India MalayalamMember SurajAdhikari Mercantile
Communications(and.npccTLD)
Nepal Nepali
Member SwarnaPrabhaChainary GuwahatiUniversity India BodoMember U.B.Pavanaja http://vishvakannada.com/ India KannadaMember UmaMaheshwarG CALTS,Univ.ofHyderabad India TeluguMember UttamShresthaRana NPNOG Nepal NepaliMember VinayMurarka Consultant;
https://मेरा.भारतIndia Hindi
Table15:ContributorsandNBGPanel
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
32
9 References
[MSR] IntegrationPanel,"MaximalStartingRepertoire—MSR-4OverviewandRationale",7February2019,https://www.icann.org/en/system/files/files/msr-4-overview-25jan19-en.pdf(Accessedon18February2019)
[NBGP]Neo-BrahmiGenerationPanel
[CodeCharts]TheUnicodeStandard10.0CharacterCodeChartshttp://www.unicode.org/charts/PDF/U0B00.pdf(Accessedon12January2018)
[0] TheUnicodeStandard1.1 AnycodepointoriginallyencodedinUnicode1.1
[6] TheUnicodeStandard4.0 AnycodepointoriginallyencodedinUnicode4.0
[9] TheUnicodeStandard5.1 AnycodepointoriginallyencodedinUnicode5.1
[101] Omniglot,"Oriya",https://www.omniglot.com/writing/oriya.htm(Accessedon12January2018)
[102] Odia(Oriya)alphabet-Wikipediahttp://www.academia.edu/9999923/The_Oriya_Script_Origin_Development_and_Sources(Accessedon07December2018)
[103] Odialanguage-http://indohistory.com/oriya.html(Accessedon07December2018))
[104] Oriya(Unicodeblock)–Unicode.org,(Unicode_block)
[105] OdishaStateGovt.PrimarySchoolGrade1e-book“HasaKhela”:byOdishaPrimaryEducationProgrammeAuthority http://opepa.odisha.gov.in/website/Download/e-Text-Book/CLass%20I/Hasa%20Khela%20Part%20II/Haso%20Khelo-II-Page-112.pdf
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
33
[106] NBGP,PublicCommentFeedbackforKannada,Telugu,OriyascriptLGRProposals,https://docs.google.com/document/d/1m9MbBfNBQZAFc9SOYpt0lgeeyM3N-DsUP173J4Vb948(Accessed13February2019)
10 AppendixA:Cross-scriptConfusableCodePoints
Oriya script has a set of possible cross-script confusable code points with the Gujarati,Bengali,Telugu,andKannadathatdonotrisetothelevelofcross-scriptvariant.
10.1 OriyaandGujarati
Thefollowingcharactersarevisuallyconfusable.TheNBGPdiscussedandconcludedthattheyaresimilarcodepointsbutshouldnotbeconsideredasvariantcodepoints.
Oriya Gujarati
ଃ(0B03) ◌ઃ(0A83)
ପ(0B2A) ઘ(0A98)
ଥ(0B25) થ(0AA5)
Table16:ConfusablecodepointsbetweentheOriyaandGujaratiscripts
10.2 OriyaandBengali
Thefollowingcharactersarevisuallyconfusable.TheNBGPdiscussedandconcludedthattheyaresimilarcodepointsandshouldnotbeconsideredasvariantcodepoints.
Bengali Oriya
ও(0993) ଓ(0B13)
Table17:ConfusablecodepointsbetweentheOriyaandBengaliscripts
ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyareneithervariantcodepointsnorconfusablecodepoints.
Bengali Oriya Resolution
ঘ(0998) ସ(0B38) Distinguishable
Table18:Other resolutionsbetweenOriyascriptandBengaliscript
10.3 OriyaandTelugu
ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyarenotvariantcodepointsnorconfusablecodepoints
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
35
Oriya Telugu Resolution
ଠ(0B20) ర(0C30) Distinguishable
ଠ(0B20) ఠ(0C20) Distinguishable
Table19:OtherresolutionsbetweentheOriyaandTeluguscripts
10.4 OriyaandKannada
ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyarenotvariantcodepointsnorconfusablecodepoints
Oriya Kannada Resolution
ଠ(0B20) ರ(0CB0) Distinguishable
ଠ(0B20) ಠ(0CA0) Distinguishable
Table20:OtherresolutionsbetweentheOriyaandKannadascripts
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
36
11 AppendixB:OriyaDialects
TherearedifferentwaysofspeakingandmeaningofwordsinlocalOriyaLanguage.Howeverthescriptremainsthesame.611.1.1.1 StandardOdiaKatakiOdiaorTheOdiaofMughalbandiregionconsideredasStandardOdiaduetoliterarytraditions.ItisspokenmainlyintheeasternhalfofthestateofOdisha,withlittlevariation,indistrictslikeKhurdha,Puri,Cuttack,Jajpur,Jagatsinghpur,Kendrapada,Dhenkanal,AngulandNayagarhdistrict.11.1.1.2 Majorforms,ordialectsMidnaporiOdia:SpokenintheundividedMidnaporeDistrictofWestBengal.SinghbhumiOdia:SpokeninEastSinghbhum,WestSinghbhumandSaraikela-KharsawandistrictofJharkhandBaleswariOdia:SpokeninBaleswar,BhadrakandMayurbhanjdistrictofOdisha.GanjamiOdia:SpokeninGanjamandGajapatidistrictsofOdishaandSrikakulamdistrictofAndhraPradesh.SambalpuriOdia:SpokeninBargarh,Bolangir,Boudh,Debagarh,Jharsuguda,Kalahandi,Nuapada,SambalpurandSubarnapurdistrictsofOdishaandbysomepeopleinRaigarh,Mahasamund,RaipurdistrictsofChhattisgarhstate.DesiyaOdia:SpokeninKoraput,Rayagada,NowrangpurandMalkangiriDistrictsofOdishaandinthehillyregionsofVishakhapatnam,VizianagaramDistrictofAndhraPradesh.Bhatri:SpokeninSouth-westernOdishaandeastern-southChhattisgarh.Halbi:SpokeninundividedBastardistrictofChhattisgarh.HalbiisamixtureofOdiaandMarathiwithinfluenceofChatishgarhitriballanguages.6ExtractedfromWikipedia,https://en.wikipedia.org/wiki/Odia_language#Major_forms_or_dialects
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
37
PhulbaniOdia:SpokeninPhulbani,PhulbaniTown,KhajuripadablockofKandhamal,andinnearbyareasborderingBoudhdistrict.ThislanguagegainedmomentumduringtheamalgamationofKandhamal(Phulbani),andBoudh,regionintoasingledistrictPhulabani,11.1.1.3 Minornon-literaryandtribalformsordialectsSundargadiOdia:VariationofOdiaSpokeninSundargarhdistrictofOdishaandinadjoiningpocketsofJharkhandandChhattisgarh.KalahandiaOdia:VariationofOdiaspokeninundividedKalahandiDistrictandneighboringdistrictsofChhattisgarh.Kurmi:SpokeninNorthernOdishaandSouthwestBengal.Sounti:SpokeninNorthernOdishaandSouthwestBengal.Bathudi:SpokeninNorthernOdishaandSouthwestBengal.Kondhan:AtribaldialectspokeninWesternOdisha..Laria:SpokeninborderingareasofChatishgarhandWesternOdisha.Aghria:SpokenmostlybytheingeniouspeopleofAghriacasteinWesternOdisha.Bhulia:TribalformspokeninWesternOdisha.Sadri:AmixtureofOdiaandHindilanguagewithmajorregionaltribalinfluence.BodoParja/Jharia:TribaldialectofOdiaspokenmostlyinKoraputdistrictofSouthernOdisha.Matia:TribaldialectofOdiaspokeninSouthernOdisha.Bhuyan:TribaldialectofOdiaspokeninSouthernOdisha.Reli:SpokeninSouthernOdishaandborderingareasofAndhraPradesh.Kupia:SpokenbyValmikicastepeopleintheIndianstateofTelanganaandAndhraPradesh,mostlyinHyderabad,Mahabubnagar,Srikakulam,Vizianagaram,EastGodavariandVisakhapatnamdistricts.
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
38
12 AppendixC:OriyaCharacters
OdishaStateGovernmentPrimarySchoolGrade1e-book“HasaKhela”[105]page112listsalltheOriyacharactersasshowninFigure4.
Figure4:OdishaStateGovt.PrimarySchoolGrade1e-book(Page112)
13 AppendixD:NBGPCross-scriptVariantInclusionPolicy
If,inanytwogivenscripts,allthepotentialcross-scriptvariantsconsistofdependent(e.g.VowelSigns,Anusvara,Visarga,Chandrabinduetc.) charactersONLY, then that entiresetcanbeignoredandnocross-scriptvariantsbeproposedbetweenthosetwoscripts.
If,inanytwogivenscripts,thereisATLEASTONEnon-dependent(e.g.Consonant,Voweletc.)cross-scriptvariantcharacter/sequencepresent,allthepotentialcross-scriptvariantsbeconsideredandproposedbetweenthetwoscripts.Thiscross-scriptanalysishasbeenrestrictedtothescriptsthathavedescendedfromthe
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
39
Brahmiasmostofthemsharesimilarusagepatterns.Byandlarge,allofthesescriptshaveacommonsetofcharactersthatexistedinBrahmiscriptandbearthesameidentities.However,asthescriptsbranchedoutfromtheBrahmi,dependingonvariousfactors,theshapesofthecharacterschanged.Thischangeintheshapewasnotuniformacrossallthecharactersandthescripts.Somecharactersshapesdidchangesignificantlywhereassomeofthemstillretainedsimilarity.Thecross-scriptsimilarityanalysisalsoaimstoidentifysuchcaseswherethesamecharacterretainedalmostthesameshapedespitebeingpartofthedifferentscripts.Thesesetofcharactersarevariantsofeachotherintruesensethanmerelyofco-incidentalvisualsimilarity.CaseofMalayalamMyanmar,andOdia(Oriya):Thisisthecaseof"ConsonantTtha"whichhappenedtoretainthesameshapedespitebeingpartofdifferentscripts,i.e.,Malayalam,MyanmarandOdia.Thesecharactersare:
ഠ - MALAYALAMLETTERTTHA(U+0D20)ଠ - ORIYALETTERTTHA(U+0B20)
Boththecharacters,lookexactlyalikeandbelongtoa"Consonant"category.Astheyareconsonants,eachofthem,eveninthesimplestformi.e.thecharactersthemselves,arevalidlabels.AspertheNBGPcross-scriptvariantinclusionpolicy,thisisavalidcaseforinclusion.Also,eveniftheyaresinglecharacters,whenthesamecharactercombines,theoreticallytheycanforminfinite7numberofcross-scriptvariantlabelsbetweenthescriptsinvolved.Herearesomesamplesofsomeofthoselabels:
Malayalam Myanmar Oriya
ഠഠഠU+0D20U+0D20U+0D20
ဝဝဝ
U+101DU+101DU+101D
ଠଠଠU+0B20U+0B20U+0B20
ഠഠഠഠU+0D20
U+0D20U+0D20U+0D20
ဝဝဝဝ
U+101DU+101DU+101DU+101D
ଠଠଠଠU+0B20
U+0B20U+0B20U+0B20
ഠഠഠഠഠU+0D20
U+0D20U+0D20U+0D20U+0D20
ဝဝဝဝဝ
U+101DU+101DU+101DU+101DU+101D
ଠଠଠଠଠU+0B20
U+0B20U+0B20U+0B20U+0B20
Since,havingsuchlabelsisarealisticpossibilityandthecorrespondinglabelslookalmostexactlyalike,NBGPhasproposedthemasblockedvariants.
7Thoughtheoreticallyinfinite,thisnumberwouldbelimitedtothenumberofsuchlabelswhoseequivalentpunycodestringwouldnotexceed63charactersincludingtheACEprefix"xn--".
ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel
40
NBGPacknowledgestheconcernthatthisshapeisquitegenericandmayhaveparallelsinother scripts not under its ambit.However, asNBGPdoes not have any exposure aboutactualusageofthosecharactersinthoseparticularscripts,NBGPdesistedfromincludingthem in the analysis.As NBGP has already considered all the related scripts under thecross-scriptvariantanalysis,thesimilarityofthecharactersbelongingtoNBGPscriptswithotherscriptsnotundertheNBGPambit,maybeofamereco-incidentalvisualnature.
Additionally,thisconcernisnotlimitedtothesetwocharactersbutforallthecharactersinall thescriptsunder the scopeof theRootLGRprocedure.Carryingout this analysis canpracticallybedoneonlywiththeGenerationPanelsthatexistwhiletheNBGPisactive.ThisstillleavesoutthosescriptswhichmaynothaveaGenerationPanelestablishedyet.Hence,carrying out this exercise in entirety is quite impracticable. This conundrum can beresolved if all the such cases are handled by the "String SimilarityAssessment Panel" ofICANN.