// Test cases for Wylie -> [other representation] -> Wylie round-trip conversion. // // Each line in the file is a comment, blank, or one test case. // // A comment line starts with //. // // Each test case is understood as EWTS; it should convert back into itself. // The following test cases are from DLC's DuffPaneTest. tsha tsa dza sha nga nag nga / bkra shis bde legs/ sgom pa'am sgom pe'am le'u'i'o la'u'i'o/la'am/pa'ang/pe'ang bras dwa gwa gyug g.yag gyag g.yas gyas 'gas gangs gnags 'byung 'byungs blags mnags gdams 'gams // Random things known to be hard // Difficult to parse deterministically brgyud // Featuritis brtswand // the only other case of superscript + root + wazur rgwug // two-letter suffix (not to be treated as n suffix + g second-suffix) dbang // plus second sufix dbangs // A common word that is not grammatical, according to some // sources, and that is also hard because br- looks like // b should be the root brlabs // "'is" is the form of kyis used after bare vowel or a chung re'u'is he'is // Bare vowels a i 'o 'a // Bare polyvowels; not clear if these are really legal, but: a'o i'u'i 'o'i // things that at one time or another tickled bugs in WW code gyurd ga*ra ha_sa gangs mngas 'gram la'i a'i od bem a ra lnga lang // A-chung as root 'a 'od 'angs 'ag 'ad // The exceptions to the three-letter root rule rags lags nags bags bangs gangs rangs langs nangs sangs babs rabs rams nams // This is my old root-letter-capitalization and pronunciation test suite. // It's should also be good for tickling conversion bugs. grwa bsgrubs bzlog brlabs sprul // this one is important because it's an exception to the rule that in a // row of three consonants, the second is almost always the root. shabs shes ldan mgon rgyud ten rigs chos dbang dbying dbyang spyang phyug byang sbyin myur po'i mri srung mkha' 'gro rta'i khwa rdzogs rdo rje zla bar ngan rnga g.yo gyo dpa' 'dag klu khyung // These are from DLC's PackageTest brtan blta blag brag bsabs zungs brtib spyoms sku'i bskyangs ag bda' dbang dga' dgra dmar gda' // Sanskrit // Tests from DLC ai heM hiM h-iM heM haiM hoM hauM hUM pad+me D+h+D+ha gaD+h+D+ha autapa parItshatsawa ke: ka: n+yadA ak+Sha Asa AT+Ta b+da d+ba d+ga d+g+ra d+gara // Other Sanskrit rak+Shasa DAkinI haM pad+ma samuts+tsaya gang+gA sid+d+hi shrI sat+t+wa sarma hU~M`! wa~M`ba~MthangumaM // Sanskrit bug ticklers ragyada 'agarma gayaradasa // R+, +W, +Y, +R paN+D+R+yau k+Sh+RUM kar+Wabi waw+Wa fuN+D+Yo yay+Yaya r+Yi // Taken from THDL Sechen text collection dashadigAMsarbatimirabi ni saranan+nAmashrIguh+yagar+b+hatatwaMsh+ts+yayastanataras+yaTa'ikAbiharatisma shrI tan+t+radzamAyadzalasamudAyArtas+yadwArAt+nish+tsitasub+haShitaMguh+yapatimukhAgamanAmabiharatisma guh+ya pa ti sa shad+h+ya laM kA ra nA ma shrA iguh+ya gar+b+ha ta twa pa in ish+tsa ya tan+t+ra s+ya pr-i t+t-i bi ha rti sma // Bare vowels: A I U -i -I // leading weird vowels aiga -Iba // two or more vowels in a row a.a // A.I.ai !!! WW is known not to be able to deal with .- ai.a I.a A.a A.A a.a // ai.-i !!! WW is known not to be able to deal with .- -I.e he.a ha.e ha.a hai.a ha.ai hU.a ha.A ha.U // fun with weird vowels; these test cases look similar, but actually mostly go through different paths // in the (wretchedly convoluted) code. hA.A hA.a hA.e hA.ai // hA.-i !!! WW is known not to be able to deal with .- hU.A hU.a hU.e hU.ai // hU.-i !!! WW is known not to be able to deal with .- hAMA hAMa hAMe hAMai hAM-i hAMu hA.Ada hA.ada hA.eda hA.aida hiMa ho~M`a // hA.-ida !!! WW is known not to be able to deal with .- // M,H on bare vowel aiM eM U~M au~M` aH aiH // bindus in the middle of a word gaMb+ha hUMhU~MhU~M` aMu~MraH raMH // Semivowel as subscript. hr-I kl-iba // Difficult because ny+d is not legal ny+dza raH hai hau // One instance of every stack in tibwn.ini, with sequential vowels attached k+Sha k+ke k+khi k+ngo k+tsu k+tA k+t+yI k+t+rU k+t+r+yai k+t+wau k+th-i k+th+y-I k+Na k+ne k+n+yi k+pho k+mu k+m+yA k+r+yI k+w+yU k+shai k+sau k+s+n-i k+s+m-I k+s+ya k+s+we kh+khi kh+no kh+lu g+gA g+g+hI g+nyU g+dai g+d+hau g+d+h+y-i g+d+h+w-I g+na g+n+ye g+pi g+b+ho g+b+h+yu g+mA g+m+yI g+r+yU g+hai g+h+g+hau g+h+ny-i g+h+n-I g+h+n+ya g+h+me g+h+li g+h+yo g+h+ru g+h+wA ng+kI ng+k+tU ng+k+t+yai ng+k+yau ng+kh-i ng+kh+y-I ng+ga ng+g+re ng+g+yi ng+g+ho ng+g+h+yu ng+g+h+rA ng+ngI ng+tU ng+nai ng+mau ng+y-i ng+l-I ng+sha ng+he ng+k+Shi ng+k+Sh+wo ng+k+Sh+yu ts+tsA ts+tshI ts+tsh+wU ts+tsh+rai ts+nyau ts+n+y-i ts+m-I ts+ya ts+re ts+li ts+h+yo tsh+thu tsh+tshA tsh+yI tsh+rU tsh+lai dz+dzau dz+dz+ny-i dz+dz+w-I dz+dz+ha dz+h+dz+he dz+nyi dz+ny+yo dz+nu dz+n+wA dz+mI dz+yU dz+rai dz+wau dz+h-i dz+h+y-I dz+h+ra dz+h+le dz+h+wi ny+tso ny+ts+mu ny+ts+yA ny+tshI ny+dzU ny+dz+yai ny+dz+hau ny+ny-i ny+p-I ny+pha ny+ye ny+ri ny+lo ny+shu T+kA T+TI T+T+hU T+nai T+pau T+m-i T+y-I T+wa T+se Th+yi Th+ro D+gu D+g+yA D+g+hI D+g+h+rU D+Dai D+D+hau D+D+h+y-i D+n-I D+ma D+ye D+ri D+wo D+hu D+h+D+hA D+h+mI D+h+yU D+h+rai D+h+wau N+T-i N+Th-I N+Da N+D+Ye N+D+ri N+D+R+yo N+D+hu N+NA N+d+rI N+mU N+yai N+wau t+k-i t+k+r-I t+k+wa t+k+se t+gi t+nyo t+Thu t+tA t+t+yI t+t+rU t+t+wai t+thau t+th+y-i t+n-I t+n+ya t+pe t+p+ri t+pho t+mu t+m+yA t+yI // t+rnU -- !!! I think that this is a bogus tibwn.ini entry, but I'm not sure t+sai t+s+thau t+s+n-i t+s+n+y-I t+s+ma t+s+m+ye t+s+yi t+s+ro t+s+wu t+r+yA t+w+yI t+k+ShU th+yai th+wau d+g-i d+g+y-I d+g+ra d+g+he d+g+h+ri d+dzo d+du d+d+yA d+d+rI d+d+wU d+d+hai d+d+h+nau d+d+h+y-i d+d+h+r-I d+d+h+wa d+ne d+bi d+b+ro d+b+hu d+b+h+yA d+b+h+rI d+mU d+yai d+r+yau d+w+y-i d+h-I d+h+na d+h+n+ye d+h+mi d+h+yo d+h+ru d+h+r+yA d+h+wI n+kU n+k+tai n+g+hau n+ng-i n+dz-I n+dz+ya n+De n+ti n+t+yo n+t+ru n+t+r+yA n+t+wI n+t+sU n+thai n+dau n+d+d-i n+d+d+r-I n+d+yA n+d+re n+d+hi n+d+h+ro n+d+h+yu n+nA n+n+yI n+pU n+p+rai n+phau n+m-i n+b+h+y-I n+tsa n+ye n+ri n+wo n+w+yu n+sA n+s+yI n+hU n+h+rai p+tau p+t+y-i p+t+r+y-I p+da p+ne p+n+yi p+po p+mu p+lA p+wI p+sU p+s+n+yai p+s+wau p+s+y-i b+g+h-I b+dza b+de b+d+dzi b+d+ho b+d+h+wu b+tA b+nI b+bU b+b+hai b+b+h+yau b+m-i b+h-I b+h+Na b+h+ne b+h+mi b+h+yo b+h+ru b+h+wA m+nyI m+NU m+nai m+n+yau m+p-i m+p+r-I m+pha m+be m+b+hi m+b+h+yo m+mu m+lA m+wI m+sU m+hai y+Yau y+r-i y+w-I y+sa r+khe r+g+hi r+g+h+yo r+ts+yu r+tshA r+dz+nyI r+dz+yU r+Tai r+Thau r+D-i r+N-I r+t+wa r+t+te r+t+si r+t+s+no r+t+s+n+yu r+thA r+th+yI r+d+d+hU r+d+d+h+yai r+d+yau r+d+h-i r+d+h+m-I r+d+h+ya r+d+h+re r+pi r+b+po r+b+bu r+b+hA r+m+mI r+YU r+Wai r+shau r+sh+y-i r+Sh-I r+Sh+Na r+Sh+N+ye r+Sh+mi r+Sh+yo r+su r+hA r+k+ShI l+g+wU l+b+yai l+mau l+y-i l+w-I l+la l+h+we w+yi w+ro w+nu w+WA sh+tsI sh+ts+yU sh+tshai sh+Nau sh+n-i sh+p-I sh+b+ya sh+me sh+yi sh+r+yo sh+lu sh+w+gA sh+w+yI sh+shU Sh+kai Sh+k+rau Sh+T-i Sh+T+y-I Sh+T+ra Sh+T+r+ye Sh+T+wi Sh+Tho Sh+Th+yu Sh+NA Sh+N+yI Sh+DU Sh+thai Sh+pau Sh+p+r-i Sh+m-I Sh+ya Sh+we Sh+Shi s+k+so s+khu s+ts+yA s+TI s+ThU s+t+yai s+t+rau s+t+w-i s+th-I s+th+ya s+n+ye s+n+wi s+pho s+ph+yu s+yA s+r+wI s+sU s+s+wai s+hau s+w+y-i h+ny-I h+Na h+te h+ni h+n+yo h+pu h+phA h+mI h+yU h+lai h+sau h+s+w-i h+w+y-I k+Sh+Na k+Sh+me k+Sh+m+yi k+Sh+yo k+Sh+Ru k+Sh+lA k+Sh+wI a+yU a+rai a+r+yau // Non-Tibetan, Non-Sanskrit words that should be convertible fava vafa vI fai vo fI/ fai/ fo/ fuM // punctuation 012345678901234 !@#$%&*()_={}:;<>? a_tsal852ja$)@#%(!Ta)0daM)%!@sa // test special handing for @# @# #@ @a# // English insert rta'i [(of horse)] mgrin // ??? legal??? pa'am'ang // is this actually illegal? pe'as pe's p'is p'e bskyUMbs bskyUMbsHgro b+g+ka xan tax qan taq shae aeb . + .y // Arguable if these are illegal g.ra r.ya +a a+ba e+ya ba+ b+ M ~M Ma Mam ~Ma ~Mam r+ra tat sbla // Skt can't end in consonant ragyad // no vowel between ' and g 'garma // stuff testing handling of ' as root b'o r'o 'yo 'wo