WylieWord/THDL phonetics test cases.txt
a1tsal 2e9ea92a3a Fixed phonetics code to strip post-suffix d (bug 800167 in
SourceForge).

  Implemented (somewhat kludgily) option for phonetics scheme to
  replace e with é iff it is the last letter of the last tsheg bar.
  This is required by the new THDL phonetics spec.

  New algorithm, per new THDL phonetics spec, for ba->wa processing.
  The heuristic is that it applies only to the last tsheg bar in
  multi-tsheg-bar words.  (Previously, ba always generated "?ba/wa?",
  which is maybe more correct but less attractive.)  This heuristic
  fails on, e.g., "tsheg bar".  Oh well.

  Rationalized format of phonetics file: > is used as separator in exceptions
  as well as rules.  (Previously, : was used in exceptions only.)
2004-02-20 09:37:23 +00:00

120 lines
2.2 KiB
Text

;
; These examples mostly come from the THDL Phonetics document (Jan 2004 draft)
;
dag pa > dakpa
ring po > ringpo
rin chen > rinchen
lab > lap
dum bu > dumbu
dmar po > marpo
ril bu > rilbu
sa skya pa > sakyapa
blo bzang > lozang
rnying ma pa > nyingmapa
rdo rje > dorjé
dge lugs pa > gelukpa
gzhis ka rtse > zhikatsé
mar me > marmé
dge bshes > geshé
bcu > chu
gcig pa > chikpa
nag chu > nakchu
'phag pa > pakpa
gser thang > sertang
khang tshan > khangtsen
lce > ché
rin chen bzang po > rinchenzangpo
bka' rgyud > kagyü
bsod nams> sönam
yul > yül
dus tshod > dütsö
bon po > bönpo
sde dge > degé
brgyad > gyé
dge rgan > gegen
ral pa can > relpachen
tshe ring > tsering
byes > jé
bstan 'dzin > tendzin
'jam dpal dbyangs > jampelyang
dge legs > gelek
kha btags > khatak
sngags pa > ngakpa
byang chub > jangchup
thub bstan > tupten
tabs > tap
bka' shag > kashak
sbra nag zhol > banakzhöl
thabs > tap
lha sa ba > lhasawa
jo bo > jowo
dpa' bo > pawo
gsal bar > selwar
; nga'i deb > ngé dep -- can't do this one, it depends on word segmentation
bar ba > barwa
spyan ras gzig > chenrezik
phyag > chak
sbyin bdag > jindak
smyong > nyong
dmyal ba > nyelwa
sgrol ma > drölma
rten 'brel > tendrel
'bras spungs > drepung
'phrin las > trinlé
srung ma > sungma
rdzun smra ba > dzünmawa
klad pa > lepa
glog > lok
zla ba > dawa
lha sa > lhasa
lho phyogs > lhochok
lhun grub > lhündrup
dbang > wang
dbyar kha > yarkha
dbral > rel
le'u > leu
khyi'u > khyiu
pa'ang > pang
gri'i > dri
'gro ba'i > drowé
rgyal bu'i > gyelbü
rin po che'i > rinpoché
bdag po'i > dakpö
le'u'i > leü
rta mgrin > tamdrin
g.yon > yön
phyag > chak
bkra shis > trashi
khros ma > tröma
sprul > trül
mri tam ga > mitamga
srid pa > sipa
pad ma > pema
pan chen > penchen
thun > tün
dus gsum > düsum
sbed > bé
ces > ché
btsan dbang > tsenwang
tshong khang > tsongkhang
rdzong > dzong
stabs > tap
thug pa > tukpa
debs > dep
sib sib > sipsip
lobs pa > loppa
grub > drup
kla col > lachöl
spyan snga ba > chenngawa
sems dpa'i > sempé
bon po'i > bönpö
rdzogs > dzok
; Other random tests
phreng > treng
; Test of second-suffix d removal. Made-up word because I don't know real ones.
rand > ren
; Test that we don't spazz out on single-letter words.
a > a
ai > ai