2e9ea92a3a
SourceForge). Implemented (somewhat kludgily) option for phonetics scheme to replace e with é iff it is the last letter of the last tsheg bar. This is required by the new THDL phonetics spec. New algorithm, per new THDL phonetics spec, for ba->wa processing. The heuristic is that it applies only to the last tsheg bar in multi-tsheg-bar words. (Previously, ba always generated "?ba/wa?", which is maybe more correct but less attractive.) This heuristic fails on, e.g., "tsheg bar". Oh well. Rationalized format of phonetics file: > is used as separator in exceptions as well as rules. (Previously, : was used in exceptions only.)
109 lines
No EOL
2.9 KiB
Text
109 lines
No EOL
2.9 KiB
Text
; Aro Gar phonetic rules for WylieWord.
|
||
|
||
; These rules attempt to capture the phonetic transcription scheme
|
||
; used by Ngak'chang Rinpoche. The results have not, however, yet
|
||
; been reviewed by Rinpoche, and probably differ from his intention
|
||
; in some ways.
|
||
|
||
; File format: there are two sections, for rules and exceptions.
|
||
;
|
||
; There is one rule per line. Typically a rule has two parts,
|
||
; separated by a space. The first part is a sequence of letters
|
||
; that, if it appears in a Wylie transliteration, is to be replaced
|
||
; by the second part in the phonetic transcription. The second
|
||
; part can be missing, in which case the first part is simply
|
||
; deleted from transcriptions when found.
|
||
;
|
||
; A semicolon precedes a comment. Blank lines are OK.
|
||
;
|
||
; The rules are applied in the order they appear in this file.
|
||
; Each rule is applied as many times as possible, but we never
|
||
; go back to a previous rule. (This simple rewrite-rule grammar is
|
||
; not sufficient to implement all phonetic schemes, at least not
|
||
; compactly. For example, it would be difficult to capture the
|
||
; effects of preinitial consonants on tone (as in the scheme used
|
||
; in Joe Wilson's book, for instance).) Also note that not even the
|
||
; whole of the present scheme is implemented using these rules. In
|
||
; particular, the deletion of prefix and superscript consonants,
|
||
; and of wa-zur, are done in program code, not using the rules here.
|
||
|
||
; Miscellaneous prefix transformations
|
||
g. ; delete this (representing g prefix, used before root y only)
|
||
dby y ; must come before db->w, for dbyang
|
||
dbr r ; must come before db->w, for dbral
|
||
db w ; must come before by->j
|
||
|
||
; c and ch are both transcribed ch. To get this we need a kludge
|
||
; (involving x), because the rule c -> ch would apply recursively.
|
||
ch c
|
||
c x
|
||
x ch
|
||
|
||
; Bad behavior from Y
|
||
py ch
|
||
phy ch
|
||
by j
|
||
my ny
|
||
|
||
; Retroflexes
|
||
kr tr
|
||
khr thr
|
||
gr dr
|
||
pr tr
|
||
phr thr
|
||
br dr
|
||
|
||
; Other bad behavior from R
|
||
mr m
|
||
sr s
|
||
|
||
; Uniquely random case
|
||
zl d
|
||
|
||
; Vowel transformations
|
||
;
|
||
; Note: this must be done before suffix-stripping.
|
||
; Before actually doing the transformations, we "hide" the n in ng, so that ng doesn't
|
||
; induce umlaut. This is gross; if we had a real grammar engine it wouldn't
|
||
; be necessary.
|
||
ng x
|
||
; Transformations of e: short before g, m, n; long otherwise
|
||
eg <20>g
|
||
em <20>m
|
||
en <20>n
|
||
e <20>
|
||
; Umlaut of a, o, u followed by d, n, l, s
|
||
ad <20>d
|
||
an <20>n
|
||
al <20>l
|
||
as <20>
|
||
od <20>d
|
||
on <20>n
|
||
ol <20>l
|
||
os <20>
|
||
ud <20>d
|
||
un <20>n
|
||
ul <20>l
|
||
us <20>
|
||
; restore ng
|
||
x ng
|
||
|
||
<?Exceptions?>
|
||
|
||
; There is one exception per line. Each exception consists of
|
||
; the transliteration (which may be several syllables separated
|
||
; by spaces), followed by a space, a colon, a space, and the
|
||
; pronunciation (which may also contain spaces). A semicolon
|
||
; precedes a comment. Blank lines are OK.
|
||
|
||
rdo rje > dorj<72>
|
||
mkha' 'gro > khandro
|
||
sku mnye > kumny<6E>
|
||
sprul sku > tulku
|
||
mtsho rgyal > tsogy<67>l
|
||
rta mgrin> tamdrin
|
||
dga' ldan > ganden
|
||
dge 'dun > gend<6E>n
|
||
a mdo > amdo
|
||
srid pa > sipa
|
||
pad ma > p<>ma |