ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing.
This commit is contained in:
parent
6f21c28d71
commit
a7e6ee8802
1 changed files with 15 additions and 12 deletions
|
@ -563,6 +563,21 @@
|
||||||
example, {\u0040} produces the at sign, <tt>@</tt>.
|
example, {\u0040} produces the at sign, <tt>@</tt>.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The latest <a
|
||||||
|
href="http://iris.lib.virginia.edu/tibet/collections/langling/ewts/">Extended
|
||||||
|
Wylie Transliteration Scheme</a> standard has assigned private-use
|
||||||
|
area (PUA) Unicode codepoints to some TMW glyphs. ACIP
|
||||||
|
documents that have a <a href="#escapes">Unicode escape</a> in the
|
||||||
|
range U+F021 to U+F0FF, inclusive, are interpreted as intending
|
||||||
|
these TMW glyphs. ACIP->Unicode produces an error for such
|
||||||
|
an escape because it is font-dependent and not standard. Other
|
||||||
|
tools will likely not understand such Unicode, so the converter will
|
||||||
|
not produce it. If you want it in the output, it is there in
|
||||||
|
the error message.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Note well the <a href="#bugs">known bug</a> with regard to
|
Note well the <a href="#bugs">known bug</a> with regard to
|
||||||
whitespace in transliteration that follows a Unicode escape.
|
whitespace in transliteration that follows a Unicode escape.
|
||||||
|
@ -1439,18 +1454,6 @@ Nativeness</h2>
|
||||||
treated as U+0F7F; it should cause a warning in some or all
|
treated as U+0F7F; it should cause a warning in some or all
|
||||||
contexts.
|
contexts.
|
||||||
</li>
|
</li>
|
||||||
<li>
|
|
||||||
The latest <a
|
|
||||||
href="http://iris.lib.virginia.edu/tibet/collections/langling/ewts/">Extended
|
|
||||||
Wylie Transliteration Scheme</a> standard has assigned private-use
|
|
||||||
area Unicode codepoints to some TMW glyphs. ACIP documents
|
|
||||||
that have a <a href="#escapes">Unicode escape</a> in the range
|
|
||||||
U+F021 to U+F0FF, inclusive, should be interpreted as intending
|
|
||||||
these TMW glyphs. ACIP->Unicode should by default produce
|
|
||||||
errors for such things (because they are font-dependent and not
|
|
||||||
standard); optionally, the private-use area codepoints should be
|
|
||||||
passed along into the output.
|
|
||||||
</li>
|
|
||||||
<li>
|
<li>
|
||||||
The <a href="#sub"><i>tsheg-bar</i> substitution</a> mechanism
|
The <a href="#sub"><i>tsheg-bar</i> substitution</a> mechanism
|
||||||
should be more general. The useful rule
|
should be more general. The useful rule
|
||||||
|
|
Loading…
Reference in a new issue