ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing.

This commit is contained in:
dchandler 2003-12-08 07:21:45 +00:00
parent 6f21c28d71
commit a7e6ee8802
1 changed files with 15 additions and 12 deletions

View File

@ -563,6 +563,21 @@
example, {\u0040} produces the at sign, <tt>@</tt>.
</p>
<p>
The latest <a
href="http://iris.lib.virginia.edu/tibet/collections/langling/ewts/">Extended
Wylie Transliteration Scheme</a> standard has assigned private-use
area (PUA) Unicode codepoints to some TMW glyphs.&nbsp; ACIP
documents that have a <a href="#escapes">Unicode escape</a> in the
range U+F021 to U+F0FF, inclusive, are interpreted as intending
these TMW glyphs.&nbsp; ACIP-&gt;Unicode produces an error for such
an escape because it is font-dependent and not standard.&nbsp; Other
tools will likely not understand such Unicode, so the converter will
not produce it.&nbsp; If you want it in the output, it is there in
the error message.
</p>
<p>
Note well the <a href="#bugs">known bug</a> with regard to
whitespace in transliteration that follows a Unicode escape.&nbsp;
@ -1439,18 +1454,6 @@ Nativeness</h2>
treated as U+0F7F; it should cause a warning in some or all
contexts.
</li>
<li>
The latest <a
href="http://iris.lib.virginia.edu/tibet/collections/langling/ewts/">Extended
Wylie Transliteration Scheme</a> standard has assigned private-use
area Unicode codepoints to some TMW glyphs.&nbsp; ACIP documents
that have a <a href="#escapes">Unicode escape</a> in the range
U+F021 to U+F0FF, inclusive, should be interpreted as intending
these TMW glyphs.&nbsp; ACIP-&gt;Unicode should by default produce
errors for such things (because they are font-dependent and not
standard); optionally, the private-use area codepoints should be
passed along into the output.
</li>
<li>
The <a href="#sub"><i>tsheg-bar</i> substitution</a> mechanism
should be more general.&nbsp; The useful rule