diff --git a/htdocs/ACIP_To_Tibetan_Converter.html b/htdocs/ACIP_To_Tibetan_Converter.html index 13174a1..e4cb38a 100644 --- a/htdocs/ACIP_To_Tibetan_Converter.html +++ b/htdocs/ACIP_To_Tibetan_Converter.html @@ -563,6 +563,21 @@ example, {\u0040} produces the at sign, @.

+

+ The latest Extended + Wylie Transliteration Scheme standard has assigned private-use + area (PUA) Unicode codepoints to some TMW glyphs.  ACIP + documents that have a Unicode escape in the + range U+F021 to U+F0FF, inclusive, are interpreted as intending + these TMW glyphs.  ACIP->Unicode produces an error for such + an escape because it is font-dependent and not standard.  Other + tools will likely not understand such Unicode, so the converter will + not produce it.  If you want it in the output, it is there in + the error message. +

+ +

Note well the known bug with regard to whitespace in transliteration that follows a Unicode escape.  @@ -1439,18 +1454,6 @@ Nativeness treated as U+0F7F; it should cause a warning in some or all contexts. -

  • - The latest Extended - Wylie Transliteration Scheme standard has assigned private-use - area Unicode codepoints to some TMW glyphs.  ACIP documents - that have a Unicode escape in the range - U+F021 to U+F0FF, inclusive, should be interpreted as intending - these TMW glyphs.  ACIP->Unicode should by default produce - errors for such things (because they are font-dependent and not - standard); optionally, the private-use area codepoints should be - passed along into the output. -
  • The tsheg-bar substitution mechanism should be more general.  The useful rule