Renamed TMW_RTF_TO_THDL_WYLIE TibetanConverter. Updated
docs.
This commit is contained in:
parent
eb7013dc39
commit
7c7d57ecd7
1 changed files with 62 additions and 22 deletions
|
@ -193,9 +193,10 @@ The first section of text is the short "introduction" about the Theme and the va
|
|||
<p>
|
||||
In the same JAR file as Jskad, power users will find a command-line
|
||||
utility that converts a Tibetan Machine Web-encoded (TMW-encoded) Rich
|
||||
Text Format (RTF) file to either of these two output formats:
|
||||
Text Format (RTF) file to either of these three output formats:
|
||||
</p>
|
||||
<ul>
|
||||
<li>RTF files in Unicode</li>
|
||||
<li>RTF files with the appropriate THDL Extended Wylie (Wylie) used
|
||||
instead of TMW</li>
|
||||
<li>RTF files in Tibetan Machine (used in legacy systems)</li>
|
||||
|
@ -203,16 +204,31 @@ The first section of text is the short "introduction" about the Theme and the va
|
|||
|
||||
<p>
|
||||
In addition, this converter can convert Tibetan Machine RTF files to
|
||||
Tibetan Machine RTF files, and takes precautions to ensure that only
|
||||
a 100% perfect conversion is done.<!-- FIXME: talk about the
|
||||
my-own-tibwn.ini validation exercise and the oddballs and their
|
||||
handling -->
|
||||
Tibetan Machine Web RTF files, and takes precautions to ensure that
|
||||
only a 100% perfect conversion is done in both directions
|
||||
(TM->TMW and TMW>TM). One such precaution is that two
|
||||
independent teams (Garrett and Garson, Chandler) turned the Tibetan
|
||||
Machine Web <a
|
||||
href="http://iris.lib.virginia.edu/tibet/tools/tmw.html#doc">
|
||||
documentation</a> into TM<->TMW tables. These tables
|
||||
were compared, giving full confidence that the tables are as
|
||||
accurate as the documentation (which has a <a
|
||||
href="http://sourceforge.net/tracker/index.php?func=detail&aid=746871&group_id=61934&atid=502515">
|
||||
few flaws</a> itself). That documentation has not been
|
||||
extensively verified against the actual fonts, however.
|
||||
Another precaution is that any unknown characters cause the
|
||||
conversion to fail, and the result is a document containing merely
|
||||
the unknown characters. (There are some known, illegal glyphs
|
||||
created by Tibet Doc, and the converter handles the ones it knows of
|
||||
and treats the rest as unknown.)
|
||||
</p>
|
||||
|
||||
<p>
|
||||
This converter is smart enough to solve the "curly-brace
|
||||
problem". This problem originates with certain versions
|
||||
of Microsoft Word's Rich Text Format writing capabilities.
|
||||
problem", wherein Tahoma '{', '}', and '\' characters appear
|
||||
instead of the TMW stacks they are supposed to represent. This
|
||||
problem originates with certain versions of Microsoft Word's Rich
|
||||
Text Format writing capabilities.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
@ -230,7 +246,7 @@ The first section of text is the short "introduction" about the Theme and the va
|
|||
</p>
|
||||
<pre>
|
||||
java -cp Jskad.jar \
|
||||
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE \
|
||||
org.thdl.tib.input.TibetanConverter \
|
||||
--find-some-non-tmw \
|
||||
"Dalai Lama Fifth History 01.rtf"
|
||||
non-TMW character newline in the font Tahoma appears first at location 39
|
||||
|
@ -245,7 +261,7 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
Given the above output, you can be sure that a flawless conversion
|
||||
(barring the appearance of <a href="#knownbugs">known bugs</a>) will
|
||||
result when you run <tt>java -cp Jskad.jar
|
||||
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE --to-wylie "Dalai Lama
|
||||
org.thdl.tib.input.TibetanConverter --to-wylie "Dalai Lama
|
||||
Fifth History 01.rtf" > "Dalai Lama Fifth History 01 in THDL
|
||||
Extended Wylie.rtf"</tt>. This is because the only text in the
|
||||
input file besides Tibetan is whitespace and the Tahoma characters
|
||||
|
@ -254,6 +270,15 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
"curly-brace problem".
|
||||
</p>
|
||||
|
||||
<h3>Failed Conversions</h3>
|
||||
|
||||
<p>
|
||||
In this section, you'll learn how to tell if a conversion has
|
||||
succeeded in full, ran into minor problems, or failed altogether.
|
||||
</p>
|
||||
|
||||
<h4>TMW to Wylie</h4>
|
||||
|
||||
<p>
|
||||
Note that some TMW glyphs have no transliteration in Exteded
|
||||
Wylie. When you encounter such a glyph, you'll find a message
|
||||
|
@ -278,25 +303,38 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
Wylie by the tool, please report this as a bug.
|
||||
</p>
|
||||
|
||||
<h4>Other Conversions</h4>
|
||||
|
||||
<p>
|
||||
Note also that there is one TMW glyph (TibetanMachineWeb7, glyph 91)
|
||||
that has no Tibetan Machine equivalent. A 72-point copy of the
|
||||
Tibetan alphabet will be inserted (in TMW) before this glyph.
|
||||
Some common-but-illegal TibetanMachine input will also cause the
|
||||
alphabet to appear before the offending glyph. Please use
|
||||
Jskad to convert such documents, as it has better error checking and
|
||||
can tell you just what's wrong. If you ever encounter these
|
||||
problems, please send us mail with the error report (and the problem
|
||||
input document) so that we can improve our tools.
|
||||
The other conversions are all-or-nothing. That is, if you run
|
||||
into any trouble whatsoever, the result will be a file containing
|
||||
just the problematic glyphs. If your result is as long as your
|
||||
input, then the conversion went flawlessly.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
There is one TMW glyph (TibetanMachineWeb7, glyph 91) that has no
|
||||
Tibetan Machine equivalent. This glyph is the only TMW glyph
|
||||
that can cause a TMW->TM conversion to fail.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
You might consider using Jskad to convert documents that give
|
||||
errors, as it has better error reporting and can tell you just
|
||||
what's wrong.
|
||||
</p>
|
||||
<p>
|
||||
If you ever encounter problems in a TM->TMW conversion, please
|
||||
send us mail with the error report (and the problem input document's
|
||||
resulting document) so that we can improve our tools.
|
||||
</p>
|
||||
|
||||
<h3>Invoking the Converter</h3>
|
||||
|
||||
<p>
|
||||
First add Jskad.jar to your CLASSPATH. Now run the command
|
||||
<tt>java org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE</tt> from a
|
||||
command prompt. You will see usage information appear.
|
||||
Forgive the name; this converter's scope widened after its creation.
|
||||
<tt>java org.thdl.tib.input.TibetanConverter</tt> from a
|
||||
command prompt. You will see usage information appear.
|
||||
</p>
|
||||
|
||||
<h3><a name="knownbugs"></a>Known Bugs</h3>
|
||||
|
@ -304,7 +342,9 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
<p>
|
||||
If the TMW given is not syntactically legal, then the Wylie that
|
||||
results will not necessarily yield, if imported into Jskad, the same
|
||||
Tibetan with which the converter started.
|
||||
Tibetan with which the converter started. The glyphs
|
||||
corresponding to the Wylie 'jaskadaskeda' have this problem, for
|
||||
example.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
|
Loading…
Reference in a new issue