Renamed TMW_RTF_TO_THDL_WYLIE TibetanConverter. Updated
docs.
This commit is contained in:
parent
eb7013dc39
commit
7c7d57ecd7
1 changed files with 62 additions and 22 deletions
|
@ -193,9 +193,10 @@ The first section of text is the short "introduction" about the Theme and the va
|
||||||
<p>
|
<p>
|
||||||
In the same JAR file as Jskad, power users will find a command-line
|
In the same JAR file as Jskad, power users will find a command-line
|
||||||
utility that converts a Tibetan Machine Web-encoded (TMW-encoded) Rich
|
utility that converts a Tibetan Machine Web-encoded (TMW-encoded) Rich
|
||||||
Text Format (RTF) file to either of these two output formats:
|
Text Format (RTF) file to either of these three output formats:
|
||||||
</p>
|
</p>
|
||||||
<ul>
|
<ul>
|
||||||
|
<li>RTF files in Unicode</li>
|
||||||
<li>RTF files with the appropriate THDL Extended Wylie (Wylie) used
|
<li>RTF files with the appropriate THDL Extended Wylie (Wylie) used
|
||||||
instead of TMW</li>
|
instead of TMW</li>
|
||||||
<li>RTF files in Tibetan Machine (used in legacy systems)</li>
|
<li>RTF files in Tibetan Machine (used in legacy systems)</li>
|
||||||
|
@ -203,16 +204,31 @@ The first section of text is the short "introduction" about the Theme and the va
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
In addition, this converter can convert Tibetan Machine RTF files to
|
In addition, this converter can convert Tibetan Machine RTF files to
|
||||||
Tibetan Machine RTF files, and takes precautions to ensure that only
|
Tibetan Machine Web RTF files, and takes precautions to ensure that
|
||||||
a 100% perfect conversion is done.<!-- FIXME: talk about the
|
only a 100% perfect conversion is done in both directions
|
||||||
my-own-tibwn.ini validation exercise and the oddballs and their
|
(TM->TMW and TMW>TM). One such precaution is that two
|
||||||
handling -->
|
independent teams (Garrett and Garson, Chandler) turned the Tibetan
|
||||||
|
Machine Web <a
|
||||||
|
href="http://iris.lib.virginia.edu/tibet/tools/tmw.html#doc">
|
||||||
|
documentation</a> into TM<->TMW tables. These tables
|
||||||
|
were compared, giving full confidence that the tables are as
|
||||||
|
accurate as the documentation (which has a <a
|
||||||
|
href="http://sourceforge.net/tracker/index.php?func=detail&aid=746871&group_id=61934&atid=502515">
|
||||||
|
few flaws</a> itself). That documentation has not been
|
||||||
|
extensively verified against the actual fonts, however.
|
||||||
|
Another precaution is that any unknown characters cause the
|
||||||
|
conversion to fail, and the result is a document containing merely
|
||||||
|
the unknown characters. (There are some known, illegal glyphs
|
||||||
|
created by Tibet Doc, and the converter handles the ones it knows of
|
||||||
|
and treats the rest as unknown.)
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
This converter is smart enough to solve the "curly-brace
|
This converter is smart enough to solve the "curly-brace
|
||||||
problem". This problem originates with certain versions
|
problem", wherein Tahoma '{', '}', and '\' characters appear
|
||||||
of Microsoft Word's Rich Text Format writing capabilities.
|
instead of the TMW stacks they are supposed to represent. This
|
||||||
|
problem originates with certain versions of Microsoft Word's Rich
|
||||||
|
Text Format writing capabilities.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
@ -230,7 +246,7 @@ The first section of text is the short "introduction" about the Theme and the va
|
||||||
</p>
|
</p>
|
||||||
<pre>
|
<pre>
|
||||||
java -cp Jskad.jar \
|
java -cp Jskad.jar \
|
||||||
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE \
|
org.thdl.tib.input.TibetanConverter \
|
||||||
--find-some-non-tmw \
|
--find-some-non-tmw \
|
||||||
"Dalai Lama Fifth History 01.rtf"
|
"Dalai Lama Fifth History 01.rtf"
|
||||||
non-TMW character newline in the font Tahoma appears first at location 39
|
non-TMW character newline in the font Tahoma appears first at location 39
|
||||||
|
@ -245,7 +261,7 @@ non-TMW character newline in the font Times New Roman appears first at location
|
||||||
Given the above output, you can be sure that a flawless conversion
|
Given the above output, you can be sure that a flawless conversion
|
||||||
(barring the appearance of <a href="#knownbugs">known bugs</a>) will
|
(barring the appearance of <a href="#knownbugs">known bugs</a>) will
|
||||||
result when you run <tt>java -cp Jskad.jar
|
result when you run <tt>java -cp Jskad.jar
|
||||||
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE --to-wylie "Dalai Lama
|
org.thdl.tib.input.TibetanConverter --to-wylie "Dalai Lama
|
||||||
Fifth History 01.rtf" > "Dalai Lama Fifth History 01 in THDL
|
Fifth History 01.rtf" > "Dalai Lama Fifth History 01 in THDL
|
||||||
Extended Wylie.rtf"</tt>. This is because the only text in the
|
Extended Wylie.rtf"</tt>. This is because the only text in the
|
||||||
input file besides Tibetan is whitespace and the Tahoma characters
|
input file besides Tibetan is whitespace and the Tahoma characters
|
||||||
|
@ -254,6 +270,15 @@ non-TMW character newline in the font Times New Roman appears first at location
|
||||||
"curly-brace problem".
|
"curly-brace problem".
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<h3>Failed Conversions</h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
In this section, you'll learn how to tell if a conversion has
|
||||||
|
succeeded in full, ran into minor problems, or failed altogether.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<h4>TMW to Wylie</h4>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Note that some TMW glyphs have no transliteration in Exteded
|
Note that some TMW glyphs have no transliteration in Exteded
|
||||||
Wylie. When you encounter such a glyph, you'll find a message
|
Wylie. When you encounter such a glyph, you'll find a message
|
||||||
|
@ -278,25 +303,38 @@ non-TMW character newline in the font Times New Roman appears first at location
|
||||||
Wylie by the tool, please report this as a bug.
|
Wylie by the tool, please report this as a bug.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<h4>Other Conversions</h4>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Note also that there is one TMW glyph (TibetanMachineWeb7, glyph 91)
|
The other conversions are all-or-nothing. That is, if you run
|
||||||
that has no Tibetan Machine equivalent. A 72-point copy of the
|
into any trouble whatsoever, the result will be a file containing
|
||||||
Tibetan alphabet will be inserted (in TMW) before this glyph.
|
just the problematic glyphs. If your result is as long as your
|
||||||
Some common-but-illegal TibetanMachine input will also cause the
|
input, then the conversion went flawlessly.
|
||||||
alphabet to appear before the offending glyph. Please use
|
</p>
|
||||||
Jskad to convert such documents, as it has better error checking and
|
|
||||||
can tell you just what's wrong. If you ever encounter these
|
<p>
|
||||||
problems, please send us mail with the error report (and the problem
|
There is one TMW glyph (TibetanMachineWeb7, glyph 91) that has no
|
||||||
input document) so that we can improve our tools.
|
Tibetan Machine equivalent. This glyph is the only TMW glyph
|
||||||
|
that can cause a TMW->TM conversion to fail.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
You might consider using Jskad to convert documents that give
|
||||||
|
errors, as it has better error reporting and can tell you just
|
||||||
|
what's wrong.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
If you ever encounter problems in a TM->TMW conversion, please
|
||||||
|
send us mail with the error report (and the problem input document's
|
||||||
|
resulting document) so that we can improve our tools.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h3>Invoking the Converter</h3>
|
<h3>Invoking the Converter</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
First add Jskad.jar to your CLASSPATH. Now run the command
|
First add Jskad.jar to your CLASSPATH. Now run the command
|
||||||
<tt>java org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE</tt> from a
|
<tt>java org.thdl.tib.input.TibetanConverter</tt> from a
|
||||||
command prompt. You will see usage information appear.
|
command prompt. You will see usage information appear.
|
||||||
Forgive the name; this converter's scope widened after its creation.
|
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h3><a name="knownbugs"></a>Known Bugs</h3>
|
<h3><a name="knownbugs"></a>Known Bugs</h3>
|
||||||
|
@ -304,7 +342,9 @@ non-TMW character newline in the font Times New Roman appears first at location
|
||||||
<p>
|
<p>
|
||||||
If the TMW given is not syntactically legal, then the Wylie that
|
If the TMW given is not syntactically legal, then the Wylie that
|
||||||
results will not necessarily yield, if imported into Jskad, the same
|
results will not necessarily yield, if imported into Jskad, the same
|
||||||
Tibetan with which the converter started.
|
Tibetan with which the converter started. The glyphs
|
||||||
|
corresponding to the Wylie 'jaskadaskeda' have this problem, for
|
||||||
|
example.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
|
Loading…
Reference in a new issue