Major improvements to the documentation for the command-line converter.
This commit is contained in:
parent
a96e14a245
commit
64e47d96db
1 changed files with 90 additions and 35 deletions
|
@ -192,8 +192,13 @@ The first section of text is the short "introduction" about the Theme and the va
|
|||
|
||||
<p>
|
||||
In the same JAR file as Jskad, power users will find a command-line
|
||||
utility that converts a Tibetan Machine Web-encoded (TMW-encoded) Rich
|
||||
Text Format (RTF) file to either of these three output formats:
|
||||
utility that converts Tibetan documents from one digital
|
||||
representation to another. The converter embodies the same
|
||||
technology as Jskad itself, but often works even when Jskad fails
|
||||
due to Java's presently poor support for viewing RTF
|
||||
documents. This command-line utility converts a Tibetan
|
||||
Machine Web-encoded (TMW-encoded) Rich Text Format (RTF) file to
|
||||
either of these three output formats:
|
||||
</p>
|
||||
<ul>
|
||||
<li>RTF files in Unicode</li>
|
||||
|
@ -245,10 +250,11 @@ The first section of text is the short "introduction" about the Theme and the va
|
|||
font. Here is some example output:
|
||||
</p>
|
||||
<pre>
|
||||
java -cp Jskad.jar \
|
||||
java -cp "c:\my thdl tools\Jskad.jar" \
|
||||
org.thdl.tib.input.TibetanConverter \
|
||||
--find-some-non-tmw \
|
||||
"Dalai Lama Fifth History 01.rtf"
|
||||
|
||||
non-TMW character newline in the font Tahoma appears first at location 39
|
||||
non-TMW character ' ' in the font TimesNewRoman appears first at location 45
|
||||
non-TMW character '}' in the font Tahoma appears first at location 66
|
||||
|
@ -260,14 +266,15 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
<p>
|
||||
Given the above output, you can be sure that a flawless conversion
|
||||
(barring the appearance of <a href="#knownbugs">known bugs</a>) will
|
||||
result when you run <tt>java -cp Jskad.jar
|
||||
org.thdl.tib.input.TibetanConverter --to-wylie "Dalai Lama
|
||||
Fifth History 01.rtf" > "Dalai Lama Fifth History 01 in THDL
|
||||
Extended Wylie.rtf"</tt>. This is because the only text in the
|
||||
input file besides Tibetan is whitespace and the Tahoma characters
|
||||
<tt>'{'</tt>, <tt>'}'</tt>, and <tt>'\'</tt>. These Tahoma
|
||||
characters are understood by the tool; they are symptoms of the
|
||||
"curly-brace problem".
|
||||
result when you run <tt>java -cp "c:\my thdl tools\Jskad.jar"
|
||||
org.thdl.tib.input.TibetanConverter --to-wylie "Dalai Lama Fifth
|
||||
History 01.rtf" > "Dalai Lama Fifth History 01 in THDL Extended
|
||||
Wylie.rtf"</tt>. (Note that the '>' causes the output to be
|
||||
directed to the file named thereafter; this is quite handy.)
|
||||
This is because the only text in the input file besides Tibetan is
|
||||
whitespace and the Tahoma characters <tt>'{'</tt>, <tt>'}'</tt>, and
|
||||
<tt>'\'</tt>. These Tahoma characters are understood by the tool;
|
||||
they are symptoms of the "curly-brace problem".
|
||||
</p>
|
||||
|
||||
<h3>Failed Conversions</h3>
|
||||
|
@ -281,26 +288,56 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
|
||||
<p>
|
||||
Note that some TMW glyphs have no transliteration in Exteded
|
||||
Wylie. When you encounter such a glyph, you'll find a message
|
||||
like the following in your RTF output:
|
||||
Wylie. When you encounter such a glyph, you'll find
|
||||
<tt>\tmwXYYY</tt> in your output, where X tells you which TMW font
|
||||
the troublesome glyph comes from and YYY is the decimal number of
|
||||
the glyph in that font (which is a number between 000 and 255
|
||||
inclusive, usually between 33 and 126). The following are
|
||||
values corresponding to X:
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<tt><<[[JSKAD_TMW_TO_WYLIE_ERROR_NO_SUCH_WYLIE: Cannot convert
|
||||
DuffCode <duffcode font=TibetanMachineWeb8 charNum=101
|
||||
character=e/> to THDL Extended Wylie. Please see the <a
|
||||
href="http://iris.lib.virginia.edu/tibet/tools/tmw.html#doc">
|
||||
documentation for the TMW font</a> and transcribe this
|
||||
yourself.]]>></tt>
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
When X is 0, the TibetanMachineWeb font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 1, the TibetanMachineWeb1 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 2, the TibetanMachineWeb2 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 3, the TibetanMachineWeb3 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 4, the TibetanMachineWeb4 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 5, the TibetanMachineWeb5 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 6, the TibetanMachineWeb6 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 7, the TibetanMachineWeb7 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 8, the TibetanMachineWeb8 font contains the glyph.
|
||||
</li>
|
||||
<li>
|
||||
When X is 9, the TibetanMachineWeb9 font contains the glyph.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
Upon seeing this, you should consult the <a
|
||||
Upon finding a <tt>\tmwXYYY</tt> sequence in your output, you should
|
||||
consult the <a
|
||||
href="http://iris.lib.virginia.edu/tibet/tools/tmw.html#doc">
|
||||
documentation</a> for the specific TMW font named. Find the
|
||||
glyph (by its charNum) and decide how to proceed. If you find
|
||||
a glyph that you believe should have been converted into Extended
|
||||
Wylie by the tool, please report this as a bug.
|
||||
glyph (by its YYY value) and decide how to proceed. If you
|
||||
find a glyph that you believe should have been converted into
|
||||
Extended Wylie by the tool, please report this as a bug through the
|
||||
SourceForge website or via e-mail.
|
||||
</p>
|
||||
|
||||
<h4>Other Conversions</h4>
|
||||
|
@ -313,9 +350,13 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
</p>
|
||||
|
||||
<p>
|
||||
There is one TMW glyph (TibetanMachineWeb7, glyph 91) that has no
|
||||
Tibetan Machine equivalent. This glyph is the only TMW glyph
|
||||
that can cause a TMW->TM conversion to fail.
|
||||
There is one TMW glyph (TibetanMachineWeb7, glyph 91 [\tmw7091])
|
||||
that has no Tibetan Machine equivalent. This glyph is the only
|
||||
TMW glyph that can cause a TMW->TM conversion to fail. It
|
||||
is fairly common, though, especially if you've used Jskad to prepare
|
||||
your document. It might be appropriate to change the document
|
||||
to use TibetanMachineWeb7, glyph 90 [\tmw7090], a similar glyph that
|
||||
does have a TM equivalent.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
@ -326,23 +367,37 @@ non-TMW character newline in the font Times New Roman appears first at location
|
|||
<p>
|
||||
If you ever encounter problems in a TM->TMW conversion, please
|
||||
send us mail with the error report (and the problem input document's
|
||||
resulting document) so that we can improve our tools.
|
||||
resulting document) so that we can improve our tools.
|
||||
</p>
|
||||
|
||||
<h3>Invoking the Converter</h3>
|
||||
|
||||
<p>
|
||||
First add Jskad.jar to your CLASSPATH. Now run the command
|
||||
<tt>java org.thdl.tib.input.TibetanConverter</tt> from a
|
||||
command prompt. You will see usage information appear.
|
||||
First add Jskad.jar to your CLASSPATH. You can do this by
|
||||
setting an environment variable CLASSPATH to contain the absolute
|
||||
path of the Jskad.jar file and then running the command <tt>java
|
||||
org.thdl.tib.input.TibetanConverter</tt>. Alternatively, you
|
||||
can use <code>java -cp "c:\my tibetan documents\Jskad.jar"
|
||||
org.thdl.tib.input.TibetanConverter</code> where you put in the
|
||||
appropriate path to Jskad.jar. You will see usage information
|
||||
appear if you do this correctly; you'll see a message like
|
||||
<code>java.lang.NoClassDefFoundError:
|
||||
org/thdl/tib/input/TibetanConverter; Exception in thread
|
||||
"main"</code> if you've not correctly told Java where to find
|
||||
Jskad.jar.
|
||||
</p>
|
||||
|
||||
<h3><a name="knownbugs"></a>Known Bugs</h3>
|
||||
|
||||
<p>
|
||||
If the TMW given is not syntactically legal, then the Wylie that
|
||||
results will not necessarily yield, if imported into Jskad, the same
|
||||
Tibetan with which the converter started. The glyphs
|
||||
All known bugs are listed in this section. They're more likely
|
||||
to be fixed if users complain, so complain away.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
First, if the TMW given is not syntactically legal, then the Wylie
|
||||
that results will not necessarily yield, if imported into Jskad, the
|
||||
same Tibetan with which the converter started. The glyphs
|
||||
corresponding to the Wylie 'jaskadaskeda' have this problem, for
|
||||
example.
|
||||
</p>
|
||||
|
|
Loading…
Reference in a new issue