transformations. I haven't actually used it with Xalan XSLT yet, but
it ought to work if TibetanHTML did (which it must have at one point).
I do have a unit test, but an end-to-end test with Xalan is what we
need.
table exactly and I fear that it makes the ACIP->Tibetan converter code
a lot uglier. The TODO(DLC)[EWTS->Tibetan] comments littered throughout
are part of the ugliness; they point to the ugliness. If each were addressed,
cleanliness could perhaps be achieved.
I've largely forgotten exactly what this change does, but it attempts to
improve EWTS->Tibetan conversion. The lexer is probably really, really
primitive. I concentrate here on converting a single tsheg bar rather than
a whole document.
Eclipse was used during part of my journey here and some imports were
reorganized merely because I could. :)
(Eclipse was needed when the usual ant build failed to run a new test
EWTSTest. And I wanted its debugger.)
Next steps: end-to-end EWTS tests should bring many problems to light. Fix
those. Triage all the TODO comments.
I don't know that I'll ever really trust the implementation. The tests are
valuable, though. A clean implementation of EWTS->Tibetan in Jython
might hold enough interest for me; I'd like to learn Python.
Servlet version's minimum is now 1.3, which is the version number in UVa's iris server.
Handheld version's minimum is now 1.4. Seems high, but it has been designed for some
time to run on WebSphere Everyplace Micro Environment Personal Profile 1.0 and 1.4 is
the maximum it takes. I didn't find any compatible freeware Java Runtime Environment for windows ce.
For everything else, the minimum is still set at 1.2.
standalone and the servlet version of the translation tool respectively. The
specific parameters are for debugging in Netbeans IDE but probably could
be generalized. Added comments on how to debug the servlet version
via sockets or via shared memory.
conversion. The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the
place. EWTS->Tibetan isn't here yet; lexing isn't here yet; this is
mainly a refactoring so that the ACIP->Tibetan code can be reused to
do EWTS->Tibetan.
I'm committing this because tests pass (it shouldn't be breaking
anything), because I want a checkpoint, and because the laptop this
sandbox was on isn't my preferred development environment.
Fixed part of bug 998476 and part of an undocumented bug. Discovered a
new bug, "aM" should be generated but only "M" is.
The undocumented bug was that laMA was generated when lAM should have been.
The part of bug 998476 that was fixed: laM, laH, etc. are now generated.
This does nothing about paN etc.
Some refactoring here; this is not a minimal diff.
Added tests of TMW->EWTS that use ACIP to get the TMW in place
because EWTS->TMW is a faulty keyboard at present.
warnings in ACIP->Tibetan conversion much more configurable. You can
now choose from short or long error messages, for one thing. You can change
the severity of almost all warnings. Each error and warning has an error code.
Errors and warnings are better tested.
The converter GUI has a new checkbox for short messages; the converter
CLI has a new mandatory option for short messages.
I also fixed a bug whereby certain errors were not being appended to the
'errors' StringBuffer.
non-reference, publicly available ACIP files (hundreds of megabytes of
them) through the converter. The frequencies of these tsheg bars in
in the file, too.
\, the Sanskrit virama, is not used. Of the 1370-odd ACIP texts I've
got here, about 57% make it through the gauntlet (fewer if you demand
a vowel or disambiguator on every stack of a non-Tibetan tsheg bar).
(which I found on suigeneris.org, not apache.org) in order to bulletproof the
Tibetan Converter tests. They used to fail due to nondeterminism in the
Java RTF writer; they should no longer fail.
I've also changed it so that the Tibetan Converter tests run in headless
mode, which means that they'll run on the nightly builds server.
noticed that formatting is mostly OK but sometimes gets bungled slightly.
I tried everything I could think of, and now I'm passing the buck to Java's
RTF support.
TMW_RTF_TO_THDL_WYLIE (now misnamed) support TMW->TM
conversion (but not TM->TMW). There is an automated test case for a
TMW->TM conversion.
I have full confidence in this conversion. Even the smallest glitch in the core
functionality (not formatting) would surprise me.
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE. It converts RTF files
consisting of TMW characters to the corresponding THDL Extended Wylie.
It supports --find-some-non-tmw mode, which allows you to ensure that no
unusual characters will spoil the conversion. The converter has built-in
intelligence that allows it to handle Tahoma '{', '}', and '\\' characters
properly.
The converter works on mixed Roman/TMW also, but --find-some-non-tmw
and --find-all-non-tmw modes are not as useful.
Invoke org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE, which resides in
Jskad's jar, with no command-line options to see usage information.
Noted some failures. "Fixed" the code to do what I want it to do for
the (no sanskrit stacking, tibetan stacking) case [which is exercised
by this keyboard only].
clean check'. Right now there are tests to ensure that typing certain
sequences of keys in the Extended Wylie keyboard gives the expected
Extended Wylie back when "Tools/Convert Tibetan to Wylie" is invoked.
The syntactically illegal d.wa now converts to Tibetan and then back
to d.wa (not dwa, as it did); likewise with the illegal g.wa. wa
doesn't take any prefixes, but I prefer clean end-to-end
behavior. (jeskd doesn't go end-to-end, though.)
Note that you cannot successfully run the DuffPane tests on a Linux
box unless your DISPLAY variable is set correctly. Thus, my nightly
builds will fail with an Error (as opposed to a Failure).