Commit graph

6 commits

Author SHA1 Message Date
amontano
835e74c0cd Changed converters from unicode non-breaking tsheg to unicode non-breaking wylie space. 2009-02-20 23:11:17 +00:00
dchandler
7198f23361 I really hesitate to commit this because I'm not sure what it brings to the
table exactly and I fear that it makes the ACIP->Tibetan converter code
a lot uglier.  The TODO(DLC)[EWTS->Tibetan] comments littered throughout
are part of the ugliness; they point to the ugliness.  If each were addressed,
cleanliness could perhaps be achieved.

I've largely forgotten exactly what this change does, but it attempts to
improve EWTS->Tibetan conversion.  The lexer is probably really, really
primitive.  I concentrate here on converting a single tsheg bar rather than
a whole document.

Eclipse was used during part of my journey here and some imports were
reorganized merely because I could.  :)

(Eclipse was needed when the usual ant build failed to run a new test
EWTSTest.  And I wanted its debugger.)

Next steps: end-to-end EWTS tests should bring many problems to light.  Fix
those.  Triage all the TODO comments.

I don't know that I'll ever really trust the implementation.  The tests are
valuable, though.  A clean implementation of EWTS->Tibetan in Jython
might hold enough interest for me; I'd like to learn Python.
2005-06-20 06:18:00 +00:00
dchandler
6bb0646f1c Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:48:27 +00:00
dchandler
de6ae79959 Fixes bug 624133, "Input freezes after impossible character". Try 'shsM' in
ACIP or 'ShSm' in Extended Wylie to see the new behavior.

We use a trie to store valid input sequences.  In the future, we could use
the same trie as a replacement for the more inefficient HashSets we use to
store characters, vowels, and punctuation.  For example, we'd use
'validInputSequences.put("K", new Pair("consonant", "k"))' when reading
in the ACIP keyboard's description of the first consonant of the Tibetan
alphabet in 'TibetanKeyboard.java'.

Note that the current trie implementation is only useful for 7- or 8-bit
transcription systems, and works best for tries with low average depth, which
describes a transcription system's trie very well.  If you used arbitrary
Unicode in your keyboard, you'd need a different trie implementation.

Improved the optional keyboard input mode status messages.
2002-11-02 18:44:24 +00:00
dchandler
a6cc4a7ff3 Removed/commented out/tagged some unused local variables.
Added a JUnit test for the new Trie that fails at present since the Trie is
case-insensitive.  Running JUnit tests is not something our build system
knows about at present, but Eclipse 2.0 makes it very easy.

Fixed a few compiler errors due to imports I'd forgotten.
2002-11-02 16:01:40 +00:00
dchandler
b8391e923d Borrowed a trie implementation from Apache's Xalan 2.4.0. 2002-11-02 13:39:29 +00:00