I really hesitate to commit this because I'm not sure what it brings to the

table exactly and I fear that it makes the ACIP->Tibetan converter code a lot uglier. The TODO(DLC)[EWTS->Tibetan] comments littered throughout are part of the ugliness; they point to the ugliness. If each were addressed, cleanliness could perhaps be achieved. I've largely forgotten exactly what this change does, but it attempts to improve EWTS->Tibetan conversion. The lexer is probably really, really primitive. I concentrate here on converting a single tsheg bar rather than a whole document. Eclipse was used during part of my journey here and some imports were reorganized merely because I could. :) (Eclipse was needed when the usual ant build failed to run a new test EWTSTest. And I wanted its debugger.) Next steps: end-to-end EWTS tests should bring many problems to light. Fix those. Triage all the TODO comments. I don't know that I'll ever really trust the implementation. The tests are valuable, though. A clean implementation of EWTS->Tibetan in Jython might hold enough interest for me; I'd like to learn Python.
2005-06-20 06:18:00 +00:00 · 2005-06-20 06:18:00 +00:00 · 7198f23361
commit 7198f23361
parent f64bae8ea6
45 changed files with 1666 additions and 695 deletions
--- a/source/org/thdl/tib/text/tshegbar/ValidatingUnicodeReader.java
+++ b/source/org/thdl/tib/text/tshegbar/ValidatingUnicodeReader.java
@ -258,7 +258,7 @@ class ValidatingUnicodeReader implements UnicodeReadingStateMachineConstants {
        throws TibetanSyntaxException
    {
        Vector syllables = new Vector();
-        int grcls_len = grcls.length();
+        int grcls_len = grcls.size();
        int beginning_of_cluster = 0;
        for (int i = 0; i < grcls_len; i++) {
            UnicodeGraphemeCluster current_grcl