I really hesitate to commit this because I'm not sure what it brings to the

table exactly and I fear that it makes the ACIP->Tibetan converter code
a lot uglier.  The TODO(DLC)[EWTS->Tibetan] comments littered throughout
are part of the ugliness; they point to the ugliness.  If each were addressed,
cleanliness could perhaps be achieved.

I've largely forgotten exactly what this change does, but it attempts to
improve EWTS->Tibetan conversion.  The lexer is probably really, really
primitive.  I concentrate here on converting a single tsheg bar rather than
a whole document.

Eclipse was used during part of my journey here and some imports were
reorganized merely because I could.  :)

(Eclipse was needed when the usual ant build failed to run a new test
EWTSTest.  And I wanted its debugger.)

Next steps: end-to-end EWTS tests should bring many problems to light.  Fix
those.  Triage all the TODO comments.

I don't know that I'll ever really trust the implementation.  The tests are
valuable, though.  A clean implementation of EWTS->Tibetan in Jython
might hold enough interest for me; I'd like to learn Python.
This commit is contained in:
dchandler 2005-06-20 06:18:00 +00:00
parent f64bae8ea6
commit 7198f23361
45 changed files with 1666 additions and 695 deletions

View file

@ -21,12 +21,12 @@ Contributor(s): ______________________________________.
package org.thdl.tib.text.ttt;
import org.thdl.util.ThdlOptions;
import java.util.ArrayList;
import junit.framework.TestCase;
import org.thdl.util.ThdlOptions;
/** Tests this package, especially {@link #TPairListFactory} and
* {@link TPairList}. Tests use ACIP more than EWTS.
@ -275,7 +275,8 @@ public class PackageTest extends TestCase {
String[] expectedLegalParses,
String expectedBestParse,
int pairListToUse) {
TPairList[] la = TPairListFactory.breakACIPIntoChunks(acip, true);
TPairList[] la
= ACIPTraits.instance().breakTshegBarIntoChunks(acip, true);
TPairList l = la[(pairListToUse == -1) ? 0 : ((pairListToUse >= 1) ? 1 : pairListToUse)];
if (sdebug || debug)
System.out.println("ACIP=" + acip + " and l'=" + l);
@ -302,9 +303,9 @@ public class PackageTest extends TestCase {
return;
} else {
String s;
if ((s = pt.getWarning("Most", l, acip, false)) != null) {
if ((s = pt.getWarning("Most", l, acip, false, ACIPTraits.instance())) != null) {
System.out.println(s);
} else if ((s = pt.getWarning("All", l, acip, false)) != null)
} else if ((s = pt.getWarning("All", l, acip, false, ACIPTraits.instance())) != null)
if (sdebug || debug) System.out.println("Paranoiac warning is this: " + s);
}
int np = pt.numberOfParses();
@ -447,9 +448,9 @@ public class PackageTest extends TestCase {
tstHelper("9012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678");
}
/** Tests {@link TPairListFactory#breakACIPIntoChunks(String,
* boolean)}, {@link TPairList#getACIPError(String, boolean)}, and {@link
* TPairList#recoverACIP()}. */
/** Tests {@link ACIPTraits#breakTshegBarIntoChunks(String,
* boolean)}, {@link TPairList#getACIPError(String, boolean)},
* and {@link TPairList#recoverACIP()}. */
public void testBreakACIPIntoChunks() {
tstHelper("GASN"); // ambiguous with regard to prefix rules
tstHelper("BARMA"); // ambiguous with regard to prefix rules