Commit graph

195 commits

Author SHA1 Message Date
dchandler
3feef9232a I got this by running Ant with the environment variable ANT_OPTS set to
'-Xmx512m'.  Otherwise I ran out of memory.

This file allows for a sizeable ACIP->TMW regression test.
2005-02-21 05:52:16 +00:00
dchandler
4c268c5ea2 Refactored so that there can be an EWTS scanner and an ACIP scanner. 2005-02-21 05:37:01 +00:00
dchandler
4b4787411b All I *meant* to do with this commit (you tell me if I did more) was to change
from \r\r\n to \r\n.  I added these files on Linux with \r\n and should've
added them as binary or done a dos2unix first.
2005-02-21 05:05:13 +00:00
dchandler
3e0168b384 Renamed ACIPConverter to TConverter. Added a needed parameter (the
only needed parameter in that class's interface AFAIK.
2005-02-21 01:35:23 +00:00
dchandler
37bf9a736d I did this stuff back in August. It's all in support of EWTS->Tibetan
conversion.  The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the
place.  EWTS->Tibetan isn't here yet; lexing isn't here yet; this is
mainly a refactoring so that the ACIP->Tibetan code can be reused to
do EWTS->Tibetan.

I'm committing this because tests pass (it shouldn't be breaking
anything), because I want a checkpoint, and because the laptop this
sandbox was on isn't my preferred development environment.
2005-02-21 01:16:10 +00:00
dchandler
83f499b7a8 Formatting in TMW documents is not preserved. I've added an identity
tranformation, TMW->TMW, to help me debug this problem.
2005-02-13 00:34:47 +00:00
dchandler
9025fb42d6 TMW->EWTS 998476 partial fix: "aM" is generated now correctly. Before
you got "M".
2005-02-07 04:00:42 +00:00
dchandler
8dcb623382 TMW->EWTS:
Fixed part of bug 998476 and part of an undocumented bug.  Discovered a
new bug, "aM" should be generated but only "M" is.

The undocumented bug was that laMA was generated when lAM should have been.

The part of bug 998476 that was fixed: laM, laH, etc. are now generated.

This does nothing about paN etc.

Some refactoring here; this is not a minimal diff.

Added tests of TMW->EWTS that use ACIP to get the TMW in place
because EWTS->TMW is a faulty keyboard at present.
2005-02-07 03:17:40 +00:00
dchandler
96d0d0d9d0 My previous commit message failed to mention the following:
I refactored the code trying to fit it onto one screen.  So not all of the
changes are material to the bug fix.

About this commit: TMW->Wylie for {b.s.d} now gives bsad instead of bas.d.
This fixes part of bug 998476, and is done because Andres thinks it'll work
most of the time.  But don't be surprised if an exception comes up in the
future and we have to trivially change the code to catch it.
2005-02-05 22:37:02 +00:00
dchandler
287fc181a0 Fix for part of bug 998476. 2005-02-05 22:16:39 +00:00
dchandler
00961b633f Added a test case for bug 998476. No fix, just a test case verifying the bug. 2005-02-05 18:47:17 +00:00
dchandler
0b0af67ed9 Ximalaya is not nearly as nice as Tibetan Machine Uni, so use the latter. 2005-01-04 02:20:59 +00:00
eg3p
c4f4288d2f eliminated some unnecessary comments i had left in there 2004-08-19 19:51:21 +00:00
dchandler
11c3898ad2 Now the Sambhota keyboard crashing bug.
Fixed crashing bug reported by Teresa Lam.  Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:46:39 +00:00
dchandler
6cbea9f894 Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:10:38 +00:00
amontano
650109200f fixed the paste. when pasting from text (non-rtf) it used to produce garbage.
now it interprets it as wylie. also made some attributes protected to inherit it.
2004-07-05 03:43:42 +00:00
dchandler
8ccf57dccb TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-15 02:20:28 +00:00
dchandler
1db0ec7bb5 Fixed javadoc comments. 2004-06-06 21:39:45 +00:00
dchandler
fd7cba4439 Changed menu item name. 2004-05-01 20:52:22 +00:00
dchandler
1a055f3472 I don't think warning level "None" was really doing the trick. Fixed that.
You can now customize the severities of all warnings, even 504 and 510.

When warning level is "None", scanning, i.e. lexical analysis, is faster.
2004-04-25 00:37:57 +00:00
dchandler
e2d42f36eb Robert Chilton's experience inspired me to make the handling of errors and
warnings in ACIP->Tibetan conversion much more configurable.  You can
now choose from short or long error messages, for one thing.  You can change
the severity of almost all warnings.  Each error and warning has an error code.
Errors and warnings are better tested.

The converter GUI has a new checkbox for short messages; the converter
CLI has a new mandatory option for short messages.

I also fixed a bug whereby certain errors were not being appended to the
'errors' StringBuffer.
2004-04-24 17:49:16 +00:00
dchandler
cc5d096918 David Chapman's latest fix to tibwn.ini (clearing up an issue that Than or I
dropped the ball on) introduced two lines for 8,95.  This is a bad thing, so
I've taken out the second line.  I've also introduced a check in
TibetanMachineWeb.java such that we'll know that tibwn.ini has no such
error in the future just by running 'ant clean jskad-run' and making sure that
the GUI is indeed visible.

I also updated the test baselines now that F03A and 0F82 are squared away.
2004-04-24 13:23:56 +00:00
dchandler
1bfd3772e6 TMW->ACIP is much improved. V and W were confused, # and * were
confused; many glyphs that should have yielded errors were not.

I've added a test case that transforms every TMW glyph save the one with
no TM mapping to ACIP.  I hand-checked that it was correct.

ACIP->TMW is fixed for # and *.  I never noticed it, but each needed an
extra swoosh (U+0F05).

Round-tripping would be good, as would testing real-world use of
TMW->ACIP.
2004-04-14 05:44:51 +00:00
dchandler
9e7ccf2894 TMW->Unicode conversions have changed; now using U+0F6A for the stacks
whose EWTS transliteration begins with "R+".

ACIP->* conversions and test baselines were updated to deal with the
"r+..."=>"R+..."  change.
2004-04-10 16:58:45 +00:00
dchandler
e0928d8472 New EWTS for 0F82 and 0F83. 2004-03-06 23:00:40 +00:00
dchandler
9dd95c5524 I saw this error when I wasn't expecting it, so now, curious, I print more details. 2004-01-17 16:51:33 +00:00
dchandler
01e65176d4 Using less memory and time to figure out if warnings occurred. 2003-12-14 07:41:15 +00:00
dchandler
a0e6db11c0 Very minor cleanup. 2003-12-13 21:59:31 +00:00
dchandler
4c30657afa Adding tests for an ACIP keyboard that will never work correctly, and
probably never even be useful.  But they were lying around from a
while back, so here are the tests.
2003-12-13 21:34:33 +00:00
dchandler
02967539b0 Slightly improved Jskad's internal documentation. Links to converters' docs. 2003-12-10 07:04:35 +00:00
dchandler
8f7322a056 Use absolute paths when invoking the external viewer; it doesn't know what our current working directory is. 2003-12-08 06:53:37 +00:00
dchandler
597cf408dd Fixed help message. 2003-12-07 19:10:36 +00:00
dchandler
dfaae4be93 ACIP->TMW and ACIP->Unicode now allow for Unicode escapes like K\u0F84. This means that the lack of support for ACIP's backslash, '\\', is mitigated because you can turn ACIP {K\} into ACIP {K\u0F84}.
Support for U+F021-U+F0FF, the PUA that the latest EWTS uses, is not provided.
2003-11-29 22:56:18 +00:00
dchandler
946d8cbc72 Updated the code I used for testing to generate the file containing all glyphs in TM and all glyphs but one in TMW. 2003-11-29 16:22:26 +00:00
dchandler
8d18ac53cb N+D+Ya, not N+D+ya, w+Wa, not w+wa .. use W, R, and Y where appropriate.
Found another inconsistency between Unicode and the TM/TMW docs.  I've sent e-mail to Tony Duff asking who's right, but I'm putting this in the errata under the assumption that even if Unicode is wrong, Unicode's wrong view will somehow rule the day.

Also, TMW->EWTS now generates \uF021-\uF0FF or \u0F00-\u0FFF escapes when appropriate.  A few TMW glyphs still give errors.

Also, there's now a test to be sure that TM<->TMW and TMW->EWTS won't break in the future (except for the one glyph in TMW that isn't in TM, that one isn't tested).  The baselines have not been hand-verified, but changes will be detected.
2003-11-24 05:50:42 +00:00
dchandler
216c5b0d54 Fixed TWM->Wylie for achen. I even tested this by pretending achen could take a da prefix (when in reality it takes no prefixes). 2003-11-23 01:22:27 +00:00
dchandler
37e8dfa917 The menu now says (Buggy) in front of "Convert Selection from Wylie to Tibetan" because this feature is, you guessed it, buggy. 2003-11-22 22:48:41 +00:00
dchandler
113480a882 X is now better supported, so this changed. 2003-11-15 20:00:59 +00:00
dchandler
084e12a02c Import Wylie is a buggy feature. The menu now calls it "(Buggy) Import Wylie...". t+s+w doesn't even convert correctly!
Bug-free EWTS->TMW using the org.thdl.tib.text.ttt codebase will be here soon.
2003-11-09 01:25:58 +00:00
dchandler
dbd9c80ca0 Special tests for rwa and r+wa, which are the only two different stacks with the same hash key modulo - and +. 2003-11-09 01:06:26 +00:00
dchandler
85e1e0701e Fixed crashing bug in Import Wylie. 2003-11-08 23:32:53 +00:00
dchandler
8fbd8850f8 New feature: Convert Selection from TWM to ACIP. 2003-11-08 23:22:06 +00:00
dchandler
bab47c4910 There are now extensive tests to make sure that each Tibetan stack in TMW can be typed in using EWTS and correctly converted to TMW and then back to EWTS. These tests unearthed new bugs in the Tibetan! 5.1 docs. 2003-11-08 22:11:24 +00:00
dchandler
f626a04d72 Tests t+r+n glyph. 2003-11-08 20:28:34 +00:00
dchandler
5c36dd81d3 Fixed bug 830332, "Convert selected ACIP=>Tibetan busted". 2003-10-26 18:25:25 +00:00
dchandler
31b3020d07 Added a test case that runs almost all the tsheg bars from all
non-reference, publicly available ACIP files (hundreds of megabytes of
them) through the converter.  The frequencies of these tsheg bars in
in the file, too.
2003-10-26 06:02:48 +00:00
dchandler
1415fc43e3 The ACIP "BNA" was converting to B-NA instead of B+NA, even though NA cannot take a BA prefix. This was because BNA was interpreted as root-suffix. In ACIP, BN is surely B+N unless N takes a B prefix, so root-suffix is out of the question. 2003-10-26 00:21:54 +00:00
dchandler
f106deb884 Private correspondence with Robert Chilton led to me to add and remove a few prefix rules. BLC and BGL are here, BLK, BLG, BLNG, BLJ, BNG, BJ, BNY, BN, and BDZ are gone.
Added a few new tests.
2003-10-25 21:40:21 +00:00
dchandler
5d9305c9d5 "Browse..." buttons are smart about file types now. 2003-10-19 23:17:25 +00:00
dchandler
3aa3859354 ACIP->Unicode crash fixed.
5% of the code for support of ACIP->Unicode.rtf is here.
2003-10-19 22:19:16 +00:00