Commit graph

231 commits

Author SHA1 Message Date
dchandler
6cbea9f894 Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:10:38 +00:00
amontano
650109200f fixed the paste. when pasting from text (non-rtf) it used to produce garbage.
now it interprets it as wylie. also made some attributes protected to inherit it.
2004-07-05 03:43:42 +00:00
dchandler
8ccf57dccb TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-15 02:20:28 +00:00
dchandler
1db0ec7bb5 Fixed javadoc comments. 2004-06-06 21:39:45 +00:00
dchandler
fd7cba4439 Changed menu item name. 2004-05-01 20:52:22 +00:00
dchandler
1a055f3472 I don't think warning level "None" was really doing the trick. Fixed that.
You can now customize the severities of all warnings, even 504 and 510.

When warning level is "None", scanning, i.e. lexical analysis, is faster.
2004-04-25 00:37:57 +00:00
dchandler
e2d42f36eb Robert Chilton's experience inspired me to make the handling of errors and
warnings in ACIP->Tibetan conversion much more configurable.  You can
now choose from short or long error messages, for one thing.  You can change
the severity of almost all warnings.  Each error and warning has an error code.
Errors and warnings are better tested.

The converter GUI has a new checkbox for short messages; the converter
CLI has a new mandatory option for short messages.

I also fixed a bug whereby certain errors were not being appended to the
'errors' StringBuffer.
2004-04-24 17:49:16 +00:00
dchandler
cc5d096918 David Chapman's latest fix to tibwn.ini (clearing up an issue that Than or I
dropped the ball on) introduced two lines for 8,95.  This is a bad thing, so
I've taken out the second line.  I've also introduced a check in
TibetanMachineWeb.java such that we'll know that tibwn.ini has no such
error in the future just by running 'ant clean jskad-run' and making sure that
the GUI is indeed visible.

I also updated the test baselines now that F03A and 0F82 are squared away.
2004-04-24 13:23:56 +00:00
dchandler
1bfd3772e6 TMW->ACIP is much improved. V and W were confused, # and * were
confused; many glyphs that should have yielded errors were not.

I've added a test case that transforms every TMW glyph save the one with
no TM mapping to ACIP.  I hand-checked that it was correct.

ACIP->TMW is fixed for # and *.  I never noticed it, but each needed an
extra swoosh (U+0F05).

Round-tripping would be good, as would testing real-world use of
TMW->ACIP.
2004-04-14 05:44:51 +00:00
dchandler
9e7ccf2894 TMW->Unicode conversions have changed; now using U+0F6A for the stacks
whose EWTS transliteration begins with "R+".

ACIP->* conversions and test baselines were updated to deal with the
"r+..."=>"R+..."  change.
2004-04-10 16:58:45 +00:00
dchandler
e0928d8472 New EWTS for 0F82 and 0F83. 2004-03-06 23:00:40 +00:00
dchandler
9dd95c5524 I saw this error when I wasn't expecting it, so now, curious, I print more details. 2004-01-17 16:51:33 +00:00
dchandler
01e65176d4 Using less memory and time to figure out if warnings occurred. 2003-12-14 07:41:15 +00:00
dchandler
a0e6db11c0 Very minor cleanup. 2003-12-13 21:59:31 +00:00
dchandler
4c30657afa Adding tests for an ACIP keyboard that will never work correctly, and
probably never even be useful.  But they were lying around from a
while back, so here are the tests.
2003-12-13 21:34:33 +00:00
dchandler
02967539b0 Slightly improved Jskad's internal documentation. Links to converters' docs. 2003-12-10 07:04:35 +00:00
dchandler
8f7322a056 Use absolute paths when invoking the external viewer; it doesn't know what our current working directory is. 2003-12-08 06:53:37 +00:00
dchandler
597cf408dd Fixed help message. 2003-12-07 19:10:36 +00:00
dchandler
dfaae4be93 ACIP->TMW and ACIP->Unicode now allow for Unicode escapes like K\u0F84. This means that the lack of support for ACIP's backslash, '\\', is mitigated because you can turn ACIP {K\} into ACIP {K\u0F84}.
Support for U+F021-U+F0FF, the PUA that the latest EWTS uses, is not provided.
2003-11-29 22:56:18 +00:00
dchandler
946d8cbc72 Updated the code I used for testing to generate the file containing all glyphs in TM and all glyphs but one in TMW. 2003-11-29 16:22:26 +00:00
dchandler
8d18ac53cb N+D+Ya, not N+D+ya, w+Wa, not w+wa .. use W, R, and Y where appropriate.
Found another inconsistency between Unicode and the TM/TMW docs.  I've sent e-mail to Tony Duff asking who's right, but I'm putting this in the errata under the assumption that even if Unicode is wrong, Unicode's wrong view will somehow rule the day.

Also, TMW->EWTS now generates \uF021-\uF0FF or \u0F00-\u0FFF escapes when appropriate.  A few TMW glyphs still give errors.

Also, there's now a test to be sure that TM<->TMW and TMW->EWTS won't break in the future (except for the one glyph in TMW that isn't in TM, that one isn't tested).  The baselines have not been hand-verified, but changes will be detected.
2003-11-24 05:50:42 +00:00
dchandler
216c5b0d54 Fixed TWM->Wylie for achen. I even tested this by pretending achen could take a da prefix (when in reality it takes no prefixes). 2003-11-23 01:22:27 +00:00
dchandler
37e8dfa917 The menu now says (Buggy) in front of "Convert Selection from Wylie to Tibetan" because this feature is, you guessed it, buggy. 2003-11-22 22:48:41 +00:00
dchandler
113480a882 X is now better supported, so this changed. 2003-11-15 20:00:59 +00:00
dchandler
084e12a02c Import Wylie is a buggy feature. The menu now calls it "(Buggy) Import Wylie...". t+s+w doesn't even convert correctly!
Bug-free EWTS->TMW using the org.thdl.tib.text.ttt codebase will be here soon.
2003-11-09 01:25:58 +00:00
dchandler
dbd9c80ca0 Special tests for rwa and r+wa, which are the only two different stacks with the same hash key modulo - and +. 2003-11-09 01:06:26 +00:00
dchandler
85e1e0701e Fixed crashing bug in Import Wylie. 2003-11-08 23:32:53 +00:00
dchandler
8fbd8850f8 New feature: Convert Selection from TWM to ACIP. 2003-11-08 23:22:06 +00:00
dchandler
bab47c4910 There are now extensive tests to make sure that each Tibetan stack in TMW can be typed in using EWTS and correctly converted to TMW and then back to EWTS. These tests unearthed new bugs in the Tibetan! 5.1 docs. 2003-11-08 22:11:24 +00:00
dchandler
f626a04d72 Tests t+r+n glyph. 2003-11-08 20:28:34 +00:00
dchandler
5c36dd81d3 Fixed bug 830332, "Convert selected ACIP=>Tibetan busted". 2003-10-26 18:25:25 +00:00
dchandler
31b3020d07 Added a test case that runs almost all the tsheg bars from all
non-reference, publicly available ACIP files (hundreds of megabytes of
them) through the converter.  The frequencies of these tsheg bars in
in the file, too.
2003-10-26 06:02:48 +00:00
dchandler
1415fc43e3 The ACIP "BNA" was converting to B-NA instead of B+NA, even though NA cannot take a BA prefix. This was because BNA was interpreted as root-suffix. In ACIP, BN is surely B+N unless N takes a B prefix, so root-suffix is out of the question. 2003-10-26 00:21:54 +00:00
dchandler
f106deb884 Private correspondence with Robert Chilton led to me to add and remove a few prefix rules. BLC and BGL are here, BLK, BLG, BLNG, BLJ, BNG, BJ, BNY, BN, and BDZ are gone.
Added a few new tests.
2003-10-25 21:40:21 +00:00
dchandler
5d9305c9d5 "Browse..." buttons are smart about file types now. 2003-10-19 23:17:25 +00:00
dchandler
3aa3859354 ACIP->Unicode crash fixed.
5% of the code for support of ACIP->Unicode.rtf is here.
2003-10-19 22:19:16 +00:00
dchandler
4b1395e0ba Jskad has a new feature: Convert Selection from ACIP to Tibetan. It uses the ACIP converter to do its work.
Improved some error messages from the ACIP->Tibetan converter.
2003-10-19 20:16:06 +00:00
dchandler
e5534f69ee Untabified -- whitespace only has changed. Use 'cvs diff -wb' to avoid seeing these differences. 2003-10-18 18:29:46 +00:00
dchandler
e799438f86 CVS ignoring backup files. 2003-10-18 17:47:56 +00:00
dchandler
3b55ea509f Prefix rules have changed. A few are gone; a few new ones are here. I've implemented here a list that Robert Chilton sent me in private correspondence. He doesn't describe it as definitive, but since it affects ACIP->Tibetan conversions, and it's the best I've got, here they are. There's still an optional warning about "Hey, prefix rules matter for this tsheg bar."
I've left in a few rules that I didn't find on RC's list; I've asked him to look into these further.
2003-10-18 05:48:53 +00:00
dchandler
8c99adeb63 TMW->EWTS, TMW->ACIP, and ACIP->Unicode/TMW now support more appendages. Personal correspondence with Robert Chilton led me to support, besides 'am, 'ang, 'o, 'i, and 'u, the following:
'e (used in foreign transliteration)
'ongs
'is
'os
'ur
'us
'ung
2003-10-18 03:04:47 +00:00
dchandler
129ebccd67 In TCC #1 keyboard, h>cj now works. I may have fixed this in a terrible way, breaking other things even. Hard to say because I don't really understand the code I changed. But DuffPaneTest passes.
If we ever clean up the keyboards, the changes made here to tcc_keyboard.ini should probably be undone.
2003-10-12 18:16:17 +00:00
dchandler
d7fdacfcdc Open menu is now Open..., Save as is now Save as... 2003-10-12 18:12:19 +00:00
dchandler
8dbfff17e1 All .rtf and .Rtf and .RTF files are selectable now. 2003-10-12 18:11:50 +00:00
dchandler
35209ce7fd I'm going to have to debug this, and the tab stops make the source unreadable. I don't like messing with whitespace, but it seems like I'll be the main maintainer for a while, and the people after me can use cvs diff -wb. So I'm untabifying. 2003-10-12 16:44:28 +00:00
dchandler
115d0e0e6c Fixed ACIP->TMW vowels like 'I etc.
Fixed ACIP->Unicode/TMW for BDE, which should be B-DE, not B+DE, because the former is legal Tibetan.

The ACIP->EWTS subroutine has improved.

TMW->Wylie and TMW->ACIP are improved in error cases.

TMW->ACIP has friendly embedded error messages now.
2003-09-12 05:06:37 +00:00
dchandler
6872ea8028 Corrected the usage info. 2003-09-07 22:08:00 +00:00
dchandler
07e360d9a8 The ACIP {NYA%} is supported. {NYAo} and {NYAx} are confusing to me,
because I don't know which glyphs o and x correspond to.  For that
reason, they cause ERRORs.

The proposed THDL Extended Wylie ~X and X is now used for U+0F35 and
U+0F37 respectively.
2003-09-07 16:19:50 +00:00
amontano
b489034598 Fixed a call to a deprecated method 2003-09-07 03:39:08 +00:00
dchandler
0d6d6ed611 Added GUI support for color-coding. Added support for color-coding
and choosing the warning level to TibetanConverter.

Better error checking in the GUI converter.
2003-09-06 22:56:10 +00:00
dchandler
717c3b94f3 Fixed ACIP->Unicode spaces/tshegs and newlines, especially with shads.
"NGA," becomes "NGA-tsheg-," automatically now.
2003-09-05 05:08:47 +00:00
dchandler
5c240ac072 From the converter GUI, you can now choose TMW->ACIP text and
TMW->Wylie text.  All the conversions show you which format they take
as input and which format they give as output.

File filter for ACIP files added.

The GUI converter suggests a file extension wisely.

Fixed newline bug in ACIP->Unicode converter.
2003-09-05 02:05:34 +00:00
dchandler
4abbf6db37 --to-acip-text and --to-wylie-text added; these get you text files,
not RTF files like --to-acip and --to-wylie do.  The GUI converter
doesn't yet allow you to get text files.
2003-09-04 05:16:47 +00:00
dchandler
cc615f34df ACIP->TMW and ACIP->Unicode have my pre-stamp of non-approval. Except
for (NYAx} and {NYAo}, they're as good as I'll get them without input
from experts of the employ of a complementary, syllabary-based
approach.
2003-09-04 04:34:18 +00:00
dchandler
316f59107b A preliminary TMW->ACIP converter is here. There are known bugs, mostly with rare punctuation. 2003-09-02 06:39:33 +00:00
dchandler
045c4069c9 Preliminary ACIP->TMW support is in place. {DU} gives you something
less beautiful than what Jskad would give, so more work is needed.
2003-08-31 16:06:35 +00:00
dchandler
dd22e161a5 Code cleanup for Jskad's Tibetan font converter GUI. 2003-08-30 05:01:15 +00:00
dchandler
896344f2d1 David Chapman removed some lines from tibwn.ini. That breaks TM<->TMW
mappings, so I've put them back, but with the EWTS non-correspondences
\tmwXYYY.

Jskad no longer supports superscribed or subscribed numerals, because
EWTS does not.
2003-08-26 01:28:02 +00:00
dchandler
1982c5847b Jskad's converter now has ACIP-to-Unicode built in. There are known
bugs; it is pre-alpha.  It's usable, though, and finds tons of errors
in ACIP input files, with the user deciding just how pedantic to be.
The biggest outstanding bug is the silent one: treating { }, space, as
tsheg instead of whitespace when we ought to know better.
2003-08-24 06:40:53 +00:00
dchandler
d5ad760230 TMW->Wylie conversion now takes advantage of prefix rules, the rules
that say "ya can take a ga prefix" etc.

The ACIP->Unicode converter now gives warnings (optionally, and by
default, inline).  This converter now produces output even when
lexical errors occur, but the output has errors and warnings inline.
2003-08-23 22:03:37 +00:00
dchandler
bcf1c12b6a We now produce EWTS m.ya, g.rwa, d.rwa, and b.ya during TMW->Wylie.
Our disambiguation is now perfect, happening when and only when it is
necessary.  These are all illegal, so it shouldn't affect many
existing conversions.  But if there were typos, it could.
2003-08-10 18:46:01 +00:00
dchandler
251d8feae5 brtan now gives TMW->Wylie brtan, not b.rtan. Etc. See bug report
http://sourceforge.net/tracker/index.php?func=detail&aid=785791&group_id=61934&atid=502515.
2003-08-09 17:48:40 +00:00
amontano
8e4b508de8 Made a new class for the preference window so that other software
(i.e. the translation tool) can use re-use that same code to set up the
attributes of the tibetan and roman fonts.
2003-08-09 07:57:21 +00:00
dchandler
a7f0c35738 Added a test for ts.ha vs. tsha ambiguity; there is no ambiguity. 2003-07-18 03:51:29 +00:00
dchandler
dc454b8c0c More test cases related to the following:
The Tibetan d.za was being converted into the Wylie dza incorrectly.  This
is a rare case, but I want TMW->Wylie to be perfectly unambiguous.
2003-07-18 02:31:02 +00:00
dchandler
1c29566aee I'm now using the Unix diff built in to Apache Jakarta Commons JRCS
(which I found on suigeneris.org, not apache.org) in order to bulletproof the
Tibetan Converter tests.  They used to fail due to nondeterminism in the
Java RTF writer; they should no longer fail.

I've also changed it so that the Tibetan Converter tests run in headless
mode, which means that they'll run on the nightly builds server.
2003-07-14 12:26:26 +00:00
dchandler
f900154e7a Tests disambiguation in TMW->Wylie conversion. 2003-07-14 12:21:02 +00:00
dchandler
79b3b97326 Remove warning message from menu item. 2003-07-13 23:19:11 +00:00
dchandler
c986684beb Updated help to talk about new features. 2003-07-13 22:51:35 +00:00
dchandler
f695b1a6c1 Updated baselines because conversions have improved since the last
update.
2003-07-13 19:14:41 +00:00
dchandler
d10f97fc06 Disambiguation was not being used appropriately. This makes previous
TMW->Wylie conversions with the new-and-improved TMW->Wylie
algorithm faulty.

Now I'm using it a little more than you need to, e.g. b.lha instead of blha is
generated because bla and b.la are ambiguous.
2003-07-13 19:14:15 +00:00
dchandler
02558a1d78 Jskad supports <7, >8, etc. again; it no longer supports the punctuation
'<' and '>'.  The current keyboard implementation makes this an either-or
proposition, when fundamentally it need not be.

Added a <?Numbers?> command and an <?Input:Numbers?> command to
tibwn.ini; broke the numbers apart from the consonants.  This facilitates the
new-and-improved Tibetan->Wylie conversion.

Tibetan->Wylie is now done by forming legal tsheg-bars.  A legal tsheg bar
is converted into perfect THDL Wylie.  See code comments to learn what
it thinks is a legal tsheg-bar, but it inlcudes bskyUMbsH minus the trailing
punctuation (H), e.g.

Illegal sequences, such as runs of transliterated Sanskrit, are turned into
unambiguous Wylie; each glyph is followed by a vowel or a disambiguator
('.').

I've made it so that the illegal sequences are as beautiful as possible.  You
get 'pad+me', for example, not the equivalent but uglier 'pad+m.e.'.
2003-07-08 14:30:17 +00:00
dchandler
c04a3f189b Rearranged the topics. 2003-07-08 12:50:27 +00:00
dchandler
24ac6fd06c The Trie of possible inputs fixed this bug. 2003-07-06 16:31:13 +00:00
dchandler
d88141512b Small changes w.r.t. clearing preferences. Some code cleanup. 2003-07-06 16:24:29 +00:00
dchandler
086f4bb6ec Renamed the Info menu Help.
Now using CalHTMLPane to surf the offline and the online help.
2003-07-05 22:25:21 +00:00
dchandler
8c4ab30a52 Rearranged the Tools menu; made the converter smart about "find some..."
and "find all..." modes.
2003-07-05 21:02:46 +00:00
dchandler
72d2eee503 Code cleanup. 2003-07-05 19:26:58 +00:00
dchandler
a463b686b3 Jskad now ships with both TibetanMachine and TibetanMachineWeb fonts
by default, not just TMW.  Thus users need not install these fonts on their
systems.
2003-07-05 18:00:29 +00:00
dchandler
9effee0564 If you opened a file from the recently opened files list and very quickly
mouse-clicked on the new Jskad window, you could cause an infinite
regression of requestFocus() operations because the menu would try
to get focus back.  I grab focus from the menu now.
2003-07-05 02:30:00 +00:00
dchandler
51679c158b Final fixes completed; recently opened files can now be selected from
Jskad's file menu.
2003-07-05 02:15:33 +00:00
dchandler
4410b52c07 There's still a small bug in this, but here's the real stuff:
Recently opened files can now be selected from Jskad's file menu.

A Jskad now gives the focus to the DuffPane when that Jskad gets the
focus.
2003-07-04 03:29:25 +00:00
dchandler
d863446d25 I think *this* compiles... 2003-07-04 02:32:40 +00:00
dchandler
407020108f I didn't mean to commit the previous revision; I'm still tweaking it. 2003-07-04 02:32:03 +00:00
dchandler
9f0b1c3250 Recently opened files can now be selected from Jskad's file menu.
A Jskad now gives the focus to the DuffPane when that Jskad gets the
focus.
2003-07-04 02:31:23 +00:00
dchandler
7500b4e06b Jskad won't allow you to exit by closing the last window anymore. Instead,
you get a dialog box saying to use File/Exit.
2003-07-04 00:21:07 +00:00
dchandler
6c286573ba Fixed Javadocs. 2003-07-04 00:12:59 +00:00
dchandler
0a1bc0d30b getWylie now takes a parameter for error detection; I'm not detecting errors
here though.

Fixed a typo in a property name.
2003-07-01 23:20:08 +00:00
dchandler
a48ec641d5 Better error messages in TMW->Wylie conversions. The user knows what's
up.
2003-07-01 03:43:33 +00:00
dchandler
e7e7c2bf15 The command-line tool runs in headless mode by default, so it will
work on a Linux console, e.g.  The JUnit tests will too, though 'ant
check' still fails because we don't sneak the -Djava.awt.headless=true
into the process early enough.
2003-07-01 02:50:09 +00:00
dchandler
6151a7bc94 TMW->Wylie now occurs in the TibetanDocument, not in DuffPane,
which means that the command-line tool can finally function with a headless
graphics device.  Hopefully it will speed things up, too.  It also means that
entering Roman text into the TMW->Unicode conversion and TMW->TM
conversion will be easy.
2003-07-01 01:21:57 +00:00
dchandler
dc03083433 I've validated by hand the TM<->TMW mappings. A few things changed, so
no previous TM->TMW conversions can be trusted.
2003-06-30 02:22:09 +00:00
dchandler
58644a6ef9 Better error handling. 2003-06-30 02:20:52 +00:00
dchandler
aedef4b44d An error now appears if you try to convert from format A to format B but no
glyphs in format A appear.  In this case, it is likely that you meant to convert
a different file or do a different conversion.
2003-06-29 21:31:48 +00:00
dchandler
ee14b7b97f Jskad now has the ability to open its buffer with an external viewer, e.g.
Microsoft Word.

Better OOM error handling in the GUI converter; untested, though.
2003-06-29 20:49:30 +00:00
dchandler
646e23b4a4 Tweaked the converter GUI so that you can open the old and the new files
with the external viewer.
2003-06-29 16:45:15 +00:00
dchandler
b841a7f14b The converter GUI can now be run standalone or from Jskad's Tools menu.
The converter GUI gives nicer error messages in at least one case.
2003-06-29 04:18:36 +00:00
dchandler
7938648ca8 TM->TMW conversion has no known bugs. Oddballs have been
comprehensively handled.
2003-06-29 03:03:07 +00:00
dchandler
c39d8d6326 My earlier code cleanup introduced this bug; TMW->TM conversion was
busted.
2003-06-26 22:48:51 +00:00
dchandler
25510542b2 Now with a nicer error message in one case. 2003-06-26 22:48:05 +00:00