Commit graph

32 commits

Author SHA1 Message Date
dchandler
5e18feb47d ACIP now stacks greedily. TTTTTA is T+T+T+T+TA, even though that stack doesn't exist in TM or TMW. Robert Chilton, in personal correspondence, agreed that this is the way to do things.
ACIP handles the appendages 'AM, 'ANG, 'US, 'UR, 'I, 'O, and 'U correctly.
2003-10-16 04:15:10 +00:00
dchandler
16817d0b8e Fixed Javadocs. 2003-09-10 01:19:05 +00:00
dchandler
0d6d6ed611 Added GUI support for color-coding. Added support for color-coding
and choosing the warning level to TibetanConverter.

Better error checking in the GUI converter.
2003-09-06 22:56:10 +00:00
dchandler
899b042ec0 Preliminary, untested color support in ACIP->TMW conversion. 2003-09-05 05:54:35 +00:00
dchandler
4abbf6db37 --to-acip-text and --to-wylie-text added; these get you text files,
not RTF files like --to-acip and --to-wylie do.  The GUI converter
doesn't yet allow you to get text files.
2003-09-04 05:16:47 +00:00
dchandler
316f59107b A preliminary TMW->ACIP converter is here. There are known bugs, mostly with rare punctuation. 2003-09-02 06:39:33 +00:00
dchandler
045c4069c9 Preliminary ACIP->TMW support is in place. {DU} gives you something
less beautiful than what Jskad would give, so more work is needed.
2003-08-31 16:06:35 +00:00
dchandler
6c286573ba Fixed Javadocs. 2003-07-04 00:12:59 +00:00
dchandler
a48ec641d5 Better error messages in TMW->Wylie conversions. The user knows what's
up.
2003-07-01 03:43:33 +00:00
dchandler
6151a7bc94 TMW->Wylie now occurs in the TibetanDocument, not in DuffPane,
which means that the command-line tool can finally function with a headless
graphics device.  Hopefully it will speed things up, too.  It also means that
entering Roman text into the TMW->Unicode conversion and TMW->TM
conversion will be easy.
2003-07-01 01:21:57 +00:00
dchandler
229536884f I've validated by hand the TM<->TMW mappings. A few things changed, so
no previous TM->TMW or TMW->TM conversions can be trusted.
2003-06-30 02:24:11 +00:00
dchandler
aedef4b44d An error now appears if you try to convert from format A to format B but no
glyphs in format A appear.  In this case, it is likely that you meant to convert
a different file or do a different conversion.
2003-06-29 21:31:48 +00:00
dchandler
2a359c45ef Bad conversions were not leaving the unconvertable characters at the
beginning of the document as they should and as they are documented to.

They now do, and they bracket the bad characters with the TM or TMW for
U+0F3C on the left and the TM or TMW for U+0F3D on the right.

Some cleanup.
2003-06-28 16:20:19 +00:00
dchandler
f547734043 Added Than's converter GUI code; adapted it to work with Jskad's
converters.

TMW->Unicode now uses Ximalaya by default.
2003-06-24 03:02:29 +00:00
dchandler
917864574c Fixed a logic bug in mapTMWtoTM and mapTMtoTMW.
You can now specify which Unicode font to use via 'java
-Dthdl.tmw.to.unicode.font=Ximalaya ...'.
2003-06-23 01:58:11 +00:00
dchandler
b6d8fd89f9 When errors in (all but TMW->Wylie and Wylie->TMW) conversion occur,
the troublesome glyphs are now put at the beginning of the document
AFTER AN ACHEN.  This makes a glyph like \tmw7095 visible atop the
achen.

Major fix to the handling of paragraphs in conversion; we were (for
whatever reason) dropping paragraphs before.
2003-06-23 01:24:02 +00:00
dchandler
1f4343bed0 TMW->TM, TM->TMW, and TMW->Unicode conversions are all (at least 2)
orders of magnitude faster.
2003-06-22 22:10:58 +00:00
dchandler
900f7492b0 'ant clean check' was failing because I hadn't updated the
--find-some-non-tmw and --find-all-non-tmw baselines.

Code cleanup.
2003-06-22 16:11:58 +00:00
dchandler
6540b260bd Fixes a (small, I think) TMW->Unicode performance glitch. I was
inserting 5 characters at a time and then skipping ahead just one
position.  I don't think this affected correctness.

I believe there's still a terrible (exponential?) slowdown as the
input file gets bigger, however.  Perhaps not -- but we run through
the first 1000 TMW glyphs in 6 seconds, the 20th thousand takes at
least 60 seconds.  Is TMW->Wylie faster than TMW->Unicode?  If so,
why?

Thought: don't use a DuffPane within TibetanConverter -- it can only
add overhead, right?  My hprof profile said that the conversion was
taking just a couple of percent of the work; the rest was going to
display-related stuff that you should only see if you were displaying
the document.  I'm not!
2003-06-22 04:08:33 +00:00
dchandler
dfe64a1927 Added --find-some-non-tm and --find-all-non-tm modes to the converter to
help ensure worry-free TM->TMW conversions.
2003-06-22 00:14:18 +00:00
dchandler
da70434e52 Jskad now allows for TMW->Unicode conversion. 2003-06-15 16:27:36 +00:00
dchandler
70b31558fa Tried to fix a crashing bug that happened when you converted TM->TMW
and then tried to convert that TMW to Wylie.  I swear it's Java's
problem (see the ugly stack trace in the code and decide for
yourself), and I tried replacing rather than
inserting-and-then-removing, but it didn't work.  I've left these
things as options.
2003-06-08 23:12:52 +00:00
dchandler
32831b698f If bad (oddball) TM glyphs appear, then converting to TMW causes, by
default, all oddballs to appear once in the resulting document.
This'll help me find the correct glyphs for the oddballs, and it'll
prevent the average user from converting a document with oddballs.
2003-06-08 22:37:38 +00:00
dchandler
0f724989b5 The Wylie 'M' used to map to TMW7.91, when it should map to TMW7.90.
I've fixed that.

I've also added a couple of Unicode mappings to give a flavor for how
multi-codepoint mappings will be represented.

TM->TMW conversion takes about 1 second per thousand glyphs on my
PIII-550.
2003-06-01 23:05:32 +00:00
dchandler
0235263ddf TM->TMW and TMW->TM conversion in RTF is now supported. I've
noticed that formatting is mostly OK but sometimes gets bungled slightly.
I tried everything I could think of, and now I'm passing the buck to Java's
RTF support.

TMW_RTF_TO_THDL_WYLIE (now misnamed) support TMW->TM
conversion (but not TM->TMW).  There is an automated test case for a
TMW->TM conversion.

I have full confidence in this conversion.  Even the smallest glitch in the core
functionality (not formatting) would surprise me.

Note that the JUnit test TMW_RTF_TO_THDL_WYLIETest sometimes fails
due to one- or two-line diffs between the actual and expected outputs.  This
is because Java's RTF support is not deterministic, I'm guessing, and is not
a real failure.  I'm too lazy to make a more elaborate sed/diff mechanism
that works on all platforms, and that would complicate the build anyway.
2003-05-31 23:21:29 +00:00
dchandler
6f0390c5d6 By default (controllable via options.txt), Jskad now fixes the Tahoma curly
brace problem upon opening any RTF document.

The TMW_RTF_TO_THDL_WYLIE test baselines changed because
I fixed (a while ago) some inconsistencies between the EWTS standard and
Jskad.

Conversion of TibetanMachineWeb8.40, @#, to Wylie now works correctly.

Unfortunately, though, typing @# doesn't produce 8.40, it still produces
8.38 and 8.39, two glyphs.
2003-05-28 00:40:59 +00:00
dchandler
ec7fec695f Added some automated JUnit tests for TMW_RTF_TO_THDL_WYLIE. 2003-05-18 17:17:52 +00:00
dchandler
e2a9720d9b I've added a command-line converter,
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE.  It converts RTF files
consisting of TMW characters to the corresponding THDL Extended Wylie.

It supports --find-some-non-tmw mode, which allows you to ensure that no
unusual characters will spoil the conversion.  The converter has built-in
intelligence that allows it to handle Tahoma '{', '}', and '\\' characters
properly.

The converter works on mixed Roman/TMW also, but --find-some-non-tmw
and --find-all-non-tmw modes are not as useful.

Invoke org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE, which resides in
Jskad's jar, with no command-line options to see usage information.
2003-05-18 14:14:47 +00:00
dchandler
ecf61bc892 A DuffPane is now a TibetanPane. A TibetanPane is much more lightweight
but does line breaks correctly.  I.e., I refactored DuffPane into two classes.

I did this trying to track down a subtle bug in line breaking: 'gye ' breaks
after 'gy' sometimes, with the dreng bo on the next line, but only when you
resize the window certain ways, and only in Savant (and maybe QD and the
translation tool, I don't know) but not in Jskad.

I was not successful in finding the bug, but it still exists when I use
TibetanPanes instead of DuffPanes in org.thdl.savant.tib.*.
2002-11-08 04:11:42 +00:00
dchandler
abcf8f19b3 Factored TibetanDocument into two classes, one that is a
DefaultStyledDocument, and another consisting entirely of static utility
methods for processing Tibetan text.  Moved TibetanDocument.DuffData
into its own class.

I think this makes things a bit more transparent, and gets us a little closer to
making clean use of Swing.
2002-11-02 03:38:59 +00:00
dchandler
403f21c8db Added Javadoc files overview.html and several package.html files.
Added a "Quit" option to Savant's File menu.  Factored out the Close
option in doing so.

Exceptions in many action listeners are now handled by
org.thdl.util.ThdlActionListener or org.thdl.util.ThdlAbstractAction.

Many exceptions that we used to just log now optionally cause aborts.
This option is on by default for developers using 'ant savant-run'-style
targets, but it is off for users.

An erroneous CLASSPATH now causes a useful error message in almost
all situations.

Fixed some typos and bad links in Javadoc comments.

Added a simple assertion facility, but the overhead is suffered even in
release builds.

Factored out the code that sets up log files like savant.log and jskad.log.
2002-10-06 18:23:27 +00:00
dchandler
c6d6116ff2 Initial revision 2002-09-23 23:15:39 +00:00