Commit graph

406 commits

Author SHA1 Message Date
dchandler
689c1910aa To deal with java.swing.text.rtf bugs regarding hexadecimal escape
sequences, I've created RTFFixerInputStream.  It turns illegal hexadecimal
escapes into Unicode escapes.
2003-06-29 02:30:08 +00:00
dchandler
0b849aed97 Fixed comments w.r.t. javadoc warnings. 2003-06-29 02:22:20 +00:00
dchandler
4e279defb4 Fixed a couple of array bounds checks.
Added support for two more oddballs.

Deprecated the oddball lookup method because it drops up to 30 glyphs in
TibetanMachine.  The correct solution is to transform the RTF before Java's
busted RTF readers ever see it.  \'97 becomes \u151, e.g.
2003-06-28 16:33:58 +00:00
dchandler
2a359c45ef Bad conversions were not leaving the unconvertable characters at the
beginning of the document as they should and as they are documented to.

They now do, and they bracket the bad characters with the TM or TMW for
U+0F3C on the left and the TM or TMW for U+0F3D on the right.

Some cleanup.
2003-06-28 16:20:19 +00:00
dchandler
c39d8d6326 My earlier code cleanup introduced this bug; TMW->TM conversion was
busted.
2003-06-26 22:48:51 +00:00
dchandler
25510542b2 Now with a nicer error message in one case. 2003-06-26 22:48:05 +00:00
dchandler
c34259b105 Code cleanup. 2003-06-25 01:04:24 +00:00
dchandler
9e6c3009ac Added an About button. Code cleanup. Changed the Cancel button to the
Close button.
2003-06-25 00:49:11 +00:00
dchandler
569fba6467 Made the comments in the my_thdl_preferences.txt file use standard line
separators.
2003-06-25 00:03:46 +00:00
dchandler
0f3c4174b6 Made the comments in the my_thdl_preferences.txt file more useful. 2003-06-24 23:48:00 +00:00
dchandler
c67ddb2d6c Use Ximalaya, not Arial Unicode MS, by default. 2003-06-24 12:51:32 +00:00
dchandler
33beb7b782 Bye bye debugging output. 2003-06-24 12:23:37 +00:00
dchandler
f547734043 Added Than's converter GUI code; adapted it to work with Jskad's
converters.

TMW->Unicode now uses Ximalaya by default.
2003-06-24 03:02:29 +00:00
dchandler
19d7cabfe6 Forget the final=faster myth. 2003-06-24 03:01:13 +00:00
dchandler
917864574c Fixed a logic bug in mapTMWtoTM and mapTMtoTMW.
You can now specify which Unicode font to use via 'java
-Dthdl.tmw.to.unicode.font=Ximalaya ...'.
2003-06-23 01:58:11 +00:00
dchandler
b6d8fd89f9 When errors in (all but TMW->Wylie and Wylie->TMW) conversion occur,
the troublesome glyphs are now put at the beginning of the document
AFTER AN ACHEN.  This makes a glyph like \tmw7095 visible atop the
achen.

Major fix to the handling of paragraphs in conversion; we were (for
whatever reason) dropping paragraphs before.
2003-06-23 01:24:02 +00:00
dchandler
1f4343bed0 TMW->TM, TM->TMW, and TMW->Unicode conversions are all (at least 2)
orders of magnitude faster.
2003-06-22 22:10:58 +00:00
dchandler
afe73c2228 The pseudo-file '-', referring to standard input, is now accepted as a
command-line argument.
2003-06-22 21:05:16 +00:00
dchandler
900f7492b0 'ant clean check' was failing because I hadn't updated the
--find-some-non-tmw and --find-all-non-tmw baselines.

Code cleanup.
2003-06-22 16:11:58 +00:00
dchandler
66287f3cc9 Small TMW->Wylie performance improvements. TMW->Wylie is *much*
faster than TMW->Unicode etc.; this is because many fewer replacements
are made (i.e., more text is replaced each time a replacement is
performed).

I must find a way to still preserve formatting but do many fewer
replacements in TMW->{Unicode,TM} and TM->TMW.
2003-06-22 04:32:59 +00:00
dchandler
6540b260bd Fixes a (small, I think) TMW->Unicode performance glitch. I was
inserting 5 characters at a time and then skipping ahead just one
position.  I don't think this affected correctness.

I believe there's still a terrible (exponential?) slowdown as the
input file gets bigger, however.  Perhaps not -- but we run through
the first 1000 TMW glyphs in 6 seconds, the 20th thousand takes at
least 60 seconds.  Is TMW->Wylie faster than TMW->Unicode?  If so,
why?

Thought: don't use a DuffPane within TibetanConverter -- it can only
add overhead, right?  My hprof profile said that the conversion was
taking just a couple of percent of the work; the rest was going to
display-related stuff that you should only see if you were displaying
the document.  I'm not!
2003-06-22 04:08:33 +00:00
dchandler
dfe64a1927 Added --find-some-non-tm and --find-all-non-tm modes to the converter to
help ensure worry-free TM->TMW conversions.
2003-06-22 00:14:18 +00:00
dchandler
80101666c7 Included a fix from WylieWord's tibwn.ini. Removed some needless trailing
tildes.
2003-06-21 02:35:21 +00:00
dchandler
9a41f512d9 It used to be the case that you could select 'Close', and then when asked
"do you want to save?" you could press yes and then press cancel and
Jskad would still exit.  That's no longer the case.

Added File->Exit to Jskad.
2003-06-21 02:07:51 +00:00
dchandler
45b87b0fb4 In Jskad, you can now clear the preferences and return to default values. 2003-06-21 01:26:17 +00:00
eg3p
fbb6245fdb Added cut() and copy() methods to override JTextPane's methods of same name. 2003-06-20 15:27:20 +00:00
dchandler
5067683121 Edward corrected me; he had intended to have M map to 7.91, not 7.90. 2003-06-17 01:46:19 +00:00
dchandler
6712b47e13 Added an option to control the Unicode font for TMW->Unicode
conversions.
2003-06-15 20:28:56 +00:00
dchandler
ced830a7d3 Renamed TMW_RTF_TO_THDL_WYLIE TibetanConverter. 2003-06-15 19:19:23 +00:00
dchandler
34a7b5da9b This converter now performs TMW->Unicode conversions. 2003-06-15 18:38:42 +00:00
dchandler
da70434e52 Jskad now allows for TMW->Unicode conversion. 2003-06-15 16:27:36 +00:00
dchandler
af5b95b08d A TMW->Unicode table is here. Note these issues, however:
Is the EWTS '_' to be represented as U+0020, or is it a wider space?

Does TMW9.42, Dza, map to U+0F5F,U+0F39?

Does TMW6.60, r+y, map to U+0F62,U+0FBB or to U+0F6A,U+0FBB?  (Likewise with r+w, TMW6.61, TMW6.62, etc.)

Is U+0F7E a bindu?  What Unicode does TMW7.96 map to, for example?  What does TMW7.91 map to?

Should TMW8.97 and TMW8.98 map to swastiskas elsewhere in Unicode?  If so, which codepoints?  Likewise with TMW9.60, a Chinese character.

Does TMW7.68 map to U+0F39?

Does TMW7.74, the ITHI secret sign, have a Unicode mapping?  f68,fa0,f80,f72 comes close, but fa0 would be too large, wouldn't it?

What Unicode does TMW9.61 map to?  Is it for sequences like f40,f7c,f60,f72?  Or is it for f60,f72,f7c?
2003-06-15 03:25:45 +00:00
dchandler
b387c512e9 Fixed two bugs. 2003-06-15 03:08:57 +00:00
dchandler
189fef9aec Made Jskad smart enough to handle a few more EWTS characters; some
it can only convert to Wylie, others are live key sequences.  This will make
converting the shechen documents go more smoothly.
2003-06-09 13:35:43 +00:00
dchandler
09a55110b7 Handles more TibetanMachine oddballs. 2003-06-09 02:01:13 +00:00
dchandler
b9219640e5 Handles more TibetanMachine oddballs. 2003-06-09 01:53:01 +00:00
dchandler
e97e1c8464 Handles more TibetanMachine oddballs. 2003-06-09 01:20:32 +00:00
dchandler
651a599188 Fixed usage info. 2003-06-08 23:23:12 +00:00
dchandler
70b31558fa Tried to fix a crashing bug that happened when you converted TM->TMW
and then tried to convert that TMW to Wylie.  I swear it's Java's
problem (see the ugly stack trace in the code and decide for
yourself), and I tried replacing rather than
inserting-and-then-removing, but it didn't work.  I've left these
things as options.
2003-06-08 23:12:52 +00:00
dchandler
212414edef TMW_RTF_TO_THDL_WYLIE now converts TM->TMW. 2003-06-08 22:43:27 +00:00
dchandler
32831b698f If bad (oddball) TM glyphs appear, then converting to TMW causes, by
default, all oddballs to appear once in the resulting document.
This'll help me find the correct glyphs for the oddballs, and it'll
prevent the average user from converting a document with oddballs.
2003-06-08 22:37:38 +00:00
dchandler
d45f5ab8c8 Improved performance (I suppose). 2003-06-03 23:49:34 +00:00
dchandler
7d768c9e06 Fixed a crashing bug that happened upon converting wylie to tibetan. 2003-06-03 23:45:15 +00:00
dchandler
0f724989b5 The Wylie 'M' used to map to TMW7.91, when it should map to TMW7.90.
I've fixed that.

I've also added a couple of Unicode mappings to give a flavor for how
multi-codepoint mappings will be represented.

TM->TMW conversion takes about 1 second per thousand glyphs on my
PIII-550.
2003-06-01 23:05:32 +00:00
dchandler
54ca37c824 The Wylie 'M' used to map to TMW7.91, when it should map to TMW7.90.
I've fixed that.

I've also added a couple of Unicode mappings to give a flavor for how
multi-codepoint mappings will be represented.
2003-06-01 19:14:08 +00:00
dchandler
e2caf99085 Some code cleanup.
tibwn.ini must now have, in the Unicode column, either nothing, or
0FXX(,0FXX)*.  E.g., 0F04,0F05 is valid.  Debugging code ensures this is
the case.
2003-06-01 18:09:49 +00:00
dchandler
1f6bb07d53 Fixes bogus Unicode mappings mentioned in
http://sourceforge.net/tracker/index.php?func=detail&aid=746871&group_id=61934&atid=502515.
2003-06-01 04:02:04 +00:00
dchandler
7a8264d87c Fixed typo. 2003-06-01 03:30:49 +00:00
dchandler
0235263ddf TM->TMW and TMW->TM conversion in RTF is now supported. I've
noticed that formatting is mostly OK but sometimes gets bungled slightly.
I tried everything I could think of, and now I'm passing the buck to Java's
RTF support.

TMW_RTF_TO_THDL_WYLIE (now misnamed) support TMW->TM
conversion (but not TM->TMW).  There is an automated test case for a
TMW->TM conversion.

I have full confidence in this conversion.  Even the smallest glitch in the core
functionality (not formatting) would surprise me.

Note that the JUnit test TMW_RTF_TO_THDL_WYLIETest sometimes fails
due to one- or two-line diffs between the actual and expected outputs.  This
is because Java's RTF support is not deterministic, I'm guessing, and is not
a real failure.  I'm too lazy to make a more elaborate sed/diff mechanism
that works on all platforms, and that would complicate the build anyway.
2003-05-31 23:21:29 +00:00
dchandler
bfacd6c998 Accurate TM->TMW and TMW->TM mappings are now available. I've
verified this extensively and have full confidence that these mappings
agree with Tony Duff's Tibetan! 5.1 documentation (except as described
below).

To get them, I had to disregard Tony Duff's tables for a few glyphs: the
characters with ordinal 32 and 45 (space and hyphen in Roman ASCII,
space and tsheg in Tibetan).  For these glyphs, we must have mappings
from TibetanMachineSkt4.32 to something, etc., and those mappings were
not present.  I've normalized the mapping for these glyphs, as it is arbitrary
because the same two glyphs just appear fifteen times each.
2003-05-31 20:13:15 +00:00
dchandler
a4bc23a9ab Made performance improvements, doc improvements, and code cleanup to
DuffCode.
2003-05-31 17:02:06 +00:00
dchandler
08d2ea3e2d Jeff C. H. Wu found a bug whereby typing 'cuig' just after starting Jskad fails
(by producing 'cug') although typing 'kcuig' succeeds.

This is now fixed, and test cases now exist to ensure that the problem
doesn't reappear.
2003-05-31 12:58:36 +00:00
dchandler
bc9a8f4754 Jeff C. H. Wu found a bug whereby typing 'cuig' just after starting Jskad fails
(by producing 'cug') although typing 'kcuig' succeeds.

This is now fixed.
2003-05-31 12:49:44 +00:00
dchandler
6f0390c5d6 By default (controllable via options.txt), Jskad now fixes the Tahoma curly
brace problem upon opening any RTF document.

The TMW_RTF_TO_THDL_WYLIE test baselines changed because
I fixed (a while ago) some inconsistencies between the EWTS standard and
Jskad.

Conversion of TibetanMachineWeb8.40, @#, to Wylie now works correctly.

Unfortunately, though, typing @# doesn't produce 8.40, it still produces
8.38 and 8.39, two glyphs.
2003-05-28 00:40:59 +00:00
dchandler
a144b125ca I've made Jskad adhere to the THDL Extended Wylie spec. Some
punctuation has changed {@, #, %, and $}.

Fixed some errors in tibwn.ini so that all the TM<->TMW mappings are
correct.
2003-05-26 13:11:51 +00:00
dchandler
ec7fec695f Added some automated JUnit tests for TMW_RTF_TO_THDL_WYLIE. 2003-05-18 17:17:52 +00:00
dchandler
e2a9720d9b I've added a command-line converter,
org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE.  It converts RTF files
consisting of TMW characters to the corresponding THDL Extended Wylie.

It supports --find-some-non-tmw mode, which allows you to ensure that no
unusual characters will spoil the conversion.  The converter has built-in
intelligence that allows it to handle Tahoma '{', '}', and '\\' characters
properly.

The converter works on mixed Roman/TMW also, but --find-some-non-tmw
and --find-all-non-tmw modes are not as useful.

Invoke org.thdl.tib.input.TMW_RTF_TO_THDL_WYLIE, which resides in
Jskad's jar, with no command-line options to see usage information.
2003-05-18 14:14:47 +00:00
dchandler
17ea8fdf2a Copying from Word XP used to crash Jskad sometimes. Now you get a
dialog box telling you something about RTF support in Java.
2003-05-15 01:41:56 +00:00
dchandler
78dc46a979 Jskad keyboards are now configured via keyboards.ini, a file that has
comments that explain its function.  It's quite simple.  This is in
response to Jeff C. H. Wu's request.
2003-05-14 03:25:36 +00:00
dchandler
dcb36ec338 Clearer status message; cleanup. 2003-05-14 02:37:28 +00:00
dchandler
8958366a07 Bad RTF now causes an error message to appear in the transcription
instead of causing a fatal exception.  The error allows you to look up
the DuffCode that caused the trouble.
2003-05-14 01:37:49 +00:00
dchandler
8275afeb41 Bad RTF files cause a polite error message to appear instead of an
exception to be thrown.

Jskad windows now always have "Jskad" in their window titles.
2003-05-14 01:34:39 +00:00
eg3p
3e847ed009 DELETE was not working properly in Roman entry mode.
Now it works ok.
2003-04-17 19:48:22 +00:00
amontano
0bacdcc229 fixed the paste problem for the translation tool 2003-04-17 11:12:59 +00:00
dchandler
59175ccfd6 Added a few tests for the ACIP keyboard, which I've improved a bit.
Noted some failures.  "Fixed" the code to do what I want it to do for
the (no sanskrit stacking, tibetan stacking) case [which is exercised
by this keyboard only].
2003-04-14 23:55:00 +00:00
dchandler
efa8fc1f25 DuffPane now has the start of a unit test suite. Invoke it via 'ant
clean check'.  Right now there are tests to ensure that typing certain
sequences of keys in the Extended Wylie keyboard gives the expected
Extended Wylie back when "Tools/Convert Tibetan to Wylie" is invoked.

The syntactically illegal d.wa now converts to Tibetan and then back
to d.wa (not dwa, as it did); likewise with the illegal g.wa.  wa
doesn't take any prefixes, but I prefer clean end-to-end
behavior. (jeskd doesn't go end-to-end, though.)

Note that you cannot successfully run the DuffPane tests on a Linux
box unless your DISPLAY variable is set correctly.  Thus, my nightly
builds will fail with an Error (as opposed to a Failure).
2003-04-14 05:22:27 +00:00
dchandler
6636d03a41 ant private-javadocs runs without warnings; cleaned up some
as-yet-unused code.
2003-04-13 01:46:20 +00:00
dchandler
644c0d3801 Updated the HTML help file; removed some useless code. 2003-04-13 01:17:10 +00:00
dchandler
daacf6ee3b I've got too many sandboxes, so I'm committing these changes,
half-done, from one sandbox so as to consolidate my sandboxes.
2003-04-12 20:56:20 +00:00
dchandler
6e05b60cff I'll need these when I turn a sequence of UnicodeGraphemeClusters into
LegalTshegBars.
2003-04-12 20:19:02 +00:00
dchandler
66e34aadfd Code cleanup -- removed cruft. 2003-04-12 16:28:56 +00:00
dchandler
cbccfc5277 Fixed bug 718207. 'byungs now converts from Tibetan to Wylie
correctly.
2003-04-10 02:14:15 +00:00
amontano
bc8b5f724b nothing 2003-04-08 13:28:38 +00:00
eg3p
995817eb98 no message 2003-04-08 12:14:03 +00:00
dchandler
7dd67bbf6a Now turns Tibetan into pa'am, not pa'm. Works with or without vowels
in the part preceding the 'am or 'ang, overcoming the inconsistency
that I'd put here for a short time.
2003-04-08 04:56:40 +00:00
dchandler
eb71fb6075 "sgom pa'am " is correct, not "sgom pa'm ". 2003-04-07 23:49:07 +00:00
eg3p
df4f8b8a45 processRomanChar now sets aside formatting
like TAB, ENTER, etc.
2003-04-07 19:41:48 +00:00
eg3p
275cf9d79d Improved handling of backspace based on my understanding
of various known Java bugs. Those who mess around with
backspace take note of the following:

The Java bug database has several related bugs concerning the treatment
of backspace. Here I adopt solution based on fix of bug 4402080:
Evaluation  The text components now key off of KEY_TYPED with a keyChar == 8 to do the
deletion. The motivation for this can be found in bug 4256901.
xxxxx@xxxxx 2001-01-05
2003-04-07 16:41:49 +00:00
amontano
e7684dedcd nothing 2003-04-05 00:03:44 +00:00
amontano
341bea3c16 Added a line to the paste method so that if text is selected, the pasted text substitute the selected text. 2003-04-03 05:17:40 +00:00
amontano
5423bc19d4 Updated the clipboard calls to DuffPane. Ed: there are some mistakes that didn't happen before. There are certain combinations that use a header letter that when pasted from the DuffPane to the DuffPane fail. Try writing "rgyas", copying it and pasting it beside it. 2003-04-03 05:16:14 +00:00
eg3p
7a495bc720 Made the following changes: (1) renamed DuffPane's copySelection, pasteSelection, etc. to copy, paste and so forth, which override JTextComponent's methods by those names: Andres, please change the translation tool accordingly to use these new methods if that it necessary; (2) in order to allow for easier integration of Jskad with other tools such as QuillDriver, I changed DuffPane to rely on a Keymap instead of a KeyListener for its default key intercepts; this addresses the comments to bug 617156. Note that I have been working on Mac OS X and have not extensively tested my changes on a PC yet. 2003-04-02 20:37:14 +00:00
amontano
2250e03766 Updated copyright and version info. 2003-04-01 13:38:16 +00:00
amontano
a7a573020f Renamed LinkedList to SimplifiedLinkList and moved it from org.thdl.tib.scanner to org.thdl.util. This linked list was implemented because the VM running on handhelds does not include java.util.LinkedList. 2003-04-01 13:08:38 +00:00
dchandler
d836b850e8 "sgom pa'm ", not "sgom pa'am", is now used. "pe'm " was being
produced already, so the code was inconsistent.  If it turns out that
"pe'am " is preferred, I'll fix it later.  Consistency is very
appealing.
2003-03-31 01:38:27 +00:00
dchandler
33b3080068 Fixed a bunch of bugs; supports le'u'i'o, sgom pa'am, etc.
Better tests.  As part of that, I had to break TibetanMachineWeb into
TibetanMachineWeb+THDLWylieConstants, because I don't want the
class-wide initialization code from TibetanMachineWeb causing errors
in LegalTshegBarTest.
2003-03-31 00:33:50 +00:00
dchandler
1987f7d80a b-r-g, b-l-g-s, etc., when converted from Tibetan to Wylie, give
correct, unambiguous Wylie.
2003-03-30 21:49:55 +00:00
amontano
8565855dd1 Now the handheld version supports both portrait and landscape. 2003-03-30 17:09:09 +00:00
dchandler
f9670233ba Removed documentation FIXMEs from this code; did away for good with
some really iffy code that I think was behind the "Tibetan->Wylie
conversion fails when keyboard isn't Extended Wylie" bug.
2003-03-30 16:13:00 +00:00
dchandler
58f7371e66 I hope that Revamped the "Tools>Convert Tibetan To Wylie" feature that
converts TibetanMachineWeb glyphs to THDL Wylie.  Three-glyph and
four-glyph sequences with implicit "a" vowels are now handled
correctly, except for disambiguation w.r.t. things like b-la-g
vs. bla-g and d-wa vs. dwa.

pa'am, pa'ang etc. now work too.

Illegal Tibetan sequences now become very ugly, but "correct" Wylie.
Correct in the sense that converting it back to glyphs should get you
the glyphs you started with.

I also made a change to TibetanMachineWeb.java that I hope will clear
up problems with this feature when keyboards other than "Extended
Wylie" are selected.

Took nga out of the farRightSet [postsuffixes]; only da and sa belong
there, right?

I tried to get the system in a state such that I could run automated
tests of this stuff, but I ran into difficulties.  I have some manual
test cases; ask if you're interested.
2003-03-30 02:31:16 +00:00
dchandler
2b81020b0e More and better tests; fixed some bugs in LegalTshegBar. 2003-03-28 03:49:49 +00:00
amontano
35a9869aac 1. Fixed parsing error
2. Added support extreme uses of 'a' like le'u'i'o
3. Now parses correctly syllables that have the particles "ang" and "am" added to them. Second works only in "roman script" mode. The converter from tibetan script to roman script does not convert correctly this combinations. ("pa'ang" is converted wrongly into "pa'ng" and "pa'am" is converted wrongly into "pa'ma").
2003-03-23 20:27:54 +00:00
dchandler
08d2a5d702 Added a test for org.thdl.tib.text.tshegbar.UnicodeCodepointToThdlWylie. 2003-03-22 04:55:17 +00:00
dchandler
f2dcb0cbc3 I said I removed this earlier; I lied. Now it's gone. 2003-03-22 03:58:13 +00:00
dchandler
16cbfb6033 Moved ad-hoc test.java test cases to UnicodeGraphemeClusterTest.java,
a JUnit test which can be run via 'ant check'.  Removed test.java and
its build process.
2003-03-22 03:55:39 +00:00
dchandler
395eca7bb1 Moved ad-hoc test.java test cases to LegalTshegBarTest.java, a JUnit
test which can be run via 'ant check'.
2003-03-22 03:46:32 +00:00
dchandler
879b477902 Made some ad-hoc tests in test.java into JUnit tests, run by 'ant
check'.

NORM_NFD was replaced with NORM_NFKD in three cases in testMostlyNFKD.
2003-03-22 03:24:56 +00:00
dchandler
12eb7cf4cf Updated this file, which is used in making the Javadocs. 2003-03-22 03:14:02 +00:00
dchandler
3da52c5f0c Removed these files, which are for Savant and QuillDriver. They're
still in the CVS Attic if you need them.
2003-03-22 03:10:59 +00:00
dchandler
1e326bb06d Removing these QuillDriver leftovers. They're still in the CSV Attic,
if anyone needs them.
2003-03-22 02:38:24 +00:00
eg3p
fed25e27ee No longer necessary now that Savant & QuillDriver
have been moved out of THDL Tools.
2003-03-14 00:33:24 +00:00
eg3p
715203a12e Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics. So I removed
the Savant & QD related options.
2003-03-14 00:28:28 +00:00
eg3p
c280d0fc96 Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics.
2003-03-13 20:00:51 +00:00
eg3p
6cc0c5e99b Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics.
2003-03-13 19:57:12 +00:00
eg3p
4dff3b4ae0 Added new words 2003-03-12 13:01:16 +00:00
eg3p
a98849d3eb QD as XML editor. More details later. 2003-03-12 12:48:18 +00:00
eg3p
4070c5ccee Latest QD 2003-03-12 12:46:44 +00:00
dchandler
9e0dc68d12 Feature Request 697358 is done. The working directory for Jskad is
now a preference.

In addition, Jskad now raises an error dialog when you try to "Save
As" to a bad place or open a file that doesn't exist or isn't
readable.
2003-03-11 01:03:19 +00:00
dchandler
aa144dd599 Javadoc 1.4.1_01 no longer has a single warning about this package. 2003-02-03 01:36:56 +00:00
dchandler
c379db6ff5 Javadoc 1.4.1_01 no longer has a single warning about this file as we
use &#64; to represent the at sign @.
2003-02-03 01:36:08 +00:00
dchandler
e6a10d052f Added a "Help" menu item that pulls up jskad_doc.html, which is now put
into Jskad's JAR file.

Doing so required that I cut out a lot of fancy HTML code.  The correct fix
is to use XML to store the meat and then use XSL to generate two forms of
HTML: one dumb enough for Java, one for use on the THDL tools website.
2003-02-01 06:42:07 +00:00
dchandler
cf279bb620 Added a JScrollPane that views a noneditable HTML file found inside a JAR
file.
2003-02-01 06:37:32 +00:00
dchandler
bde0cc8381 Slapped on copyright boilerplate. 2003-02-01 05:52:03 +00:00
dchandler
a1f6b9e117 Each class's author is now listed as Than. 2003-02-01 05:38:48 +00:00
dchandler
d453e801ef Windows directory separators (backslashes) have been replaced with
java.io.File.separatorChar.  This means tibbibl puts its temporary
files under Jskad/bin in my Linux sandbox.
2003-02-01 05:30:22 +00:00
dchandler
72ee4fc7d2 Added the initial version of Tibbibl, which Nathaniel Garson of UVa
e-mailed to me.  Tibbibl is an editor for XML-based bibliographies of
Tibetan texts.  All I did was change the package from org.thdl.xml to
org.thdl.tib.bibl and add boilerplate; no changes to Than's code were
made.

Tibbibl features a diacritic input tool which Jskad might want to
swipe.
2003-02-01 05:08:02 +00:00
dchandler
190a3d9b60 achen must appear before a vowel. 2003-01-05 05:58:32 +00:00
dchandler
fcb75c55eb Small performance improvement involving String.intern(). Plus a
little bit of code cleanup.
2003-01-05 05:57:44 +00:00
dchandler
e5a63df1c1 Added a class skeleton that may not stay for long.
I'm committing in order to sync with my laptop, really.  This stuff will disappear
and reappear in better form later, after a holiday of coding and eggless,
alcohol-free nog.
2002-12-20 04:46:13 +00:00
dchandler
fdfedb4419 Added some tests for org.thdl.tib.text.tshegbar. These tests are preliminary,
and for this package only.

I'm committing in order to sync with my laptop, really.  This stuff will disappear
and reappear in better form later, after a holiday of coding and eggless,
alcohol-free nog.
2002-12-20 04:34:56 +00:00
dchandler
7ea185fa01 Renamed UnicodeCharToExtendedWylie to
UnicodeCodepointToThdlWylie.java.

Added a new class, UnicodeGraphemeCluster, that can tell you
the components of a grapheme cluster from top to bottom.  It does not
yet have good error checking; it is not yet finished.

Next is to parse clean Unicode into GraphemeClusters.  After that comes
scanning dirty Unicode into best-guess GraphemeClusters, and scanning
dirty Unicode to get nice error messages.
2002-12-17 13:51:18 +00:00
dchandler
8e8a23c6a6 Extended Wylie is referred to as THDL Extended Wylie or THDL Wylie
because a Japanese scholar has an "Extended Wylie" also.

NFKD and NFD have a new brother, NFTHDL.  I wish there weren't a need,
but as my yet-to-be-put-into-CVS break-unicode-into-grapheme-clusters code
demonstrates, the-need-is-there.  forgive-me for the hyphens, it's late.
2002-12-15 06:57:32 +00:00
dchandler
a42347b224 Now uses terminology from the Unicode standard. No more talk of
characters, for example.

Normalization forms NFKD and NFD are supported for the Tibetan Unicode
range.  I don't like either, actually.  I've tested NFKD, but I've not yet
committed the tests.
2002-12-15 03:35:24 +00:00
eg3p
3199ff7926 There are two classes here. One renders XML transcripts in JTextPane, and the other uses XPath to navigate the transcripts. Neither is part of the build yet. I'll document them more fully later when I've got to a point where they are worth sharing. 2002-12-12 15:17:42 +00:00
eg3p
86c2374706 New QD files that don't do anything yet. 2002-12-10 20:53:55 +00:00
dchandler
26993a5093 So that Unicode escape sequences appear correctly in javadocs. 2002-12-09 02:35:39 +00:00
dchandler
2d6c8be804 So that Unicode escape sequences appear correctly in javadocs. 2002-12-09 02:29:09 +00:00
dchandler
22c6ec5406 Javadoc now works without warnings. 2002-12-09 01:48:34 +00:00
dchandler
f4a16f8e9d This commit is for my benefit only; these classes are not ready for prime time,
and the build system is not yet aware of them.

I'm adding some classes for representing legal tsheg-bars (syllables, for the
most part) in Unicode.  These classes were designed bottom-up (OK, OK --
they weren't designed designed, but I had to write down everything I knew
about Tibetan syntax somewhere).  The classes are aware of extended
wylie.  I doubt the Javadocs work yet, and I'm still testing (and am not
committing my testing code with these as it is not yet ready).

Next on my list--fix these up to reflect my new awareness of suffix particles
(like le'u'i'o) add classes to support syntactically incorrect Unicode
sequences.  Then add a UnicodeReader, and we've got the back end of
a Tibetan Unicode shaping system (like half of MS's Uniscribe or Apple's
Worldscript or FreeType Layout or Omega's OTPs).

A top-down design would not have included LegalTshegBar.  But now that
my itch has been scratched, potential uses are lingering about.  For example,
it would be nice to scan some input and break it into LegalTshegBars,
punctuation/marks/signs, and illegal stacks.  Then we could alert the client
of the illegality, its precise form, and its precise location.

The real system for turning a Unicode stream into an internal representation
suitable for conversion to EWTS/ACIP/XHTML/what-have-you need not be
aware of Tibetan syntax.  But to make the very best conversion from
Unicode to, e.g., EWTS, it is necessary to konw that gaskad is better
represented as gskad, but that jaskad is not the same as jskad.
2002-12-09 01:02:23 +00:00
dchandler
03688b6137 Dza, fa, and va go together, not dza, fa, and va. 2002-12-08 21:08:58 +00:00
dchandler
53aa2e2309 Added jskad_doc.html (a revision of which is up at
http://iris.lib.virginia.edu/tibet/tools/jskad_doc.html) to the repository.  The
build puts this into Jskad's JARs, but Jskad itself does not allow for viewing
it.  In Java, that's a ten-minute job, but I haven't done it.
2002-12-07 17:53:24 +00:00
eg3p
9eedfcd909 This is Tashi's TibetanSyllable class for sorting Wylie Tibetan.
It does not have many methods for determining the root letter, suffix,
and so on, but these should be easy to add. David, please use this
class to the extent that it and your new work overlap.
2002-12-05 01:48:41 +00:00
eg3p
d14aa87fda Removed egotistical self-reference from About Jskad text. 2002-12-04 16:02:16 +00:00
eg3p
1aad72f81b Just testing cvs commit from my Mac. 2002-12-04 15:16:40 +00:00
eg3p
15404dc3b1 String resources for Spanish version of software. 2002-12-02 20:39:43 +00:00
amontano
569b2bb608 changed them to public 2002-11-29 08:08:54 +00:00
michel_jacobson
f869f054b7 new applet for QT4J 2002-11-28 16:18:19 +00:00
michel_jacobson
3d215caf53 modifications to handle url in place of file and to be used with SmartQT4JApplet 2002-11-28 16:16:27 +00:00
amontano
178ffcb800 added documentation 2002-11-28 06:54:46 +00:00
amontano
c81241e309 put in comments the association of menus with shortcuts. With the shortcut sometimes it seems to copy stuff twice. Without the shortcut seems to work anyway. 2002-11-27 23:32:57 +00:00
amontano
4acb2aa77e fixed grammar mistake 2002-11-27 23:31:22 +00:00
amontano
c12088ce5d fixed the importing of dictionaries using '-' as a separator, without confusing such character with reverse vowel in the tibetanized sanskrit. 2002-11-27 23:30:44 +00:00
amontano
c13adf9d14 now having copied a selection, if you paste it over selected text, the selected text is substituted with the text being pasted. 2002-11-27 23:29:31 +00:00
amontano
93eeae2118 Fixed bug that recently made it crash.
Enabled the property thdl.rely.on.system.tmw.fonts before the production of TibetanMachineWeb HTML. This avoids the call to readInFontFiles() within the TibetanMachineWeb class (which raises an exception when it cannot find for whatever reason the fonts). The servlet doesn't need to load the fonts anyway!
2002-11-23 21:13:47 +00:00
amontano
5432168694 Fixed bug that recently appeared that made it crash.
Enabled the property thdl.rely.on.system.tmw.fonts before
the production of TibetanMachineWeb HTML. This avoids
2002-11-23 21:03:33 +00:00
amontano
b73760009c added warning against using tibetanmachineweb, while the html script is not working. 2002-11-23 01:57:00 +00:00
amontano
dbf900b08b minor change 2002-11-22 22:51:11 +00:00
michel_jacobson
a7bc9e97c0 no more need. It have been replaced by SmartJMFApplet.java 2002-11-19 21:26:01 +00:00
amontano
5d205ca9d9 minor changes to about window. 2002-11-19 18:47:43 +00:00
amontano
06fa7f020e added timestamp to about window. 2002-11-19 18:46:47 +00:00
amontano
1fb425c6cd corrected possible error with the '-' being used as both marker separating definition and definiendum and valid wylie character (transliterated sanskrit). 2002-11-19 18:46:14 +00:00
michel_jacobson
cc0097f2ae no message 2002-11-19 17:51:45 +00:00
thangarson
384ad282b5 This is the SGML version of the TIBBIBL DTD used to catalog Tibetan
texts. All the catalogs of the Literary Collection are at this time using this
DTD and a proprietary program, Dynaweb. There is also an XML version
of it, xtibbibl.dtd, which has some slight modifications.
2002-11-19 16:51:23 +00:00
dchandler
07fe242596 Very minor cleanup to fix Javadocs and make the source code more
readable; comments added.
2002-11-18 21:33:44 +00:00
dchandler
d200b03d66 Updated the build system so that you must do a cvs checkout of the
'Fonts' module inside the 'Jskad' module.  I.e., you must now have the
tree like so:

Jskad/
   source/
   dist/
   Fonts/
       TibetanMachineWeb/
   .
   .
   .

This is because the THDL tools now optionally (and by default) load
the TibetanMachineWeb fonts automatically.

Updated the build system so that the 'web-start-releases' and
'self-contained-dist' targets JAR up optional JARs to create
double-clickable, self-contained joy.  Even the TMW fonts are in the
JARs now.

Changed the strings describing two Jskad keyboards so that "keyboard"
is no longer in the description.  It's in the label next to the combo
box.

Jskad now saves preferences on exit or when the user selects a menu
item (that is there for debugging mainly) to ~/my_thdl_preferences.txt
on *nix or C:\my_thdl_preferences.txt on Win32.  I don't know the
correct Mac location.

There's a new paradigm for telling org.thdl.util.ThdlOptions that a
user preference has been changed.  If, for example, a combo box is
manipulated so that the ACIP keyboard is selected, then you must call
a certain method in ThdlOptions.
2002-11-18 16:12:25 +00:00
amontano
77b8c5e424 Added timestamp to about window (of all versions of translation tool except servlet). 2002-11-17 09:09:10 +00:00
dchandler
5ffb813019 Jskad's "About" dialog box now lists the time of compilation. Ant
creates source/org/thdl/util/ThdlVersion.java when you execute the
jskad-compile target.
2002-11-16 19:18:44 +00:00
michel_jacobson
872232c108 no message 2002-11-15 20:38:25 +00:00
michel_jacobson
3e71ff8351 changed to work with lacito applet - but not done 2002-11-14 21:13:08 +00:00
eg3p
c9349f6846 These files are not used. 2002-11-12 16:47:02 +00:00
eg3p
7c47e89811 This file is no longer used. 2002-11-12 16:44:20 +00:00
dchandler
ecf61bc892 A DuffPane is now a TibetanPane. A TibetanPane is much more lightweight
but does line breaks correctly.  I.e., I refactored DuffPane into two classes.

I did this trying to track down a subtle bug in line breaking: 'gye ' breaks
after 'gy' sometimes, with the dreng bo on the next line, but only when you
resize the window certain ways, and only in Savant (and maybe QD and the
translation tool, I don't know) but not in Jskad.

I was not successful in finding the bug, but it still exists when I use
TibetanPanes instead of DuffPanes in org.thdl.savant.tib.*.
2002-11-08 04:11:42 +00:00
dchandler
04da61688d A DuffPane is now a TibetanPane. A TibetanPane is much more lightweight
but does line breaks correctly.  I.e., I refactored DuffPane into two classes.

I did this trying to track down a subtle bug in line breaking: 'gye ' breaks
after 'gy' sometimes, with the dreng bo on the next line, but only when you
resize the window certain ways, and only in Savant (and maybe QD and the
translation tool, I don't know) but not in Jskad.

I was not successful in finding the bug, but it still exists when I use
TibetanPanes instead of DuffPanes in org.thdl.savant.tib.*.
2002-11-08 04:05:06 +00:00
dchandler
86e384352b Jskad's "Do you want to save your changes before you quit?" dialog is now
optional.
2002-11-08 03:58:35 +00:00
amontano
947ac5537a Updated on comments and made it a bit more consistent. 2002-11-03 17:42:11 +00:00
dchandler
d462f4e41c Fixes all known bugs with the ACIP keyboard except for one:
ACIP's 'WA' represents Wylie's 'wa', but ACIP's 'ZHVA' represents Wylie's
'zhwa'.  The key for wasur is the same as the key for the twentieth
consonant in extended Wylie, but not in ACIP.
2002-11-03 17:34:33 +00:00
dchandler
22141248e7 Terribly minor cleanup. 2002-11-03 17:05:44 +00:00
dchandler
7adfddfb43 Fixed my fix to the "Jskad freezes on impossible input" bug.
Typing 'lKU' in Extended Wylie is now equivalent to 'lU'.  I'm not sure if
this is a change or not.
2002-11-03 17:05:05 +00:00
amontano
37b29c8d33 Added comments to all class headers. Comments to individual methods will
be added as needed.
2002-11-03 08:56:11 +00:00
eg3p
b4e4decc2e Updates, including support for
internationalized strings.
2002-11-02 22:11:02 +00:00
eg3p
fab76cb82e no message 2002-11-02 22:10:12 +00:00
eg3p
392b2b180a These files have been updated for use with
Savant. That is, org.thdl.savant.SoundPanel
has been eliminated in favour of these classes,
which are shared between QD and Savant.

The main change is that SmartMoviePanels
can now communicate with the outside world,
for example to send messages to a Savant
text window telling it to update highlights.
2002-11-02 20:20:30 +00:00
eg3p
da9a576e02 New strings added. 2002-11-02 20:13:16 +00:00
eg3p
5bfaccded7 This class provides static methods for dealing
with THDL's internationalization issues.
2002-11-02 20:11:42 +00:00
eg3p
59d65bedc3 Change scrolling policy w/in Savant. Now highlighted text stays in the middle of the window instead of at the bottom. 2002-11-02 20:10:05 +00:00
dchandler
de6ae79959 Fixes bug 624133, "Input freezes after impossible character". Try 'shsM' in
ACIP or 'ShSm' in Extended Wylie to see the new behavior.

We use a trie to store valid input sequences.  In the future, we could use
the same trie as a replacement for the more inefficient HashSets we use to
store characters, vowels, and punctuation.  For example, we'd use
'validInputSequences.put("K", new Pair("consonant", "k"))' when reading
in the ACIP keyboard's description of the first consonant of the Tibetan
alphabet in 'TibetanKeyboard.java'.

Note that the current trie implementation is only useful for 7- or 8-bit
transcription systems, and works best for tries with low average depth, which
describes a transcription system's trie very well.  If you used arbitrary
Unicode in your keyboard, you'd need a different trie implementation.

Improved the optional keyboard input mode status messages.
2002-11-02 18:44:24 +00:00
dchandler
a6cc4a7ff3 Removed/commented out/tagged some unused local variables.
Added a JUnit test for the new Trie that fails at present since the Trie is
case-insensitive.  Running JUnit tests is not something our build system
knows about at present, but Eclipse 2.0 makes it very easy.

Fixed a few compiler errors due to imports I'd forgotten.
2002-11-02 16:01:40 +00:00
dchandler
b8391e923d Borrowed a trie implementation from Apache's Xalan 2.4.0. 2002-11-02 13:39:29 +00:00
dchandler
29042638e2 In the ACIP keyboard, 'KEE' and 'KOO', which are equivalent to Wylie's
'kai' and 'kau', now work.

The optional status messages have been improved.
2002-11-02 05:21:12 +00:00
dchandler
aa580e0bea Undoing my erroneous commit of buggy code. 2002-11-02 03:46:44 +00:00
dchandler
abcf8f19b3 Factored TibetanDocument into two classes, one that is a
DefaultStyledDocument, and another consisting entirely of static utility
methods for processing Tibetan text.  Moved TibetanDocument.DuffData
into its own class.

I think this makes things a bit more transparent, and gets us a little closer to
making clean use of Swing.
2002-11-02 03:38:59 +00:00
dchandler
5249c48807 Factored TibetanDocument into two classes, one that is a
DefaultStyledDocument, and another consisting entirely of static utility
methods for processing Tibetan text.  Moved TibetanDocument.DuffData
into its own class.

I think this makes things a bit more transparent, and gets us a little closer to
making clean use of Swing.
2002-11-02 03:33:09 +00:00
eg3p
d070e470ef Updated these files to use DuffPane instead of JTextPane and so take advantage of DLC's new line wrapping code. 2002-10-31 19:06:47 +00:00
dchandler
97c530e974 GHA and KR'i now work. 2002-10-28 05:31:19 +00:00
dchandler
1ecbfe6a7c Fixed some Javadoc comments in preparation for putting up new Javadocs
on http://thdltools.sf.net/.
2002-10-28 04:49:24 +00:00
dchandler
fd1b4dd468 Now breaks the line after the last whitespace, not the first.
I cleaned things up a bit, and I've made logging optional since I don't yet
trust the code fully.

A Wylie underscore at the end of a line is worth looking into further, at the
very least.
2002-10-28 04:12:49 +00:00
dchandler
8433369d60 Now with slightly better error handling. 2002-10-28 03:17:28 +00:00
dchandler
0ad135f8f1 This may well be a fix to the "Improper line wrapping" bug. The fix
is basically that we use our own special ViewFactory, with a new
subclass of LabelView (the view RTFEditorKit uses for the nitty
gritty) that is aware of Tibetan.

There are a couple of nasty hacks still here, and Swing's
documentation for doing what I did was quite poor.  I searched the web
for hours, read the Javadocs and the tutorials, and consulted a Swing
reference book, but I still don't have tremendous confidence in this
solution.  If it fundamentally doesn't work, though, we have to define
our own first-class Document, Element hierarchy, ViewFactory, Views,
and EditorKit.  So let's hope it *does* work fundamentally.

I can't say for sure if this even works, as I have yet to run this
code on a machine where Jskad works properly.  I had major trouble
installing the TMW fonts on Linux, and have yet to resolve it, even
after verifying via xlsfonts that the fonts were installed and then
changing TibetanMachineWeb.java to look for them.  Because I haven't
tested this yet, a lot of nasty code is tagged 'DLC' and commented
out.
2002-10-28 03:08:04 +00:00
dchandler
f26dd53da3 Changed the build so that Savant and QuillDriver's builds include
Smart*Player.java, which are accessed via reflection.  Cleaned up the
code a bit so that it would compile in so doing.

Changed the 'options.txt' preferences file to reflect the new method
of selecting media players.
2002-10-27 19:12:13 +00:00
amontano
e4aa52a6eb re-arranged the display. Now the buttons are closer to the text input. 2002-10-27 18:48:48 +00:00
amontano
8391f19a8d copy and paste features are fixed. 2002-10-27 18:48:03 +00:00
amontano
7336d27a33 fixing the copy-paste issue for the translation tool. 2002-10-26 18:15:34 +00:00
dchandler
b6b8cd73ff Moved JskadKeyboard-related code into separate files; made many things public. 2002-10-26 17:40:51 +00:00
dchandler
3ee1fbd3fa Removed backup copies of .java files. 2002-10-26 15:57:06 +00:00
amontano
d35048a067 fixing copy and paste. works, except if pasted from a TextArea through the windows pop-up menu. 2002-10-26 15:49:55 +00:00
eg3p
34b660b8f9 Moved all media related stuff to new package.
This makes more sense, since all this stuff is
accessed by both Savant and QuillDriver.
2002-10-25 20:19:56 +00:00
eg3p
7f3f0eb8e1 Various changes related to Quicktime and JMF
support, as well as keyboard modularization. Not
quite done yet, though, so may not compile.
2002-10-25 20:18:22 +00:00
eg3p
3f17daab67 no message 2002-10-25 19:48:23 +00:00
eg3p
27dfa66b02 Ongoing work with Andres to change paste so that
isRomanEnabled = false implies auto conversion
of Wylie to Tibetan. Doesn't work yet.
2002-10-25 19:47:14 +00:00
eg3p
91b8fd3cd9 Edited JskadKeyboard code slightly so that it is
easier to use these keyboards outside of Jskad
(for example from QuillDriver).
2002-10-25 19:41:43 +00:00