Commit Graph

666 Commits

Author SHA1 Message Date
dchandler 05214b8f14 EWTS->Uni was crashing for tabs. 2005-07-14 04:16:36 +00:00
dchandler dc18165992 Added a class for performing EWTS->Unicode conversions during XSLT
transformations.  I haven't actually used it with Xalan XSLT yet, but
it ought to work if TibetanHTML did (which it must have at one point).

I do have a unit test, but an end-to-end test with Xalan is what we
need.
2005-07-13 07:25:18 +00:00
dchandler 6260c0889d Mentions how to salvage this class. 2005-07-13 06:23:02 +00:00
dchandler bef1d1b625 Added boilerplate and a class comment and traded in tabs for four
spaces.  A unittest and an example would be great, but this is a
start.
2005-07-13 06:19:27 +00:00
dchandler 8ccd68789a Since I had Eclipse fired up, I had it automatically organized the
imports.  It made two errors, but the compiler found them.  I've cvs
tagged the tree before doing this, just in case.
2005-07-11 03:10:32 +00:00
dchandler 6d419fe641 Numerous EWTS->Unicode and especially EWTS->TMW improvements.
Fixed ordering of Unicode wowels.  [ku+A] gives the correct Unicode
now, e.g.

EWTS->TMW looks better for some wacky wowels like, I'm guessing here, [ku+A].

EWTS->TMW should now give errors any time the full input isn't used.
Previously, wacky wowels like [kai+-i] would lead to some droppage.

EWTS->TMW->Unicode testing is now in effect.  This found a ton of
EWTS->TMW bugs, most or all of which are fixed now.

TMW->Unicode is improved/fixed for {
\u5350,\u534D,\u0F88+k,\u0F88+kh,U }.  (Why U?  "\u0f75" is
discouraged in favor of "\u0f71\u0f74".)

NOTE: TMW_RTF_TO_THDL_WYLIETest is still disabled for the nightly
builds' sake, but I ran it in my sandbox and it passed.
2005-07-11 02:51:06 +00:00
dchandler 36122778b4 EWTS->TMW works now for [#] and for [//]. 2005-07-10 05:36:35 +00:00
dchandler 33fc836e81 EWTS->Unicode for // now produces \u0f0e as it should. 2005-07-10 05:01:03 +00:00
dchandler 64625fd445 Removed the 'Import Wylie' menu item; 'Launch Converter...' is the way to go.
Fixed the converter GUI to work perfectly (AFAIK) for EWTS->Tibetan.
2005-07-07 03:15:59 +00:00
dchandler cddbbae9a1 Bulletproofed EWTS->Tibetan against nasty pseudo-EWTS like [RAM].
Renamed recoverACIP methods.
2005-07-07 02:54:36 +00:00
dchandler 982350371d EWTS->TMW fixes. Wowel handling still isn't perfect but I'm lazy.
Jskad now uses the new EWTS->TMW routine, not the old, and thus the
"(Buggy)" label is [unfairly, perhaps] dropped.
2005-07-07 01:30:03 +00:00
dchandler 0f99c402df My last commit left the tests broken. Doh.
Also, I'm enabling EWTS->Tibetan converters in the GUI so that I can
ask folks to try them out.
2005-07-06 22:55:19 +00:00
dchandler b74af71efc Better, but still flawed, handling of EWTS [^] (i.e., U+0F39). 2005-07-06 22:26:55 +00:00
dchandler f5d87ab226 Fixed EWTS->Tibetan [g.yogs] bug. 2005-07-06 18:37:22 +00:00
dchandler 63ff0fb0c9 Fixed important EWTS->Tibetan conversion bugs. [g.yogs] (and maybe
[hUM^]) are not yet converting correctly.

I have not yet committed the end-to-end test that I'm manually doing
to find these problems.  It will be another document for
TMW_RTF_TO_THDL_WYLIETest.java.  Note that thdl.debug=true is
essential to access the GUI for the EWTS->* converters.
2005-07-06 07:46:21 +00:00
dchandler 0b3a636f63 Tremendously better EWTS->Unicode and EWTS->TMW conversion, though still not tested end-to-end and without perfect unit tests. See EWTSTest.RUN_FAILING_TESTS, for example, to find imperfection. 2005-07-06 02:19:38 +00:00
dchandler affb9e4b5e Still trying to get the tests to complete on thdl.org's servers.
This will surely work.
2005-06-25 21:51:13 +00:00
dchandler 1062ce9b6a Trying to make the tests run on thdl.org's servers. Yesterday's change didn't do it; maybe this will but it's just a guess as I can't log on to their servers without time and effort. Reverting yesterday's change since it didn't matter. 2005-06-23 18:46:39 +00:00
dchandler b9f4ed21ab Disabling a test that ran for *way* too long on thdl.org servers 2005-06-22 19:01:17 +00:00
dchandler 2678fc134a Added UI for EWTS->Tibetan conversions. GUI is disabled except in
debug mode for now.

I tested against a really simple-but-real document, found a bug with '*', tried
to implement TMW vowel code but I don't trust it yet.  Differentiated EWTS
code from ACIP where needed.

Several bugs in ewts->tibetan have been exposed; see the TODO
comments.
2005-06-20 09:30:35 +00:00
dchandler 7198f23361 I really hesitate to commit this because I'm not sure what it brings to the
table exactly and I fear that it makes the ACIP->Tibetan converter code
a lot uglier.  The TODO(DLC)[EWTS->Tibetan] comments littered throughout
are part of the ugliness; they point to the ugliness.  If each were addressed,
cleanliness could perhaps be achieved.

I've largely forgotten exactly what this change does, but it attempts to
improve EWTS->Tibetan conversion.  The lexer is probably really, really
primitive.  I concentrate here on converting a single tsheg bar rather than
a whole document.

Eclipse was used during part of my journey here and some imports were
reorganized merely because I could.  :)

(Eclipse was needed when the usual ant build failed to run a new test
EWTSTest.  And I wanted its debugger.)

Next steps: end-to-end EWTS tests should bring many problems to light.  Fix
those.  Triage all the TODO comments.

I don't know that I'll ever really trust the implementation.  The tests are
valuable, though.  A clean implementation of EWTS->Tibetan in Jython
might hold enough interest for me; I'd like to learn Python.
2005-06-20 06:18:00 +00:00
amontano f64bae8ea6 Fixed loading default dictionary at the beginning of the installed stand-alone version
but making it easily changeable if the user selects another dictionary.
2005-05-13 04:25:59 +00:00
amontano 91d67f360f *** empty log message *** 2005-05-13 04:24:03 +00:00
amontano e096aee67b Gave higher priority to the dictionary name specified in the properties file than to the dictionary name passed as an argument. 2005-04-26 05:25:19 +00:00
amontano 14ad55aa3f *** keyword substitution change *** 2005-04-26 05:24:02 +00:00
amontano 30470cb634 Script to generate off-line installer for the Translation Tool using NSIS. 2005-04-26 05:21:21 +00:00
amontano c0638d3284 icon for the translation tool's short-cut 2005-04-26 05:19:53 +00:00
amontano 1c521d757c added more methods 2005-04-25 09:34:14 +00:00
amontano 4f821b7965 fixed double quote problem that showed up on some dictionaries (that where pre-processed in excel) 2005-04-25 09:29:39 +00:00
amontano 83f5c19a13 Now it catches exception and displays it to System.err when vowel errors (oo, ie, ee, etc.) come up in converting from wylie to tmw. 2005-04-25 09:28:34 +00:00
amontano b1cc500abf Fixed stability bug 2005-03-08 09:50:52 +00:00
amontano 57895d945f Fixed whitespaces that got screwed up. 2005-03-08 07:59:05 +00:00
amontano 92e9a15d84 Updated the version numbers to reflect the new release with the conversion errors fixed. 2005-03-07 05:48:18 +00:00
amontano 07fd75f2c7 Added a method to write the list into a file. Useful for debugging. 2005-03-06 10:03:39 +00:00
amontano 063e04b7c4 Fixed scope of class and methods. (from protected to public or private). 2005-03-06 10:02:43 +00:00
amontano db86a19d2b Added code to not fall into a ArrayIndexOutOfBounds. Still not clear why it happens. 2005-03-06 10:00:24 +00:00
amontano a86bad152f This classes were for private use for processing some dictionaries. They are out of place here. 2005-02-28 03:22:10 +00:00
amontano d4dc004c2a Made some changes to the OnlineTranslationTool. Now when text is redisplayed as links, is shows up with the declension marks. 2005-02-28 03:04:01 +00:00
dchandler ff130e6fb1 Organized imports in Eclipse. 2005-02-28 01:37:54 +00:00
dchandler f8bd7e83cb Fixed comment. 2005-02-27 22:08:47 +00:00
dchandler 6bc72956b6 New tests based on Paul Hackett's e-mail from today. 2005-02-27 20:59:33 +00:00
dchandler 24403bb395 Jskad has two new menu items that are a shortcut for running a standalone
TMW->Unicode conversion: 'Save as Unicode RTF...' and 'Save as
Unicode UTF-8 Text...'.
2005-02-27 10:27:37 +00:00
amontano 107fcce565 Added debug info to the servlet version. 2005-02-27 05:48:03 +00:00
dchandler 9507ff3694 At David Germano's request, Jskad's UI now uses
'Tibetan Machine Web (non-Unicode)' rather than 'Tibetan'.

I often use the term 'Tibetan' to mean 'Tibetan (either in Unicode,
TM or TMW in RTF, or any other scheme where it appears as Tibetan
instead of Roman transliteration'.  So this is a good change in my opinion,
though 'TMW' or 'Legacy TMW' is shorter.
2005-02-27 00:14:37 +00:00
dchandler 06d93082cb Better help - OO.org 1.9.79 isn't broken. 2005-02-26 23:58:17 +00:00
dchandler 9cfedadab7 Jskad has a new menu item, 'Copy as Unicode', which does a
TMW->Unicode conversion.
2005-02-26 22:57:38 +00:00
dchandler 25f5440218 We had some useless code for enabling and disabling the clipboard; I've
killed it.  Jskad in applets would take QA and some code changes, I bet, so
a TODO is good enough.

Documented better the clipboard code.
2005-02-26 20:46:40 +00:00
dchandler c16f633ecf Two things:
One, TMW->EWTS gives dbas and dngas instead of dabs and dangs
because Chris Fynn's e-mail from today has dbas and dngas.

Second, Down with ACIPRules.  Long live ACIPTraits.  EWTS->Tibetan
conversion is closer still.
2005-02-22 04:36:54 +00:00
dchandler 03483d88c6 This memory hog of a test means that ACIP->TMW regressions will be
harder to introduce.
2005-02-21 06:08:19 +00:00
dchandler 3feef9232a I got this by running Ant with the environment variable ANT_OPTS set to
'-Xmx512m'.  Otherwise I ran out of memory.

This file allows for a sizeable ACIP->TMW regression test.
2005-02-21 05:52:16 +00:00
dchandler 4c268c5ea2 Refactored so that there can be an EWTS scanner and an ACIP scanner. 2005-02-21 05:37:01 +00:00
amontano 7854e4fd93 Changed minor optimization things 2005-02-21 05:27:36 +00:00
amontano 686756116f Fixed importation errors for the translation tool. 2005-02-21 05:10:00 +00:00
dchandler 4b4787411b All I *meant* to do with this commit (you tell me if I did more) was to change
from \r\r\n to \r\n.  I added these files on Linux with \r\n and should've
added them as binary or done a dos2unix first.
2005-02-21 05:05:13 +00:00
dchandler 3e0168b384 Renamed ACIPConverter to TConverter. Added a needed parameter (the
only needed parameter in that class's interface AFAIK.
2005-02-21 01:35:23 +00:00
dchandler 37bf9a736d I did this stuff back in August. It's all in support of EWTS->Tibetan
conversion.  The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the
place.  EWTS->Tibetan isn't here yet; lexing isn't here yet; this is
mainly a refactoring so that the ACIP->Tibetan code can be reused to
do EWTS->Tibetan.

I'm committing this because tests pass (it shouldn't be breaking
anything), because I want a checkpoint, and because the laptop this
sandbox was on isn't my preferred development environment.
2005-02-21 01:16:10 +00:00
dchandler 83f499b7a8 Formatting in TMW documents is not preserved. I've added an identity
tranformation, TMW->TMW, to help me debug this problem.
2005-02-13 00:34:47 +00:00
dchandler 9025fb42d6 TMW->EWTS 998476 partial fix: "aM" is generated now correctly. Before
you got "M".
2005-02-07 04:00:42 +00:00
dchandler 8dcb623382 TMW->EWTS:
Fixed part of bug 998476 and part of an undocumented bug.  Discovered a
new bug, "aM" should be generated but only "M" is.

The undocumented bug was that laMA was generated when lAM should have been.

The part of bug 998476 that was fixed: laM, laH, etc. are now generated.

This does nothing about paN etc.

Some refactoring here; this is not a minimal diff.

Added tests of TMW->EWTS that use ACIP to get the TMW in place
because EWTS->TMW is a faulty keyboard at present.
2005-02-07 03:17:40 +00:00
amontano a82afad92c Updated methods to help with the dictionary clean up. 2005-02-06 23:19:44 +00:00
dchandler 96d0d0d9d0 My previous commit message failed to mention the following:
I refactored the code trying to fit it onto one screen.  So not all of the
changes are material to the bug fix.

About this commit: TMW->Wylie for {b.s.d} now gives bsad instead of bas.d.
This fixes part of bug 998476, and is done because Andres thinks it'll work
most of the time.  But don't be surprised if an exception comes up in the
future and we have to trivially change the code to catch it.
2005-02-05 22:37:02 +00:00
dchandler 287fc181a0 Fix for part of bug 998476. 2005-02-05 22:16:39 +00:00
dchandler be632e1874 Cleaned up some code that is relevant to the bug I'm looking into. I need to
instrument it.  Functionally, this change is a no-op.  I just don't want to
confuse refactoring with the actual bug fix.
2005-02-05 19:29:37 +00:00
dchandler 00961b633f Added a test case for bug 998476. No fix, just a test case verifying the bug. 2005-02-05 18:47:17 +00:00
dchandler b4155c3264 After a1tsal's changes to tibwn.ini, the tests failed. I'm a bit disheartened
that more tests didn't fail.
2005-02-05 16:51:13 +00:00
dchandler 7304c770c9 Just a better comment. 2005-02-05 16:27:34 +00:00
amontano 78373b3094 updated the servlet version. logging still needs work. 2005-01-23 00:57:02 +00:00
a1tsal 28d46bb207 Add corrective comment regarding the bogus Unicode OM characters. 2005-01-19 01:07:52 +00:00
a1tsal affccad9e5 Use 00A0 rather than 0020 for _, per unicode spec. 2005-01-17 08:58:56 +00:00
a1tsal 91d0e7f4da Use "precomposed" sanskrit consonant combinations consistently throughout. 2005-01-17 08:49:04 +00:00
a1tsal 22cfee69db Had wrong Unicode for n+n+y.
Had wrong Unicode for space -- but only in comment.

Cleaned up punctuation in another comment.
2005-01-16 01:17:44 +00:00
dchandler 0b0af67ed9 Ximalaya is not nearly as nice as Tibetan Machine Uni, so use the latter. 2005-01-04 02:20:59 +00:00
amontano 88ebeaed3b Fixed pocket pc version to be completely swing-less. 2004-12-23 19:31:26 +00:00
dchandler aa5d86a6e3 The *->Unicode conversions were outputting Unicode that was not
well-formed.  They still do, but they do it less often.

Chris Fynn wrote this a while back:

   By normal Tibetan & Dzongkha spelling, writing, and input rules
   Tibetan script stacks should be entered and written: 1 headline
   consonant (0F40->0F6A), any subjoined consonant(s) (0F90-> 0F9C),
   achung (0F71), shabkyu (0F74), any above headline vowel(s) (0F72
   0F7A 0F7B 0F7C 0F7D and 0F80); any ngaro (0F7E, 0F82 and 0F83).

Now efforts are made to ensure that the converters conform to the
above rules.
2004-12-13 02:32:46 +00:00
eg3p c4f4288d2f eliminated some unnecessary comments i had left in there 2004-08-19 19:51:21 +00:00
eg3p a39d5c2ba3 changed all occurrences of 'Color.BLACK' to 'Color.black', since the former causes a runtime error on Mac OS X (for Java 1.3.1 at least). not sure why this is, but may be related to the bug mentioned at http://www.oxygenxml.com/forum/viewtopic.php?p=239 2004-08-19 14:59:06 +00:00
amontano afd3a95a21 Updated the dictionary structure to allow grouping of dictionaries, this is the first step to try to clean up a bit the massive repetitions in dictionaries. 2004-08-13 04:47:35 +00:00
dchandler 6bb0646f1c Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:48:27 +00:00
dchandler 11c3898ad2 Now the Sambhota keyboard crashing bug.
Fixed crashing bug reported by Teresa Lam.  Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:46:39 +00:00
dchandler e101cc8294 Now the Sambhota keyboard crashing bug.
Fixed crashing bug reported by Teresa Lam.  Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:36:35 +00:00
dchandler 6cbea9f894 Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:10:38 +00:00
amontano 805124ebc0 fixed some stuff in dictionary importation, 2004-07-05 03:56:20 +00:00
amontano 0fd26a6c56 now uses strictduffpane instead of duffpane for "smart-pasting". 2004-07-05 03:54:45 +00:00
amontano 353db0c352 fixed a bit the interface for the handheld version of the translation tool. 2004-07-05 03:52:30 +00:00
amontano 07c045341b included a guessIfAcip method. Used by the translation tool to paste wylie or
ACIP accordingly.
2004-07-05 03:51:11 +00:00
amontano 5b2bb9af11 identical to duffpane but customized to the translation tool. the only change is
that the paste method is over-ridden for "smart-pasting". If pasting TMW paste
as is. If pasting TM, converts to TMW. If neither of these fonts are used,
assumes transliteration and converts to TMW.
2004-07-05 03:49:50 +00:00
amontano 707ae5e526 updated version number 2004-07-05 03:45:02 +00:00
amontano 650109200f fixed the paste. when pasting from text (non-rtf) it used to produce garbage.
now it interprets it as wylie. also made some attributes protected to inherit it.
2004-07-05 03:43:42 +00:00
dchandler bff0e6b2fc Fixed TMW->Wylie/ACIP when multiple font sizes are in use. I was not
incrementing the offset at which I was inserting text properly.
2004-06-25 00:22:10 +00:00
dchandler 14fb449f95 I thought my earlier commit preserved font size info for TMW->ACIP/Wylie
conversions.  It was only at a very coarse level.  The feature is now truly
here.
2004-06-20 02:57:28 +00:00
dchandler 8ccf57dccb TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-15 02:20:28 +00:00
amontano aee8630986 updated version info and fixed a parsing error. 2004-06-14 03:42:35 +00:00
dchandler e18a4417dc Added a FIXME comment. 2004-06-12 02:26:28 +00:00
dchandler 9f78cabb18 TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-12 02:09:28 +00:00
dchandler 7acbce3361 Added errors 142 and 143, which are produced when converting yig chung
to a Unicode text file, which cannot support font size changes.
2004-06-06 21:59:16 +00:00
dchandler 1db0ec7bb5 Fixed javadoc comments. 2004-06-06 21:39:45 +00:00
dchandler df262aa148 It is now a compile-time option whether to treat []- and {}-bracketed sequences
as text to be passed through (without the brackets in the case of {}) literally,
which is the case by default because Robert Chilton requested it, or the old,
ad-hoc mechanism which could be useful for finding some ugly input.

Made a couple of error messages a little more verbose now that we have
short-message mode.
2004-06-06 21:39:06 +00:00
dchandler a69f7588b2 I broke warning 507 into two warnings, one high-priority (512) and one
low-priority (507).
2004-05-01 20:55:13 +00:00
dchandler fd7cba4439 Changed menu item name. 2004-05-01 20:52:22 +00:00
dchandler 8a9271a3d8 I broke warning 507 into two warnings, one high-priority (512) and one
low-priority (507).
2004-05-01 20:49:53 +00:00