Commit Graph

666 Commits

Author SHA1 Message Date
dchandler 4c268c5ea2 Refactored so that there can be an EWTS scanner and an ACIP scanner. 2005-02-21 05:37:01 +00:00
amontano 7854e4fd93 Changed minor optimization things 2005-02-21 05:27:36 +00:00
amontano 686756116f Fixed importation errors for the translation tool. 2005-02-21 05:10:00 +00:00
dchandler 4b4787411b All I *meant* to do with this commit (you tell me if I did more) was to change
from \r\r\n to \r\n.  I added these files on Linux with \r\n and should've
added them as binary or done a dos2unix first.
2005-02-21 05:05:13 +00:00
dchandler 3e0168b384 Renamed ACIPConverter to TConverter. Added a needed parameter (the
only needed parameter in that class's interface AFAIK.
2005-02-21 01:35:23 +00:00
dchandler 37bf9a736d I did this stuff back in August. It's all in support of EWTS->Tibetan
conversion.  The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the
place.  EWTS->Tibetan isn't here yet; lexing isn't here yet; this is
mainly a refactoring so that the ACIP->Tibetan code can be reused to
do EWTS->Tibetan.

I'm committing this because tests pass (it shouldn't be breaking
anything), because I want a checkpoint, and because the laptop this
sandbox was on isn't my preferred development environment.
2005-02-21 01:16:10 +00:00
dchandler 83f499b7a8 Formatting in TMW documents is not preserved. I've added an identity
tranformation, TMW->TMW, to help me debug this problem.
2005-02-13 00:34:47 +00:00
dchandler 9025fb42d6 TMW->EWTS 998476 partial fix: "aM" is generated now correctly. Before
you got "M".
2005-02-07 04:00:42 +00:00
dchandler 8dcb623382 TMW->EWTS:
Fixed part of bug 998476 and part of an undocumented bug.  Discovered a
new bug, "aM" should be generated but only "M" is.

The undocumented bug was that laMA was generated when lAM should have been.

The part of bug 998476 that was fixed: laM, laH, etc. are now generated.

This does nothing about paN etc.

Some refactoring here; this is not a minimal diff.

Added tests of TMW->EWTS that use ACIP to get the TMW in place
because EWTS->TMW is a faulty keyboard at present.
2005-02-07 03:17:40 +00:00
amontano a82afad92c Updated methods to help with the dictionary clean up. 2005-02-06 23:19:44 +00:00
dchandler 96d0d0d9d0 My previous commit message failed to mention the following:
I refactored the code trying to fit it onto one screen.  So not all of the
changes are material to the bug fix.

About this commit: TMW->Wylie for {b.s.d} now gives bsad instead of bas.d.
This fixes part of bug 998476, and is done because Andres thinks it'll work
most of the time.  But don't be surprised if an exception comes up in the
future and we have to trivially change the code to catch it.
2005-02-05 22:37:02 +00:00
dchandler 287fc181a0 Fix for part of bug 998476. 2005-02-05 22:16:39 +00:00
dchandler be632e1874 Cleaned up some code that is relevant to the bug I'm looking into. I need to
instrument it.  Functionally, this change is a no-op.  I just don't want to
confuse refactoring with the actual bug fix.
2005-02-05 19:29:37 +00:00
dchandler 00961b633f Added a test case for bug 998476. No fix, just a test case verifying the bug. 2005-02-05 18:47:17 +00:00
dchandler b4155c3264 After a1tsal's changes to tibwn.ini, the tests failed. I'm a bit disheartened
that more tests didn't fail.
2005-02-05 16:51:13 +00:00
dchandler 7304c770c9 Just a better comment. 2005-02-05 16:27:34 +00:00
amontano 78373b3094 updated the servlet version. logging still needs work. 2005-01-23 00:57:02 +00:00
a1tsal 28d46bb207 Add corrective comment regarding the bogus Unicode OM characters. 2005-01-19 01:07:52 +00:00
a1tsal affccad9e5 Use 00A0 rather than 0020 for _, per unicode spec. 2005-01-17 08:58:56 +00:00
a1tsal 91d0e7f4da Use "precomposed" sanskrit consonant combinations consistently throughout. 2005-01-17 08:49:04 +00:00
a1tsal 22cfee69db Had wrong Unicode for n+n+y.
Had wrong Unicode for space -- but only in comment.

Cleaned up punctuation in another comment.
2005-01-16 01:17:44 +00:00
dchandler 0b0af67ed9 Ximalaya is not nearly as nice as Tibetan Machine Uni, so use the latter. 2005-01-04 02:20:59 +00:00
amontano 88ebeaed3b Fixed pocket pc version to be completely swing-less. 2004-12-23 19:31:26 +00:00
dchandler aa5d86a6e3 The *->Unicode conversions were outputting Unicode that was not
well-formed.  They still do, but they do it less often.

Chris Fynn wrote this a while back:

   By normal Tibetan & Dzongkha spelling, writing, and input rules
   Tibetan script stacks should be entered and written: 1 headline
   consonant (0F40->0F6A), any subjoined consonant(s) (0F90-> 0F9C),
   achung (0F71), shabkyu (0F74), any above headline vowel(s) (0F72
   0F7A 0F7B 0F7C 0F7D and 0F80); any ngaro (0F7E, 0F82 and 0F83).

Now efforts are made to ensure that the converters conform to the
above rules.
2004-12-13 02:32:46 +00:00
eg3p c4f4288d2f eliminated some unnecessary comments i had left in there 2004-08-19 19:51:21 +00:00
eg3p a39d5c2ba3 changed all occurrences of 'Color.BLACK' to 'Color.black', since the former causes a runtime error on Mac OS X (for Java 1.3.1 at least). not sure why this is, but may be related to the bug mentioned at http://www.oxygenxml.com/forum/viewtopic.php?p=239 2004-08-19 14:59:06 +00:00
amontano afd3a95a21 Updated the dictionary structure to allow grouping of dictionaries, this is the first step to try to clean up a bit the massive repetitions in dictionaries. 2004-08-13 04:47:35 +00:00
dchandler 6bb0646f1c Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:48:27 +00:00
dchandler 11c3898ad2 Now the Sambhota keyboard crashing bug.
Fixed crashing bug reported by Teresa Lam.  Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:46:39 +00:00
dchandler e101cc8294 Now the Sambhota keyboard crashing bug.
Fixed crashing bug reported by Teresa Lam.  Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:36:35 +00:00
dchandler 6cbea9f894 Fixed crashing bug reported by Teresa Lam. Added tests so that I'm fairly
certain that no more crashing bugs exist.  Removed a marker for iffy code
after understanding that code via test cases.
2004-07-05 04:10:38 +00:00
amontano 805124ebc0 fixed some stuff in dictionary importation, 2004-07-05 03:56:20 +00:00
amontano 0fd26a6c56 now uses strictduffpane instead of duffpane for "smart-pasting". 2004-07-05 03:54:45 +00:00
amontano 353db0c352 fixed a bit the interface for the handheld version of the translation tool. 2004-07-05 03:52:30 +00:00
amontano 07c045341b included a guessIfAcip method. Used by the translation tool to paste wylie or
ACIP accordingly.
2004-07-05 03:51:11 +00:00
amontano 5b2bb9af11 identical to duffpane but customized to the translation tool. the only change is
that the paste method is over-ridden for "smart-pasting". If pasting TMW paste
as is. If pasting TM, converts to TMW. If neither of these fonts are used,
assumes transliteration and converts to TMW.
2004-07-05 03:49:50 +00:00
amontano 707ae5e526 updated version number 2004-07-05 03:45:02 +00:00
amontano 650109200f fixed the paste. when pasting from text (non-rtf) it used to produce garbage.
now it interprets it as wylie. also made some attributes protected to inherit it.
2004-07-05 03:43:42 +00:00
dchandler bff0e6b2fc Fixed TMW->Wylie/ACIP when multiple font sizes are in use. I was not
incrementing the offset at which I was inserting text properly.
2004-06-25 00:22:10 +00:00
dchandler 14fb449f95 I thought my earlier commit preserved font size info for TMW->ACIP/Wylie
conversions.  It was only at a very coarse level.  The feature is now truly
here.
2004-06-20 02:57:28 +00:00
dchandler 8ccf57dccb TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-15 02:20:28 +00:00
amontano aee8630986 updated version info and fixed a parsing error. 2004-06-14 03:42:35 +00:00
dchandler e18a4417dc Added a FIXME comment. 2004-06-12 02:26:28 +00:00
dchandler 9f78cabb18 TMW->{Wylie,ACIP} conversions now preserve font size information. 2004-06-12 02:09:28 +00:00
dchandler 7acbce3361 Added errors 142 and 143, which are produced when converting yig chung
to a Unicode text file, which cannot support font size changes.
2004-06-06 21:59:16 +00:00
dchandler 1db0ec7bb5 Fixed javadoc comments. 2004-06-06 21:39:45 +00:00
dchandler df262aa148 It is now a compile-time option whether to treat []- and {}-bracketed sequences
as text to be passed through (without the brackets in the case of {}) literally,
which is the case by default because Robert Chilton requested it, or the old,
ad-hoc mechanism which could be useful for finding some ugly input.

Made a couple of error messages a little more verbose now that we have
short-message mode.
2004-06-06 21:39:06 +00:00
dchandler a69f7588b2 I broke warning 507 into two warnings, one high-priority (512) and one
low-priority (507).
2004-05-01 20:55:13 +00:00
dchandler fd7cba4439 Changed menu item name. 2004-05-01 20:52:22 +00:00
dchandler 8a9271a3d8 I broke warning 507 into two warnings, one high-priority (512) and one
low-priority (507).
2004-05-01 20:49:53 +00:00
dchandler 31bdd39fec The TMW for 'da'i was converting to 'aad'i. Andres found this; it is bug
945744.  I've made it more correct -- 'ad'i is now produced.  The wrong stack
is thought to be the root stack still.
2004-05-01 19:11:15 +00:00
dchandler 1a055f3472 I don't think warning level "None" was really doing the trick. Fixed that.
You can now customize the severities of all warnings, even 504 and 510.

When warning level is "None", scanning, i.e. lexical analysis, is faster.
2004-04-25 00:37:57 +00:00
dchandler e2d42f36eb Robert Chilton's experience inspired me to make the handling of errors and
warnings in ACIP->Tibetan conversion much more configurable.  You can
now choose from short or long error messages, for one thing.  You can change
the severity of almost all warnings.  Each error and warning has an error code.
Errors and warnings are better tested.

The converter GUI has a new checkbox for short messages; the converter
CLI has a new mandatory option for short messages.

I also fixed a bug whereby certain errors were not being appended to the
'errors' StringBuffer.
2004-04-24 17:49:16 +00:00
dchandler cc5d096918 David Chapman's latest fix to tibwn.ini (clearing up an issue that Than or I
dropped the ball on) introduced two lines for 8,95.  This is a bad thing, so
I've taken out the second line.  I've also introduced a check in
TibetanMachineWeb.java such that we'll know that tibwn.ini has no such
error in the future just by running 'ant clean jskad-run' and making sure that
the GUI is indeed visible.

I also updated the test baselines now that F03A and 0F82 are squared away.
2004-04-24 13:23:56 +00:00
a1tsal 9e071ea178 Differentiated 0F82 (~M`) and F03A (nyi.zla editor's mark). 2004-04-21 10:04:11 +00:00
dchandler 72442788c1 This displayed poorly for me, so I untabified it. Whitespace changed only. 2004-04-18 18:56:01 +00:00
dchandler 0ee90a0fb0 Added many ACIP->TMW->ACIP tests. They found no bugs. 2004-04-17 17:28:26 +00:00
dchandler 63438d243b getACIP was getting EWTS, not ACIP. 2004-04-17 15:49:40 +00:00
dchandler de3a19761e Fixes for javadoc tool. 2004-04-17 15:48:50 +00:00
dchandler adcf9de952 Two new tests. 2004-04-17 15:14:46 +00:00
dchandler 1bfd3772e6 TMW->ACIP is much improved. V and W were confused, # and * were
confused; many glyphs that should have yielded errors were not.

I've added a test case that transforms every TMW glyph save the one with
no TM mapping to ACIP.  I hand-checked that it was correct.

ACIP->TMW is fixed for # and *.  I never noticed it, but each needed an
extra swoosh (U+0F05).

Round-tripping would be good, as would testing real-world use of
TMW->ACIP.
2004-04-14 05:44:51 +00:00
dchandler 244a9d1370 TiblEdit's diacritics panel now works -- dia.dat has been added to the
repository and to TiblEdit's jar.
2004-04-14 05:12:00 +00:00
dchandler 56a02ba41d Fixed the worst TMW->ACIP bug, the one regarding U+0F04 and U+0F05.
TMW->EWTS requires no context information, but TMW->ACIP does.
2004-04-10 18:26:57 +00:00
dchandler 9e7ccf2894 TMW->Unicode conversions have changed; now using U+0F6A for the stacks
whose EWTS transliteration begins with "R+".

ACIP->* conversions and test baselines were updated to deal with the
"r+..."=>"R+..."  change.
2004-04-10 16:58:45 +00:00
dchandler 7eca276a62 TMW->Unicode conversions have changed; now using U+0F6A for the stacks
whose EWTS transliteration begins with "R+".

ACIP->* conversions and test baselines were updated to deal with the
"r+..."=>"R+..."  change.
2004-04-10 16:03:25 +00:00
dchandler aff34174ab The new EWTS rule regarding R, W, and Y requires that these change. It
may also require changes to the following, but I'm going to ask if it really
should or not.

// Y+Y~185,3~~6,98~1,109~6,120~1,123~1,125~6,106~6,113~f61,fbb
// Y+r~186,3~~6,99~1,109~6,120~1,123~1,125~6,106~6,113~f61,fb2
// Y+w~187,3~~6,100~1,109~6,120~1,123~1,125~6,106~6,113~f61,fad
// Y+s~188,3~~6,101~1,109~6,120~1,123~1,125~6,106~6,113~f61,fb6

// W+y~69,4~~7,79~1,109~8,121~1,123~1,125~8,107~8,114~f5d,fb1
// W+r~70,4~~7,80~1,109~8,121~1,123~1,125~8,107~8,114~f5d,fb2
// W+n~195,4~~7,81~1,109~8,120~1,123~1,125~8,106~8,113~f5d,fa3
// W+W~194,4~~7,82~1,109~8,120~1,123~1,125~8,106~8,113~f5d,fba
2004-04-08 02:55:59 +00:00
dchandler 76356f4009 ACIP->Tibetan now gives an error when {?} is seen alone (not in {[?]} or {[*FOO?]}, but alone). Bug 860192 is fixed. 2004-03-15 00:49:01 +00:00
dchandler 542fb50bf1 The ~M and ~M` EWTS change had not fully been made. Someone submitted a bug report 911472 that alerted me to this. 2004-03-07 17:02:35 +00:00
dchandler e0928d8472 New EWTS for 0F82 and 0F83. 2004-03-06 23:00:40 +00:00
amontano bb8fa6c58f Now the clear button in the http servlet version actually clears. Also added "synchronized" to some methods to ensure that concurrent threads don't crash. 2004-03-03 00:33:18 +00:00
dchandler d436a4d462 Removed David Chapman's recently added line for U+0F82 -- a line for U+0F82 already existed, and the new line had incorrect TM and incorrect TMW mappings. I changed the existing line for U+0F82 to use the EWTS {~M`}. 2004-03-02 04:29:41 +00:00
a1tsal 8eaaeaa202 Fix careless error: I had the same TMW character for ~M and ~M`! 2004-02-22 09:14:56 +00:00
a1tsal b14833b5b9 Change ^M to ~M to conform to spec.
Introduce ~M` (for 0F82).
2004-02-20 15:07:49 +00:00
amontano e5454d3720 Updated the translation tool to conform to the Personal Profile specification of Java.
Before it would run in pocket pc's through the more restricted personalJava specification
but Sun's vm for pocket pc's project was terminated. Now it is designed to run under
IBM's VM for pocket pc's called J9 which implements the Personal Profile specification.
Such specification also supports awt, but not swing so still there is no (hope for) support
of Tibetan script in the pocket pc's,
2004-02-07 18:21:17 +00:00
dchandler 274e1736be Deleted cut-and-paste goof. 2004-01-17 19:45:31 +00:00
dchandler c69ba26c60 TString now has tracks what Roman transliteration system it is using. Next up is to make ACIPConverter handle EWTS or ACIP TStrings. 2004-01-17 19:28:54 +00:00
dchandler 48b4c5cb07 Added a Unicode->ASCII dump for debugging *->Unicode conversions. To use it, use 'java -cp Jskad.jar org.thdl.util.VerboseUnicodeDump'. 2004-01-17 17:10:12 +00:00
dchandler 6fdb2a26bb Added a Unicode->ASCII dump for debugging *->Unicode conversions. To use it, use 'java -cp Jskad.jar org.thdl.util.VerboseUnicodeDump'. 2004-01-17 16:52:38 +00:00
dchandler 9dd95c5524 I saw this error when I wasn't expecting it, so now, curious, I print more details. 2004-01-17 16:51:33 +00:00
dchandler 4dd40809a5 A user reported that q` caused a crash with TCC keyboard #1. Fixed. TCC keyboard #1 does not support q~ though. 2003-12-21 06:27:36 +00:00
dchandler c1aa81e943 RFE 860190: ACIP->Unicode now gives a warning when it outputs something that can't be represented in TMW. 2003-12-16 07:45:40 +00:00
dchandler 848349fd3a More tests. 2003-12-15 08:16:06 +00:00
dchandler e7a9e7968f ACIP->Unicode now uses two characters for consonants instead of one. This matches the dislike for characters like U+0F77 etc.
ACIP->Tibetan was not giving an error for BCWA because it parsed like BCVA.  Fixed.
2003-12-15 07:32:14 +00:00
dchandler e9f7b2dfed If you want curly brackets around folio markers, you'll have to set
the system property
thdl.acip.to.x.output.curly.brackets.around.folio.markers to true.
2003-12-14 08:47:03 +00:00
dchandler 8664571577 Warnings were not being detected correctly. Fixed.
ACIP->Unicode uses U+0020, ' ', for whitespace.  ACIP->TMW uses the
TMW whitespace for whitespace.
2003-12-14 08:38:10 +00:00
dchandler 01e65176d4 Using less memory and time to figure out if warnings occurred. 2003-12-14 07:41:15 +00:00
dchandler 76c2e969ac Fixed ACIP->Unicode bug for YYE etc., things with full-formed
subjoined consonants and vowels.

Fixed ACIP->TMW for YYA etc., things with full-formed subjoined
consonants.
2003-12-14 07:36:21 +00:00
dchandler f625c937ee ACIP {B} was not being treated like {BA}; instead, an error was resulting. All the five prefixes were affected. 2003-12-14 05:54:07 +00:00
dchandler a0e6db11c0 Very minor cleanup. 2003-12-13 21:59:31 +00:00
dchandler 4c30657afa Adding tests for an ACIP keyboard that will never work correctly, and
probably never even be useful.  But they were lying around from a
while back, so here are the tests.
2003-12-13 21:34:33 +00:00
dchandler 02967539b0 Slightly improved Jskad's internal documentation. Links to converters' docs. 2003-12-10 07:04:35 +00:00
dchandler 581643cf59 {DAN,\nLHAG} used to be treated like {DAN, LHAG} but that got broken. Fixed.
Added tests for lexer's handling of ACIP spaces etc.
2003-12-10 06:55:16 +00:00
dchandler 8e673bbc2c {NGA,} becomes {NGA\u0f0c,} now instead of {NGA\u0f0b,}.
Note: ACIP->Unicode for {NGA,} was not giving the Unicode that {NGA\u0f0b,} gives before.
2003-12-10 06:50:14 +00:00
dchandler a466bad939 ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing. 2003-12-08 07:51:45 +00:00
dchandler a39c5c12b0 ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing. 2003-12-08 07:15:27 +00:00
dchandler 8f7322a056 Use absolute paths when invoking the external viewer; it doesn't know what our current working directory is. 2003-12-08 06:53:37 +00:00
dchandler b617f761d5 ACIP->TMW for {^GONG SA } used to fail; fixed. 2003-12-07 20:05:41 +00:00
dchandler 115534e688 ACIP->TMW for {^GONG SA } used to fail because we had \u0F38 in the ToWylie section. Now it's in the <?Input:Numbers?> section because I didn't want to introduce a new section. If WylieWord has trouble due to this misuse of the 'numbers' category, we'll introduce a new category, 'other'.
TMW->EWTS improved as a result -- {\u0F38.gonga sa } is produced now where {\u0F38agonga sa } was once produced.  Even the better version is imperfect; see bug 855877.
2003-12-07 19:40:59 +00:00
dchandler 597cf408dd Fixed help message. 2003-12-07 19:10:36 +00:00
dchandler 4adf87c401 Updated comments only. 2003-12-06 20:36:56 +00:00
dchandler 3f18623977 Added comments only. 2003-12-06 20:26:45 +00:00
dchandler 6232ee9170 Added comments referring to a user guide in development now. 2003-12-06 20:26:15 +00:00
dchandler c43e9a446b Revamped some ACIP->Tibetan error messages. 2003-12-06 20:19:40 +00:00
dchandler c9c771d1ee ACIP {&}, as in {KO&HAm,}, is supported. 2003-11-30 02:18:59 +00:00
dchandler ac412c994b Now {Pm} is treated like {PAm}; {Pm:} is like {PAm:}; {P:} is like {PA:}. 2003-11-30 02:06:48 +00:00
dchandler e7c4cc1874 Updated to be in sync with latest EWTS draft. 2003-11-29 22:59:39 +00:00
dchandler ffd041e32c ACIP->TMW and ACIP->Unicode now allow for Unicode escapes like K\u0F84. This means that the lack of support for ACIP's backslash, '\\', is mitigated because you can turn ACIP {K\} into ACIP {K\u0F84}.
Support for U+F021-U+F0FF, the PUA that the latest EWTS uses, is not provided.

Also, we've traded some speed for memory -- DuffCode now uses bytes, not ints.
2003-11-29 22:57:12 +00:00
dchandler dfaae4be93 ACIP->TMW and ACIP->Unicode now allow for Unicode escapes like K\u0F84. This means that the lack of support for ACIP's backslash, '\\', is mitigated because you can turn ACIP {K\} into ACIP {K\u0F84}.
Support for U+F021-U+F0FF, the PUA that the latest EWTS uses, is not provided.
2003-11-29 22:56:18 +00:00
dchandler 946d8cbc72 Updated the code I used for testing to generate the file containing all glyphs in TM and all glyphs but one in TMW. 2003-11-29 16:22:26 +00:00
dchandler 16bfeac641 These issues are non-issues; removing these comments. 2003-11-25 00:31:33 +00:00
dchandler d3d0ff23a8 Chris Fynn and Tony Duff answered my questions about U+0F3F and U+0F3E. 2003-11-25 00:28:18 +00:00
dchandler b8608797aa Updated the code I used for testing to generate the file containing all glyphs in TM and all glyphs but one in TMW. 2003-11-24 05:59:32 +00:00
dchandler 8d18ac53cb N+D+Ya, not N+D+ya, w+Wa, not w+wa .. use W, R, and Y where appropriate.
Found another inconsistency between Unicode and the TM/TMW docs.  I've sent e-mail to Tony Duff asking who's right, but I'm putting this in the errata under the assumption that even if Unicode is wrong, Unicode's wrong view will somehow rule the day.

Also, TMW->EWTS now generates \uF021-\uF0FF or \u0F00-\u0FFF escapes when appropriate.  A few TMW glyphs still give errors.

Also, there's now a test to be sure that TM<->TMW and TMW->EWTS won't break in the future (except for the one glyph in TMW that isn't in TM, that one isn't tested).  The baselines have not been hand-verified, but changes will be detected.
2003-11-24 05:50:42 +00:00
dchandler 5d053b41fe Found another inconsistency between Unicode and the TM/TMW docs. I've sent e-mail to Tony Duff asking who's right, but I'm putting this in the errata under the assumption that even if Unicode is wrong, Unicode's wrong view will somehow rule the day.
Also, TMW->EWTS now generates \uF021-\uF0FF or \u0F00-\u0FFF escapes when appropriate.  A few TMW glyphs still give errors.

Also, there's now a test to be sure that TM<->TMW and TMW->EWTS won't break in the future (except for the one glyph in TMW that isn't in TM, that one isn't tested).  The baselines have not been hand-verified, but changes will be detected.
2003-11-24 05:49:15 +00:00
dchandler 9a247f5932 N+D+Ya, not N+D+ya, w+Wa, not w+wa .. use W, R, and Y where appropriate. 2003-11-24 04:55:11 +00:00
dchandler 1ec668c018 Dza is not in the latest EWTS draft. 2003-11-24 04:28:55 +00:00
dchandler f76c089366 Using Y, R, and W everywhere needed. R+... is never needed in TM/TMW, I concluded (with 50% certainty). 2003-11-24 04:05:59 +00:00
dchandler 08c676c186 Bug fixes. Plus, now 99% in sync with the new EWTS draft. Search for 'DLC' to find a few open issues.
Readded the line for reversed dza; it should never have been deleted, as that breaks TM<->TMW.  I tested the whole mapping by hand once; this incident shows that automation is very helpful.

'{' and '}' were swapped...

The Unicode for something was "", not "none".

+R, +W, +Y, R+ now in use (though more testing is needed)
2003-11-24 02:40:40 +00:00
dchandler 216c5b0d54 Fixed TWM->Wylie for achen. I even tested this by pretending achen could take a da prefix (when in reality it takes no prefixes). 2003-11-23 01:22:27 +00:00
dchandler 37e8dfa917 The menu now says (Buggy) in front of "Convert Selection from Wylie to Tibetan" because this feature is, you guessed it, buggy. 2003-11-22 22:48:41 +00:00
dchandler 113480a882 X is now better supported, so this changed. 2003-11-15 20:00:59 +00:00
dchandler 8d4fb5d13f We crashed before when '~' was entered. 2003-11-14 04:50:55 +00:00
dchandler b59b86fd73 Commented this to mention some recent testing. 2003-11-11 03:45:58 +00:00
dchandler 4023be9612 Better prettyprinting. Untested. 2003-11-11 03:43:26 +00:00
dchandler 4e6a9c299f ACIP % {MTHAR%} and o {Ko} and ^ {^GONG SA} are now supported. A % always causes a warning. 2003-11-11 03:43:11 +00:00
dchandler 2cb90bd231 ACIP->Tibetan converters now warn every time {%} is encountered that U+0F14 might've been intended.
The Unicode for ACIP {o} is U+0F37.
2003-11-09 23:15:58 +00:00
dchandler 084e12a02c Import Wylie is a buggy feature. The menu now calls it "(Buggy) Import Wylie...". t+s+w doesn't even convert correctly!
Bug-free EWTS->TMW using the org.thdl.tib.text.ttt codebase will be here soon.
2003-11-09 01:25:58 +00:00
dchandler 04816acb74 ACIP->Unicode was broken for KshR, ndRY, ndY, YY, and RY -- those
stacks that use full-form subjoined RA and YA consonants.

ACIP {RVA} was converting to the wrong things.

The TMW for {RVA} was converting to the wrong ACIP.

Checked all the 'DLC' tags in the ttt (ACIP->Tibetan) package.
2003-11-09 01:07:45 +00:00
dchandler 8193cef5d1 Better comments. 2003-11-09 01:07:07 +00:00
dchandler dbd9c80ca0 Special tests for rwa and r+wa, which are the only two different stacks with the same hash key modulo - and +. 2003-11-09 01:06:26 +00:00
dchandler 85e1e0701e Fixed crashing bug in Import Wylie. 2003-11-08 23:32:53 +00:00
dchandler 8fbd8850f8 New feature: Convert Selection from TWM to ACIP. 2003-11-08 23:22:06 +00:00
dchandler bab47c4910 There are now extensive tests to make sure that each Tibetan stack in TMW can be typed in using EWTS and correctly converted to TMW and then back to EWTS. These tests unearthed new bugs in the Tibetan! 5.1 docs. 2003-11-08 22:11:24 +00:00
dchandler 3fa417d3ee phywI, phywU, drwI and drwU now produce vowels and subjoined a-chungs. The Tibetan! 5.1 docs say I and U are not applicable to these stacks, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the I or U request -- we were silent. 2003-11-08 21:53:34 +00:00
dchandler e058d6252e phywu and drwu now produce zhabs-kyus. The Tibetan! 5.1 docs say the zhabs-kyu is not applicable to these stacks, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the zhabs-kyu request -- we were silent. 2003-11-08 21:48:08 +00:00
dchandler 55aaeef9d0 l+h+wu now produces a zhabs-kyu. The Tibetan! 5.1 docs say the zhabs-kyu is not applicable to l+h+w, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the zhabs-kyu request -- we were silent. 2003-11-08 21:23:50 +00:00
dchandler 06edf17b04 Once again, the wrong 'dreng-bu glyphs were listed in the Tibetan! 5.1 docs -- they were na-ro glyphs, actually. 2003-11-08 21:17:18 +00:00
dchandler f626a04d72 Tests t+r+n glyph. 2003-11-08 20:28:34 +00:00
dchandler 74d6bc61ab The wrong 'dreng-bu glyphs were listed in the Tibetan! 5.1 docs -- they were na-ro glyphs, actually. 2003-11-08 20:25:16 +00:00
dchandler a0ae0bf70d Fixes bug 800164. Jskad users can now enter t+r+n on the keyboard. Wylie Word should work for t+r+n too. 2003-11-08 17:50:10 +00:00
dchandler 0ac90d7c0f Nathanial -> Nathaniel 2003-11-08 03:42:51 +00:00
dchandler e3f1ed5914 Removed a DOS EOF character (^Z). I haven't a clue how it crept in -- the lexer doesn't let that kind of thing get into tsheg bars. 2003-10-27 13:58:45 +00:00
dchandler 94a43d3f39 Now anything not clearly native Tibetan is colored green when coloring is enabled. G'EEm is "native", though -- the only "vowel" that implies non-nativeness is {:}, as in {KA:}. 2003-10-26 18:56:48 +00:00
dchandler 5c36dd81d3 Fixed bug 830332, "Convert selected ACIP=>Tibetan busted". 2003-10-26 18:25:25 +00:00
dchandler e74547d743 GA-YOGS now parses like G-YOGS and GAYOGS do. 2003-10-26 18:06:38 +00:00
dchandler 61cf19932e ACIP {B5} and {7'} were problematic; that's fixed. 2003-10-26 17:47:35 +00:00
dchandler ad7b20e485 Added yet more metadata. 2003-10-26 16:05:30 +00:00
dchandler 1550fee41a Removed garbage. 2003-10-26 16:05:07 +00:00
dchandler fe33d67573 Added more metadata. There are 35 million+ tsheg bars here. 2003-10-26 15:35:08 +00:00
dchandler 050666d735 I'm committing this at 1:55 am EST on Sunday, October 26, 2003. There
is no compelling technical reason, but this way I get to have two
commits that are both before and after each other.

Freaky.
2003-10-26 06:56:12 +00:00