Jskad

Author	SHA1	Message	Date
dchandler	551f4f094e	More EWTS->Unicode tests.	2005-07-14 04:53:11 +00:00
dchandler	05214b8f14	EWTS->Uni was crashing for tabs.	2005-07-14 04:16:36 +00:00
dchandler	dc18165992	Added a class for performing EWTS->Unicode conversions during XSLT transformations. I haven't actually used it with Xalan XSLT yet, but it ought to work if TibetanHTML did (which it must have at one point). I do have a unit test, but an end-to-end test with Xalan is what we need.	2005-07-13 07:25:18 +00:00
dchandler	8ccd68789a	Since I had Eclipse fired up, I had it automatically organized the imports. It made two errors, but the compiler found them. I've cvs tagged the tree before doing this, just in case.	2005-07-11 03:10:32 +00:00
dchandler	6d419fe641	Numerous EWTS->Unicode and especially EWTS->TMW improvements. Fixed ordering of Unicode wowels. [ku+A] gives the correct Unicode now, e.g. EWTS->TMW looks better for some wacky wowels like, I'm guessing here, [ku+A]. EWTS->TMW should now give errors any time the full input isn't used. Previously, wacky wowels like [kai+-i] would lead to some droppage. EWTS->TMW->Unicode testing is now in effect. This found a ton of EWTS->TMW bugs, most or all of which are fixed now. TMW->Unicode is improved/fixed for { \u5350,\u534D,\u0F88+k,\u0F88+kh,U }. (Why U? "\u0f75" is discouraged in favor of "\u0f71\u0f74".) NOTE: TMW_RTF_TO_THDL_WYLIETest is still disabled for the nightly builds' sake, but I ran it in my sandbox and it passed.	2005-07-11 02:51:06 +00:00
dchandler	36122778b4	EWTS->TMW works now for [#] and for [//].	2005-07-10 05:36:35 +00:00
dchandler	33fc836e81	EWTS->Unicode for // now produces \u0f0e as it should.	2005-07-10 05:01:03 +00:00
dchandler	cddbbae9a1	Bulletproofed EWTS->Tibetan against nasty pseudo-EWTS like [RAM]. Renamed recoverACIP methods.	2005-07-07 02:54:36 +00:00
dchandler	982350371d	EWTS->TMW fixes. Wowel handling still isn't perfect but I'm lazy. Jskad now uses the new EWTS->TMW routine, not the old, and thus the "(Buggy)" label is [unfairly, perhaps] dropped.	2005-07-07 01:30:03 +00:00
dchandler	0f99c402df	My last commit left the tests broken. Doh. Also, I'm enabling EWTS->Tibetan converters in the GUI so that I can ask folks to try them out.	2005-07-06 22:55:19 +00:00
dchandler	b74af71efc	Better, but still flawed, handling of EWTS [^] (i.e., U+0F39).	2005-07-06 22:26:55 +00:00
dchandler	f5d87ab226	Fixed EWTS->Tibetan [g.yogs] bug.	2005-07-06 18:37:22 +00:00
dchandler	63ff0fb0c9	Fixed important EWTS->Tibetan conversion bugs. [g.yogs] (and maybe [hUM^]) are not yet converting correctly. I have not yet committed the end-to-end test that I'm manually doing to find these problems. It will be another document for TMW_RTF_TO_THDL_WYLIETest.java. Note that thdl.debug=true is essential to access the GUI for the EWTS->* converters.	2005-07-06 07:46:21 +00:00
dchandler	0b3a636f63	Tremendously better EWTS->Unicode and EWTS->TMW conversion, though still not tested end-to-end and without perfect unit tests. See EWTSTest.RUN_FAILING_TESTS, for example, to find imperfection.	2005-07-06 02:19:38 +00:00
dchandler	1062ce9b6a	Trying to make the tests run on thdl.org's servers. Yesterday's change didn't do it; maybe this will but it's just a guess as I can't log on to their servers without time and effort. Reverting yesterday's change since it didn't matter.	2005-06-23 18:46:39 +00:00
dchandler	b9f4ed21ab	Disabling a test that ran for way too long on thdl.org servers	2005-06-22 19:01:17 +00:00
dchandler	2678fc134a	Added UI for EWTS->Tibetan conversions. GUI is disabled except in debug mode for now. I tested against a really simple-but-real document, found a bug with '*', tried to implement TMW vowel code but I don't trust it yet. Differentiated EWTS code from ACIP where needed. Several bugs in ewts->tibetan have been exposed; see the TODO comments.	2005-06-20 09:30:35 +00:00
dchandler	7198f23361	I really hesitate to commit this because I'm not sure what it brings to the table exactly and I fear that it makes the ACIP->Tibetan converter code a lot uglier. The TODO(DLC)[EWTS->Tibetan] comments littered throughout are part of the ugliness; they point to the ugliness. If each were addressed, cleanliness could perhaps be achieved. I've largely forgotten exactly what this change does, but it attempts to improve EWTS->Tibetan conversion. The lexer is probably really, really primitive. I concentrate here on converting a single tsheg bar rather than a whole document. Eclipse was used during part of my journey here and some imports were reorganized merely because I could. :) (Eclipse was needed when the usual ant build failed to run a new test EWTSTest. And I wanted its debugger.) Next steps: end-to-end EWTS tests should bring many problems to light. Fix those. Triage all the TODO comments. I don't know that I'll ever really trust the implementation. The tests are valuable, though. A clean implementation of EWTS->Tibetan in Jython might hold enough interest for me; I'd like to learn Python.	2005-06-20 06:18:00 +00:00
dchandler	c16f633ecf	Two things: One, TMW->EWTS gives dbas and dngas instead of dabs and dangs because Chris Fynn's e-mail from today has dbas and dngas. Second, Down with ACIPRules. Long live ACIPTraits. EWTS->Tibetan conversion is closer still.	2005-02-22 04:36:54 +00:00
dchandler	4c268c5ea2	Refactored so that there can be an EWTS scanner and an ACIP scanner.	2005-02-21 05:37:01 +00:00
dchandler	3e0168b384	Renamed ACIPConverter to TConverter. Added a needed parameter (the only needed parameter in that class's interface AFAIK.	2005-02-21 01:35:23 +00:00
dchandler	37bf9a736d	I did this stuff back in August. It's all in support of EWTS->Tibetan conversion. The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the place. EWTS->Tibetan isn't here yet; lexing isn't here yet; this is mainly a refactoring so that the ACIP->Tibetan code can be reused to do EWTS->Tibetan. I'm committing this because tests pass (it shouldn't be breaking anything), because I want a checkpoint, and because the laptop this sandbox was on isn't my preferred development environment.	2005-02-21 01:16:10 +00:00
dchandler	9025fb42d6	TMW->EWTS 998476 partial fix: "aM" is generated now correctly. Before you got "M".	2005-02-07 04:00:42 +00:00
dchandler	8dcb623382	TMW->EWTS: Fixed part of bug 998476 and part of an undocumented bug. Discovered a new bug, "aM" should be generated but only "M" is. The undocumented bug was that laMA was generated when lAM should have been. The part of bug 998476 that was fixed: laM, laH, etc. are now generated. This does nothing about paN etc. Some refactoring here; this is not a minimal diff. Added tests of TMW->EWTS that use ACIP to get the TMW in place because EWTS->TMW is a faulty keyboard at present.	2005-02-07 03:17:40 +00:00
eg3p	a39d5c2ba3	changed all occurrences of 'Color.BLACK' to 'Color.black', since the former causes a runtime error on Mac OS X (for Java 1.3.1 at least). not sure why this is, but may be related to the bug mentioned at http://www.oxygenxml.com/forum/viewtopic.php?p=239	2004-08-19 14:59:06 +00:00
dchandler	7acbce3361	Added errors 142 and 143, which are produced when converting yig chung to a Unicode text file, which cannot support font size changes.	2004-06-06 21:59:16 +00:00
dchandler	df262aa148	It is now a compile-time option whether to treat []- and {}-bracketed sequences as text to be passed through (without the brackets in the case of {}) literally, which is the case by default because Robert Chilton requested it, or the old, ad-hoc mechanism which could be useful for finding some ugly input. Made a couple of error messages a little more verbose now that we have short-message mode.	2004-06-06 21:39:06 +00:00
dchandler	8a9271a3d8	I broke warning 507 into two warnings, one high-priority (512) and one low-priority (507).	2004-05-01 20:49:53 +00:00
dchandler	1a055f3472	I don't think warning level "None" was really doing the trick. Fixed that. You can now customize the severities of all warnings, even 504 and 510. When warning level is "None", scanning, i.e. lexical analysis, is faster.	2004-04-25 00:37:57 +00:00
dchandler	e2d42f36eb	Robert Chilton's experience inspired me to make the handling of errors and warnings in ACIP->Tibetan conversion much more configurable. You can now choose from short or long error messages, for one thing. You can change the severity of almost all warnings. Each error and warning has an error code. Errors and warnings are better tested. The converter GUI has a new checkbox for short messages; the converter CLI has a new mandatory option for short messages. I also fixed a bug whereby certain errors were not being appended to the 'errors' StringBuffer.	2004-04-24 17:49:16 +00:00
dchandler	0ee90a0fb0	Added many ACIP->TMW->ACIP tests. They found no bugs.	2004-04-17 17:28:26 +00:00
dchandler	de3a19761e	Fixes for javadoc tool.	2004-04-17 15:48:50 +00:00
dchandler	adcf9de952	Two new tests.	2004-04-17 15:14:46 +00:00
dchandler	1bfd3772e6	TMW->ACIP is much improved. V and W were confused, # and * were confused; many glyphs that should have yielded errors were not. I've added a test case that transforms every TMW glyph save the one with no TM mapping to ACIP. I hand-checked that it was correct. ACIP->TMW is fixed for # and *. I never noticed it, but each needed an extra swoosh (U+0F05). Round-tripping would be good, as would testing real-world use of TMW->ACIP.	2004-04-14 05:44:51 +00:00
dchandler	7eca276a62	TMW->Unicode conversions have changed; now using U+0F6A for the stacks whose EWTS transliteration begins with "R+". ACIP->* conversions and test baselines were updated to deal with the "r+..."=>"R+..." change.	2004-04-10 16:03:25 +00:00
dchandler	76356f4009	ACIP->Tibetan now gives an error when {?} is seen alone (not in {[?]} or {[*FOO?]}, but alone). Bug 860192 is fixed.	2004-03-15 00:49:01 +00:00
dchandler	274e1736be	Deleted cut-and-paste goof.	2004-01-17 19:45:31 +00:00
dchandler	c69ba26c60	TString now has tracks what Roman transliteration system it is using. Next up is to make ACIPConverter handle EWTS or ACIP TStrings.	2004-01-17 19:28:54 +00:00
dchandler	c1aa81e943	RFE 860190: ACIP->Unicode now gives a warning when it outputs something that can't be represented in TMW.	2003-12-16 07:45:40 +00:00
dchandler	848349fd3a	More tests.	2003-12-15 08:16:06 +00:00
dchandler	e7a9e7968f	ACIP->Unicode now uses two characters for consonants instead of one. This matches the dislike for characters like U+0F77 etc. ACIP->Tibetan was not giving an error for BCWA because it parsed like BCVA. Fixed.	2003-12-15 07:32:14 +00:00
dchandler	e9f7b2dfed	If you want curly brackets around folio markers, you'll have to set the system property thdl.acip.to.x.output.curly.brackets.around.folio.markers to true.	2003-12-14 08:47:03 +00:00
dchandler	8664571577	Warnings were not being detected correctly. Fixed. ACIP->Unicode uses U+0020, ' ', for whitespace. ACIP->TMW uses the TMW whitespace for whitespace.	2003-12-14 08:38:10 +00:00
dchandler	01e65176d4	Using less memory and time to figure out if warnings occurred.	2003-12-14 07:41:15 +00:00
dchandler	76c2e969ac	Fixed ACIP->Unicode bug for YYE etc., things with full-formed subjoined consonants and vowels. Fixed ACIP->TMW for YYA etc., things with full-formed subjoined consonants.	2003-12-14 07:36:21 +00:00
dchandler	f625c937ee	ACIP {B} was not being treated like {BA}; instead, an error was resulting. All the five prefixes were affected.	2003-12-14 05:54:07 +00:00
dchandler	581643cf59	{DAN,\nLHAG} used to be treated like {DAN, LHAG} but that got broken. Fixed. Added tests for lexer's handling of ACIP spaces etc.	2003-12-10 06:55:16 +00:00
dchandler	8e673bbc2c	{NGA,} becomes {NGA\u0f0c,} now instead of {NGA\u0f0b,}. Note: ACIP->Unicode for {NGA,} was not giving the Unicode that {NGA\u0f0b,} gives before.	2003-12-10 06:50:14 +00:00
dchandler	a466bad939	ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing.	2003-12-08 07:51:45 +00:00
dchandler	a39c5c12b0	ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing.	2003-12-08 07:15:27 +00:00

1 2 3

116 commits