Jskad

Author	SHA1	Message	Date
dchandler	3e0168b384	Renamed ACIPConverter to TConverter. Added a needed parameter (the only needed parameter in that class's interface AFAIK.	2005-02-21 01:35:23 +00:00
dchandler	37bf9a736d	I did this stuff back in August. It's all in support of EWTS->Tibetan conversion. The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the place. EWTS->Tibetan isn't here yet; lexing isn't here yet; this is mainly a refactoring so that the ACIP->Tibetan code can be reused to do EWTS->Tibetan. I'm committing this because tests pass (it shouldn't be breaking anything), because I want a checkpoint, and because the laptop this sandbox was on isn't my preferred development environment.	2005-02-21 01:16:10 +00:00
eg3p	a39d5c2ba3	changed all occurrences of 'Color.BLACK' to 'Color.black', since the former causes a runtime error on Mac OS X (for Java 1.3.1 at least). not sure why this is, but may be related to the bug mentioned at http://www.oxygenxml.com/forum/viewtopic.php?p=239	2004-08-19 14:59:06 +00:00
dchandler	7acbce3361	Added errors 142 and 143, which are produced when converting yig chung to a Unicode text file, which cannot support font size changes.	2004-06-06 21:59:16 +00:00
dchandler	1a055f3472	I don't think warning level "None" was really doing the trick. Fixed that. You can now customize the severities of all warnings, even 504 and 510. When warning level is "None", scanning, i.e. lexical analysis, is faster.	2004-04-25 00:37:57 +00:00
dchandler	e2d42f36eb	Robert Chilton's experience inspired me to make the handling of errors and warnings in ACIP->Tibetan conversion much more configurable. You can now choose from short or long error messages, for one thing. You can change the severity of almost all warnings. Each error and warning has an error code. Errors and warnings are better tested. The converter GUI has a new checkbox for short messages; the converter CLI has a new mandatory option for short messages. I also fixed a bug whereby certain errors were not being appended to the 'errors' StringBuffer.	2004-04-24 17:49:16 +00:00
dchandler	1bfd3772e6	TMW->ACIP is much improved. V and W were confused, # and * were confused; many glyphs that should have yielded errors were not. I've added a test case that transforms every TMW glyph save the one with no TM mapping to ACIP. I hand-checked that it was correct. ACIP->TMW is fixed for # and *. I never noticed it, but each needed an extra swoosh (U+0F05). Round-tripping would be good, as would testing real-world use of TMW->ACIP.	2004-04-14 05:44:51 +00:00
dchandler	c1aa81e943	RFE 860190: ACIP->Unicode now gives a warning when it outputs something that can't be represented in TMW.	2003-12-16 07:45:40 +00:00
dchandler	e9f7b2dfed	If you want curly brackets around folio markers, you'll have to set the system property thdl.acip.to.x.output.curly.brackets.around.folio.markers to true.	2003-12-14 08:47:03 +00:00
dchandler	8664571577	Warnings were not being detected correctly. Fixed. ACIP->Unicode uses U+0020, ' ', for whitespace. ACIP->TMW uses the TMW whitespace for whitespace.	2003-12-14 08:38:10 +00:00
dchandler	01e65176d4	Using less memory and time to figure out if warnings occurred.	2003-12-14 07:41:15 +00:00
dchandler	8e673bbc2c	{NGA,} becomes {NGA\u0f0c,} now instead of {NGA\u0f0b,}. Note: ACIP->Unicode for {NGA,} was not giving the Unicode that {NGA\u0f0b,} gives before.	2003-12-10 06:50:14 +00:00
dchandler	a39c5c12b0	ACIP->TMW now supports EWTS PUA {\uF021}-style escapes. Our extended ACIP is thus TMW-complete and useful for testing.	2003-12-08 07:15:27 +00:00
dchandler	c43e9a446b	Revamped some ACIP->Tibetan error messages.	2003-12-06 20:19:40 +00:00
dchandler	ac412c994b	Now {Pm} is treated like {PAm}; {Pm:} is like {PAm:}; {P:} is like {PA:}.	2003-11-30 02:06:48 +00:00
dchandler	dfaae4be93	ACIP->TMW and ACIP->Unicode now allow for Unicode escapes like K\u0F84. This means that the lack of support for ACIP's backslash, '\\', is mitigated because you can turn ACIP {K\} into ACIP {K\u0F84}. Support for U+F021-U+F0FF, the PUA that the latest EWTS uses, is not provided.	2003-11-29 22:56:18 +00:00
dchandler	4e6a9c299f	ACIP % {MTHAR%} and o {Ko} and ^ {^GONG SA} are now supported. A % always causes a warning.	2003-11-11 03:43:11 +00:00
dchandler	04816acb74	ACIP->Unicode was broken for KshR, ndRY, ndY, YY, and RY -- those stacks that use full-form subjoined RA and YA consonants. ACIP {RVA} was converting to the wrong things. The TMW for {RVA} was converting to the wrong ACIP. Checked all the 'DLC' tags in the ttt (ACIP->Tibetan) package.	2003-11-09 01:07:45 +00:00
dchandler	94a43d3f39	Now anything not clearly native Tibetan is colored green when coloring is enabled. G'EEm is "native", though -- the only "vowel" that implies non-nativeness is {:}, as in {KA:}.	2003-10-26 18:56:48 +00:00
dchandler	5c36dd81d3	Fixed bug 830332, "Convert selected ACIP=>Tibetan busted".	2003-10-26 18:25:25 +00:00
dchandler	7d24ab393f	Code cleanup.	2003-10-21 03:44:02 +00:00
dchandler	2f81a801ef	Added three new kinds of warnings to ACIP->Tibetan conversions.	2003-10-21 02:00:49 +00:00
dchandler	3aa3859354	ACIP->Unicode crash fixed. 5% of the code for support of ACIP->Unicode.rtf is here.	2003-10-19 22:19:16 +00:00
dchandler	5aab4acc93	I've undone the SNYAM'AM == SNYAMA'AM hack. The only occurrence of SNYAM'AM in the ACIP texts I've got is likely a typo, says Robert Chilton. The code would be cleaner if I could bear to delete my terrible hack. Maybe in a month, when I don't feel so dumb for coding it up in the first place. The correct solution for such things is to give the ACIP->Tibetan converters a pre-filter mechanism. This would be before the lexer or part of the lexer (maybe you only want to filter tsheg bars), and it would allow the end user to specify things like "s/SNYAM'AM/S+NYAMA'AMA/g".	2003-10-19 20:48:22 +00:00
dchandler	4b1395e0ba	Jskad has a new feature: Convert Selection from ACIP to Tibetan. It uses the ACIP converter to do its work. Improved some error messages from the ACIP->Tibetan converter.	2003-10-19 20:16:06 +00:00
dchandler	5ce84d4d9a	Tiny code cleanup.	2003-10-19 04:43:34 +00:00
dchandler	5e18feb47d	ACIP now stacks greedily. TTTTTA is T+T+T+T+TA, even though that stack doesn't exist in TM or TMW. Robert Chilton, in personal correspondence, agreed that this is the way to do things. ACIP handles the appendages 'AM, 'ANG, 'US, 'UR, 'I, 'O, and 'U correctly.	2003-10-16 04:15:10 +00:00
dchandler	6a11eddb1e	Warning level "None" wasn't working.	2003-10-04 16:12:48 +00:00
dchandler	ee50291ed4	Andres found that "THAG PA" caused a NullPointerException. That's fixed. Renamed ACIPString to TString -- we'll use this for EWTS and ACIP both. TMW->ACIP for TMW9.61 should work now.	2003-10-04 01:22:59 +00:00
dchandler	115d0e0e6c	Fixed ACIP->TMW vowels like 'I etc. Fixed ACIP->Unicode/TMW for BDE, which should be B-DE, not B+DE, because the former is legal Tibetan. The ACIP->EWTS subroutine has improved. TMW->Wylie and TMW->ACIP are improved in error cases. TMW->ACIP has friendly embedded error messages now.	2003-09-12 05:06:37 +00:00
dchandler	e42d76b3b8	Nicer default Latin font for ACIP->* conversions. Performance improvement in non-color-coding mode.	2003-09-07 22:08:35 +00:00
dchandler	d8657abd44	ACIP font shrinking as in {KA (GA)} is now supported.	2003-09-07 18:30:59 +00:00
dchandler	07e360d9a8	The ACIP {NYA%} is supported. {NYAo} and {NYAx} are confusing to me, because I don't know which glyphs o and x correspond to. For that reason, they cause ERRORs. The proposed THDL Extended Wylie ~X and X is now used for U+0F35 and U+0F37 respectively.	2003-09-07 16:19:50 +00:00
dchandler	0d6d6ed611	Added GUI support for color-coding. Added support for color-coding and choosing the warning level to TibetanConverter. Better error checking in the GUI converter.	2003-09-06 22:56:10 +00:00
dchandler	1308f14807	sanskrit=green, prefix-rule-afflicted-tsheg-bar=yellow	2003-09-05 06:05:46 +00:00
dchandler	899b042ec0	Preliminary, untested color support in ACIP->TMW conversion.	2003-09-05 05:54:35 +00:00
dchandler	717c3b94f3	Fixed ACIP->Unicode spaces/tshegs and newlines, especially with shads. "NGA," becomes "NGA-tsheg-," automatically now.	2003-09-05 05:08:47 +00:00
dchandler	cc615f34df	ACIP->TMW and ACIP->Unicode have my pre-stamp of non-approval. Except for (NYAx} and {NYAo}, they're as good as I'll get them without input from experts of the employ of a complementary, syllabary-based approach.	2003-09-04 04:34:18 +00:00
dchandler	d2749cecd0	ACIP->TMW and ACIP->Unicode are now smart about when a newline is really a newline and when a space is really a tsheg. The space in {KA ,MDO} is a tsheg, but the space in {GA ,MDO} is not.	2003-09-04 04:04:21 +00:00
dchandler	045c4069c9	Preliminary ACIP->TMW support is in place. {DU} gives you something less beautiful than what Jskad would give, so more work is needed.	2003-08-31 16:06:35 +00:00
dchandler	1982c5847b	Jskad's converter now has ACIP-to-Unicode built in. There are known bugs; it is pre-alpha. It's usable, though, and finds tons of errors in ACIP input files, with the user deciding just how pedantic to be. The biggest outstanding bug is the silent one: treating { }, space, as tsheg instead of whitespace when we ought to know better.	2003-08-24 06:40:53 +00:00
dchandler	d5ad760230	TMW->Wylie conversion now takes advantage of prefix rules, the rules that say "ya can take a ga prefix" etc. The ACIP->Unicode converter now gives warnings (optionally, and by default, inline). This converter now produces output even when lexical errors occur, but the output has errors and warnings inline.	2003-08-23 22:03:37 +00:00
dchandler	1afb3a0fdd	ACIP->Unicode, without going through TMW, is now possible, so long as \, the Sanskrit virama, is not used. Of the 1370-odd ACIP texts I've got here, about 57% make it through the gauntlet (fewer if you demand a vowel or disambiguator on every stack of a non-Tibetan tsheg bar).	2003-08-18 02:38:54 +00:00

43 commits