Jskad

Author	SHA1	Message	Date
dchandler	37bf9a736d	I did this stuff back in August. It's all in support of EWTS->Tibetan conversion. The tag 'TODO(DLC)[EWTS->Tibetan]' exists all over the place. EWTS->Tibetan isn't here yet; lexing isn't here yet; this is mainly a refactoring so that the ACIP->Tibetan code can be reused to do EWTS->Tibetan. I'm committing this because tests pass (it shouldn't be breaking anything), because I want a checkpoint, and because the laptop this sandbox was on isn't my preferred development environment.	2005-02-21 01:16:10 +00:00
dchandler	1bfd3772e6	TMW->ACIP is much improved. V and W were confused, # and * were confused; many glyphs that should have yielded errors were not. I've added a test case that transforms every TMW glyph save the one with no TM mapping to ACIP. I hand-checked that it was correct. ACIP->TMW is fixed for # and *. I never noticed it, but each needed an extra swoosh (U+0F05). Round-tripping would be good, as would testing real-world use of TMW->ACIP.	2004-04-14 05:44:51 +00:00
dchandler	7eca276a62	TMW->Unicode conversions have changed; now using U+0F6A for the stacks whose EWTS transliteration begins with "R+". ACIP->* conversions and test baselines were updated to deal with the "r+..."=>"R+..." change.	2004-04-10 16:03:25 +00:00
dchandler	e7a9e7968f	ACIP->Unicode now uses two characters for consonants instead of one. This matches the dislike for characters like U+0F77 etc. ACIP->Tibetan was not giving an error for BCWA because it parsed like BCVA. Fixed.	2003-12-15 07:32:14 +00:00
dchandler	b617f761d5	ACIP->TMW for {^GONG SA } used to fail; fixed.	2003-12-07 20:05:41 +00:00
dchandler	ac412c994b	Now {Pm} is treated like {PAm}; {Pm:} is like {PAm:}; {P:} is like {PA:}.	2003-11-30 02:06:48 +00:00
dchandler	4e6a9c299f	ACIP % {MTHAR%} and o {Ko} and ^ {^GONG SA} are now supported. A % always causes a warning.	2003-11-11 03:43:11 +00:00
dchandler	2cb90bd231	ACIP->Tibetan converters now warn every time {%} is encountered that U+0F14 might've been intended. The Unicode for ACIP {o} is U+0F37.	2003-11-09 23:15:58 +00:00
dchandler	04816acb74	ACIP->Unicode was broken for KshR, ndRY, ndY, YY, and RY -- those stacks that use full-form subjoined RA and YA consonants. ACIP {RVA} was converting to the wrong things. The TMW for {RVA} was converting to the wrong ACIP. Checked all the 'DLC' tags in the ttt (ACIP->Tibetan) package.	2003-11-09 01:07:45 +00:00
dchandler	557ed7ed44	DKY'O etc. weren't being handled properly by ACIP->Tibetan. Now they are.	2003-10-18 17:49:29 +00:00
dchandler	5e18feb47d	ACIP now stacks greedily. TTTTTA is T+T+T+T+TA, even though that stack doesn't exist in TM or TMW. Robert Chilton, in personal correspondence, agreed that this is the way to do things. ACIP handles the appendages 'AM, 'ANG, 'US, 'UR, 'I, 'O, and 'U correctly.	2003-10-16 04:15:10 +00:00
dchandler	ee50291ed4	Andres found that "THAG PA" caused a NullPointerException. That's fixed. Renamed ACIPString to TString -- we'll use this for EWTS and ACIP both. TMW->ACIP for TMW9.61 should work now.	2003-10-04 01:22:59 +00:00
dchandler	115d0e0e6c	Fixed ACIP->TMW vowels like 'I etc. Fixed ACIP->Unicode/TMW for BDE, which should be B-DE, not B+DE, because the former is legal Tibetan. The ACIP->EWTS subroutine has improved. TMW->Wylie and TMW->ACIP are improved in error cases. TMW->ACIP has friendly embedded error messages now.	2003-09-12 05:06:37 +00:00
dchandler	07e360d9a8	The ACIP {NYA%} is supported. {NYAo} and {NYAx} are confusing to me, because I don't know which glyphs o and x correspond to. For that reason, they cause ERRORs. The proposed THDL Extended Wylie ~X and X is now used for U+0F35 and U+0F37 respectively.	2003-09-07 16:19:50 +00:00
dchandler	717c3b94f3	Fixed ACIP->Unicode spaces/tshegs and newlines, especially with shads. "NGA," becomes "NGA-tsheg-," automatically now.	2003-09-05 05:08:47 +00:00
dchandler	5c240ac072	From the converter GUI, you can now choose TMW->ACIP text and TMW->Wylie text. All the conversions show you which format they take as input and which format they give as output. File filter for ACIP files added. The GUI converter suggests a file extension wisely. Fixed newline bug in ACIP->Unicode converter.	2003-09-05 02:05:34 +00:00
dchandler	cc615f34df	ACIP->TMW and ACIP->Unicode have my pre-stamp of non-approval. Except for (NYAx} and {NYAo}, they're as good as I'll get them without input from experts of the employ of a complementary, syllabary-based approach.	2003-09-04 04:34:18 +00:00
dchandler	045c4069c9	Preliminary ACIP->TMW support is in place. {DU} gives you something less beautiful than what Jskad would give, so more work is needed.	2003-08-31 16:06:35 +00:00
dchandler	21ef657921	I'd broken the ACIP->Wylie for ACIP vowels {'A}, {'I}, etc.	2003-08-22 05:13:32 +00:00
dchandler	1afb3a0fdd	ACIP->Unicode, without going through TMW, is now possible, so long as \, the Sanskrit virama, is not used. Of the 1370-odd ACIP texts I've got here, about 57% make it through the gauntlet (fewer if you demand a vowel or disambiguator on every stack of a non-Tibetan tsheg bar).	2003-08-18 02:38:54 +00:00
dchandler	e21d3774a9	Added an unfinished ACIP->Tibetan converter. Once it works properly for ACIP, it'll easily be made to work as a perfect EWTS Wylie->Tibetan converter. It has an extensive suite of tests for the existing functionality.	2003-08-10 19:30:07 +00:00

21 commits