Jskad

Author	SHA1	Message	Date
dchandler	6d419fe641	Numerous EWTS->Unicode and especially EWTS->TMW improvements. Fixed ordering of Unicode wowels. [ku+A] gives the correct Unicode now, e.g. EWTS->TMW looks better for some wacky wowels like, I'm guessing here, [ku+A]. EWTS->TMW should now give errors any time the full input isn't used. Previously, wacky wowels like [kai+-i] would lead to some droppage. EWTS->TMW->Unicode testing is now in effect. This found a ton of EWTS->TMW bugs, most or all of which are fixed now. TMW->Unicode is improved/fixed for { \u5350,\u534D,\u0F88+k,\u0F88+kh,U }. (Why U? "\u0f75" is discouraged in favor of "\u0f71\u0f74".) NOTE: TMW_RTF_TO_THDL_WYLIETest is still disabled for the nightly builds' sake, but I ran it in my sandbox and it passed.	2005-07-11 02:51:06 +00:00
dchandler	b74af71efc	Better, but still flawed, handling of EWTS [^] (i.e., U+0F39).	2005-07-06 22:26:55 +00:00
dchandler	c16f633ecf	Two things: One, TMW->EWTS gives dbas and dngas instead of dabs and dangs because Chris Fynn's e-mail from today has dbas and dngas. Second, Down with ACIPRules. Long live ACIPTraits. EWTS->Tibetan conversion is closer still.	2005-02-22 04:36:54 +00:00
dchandler	3e0168b384	Renamed ACIPConverter to TConverter. Added a needed parameter (the only needed parameter in that class's interface AFAIK.	2005-02-21 01:35:23 +00:00
dchandler	b4155c3264	After a1tsal's changes to tibwn.ini, the tests failed. I'm a bit disheartened that more tests didn't fail.	2005-02-05 16:51:13 +00:00
a1tsal	28d46bb207	Add corrective comment regarding the bogus Unicode OM characters.	2005-01-19 01:07:52 +00:00
a1tsal	affccad9e5	Use 00A0 rather than 0020 for _, per unicode spec.	2005-01-17 08:58:56 +00:00
a1tsal	91d0e7f4da	Use "precomposed" sanskrit consonant combinations consistently throughout.	2005-01-17 08:49:04 +00:00
a1tsal	22cfee69db	Had wrong Unicode for n+n+y. Had wrong Unicode for space -- but only in comment. Cleaned up punctuation in another comment.	2005-01-16 01:17:44 +00:00
dchandler	e2d42f36eb	Robert Chilton's experience inspired me to make the handling of errors and warnings in ACIP->Tibetan conversion much more configurable. You can now choose from short or long error messages, for one thing. You can change the severity of almost all warnings. Each error and warning has an error code. Errors and warnings are better tested. The converter GUI has a new checkbox for short messages; the converter CLI has a new mandatory option for short messages. I also fixed a bug whereby certain errors were not being appended to the 'errors' StringBuffer.	2004-04-24 17:49:16 +00:00
dchandler	cc5d096918	David Chapman's latest fix to tibwn.ini (clearing up an issue that Than or I dropped the ball on) introduced two lines for 8,95. This is a bad thing, so I've taken out the second line. I've also introduced a check in TibetanMachineWeb.java such that we'll know that tibwn.ini has no such error in the future just by running 'ant clean jskad-run' and making sure that the GUI is indeed visible. I also updated the test baselines now that F03A and 0F82 are squared away.	2004-04-24 13:23:56 +00:00
a1tsal	9e071ea178	Differentiated 0F82 (~M`) and F03A (nyi.zla editor's mark).	2004-04-21 10:04:11 +00:00
dchandler	1bfd3772e6	TMW->ACIP is much improved. V and W were confused, # and * were confused; many glyphs that should have yielded errors were not. I've added a test case that transforms every TMW glyph save the one with no TM mapping to ACIP. I hand-checked that it was correct. ACIP->TMW is fixed for # and *. I never noticed it, but each needed an extra swoosh (U+0F05). Round-tripping would be good, as would testing real-world use of TMW->ACIP.	2004-04-14 05:44:51 +00:00
dchandler	7eca276a62	TMW->Unicode conversions have changed; now using U+0F6A for the stacks whose EWTS transliteration begins with "R+". ACIP->* conversions and test baselines were updated to deal with the "r+..."=>"R+..." change.	2004-04-10 16:03:25 +00:00
dchandler	aff34174ab	The new EWTS rule regarding R, W, and Y requires that these change. It may also require changes to the following, but I'm going to ask if it really should or not. // Y+Y~185,3~~6,98~1,109~6,120~1,123~1,125~6,106~6,113~f61,fbb // Y+r~186,3~~6,99~1,109~6,120~1,123~1,125~6,106~6,113~f61,fb2 // Y+w~187,3~~6,100~1,109~6,120~1,123~1,125~6,106~6,113~f61,fad // Y+s~188,3~~6,101~1,109~6,120~1,123~1,125~6,106~6,113~f61,fb6 // W+y~69,4~~7,79~1,109~8,121~1,123~1,125~8,107~8,114~f5d,fb1 // W+r~70,4~~7,80~1,109~8,121~1,123~1,125~8,107~8,114~f5d,fb2 // W+n~195,4~~7,81~1,109~8,120~1,123~1,125~8,106~8,113~f5d,fa3 // W+W~194,4~~7,82~1,109~8,120~1,123~1,125~8,106~8,113~f5d,fba	2004-04-08 02:55:59 +00:00
dchandler	d436a4d462	Removed David Chapman's recently added line for U+0F82 -- a line for U+0F82 already existed, and the new line had incorrect TM and incorrect TMW mappings. I changed the existing line for U+0F82 to use the EWTS {~M`}.	2004-03-02 04:29:41 +00:00
a1tsal	8eaaeaa202	Fix careless error: I had the same TMW character for ~M and ~M`!	2004-02-22 09:14:56 +00:00
a1tsal	b14833b5b9	Change ^M to ~M to conform to spec. Introduce ~M` (for 0F82).	2004-02-20 15:07:49 +00:00
dchandler	115534e688	ACIP->TMW for {^GONG SA } used to fail because we had \u0F38 in the ToWylie section. Now it's in the <?Input:Numbers?> section because I didn't want to introduce a new section. If WylieWord has trouble due to this misuse of the 'numbers' category, we'll introduce a new category, 'other'. TMW->EWTS improved as a result -- {\u0F38.gonga sa } is produced now where {\u0F38agonga sa } was once produced. Even the better version is imperfect; see bug 855877.	2003-12-07 19:40:59 +00:00
dchandler	3f18623977	Added comments only.	2003-12-06 20:26:45 +00:00
dchandler	16bfeac641	These issues are non-issues; removing these comments.	2003-11-25 00:31:33 +00:00
dchandler	d3d0ff23a8	Chris Fynn and Tony Duff answered my questions about U+0F3F and U+0F3E.	2003-11-25 00:28:18 +00:00
dchandler	5d053b41fe	Found another inconsistency between Unicode and the TM/TMW docs. I've sent e-mail to Tony Duff asking who's right, but I'm putting this in the errata under the assumption that even if Unicode is wrong, Unicode's wrong view will somehow rule the day. Also, TMW->EWTS now generates \uF021-\uF0FF or \u0F00-\u0FFF escapes when appropriate. A few TMW glyphs still give errors. Also, there's now a test to be sure that TM<->TMW and TMW->EWTS won't break in the future (except for the one glyph in TMW that isn't in TM, that one isn't tested). The baselines have not been hand-verified, but changes will be detected.	2003-11-24 05:49:15 +00:00
dchandler	9a247f5932	N+D+Ya, not N+D+ya, w+Wa, not w+wa .. use W, R, and Y where appropriate.	2003-11-24 04:55:11 +00:00
dchandler	f76c089366	Using Y, R, and W everywhere needed. R+... is never needed in TM/TMW, I concluded (with 50% certainty).	2003-11-24 04:05:59 +00:00
dchandler	08c676c186	Bug fixes. Plus, now 99% in sync with the new EWTS draft. Search for 'DLC' to find a few open issues. Readded the line for reversed dza; it should never have been deleted, as that breaks TM<->TMW. I tested the whole mapping by hand once; this incident shows that automation is very helpful. '{' and '}' were swapped... The Unicode for something was "", not "none". +R, +W, +Y, R+ now in use (though more testing is needed)	2003-11-24 02:40:40 +00:00
dchandler	b59b86fd73	Commented this to mention some recent testing.	2003-11-11 03:45:58 +00:00
dchandler	3fa417d3ee	phywI, phywU, drwI and drwU now produce vowels and subjoined a-chungs. The Tibetan! 5.1 docs say I and U are not applicable to these stacks, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the I or U request -- we were silent.	2003-11-08 21:53:34 +00:00
dchandler	e058d6252e	phywu and drwu now produce zhabs-kyus. The Tibetan! 5.1 docs say the zhabs-kyu is not applicable to these stacks, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the zhabs-kyu request -- we were silent.	2003-11-08 21:48:08 +00:00
dchandler	55aaeef9d0	l+h+wu now produces a zhabs-kyu. The Tibetan! 5.1 docs say the zhabs-kyu is not applicable to l+h+w, but I say Jskad lets the user decide what's applicable. If you disagree, be sure to give an error message before dropping the zhabs-kyu request -- we were silent.	2003-11-08 21:23:50 +00:00
dchandler	06edf17b04	Once again, the wrong 'dreng-bu glyphs were listed in the Tibetan! 5.1 docs -- they were na-ro glyphs, actually.	2003-11-08 21:17:18 +00:00
dchandler	74d6bc61ab	The wrong 'dreng-bu glyphs were listed in the Tibetan! 5.1 docs -- they were na-ro glyphs, actually.	2003-11-08 20:25:16 +00:00
dchandler	a0ae0bf70d	Fixes bug 800164. Jskad users can now enter t+r+n on the keyboard. Wylie Word should work for t+r+n too.	2003-11-08 17:50:10 +00:00
dchandler	b983af8031	r-t, not rt. This was why converting 'brtul' from TMW to Wylie didn't work.	2003-10-04 16:33:23 +00:00
dchandler	07e360d9a8	The ACIP {NYA%} is supported. {NYAo} and {NYAx} are confusing to me, because I don't know which glyphs o and x correspond to. For that reason, they cause ERRORs. The proposed THDL Extended Wylie ~X and X is now used for U+0F35 and U+0F37 respectively.	2003-09-07 16:19:50 +00:00
dchandler	72e531e515	Use shortened 'dreng-bu, not regular. As per TM glyphs. I suspect that the following would look better with shortened 'dreng-bu also, but I'm sticking with the TM/TMW docs: dz+r~137,2~~4,46~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fb2 dz+w~138,2~~4,47~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fad dz+h~139,2~~4,48~1,110~4,120~1,123~1,126~4,106~4,113~0F5C dz+h+y~140,2~~4,49~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb1 dz+h+r~141,2~~4,50~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb2 dz+h+l~249,2~~4,51~1,110~4,123~1,123~1,126~4,110~4,117~f5c,fb3 dz+h+w~143,2~~4,52~1,110~4,122~1,123~1,126~4,108~4,115~f5c,fad	2003-09-04 03:46:35 +00:00
a1tsal	2f58ec2760	A bunch of Sanskrit stacks of the form ts+... and dz+...had 1,125 for their drengbu, but that is actually a naro. I changed it to 1,123 (which is one of the two drengbus).	2003-09-04 02:06:58 +00:00
dchandler	045c4069c9	Preliminary ACIP->TMW support is in place. {DU} gives you something less beautiful than what Jskad would give, so more work is needed.	2003-08-31 16:06:35 +00:00
a1tsal	1f4d53be2e	Moved ^M to punctuation section. Removed obsolete comment.	2003-08-31 00:44:23 +00:00
a1tsal	522812996e	Remove unused sections of tibwn.ini.	2003-08-31 00:34:15 +00:00
dchandler	896344f2d1	David Chapman removed some lines from tibwn.ini. That breaks TM<->TMW mappings, so I've put them back, but with the EWTS non-correspondences \tmwXYYY. Jskad no longer supports superscribed or subscribed numerals, because EWTS does not.	2003-08-26 01:28:02 +00:00
a1tsal	ccdebf6719	Removed half numbers (no longer in EWTS) Brought <?Other?> closer to EWTS Removed __TILDE__ (no longer in EWTS) Changed M^ to ^M per new EWTS draft Added ai, au, -i from WW tibwn.ini -- they were missing in this version	2003-08-25 23:19:48 +00:00
dchandler	d5ad760230	TMW->Wylie conversion now takes advantage of prefix rules, the rules that say "ya can take a ga prefix" etc. The ACIP->Unicode converter now gives warnings (optionally, and by default, inline). This converter now produces output even when lexical errors occur, but the output has errors and warnings inline.	2003-08-23 22:03:37 +00:00
dchandler	e198519c5f	Jskad now supports EWTS ~, i.e. TMW8.91.	2003-07-25 02:35:31 +00:00
dchandler	d726bc0258	A couple of changes to TMW->Unicode thanks to Than's reply to my questions.	2003-07-09 01:44:15 +00:00
dchandler	02558a1d78	Jskad supports <7, >8, etc. again; it no longer supports the punctuation '<' and '>'. The current keyboard implementation makes this an either-or proposition, when fundamentally it need not be. Added a <?Numbers?> command and an <?Input:Numbers?> command to tibwn.ini; broke the numbers apart from the consonants. This facilitates the new-and-improved Tibetan->Wylie conversion. Tibetan->Wylie is now done by forming legal tsheg-bars. A legal tsheg bar is converted into perfect THDL Wylie. See code comments to learn what it thinks is a legal tsheg-bar, but it inlcudes bskyUMbsH minus the trailing punctuation (H), e.g. Illegal sequences, such as runs of transliterated Sanskrit, are turned into unambiguous Wylie; each glyph is followed by a vowel or a disambiguator ('.'). I've made it so that the illegal sequences are as beautiful as possible. You get 'pad+me', for example, not the equivalent but uglier 'pad+m.e.'.	2003-07-08 14:30:17 +00:00
dchandler	23d18c925f	Tibetan! 5.1's docs were again faulty. fa and va were getting the wrong vowels.	2003-07-08 02:59:17 +00:00
dchandler	72d2eee503	Code cleanup.	2003-07-05 19:26:58 +00:00
dchandler	3113a4b8de	Some of the \tmw80.. mappings were out of date. 3+1/2 is not EWTS; took these out.	2003-07-01 03:42:30 +00:00
dchandler	61d29fc355	The TMW->Wylie mapping was busted w.r.t. tshegs. Also, I now map both TMW7.90 and TMW7.91 to EWTS 'M'.	2003-07-01 00:17:18 +00:00

1 2

62 commits