Jskad

Author	SHA1	Message	Date
dchandler	ef24c608bf	Added a mechanism for end users to customize ACIP/EWTS=>Tibetan conversions by giving a list of substitutions to be performed. E.g., when I invoke Jskad via 'java -Dorg.thdl.tib.text.ttt.VerboseReplacementMap=false -Dorg.thdl.tib.text.ttt.ReplacementMap="KAsh=>K+sh" -jar Jskad.jar', then the ACIP KAsh becomes K+sh automatically. This mechanism is for Andres (who noticed KAsh=>K+sh in practice) and power users only, and not power users until I document the thing outside of the source code.	2003-10-26 02:17:19 +00:00
dchandler	6bda550157	The ACIP "BNA" was converting to B-NA instead of B+NA, even though NA cannot take a BA prefix. This was because BNA was interpreted as root-suffix. In ACIP, BN is surely B+N unless N takes a B prefix, so root-suffix is out of the question. Now Jskad has two "Convert selected ACIP to Tibetan" conversions, one with and one without warnings, built in to Jskad proper (not the converter, that is).	2003-10-26 00:32:55 +00:00
dchandler	d99ae50d8a	The ACIP "BNA" was converting to B-NA instead of B+NA, even though NA cannot take a BA prefix. This was because BNA was interpreted as root-suffix. In ACIP, BN is surely B+N unless N takes a B prefix, so root-suffix is out of the question. Now Jskad has two "Convert selected ACIP to Tibetan" conversions, one with and one without warnings, built in to Jskad proper (not the converter, that is).	2003-10-26 00:24:28 +00:00
dchandler	1415fc43e3	The ACIP "BNA" was converting to B-NA instead of B+NA, even though NA cannot take a BA prefix. This was because BNA was interpreted as root-suffix. In ACIP, BN is surely B+N unless N takes a B prefix, so root-suffix is out of the question.	2003-10-26 00:21:54 +00:00
dchandler	306cf2817c	Private correspondence with Robert Chilton led to me to add and remove a few prefix rules. BLC and BGL are here, BLK, BLG, BLNG, BLJ, BNG, BJ, BNY, BN, and BDZ are gone. Added a few new tests.	2003-10-25 21:47:34 +00:00
dchandler	f106deb884	Private correspondence with Robert Chilton led to me to add and remove a few prefix rules. BLC and BGL are here, BLK, BLG, BLNG, BLJ, BNG, BJ, BNY, BN, and BDZ are gone. Added a few new tests.	2003-10-25 21:40:21 +00:00
dchandler	af013a6a39	I renamed this function a while ago.	2003-10-22 02:49:16 +00:00
dchandler	7d24ab393f	Code cleanup.	2003-10-21 03:44:02 +00:00
dchandler	c764eee8d0	Added a new warning for DMAR and others affected similarly affected by prefix rules, where seeing D+MAR, not D-MAR, could have caused an input operator to type in DMAR. This is a "Most" warning, but DMA causes a higher-priority "Some" warning.	2003-10-21 03:36:57 +00:00
dchandler	2f39921381	Added more test cases.	2003-10-21 02:14:45 +00:00
dchandler	2f81a801ef	Added three new kinds of warnings to ACIP->Tibetan conversions.	2003-10-21 02:00:49 +00:00
dchandler	a47af2c165	Bulletproofing -- code cleanup.	2003-10-21 00:31:10 +00:00
dchandler	188b9c322e	Warn about prefix rules only in Most and All modes.	2003-10-21 00:23:55 +00:00
dchandler	1224030898	Speedup.	2003-10-21 00:19:15 +00:00
dchandler	1d9b405bb8	Forgot to add this file earlier.	2003-10-20 13:49:54 +00:00
dchandler	5d9305c9d5	"Browse..." buttons are smart about file types now.	2003-10-19 23:17:25 +00:00
dchandler	3aa3859354	ACIP->Unicode crash fixed. 5% of the code for support of ACIP->Unicode.rtf is here.	2003-10-19 22:19:16 +00:00
dchandler	5aab4acc93	I've undone the SNYAM'AM == SNYAMA'AM hack. The only occurrence of SNYAM'AM in the ACIP texts I've got is likely a typo, says Robert Chilton. The code would be cleaner if I could bear to delete my terrible hack. Maybe in a month, when I don't feel so dumb for coding it up in the first place. The correct solution for such things is to give the ACIP->Tibetan converters a pre-filter mechanism. This would be before the lexer or part of the lexer (maybe you only want to filter tsheg bars), and it would allow the end user to specify things like "s/SNYAM'AM/S+NYAMA'AMA/g".	2003-10-19 20:48:22 +00:00
dchandler	4b1395e0ba	Jskad has a new feature: Convert Selection from ACIP to Tibetan. It uses the ACIP converter to do its work. Improved some error messages from the ACIP->Tibetan converter.	2003-10-19 20:16:06 +00:00
dchandler	5ce84d4d9a	Tiny code cleanup.	2003-10-19 04:43:34 +00:00
dchandler	0edebd55d7	We were dying in the "can ts+h take a ga prefix?" check for GTZHAN.	2003-10-19 03:47:33 +00:00
dchandler	47648186b4	Untabified -- whitespace only has changed. Use 'cvs diff -wb' to avoid seeing these differences.	2003-10-18 18:34:49 +00:00
dchandler	e5534f69ee	Untabified -- whitespace only has changed. Use 'cvs diff -wb' to avoid seeing these differences.	2003-10-18 18:29:46 +00:00
dchandler	557ed7ed44	DKY'O etc. weren't being handled properly by ACIP->Tibetan. Now they are.	2003-10-18 17:49:29 +00:00
dchandler	e799438f86	CVS ignoring backup files.	2003-10-18 17:47:56 +00:00
dchandler	3b55ea509f	Prefix rules have changed. A few are gone; a few new ones are here. I've implemented here a list that Robert Chilton sent me in private correspondence. He doesn't describe it as definitive, but since it affects ACIP->Tibetan conversions, and it's the best I've got, here they are. There's still an optional warning about "Hey, prefix rules matter for this tsheg bar." I've left in a few rules that I didn't find on RC's list; I've asked him to look into these further.	2003-10-18 05:48:53 +00:00
dchandler	f28bee4c71	The appendage 'um is here too.	2003-10-18 05:10:49 +00:00
dchandler	8c99adeb63	TMW->EWTS, TMW->ACIP, and ACIP->Unicode/TMW now support more appendages. Personal correspondence with Robert Chilton led me to support, besides 'am, 'ang, 'o, 'i, and 'u, the following: 'e (used in foreign transliteration) 'ongs 'is 'os 'ur 'us 'ung	2003-10-18 03:04:47 +00:00
dchandler	5e18feb47d	ACIP now stacks greedily. TTTTTA is T+T+T+T+TA, even though that stack doesn't exist in TM or TMW. Robert Chilton, in personal correspondence, agreed that this is the way to do things. ACIP handles the appendages 'AM, 'ANG, 'US, 'UR, 'I, 'O, and 'U correctly.	2003-10-16 04:15:10 +00:00
dchandler	5f4fbfab7c	Bulletproofing and debugging support.	2003-10-16 04:13:14 +00:00
dchandler	129ebccd67	In TCC #1 keyboard, h>cj now works. I may have fixed this in a terrible way, breaking other things even. Hard to say because I don't really understand the code I changed. But DuffPaneTest passes. If we ever clean up the keyboards, the changes made here to tcc_keyboard.ini should probably be undone.	2003-10-12 18:16:17 +00:00
dchandler	d7fdacfcdc	Open menu is now Open..., Save as is now Save as...	2003-10-12 18:12:19 +00:00
dchandler	8dbfff17e1	All .rtf and .Rtf and .RTF files are selectable now.	2003-10-12 18:11:50 +00:00
dchandler	35209ce7fd	I'm going to have to debug this, and the tab stops make the source unreadable. I don't like messing with whitespace, but it seems like I'll be the main maintainer for a while, and the people after me can use cvs diff -wb. So I'm untabifying.	2003-10-12 16:44:28 +00:00
dchandler	749b8d6727	Added toString for debugging.	2003-10-04 16:33:47 +00:00
dchandler	b983af8031	r-t, not rt. This was why converting 'brtul' from TMW to Wylie didn't work.	2003-10-04 16:33:23 +00:00
dchandler	6a11eddb1e	Warning level "None" wasn't working.	2003-10-04 16:12:48 +00:00
dchandler	b10098cc61	"Most" warnings now excludes "the last stack has no vowel", making it much more useful.	2003-10-04 15:10:18 +00:00
dchandler	ee50291ed4	Andres found that "THAG PA" caused a NullPointerException. That's fixed. Renamed ACIPString to TString -- we'll use this for EWTS and ACIP both. TMW->ACIP for TMW9.61 should work now.	2003-10-04 01:22:59 +00:00
amontano	c8927b827c	Fixed bugs in the scanner. Added reference to yogacara bhumi in the about window.	2003-09-23 19:05:23 +00:00
amontano	e89c49651c	Now translation tool accepts synonyms separated by ';' in the entry field.	2003-09-14 05:56:20 +00:00
dchandler	115d0e0e6c	Fixed ACIP->TMW vowels like 'I etc. Fixed ACIP->Unicode/TMW for BDE, which should be B-DE, not B+DE, because the former is legal Tibetan. The ACIP->EWTS subroutine has improved. TMW->Wylie and TMW->ACIP are improved in error cases. TMW->ACIP has friendly embedded error messages now.	2003-09-12 05:06:37 +00:00
dchandler	16817d0b8e	Fixed Javadocs.	2003-09-10 01:19:05 +00:00
amontano	cc853be387	Fixed a bug with regards to the word order in the servlet version.	2003-09-09 16:02:03 +00:00
amontano	1467f9cd3f	Fixed display of servlet version and added option to include links to other versions. See http://iris.lib.virginia.edu/tibetan/servlet/org.thdl.tib.scanner.OnLineScannerFilter?thdlBanner=on	2003-09-08 21:32:40 +00:00
amontano	73d01111ca	Fixed the "clicking on the translate button makes the thdl menu go away" error. on the servlet version of the translation tool.	2003-09-08 16:39:18 +00:00
amontano	07fbbcaf45	Solved some sorting errors with the servlet version. Also if the service parameter thdlBanner=anything is sent, the THDL's java script menu is displayed (if it is running on the thdl server). There is still a bug. Menu goes away when pressing "translate" button. See: http://iris.lib.virginia.edu/tibetan/servlet/org.thdl.tib.scanner.OnLineScannerFilter?thdlBanner=on	2003-09-08 08:12:56 +00:00
dchandler	e42d76b3b8	Nicer default Latin font for ACIP->* conversions. Performance improvement in non-color-coding mode.	2003-09-07 22:08:35 +00:00
dchandler	6872ea8028	Corrected the usage info.	2003-09-07 22:08:00 +00:00
dchandler	d8657abd44	ACIP font shrinking as in {KA (GA)} is now supported.	2003-09-07 18:30:59 +00:00
dchandler	07e360d9a8	The ACIP {NYA%} is supported. {NYAo} and {NYAx} are confusing to me, because I don't know which glyphs o and x correspond to. For that reason, they cause ERRORs. The proposed THDL Extended Wylie ~X and X is now used for U+0F35 and U+0F37 respectively.	2003-09-07 16:19:50 +00:00
amontano	f57cdda867	Now translation tool displays to where is it connected	2003-09-07 03:40:51 +00:00
amontano	b489034598	Fixed a call to a deprecated method	2003-09-07 03:39:08 +00:00
dchandler	0d6d6ed611	Added GUI support for color-coding. Added support for color-coding and choosing the warning level to TibetanConverter. Better error checking in the GUI converter.	2003-09-06 22:56:10 +00:00
dchandler	1308f14807	sanskrit=green, prefix-rule-afflicted-tsheg-bar=yellow	2003-09-05 06:05:46 +00:00
dchandler	899b042ec0	Preliminary, untested color support in ACIP->TMW conversion.	2003-09-05 05:54:35 +00:00
dchandler	717c3b94f3	Fixed ACIP->Unicode spaces/tshegs and newlines, especially with shads. "NGA," becomes "NGA-tsheg-," automatically now.	2003-09-05 05:08:47 +00:00
dchandler	5c240ac072	From the converter GUI, you can now choose TMW->ACIP text and TMW->Wylie text. All the conversions show you which format they take as input and which format they give as output. File filter for ACIP files added. The GUI converter suggests a file extension wisely. Fixed newline bug in ACIP->Unicode converter.	2003-09-05 02:05:34 +00:00
dchandler	4abbf6db37	--to-acip-text and --to-wylie-text added; these get you text files, not RTF files like --to-acip and --to-wylie do. The GUI converter doesn't yet allow you to get text files.	2003-09-04 05:16:47 +00:00
dchandler	cc615f34df	ACIP->TMW and ACIP->Unicode have my pre-stamp of non-approval. Except for (NYAx} and {NYAo}, they're as good as I'll get them without input from experts of the employ of a complementary, syllabary-based approach.	2003-09-04 04:34:18 +00:00
dchandler	ae7a7577bc	ACIP->TMW and ACIP->Unicode are now smart about when a newline is really a newline and when a space is really a tsheg. The space in {KA ,MDO} is a tsheg, but the space in {GA ,MDO} is not.	2003-09-04 04:13:01 +00:00
dchandler	d2749cecd0	ACIP->TMW and ACIP->Unicode are now smart about when a newline is really a newline and when a space is really a tsheg. The space in {KA ,MDO} is a tsheg, but the space in {GA ,MDO} is not.	2003-09-04 04:04:21 +00:00
dchandler	72e531e515	Use shortened 'dreng-bu, not regular. As per TM glyphs. I suspect that the following would look better with shortened 'dreng-bu also, but I'm sticking with the TM/TMW docs: dz+r~137,2~~4,46~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fb2 dz+w~138,2~~4,47~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fad dz+h~139,2~~4,48~1,110~4,120~1,123~1,126~4,106~4,113~0F5C dz+h+y~140,2~~4,49~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb1 dz+h+r~141,2~~4,50~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb2 dz+h+l~249,2~~4,51~1,110~4,123~1,123~1,126~4,110~4,117~f5c,fb3 dz+h+w~143,2~~4,52~1,110~4,122~1,123~1,126~4,108~4,115~f5c,fad	2003-09-04 03:46:35 +00:00
a1tsal	2f58ec2760	A bunch of Sanskrit stacks of the form ts+... and dz+...had 1,125 for their drengbu, but that is actually a naro. I changed it to 1,123 (which is one of the two drengbus).	2003-09-04 02:06:58 +00:00
dchandler	316f59107b	A preliminary TMW->ACIP converter is here. There are known bugs, mostly with rare punctuation.	2003-09-02 06:39:33 +00:00
dchandler	cc9ab06864	Added utility routine. Better comments.	2003-08-31 20:38:28 +00:00
dchandler	045c4069c9	Preliminary ACIP->TMW support is in place. {DU} gives you something less beautiful than what Jskad would give, so more work is needed.	2003-08-31 16:06:35 +00:00
a1tsal	1f4d53be2e	Moved ^M to punctuation section. Removed obsolete comment.	2003-08-31 00:44:23 +00:00
a1tsal	522812996e	Remove unused sections of tibwn.ini.	2003-08-31 00:34:15 +00:00
dchandler	dd22e161a5	Code cleanup for Jskad's Tibetan font converter GUI.	2003-08-30 05:01:15 +00:00
dchandler	896344f2d1	David Chapman removed some lines from tibwn.ini. That breaks TM<->TMW mappings, so I've put them back, but with the EWTS non-correspondences \tmwXYYY. Jskad no longer supports superscribed or subscribed numerals, because EWTS does not.	2003-08-26 01:28:02 +00:00
a1tsal	ccdebf6719	Removed half numbers (no longer in EWTS) Brought <?Other?> closer to EWTS Removed __TILDE__ (no longer in EWTS) Changed M^ to ^M per new EWTS draft Added ai, au, -i from WW tibwn.ini -- they were missing in this version	2003-08-25 23:19:48 +00:00
dchandler	1982c5847b	Jskad's converter now has ACIP-to-Unicode built in. There are known bugs; it is pre-alpha. It's usable, though, and finds tons of errors in ACIP input files, with the user deciding just how pedantic to be. The biggest outstanding bug is the silent one: treating { }, space, as tsheg instead of whitespace when we ought to know better.	2003-08-24 06:40:53 +00:00
dchandler	d5ad760230	TMW->Wylie conversion now takes advantage of prefix rules, the rules that say "ya can take a ga prefix" etc. The ACIP->Unicode converter now gives warnings (optionally, and by default, inline). This converter now produces output even when lexical errors occur, but the output has errors and warnings inline.	2003-08-23 22:03:37 +00:00
dchandler	21ef657921	I'd broken the ACIP->Wylie for ACIP vowels {'A}, {'I}, etc.	2003-08-22 05:13:32 +00:00
dchandler	1afb3a0fdd	ACIP->Unicode, without going through TMW, is now possible, so long as \, the Sanskrit virama, is not used. Of the 1370-odd ACIP texts I've got here, about 57% make it through the gauntlet (fewer if you demand a vowel or disambiguator on every stack of a non-Tibetan tsheg bar).	2003-08-18 02:38:54 +00:00
dchandler	245aac4911	I'm now stricter about accepting alphabetic characters. F, Q, X, a, b, c, d, e, ... do not belong in ACIP, so the scanner rejects them. This should make it even easier to distinguish automatically between Tibetan and English texts.	2003-08-17 02:38:58 +00:00
dchandler	39451d8879	Fixed a couple of small bugs. Only 250 errors are reported now; this is important if you try to convert an English document.	2003-08-17 02:12:49 +00:00
dchandler	4581a2d8ab	Improved the ACIP scanner (the part of the converter that says, "This is a correction, that's a comment, this is Tibetan, that's Latin (English), that's Tibetan inter-tsheg-bar punctuation, etc.) It now accepts more real-world ACIP files, i.e. it handles illegal constructs. The error checking is more user-friendly. There are now tests. Added some tsheg bars that Peter E. Hauer of Linguasoft sent me to the tests. Many thanks, Peter. I still need to implement rules that say, "This is not Tibetan, it must be Sanskrit, because that letter doesn't take a MA prefix."	2003-08-17 01:45:55 +00:00
dchandler	0b91ed0beb	I've improved the ACIP tsheg bar scanner to handle a lot of illegal constructions that occur in practice.	2003-08-16 16:13:53 +00:00
amontano	2a57439516	Updated the info displayed on the about window.	2003-08-14 14:16:49 +00:00
amontano	da384c6c2f	Now when loading, takes the default font options from the DuffPane.	2003-08-14 14:16:23 +00:00
dchandler	2b59d9838d	I now have a function that takes as input a String of ACIP and breaks up that String into tsheg bars, punctuation, etc., while finding errors. I've tested it some, but I'm not yet committing the tests. Next step: a converter that takes an ACIP file as input and outputs TMW+Latin.	2003-08-14 05:10:47 +00:00
dchandler	57f506384f	The ACIP->Tibetan converter now has perfect low-level functionality, and it has the capability to produce error messages and warnings that make sense to the user. One can now get the correct parse, if one exists, for an ACIP tsheg bar. One could even feed in ACIP and get a list of warnings about things as innocuous as PADMA, which a dumb converter would have trouble with. One could then turn ACIP into well-behaved ACIP for that dumb converter, if you really wanted to. Still to do: o Scan ACIP files into tsheg bars. o Produce TMW/Latin (from which you can get Unicode, etc.). o E-mail the illegal tsheg bars to the ACIP fellows so they can fix the affected documents (most of the Kangyur has unparseable creatures).	2003-08-12 04:13:11 +00:00
dchandler	87266646fb	Removed misinformation.	2003-08-10 19:33:01 +00:00
dchandler	e21d3774a9	Added an unfinished ACIP->Tibetan converter. Once it works properly for ACIP, it'll easily be made to work as a perfect EWTS Wylie->Tibetan converter. It has an extensive suite of tests for the existing functionality.	2003-08-10 19:30:07 +00:00
dchandler	39e0435b6b	Refactored this code so that Wylie->Tibetan and ACIP->Tibetan conversions can make use of it. Hooray for reuse.	2003-08-10 19:02:56 +00:00
dchandler	bcf1c12b6a	We now produce EWTS m.ya, g.rwa, d.rwa, and b.ya during TMW->Wylie. Our disambiguation is now perfect, happening when and only when it is necessary. These are all illegal, so it shouldn't affect many existing conversions. But if there were typos, it could.	2003-08-10 18:46:01 +00:00
dchandler	9093fd3c05	We now produce EWTS m.ya, g.rwa, d.rwa, and b.ya during TMW->Wylie. Our disambiguation is now perfect, happening when and only when it is necessary. These are all illegal, so it shouldn't affect many existing conversions. But if there were typos, it could.	2003-08-10 18:38:20 +00:00
dchandler	251d8feae5	brtan now gives TMW->Wylie brtan, not b.rtan. Etc. See bug report http://sourceforge.net/tracker/index.php?func=detail&aid=785791&group_id=61934&atid=502515.	2003-08-09 17:48:40 +00:00
dchandler	7dffc47cb7	'bad now gives TMW->Wylie 'bad, not TMW->Wylie 'abd. Andres came across this one, so we've added it to the list of ambiguous three-consonant combos.	2003-08-09 17:05:43 +00:00
amontano	52cdc17794	Added support for multiple keyboards and ability to set the preferences for size of tibetan font and type and size of roman font.	2003-08-09 08:00:58 +00:00
amontano	8e4b508de8	Made a new class for the preference window so that other software (i.e. the translation tool) can use re-use that same code to set up the attributes of the tibetan and roman fonts.	2003-08-09 07:57:21 +00:00
amontano	ef0df405d9	Redesigned the interface of the handheld version.	2003-08-03 06:29:08 +00:00
amontano	2b5a5fe67a	Got rid of redundant code	2003-08-03 06:28:22 +00:00
amontano	cce779bf88	Added a wizard window to avoid as much as possible using the command line. This way through clicking on the application through the wizard one can choose to connect to the available on-line dicts, open a local dict or generate a dict database.	2003-08-03 06:27:30 +00:00
dchandler	4caeafa1b1	You shouldn't have one of these without the other, now that there are two. This way neither TM nor TMW fonts will be loaded.	2003-07-26 00:55:32 +00:00
dchandler	2bb499e5a7	This was dying with a NullPointerException when you started it up using 'ant tt-run' with no dictionary. Now it starts up and shows you a nice error message, "Dictionary could not be loaded!", instead.	2003-07-26 00:53:59 +00:00
dchandler	e198519c5f	Jskad now supports EWTS ~, i.e. TMW8.91.	2003-07-25 02:35:31 +00:00
amontano	5df9b5b91a	now supports sorting	2003-07-25 01:43:58 +00:00
amontano	97f5fe91b3	when invalid wylie is encountered, instead of displaying a message it raises an exception.	2003-07-25 01:43:18 +00:00
amontano	7cdbf33333	changed it to support for 30 dictionaries (instead of just 15)	2003-07-25 01:42:17 +00:00
amontano	7b04d7bca5	changed the "about" info	2003-07-25 01:41:30 +00:00
dchandler	a7f0c35738	Added a test for ts.ha vs. tsha ambiguity; there is no ambiguity.	2003-07-18 03:51:29 +00:00
dchandler	dc454b8c0c	More test cases related to the following: The Tibetan d.za was being converted into the Wylie dza incorrectly. This is a rare case, but I want TMW->Wylie to be perfectly unambiguous.	2003-07-18 02:31:02 +00:00
dchandler	f8c959bfb0	The Tibetan d.za was being converted into the Wylie dza incorrectly. This is a rare case, but I want TMW->Wylie to be perfectly unambiguous.	2003-07-18 00:30:27 +00:00
dchandler	1c29566aee	I'm now using the Unix diff built in to Apache Jakarta Commons JRCS (which I found on suigeneris.org, not apache.org) in order to bulletproof the Tibetan Converter tests. They used to fail due to nondeterminism in the Java RTF writer; they should no longer fail. I've also changed it so that the Tibetan Converter tests run in headless mode, which means that they'll run on the nightly builds server.	2003-07-14 12:26:26 +00:00
dchandler	f900154e7a	Tests disambiguation in TMW->Wylie conversion.	2003-07-14 12:21:02 +00:00
dchandler	0622ac5062	Jskad no longer relies on the <?Consonants?>, <?Vowels?>, <?Other?>, or <?Numbers?> commands; it instead hard-codes the appropriate comma- delimited lists. This is cleaner because WylieWord and Jskad had different values for these lists.	2003-07-14 12:19:46 +00:00
dchandler	fb85f6e8ce	Fix comment.	2003-07-14 12:17:04 +00:00
dchandler	79b3b97326	Remove warning message from menu item.	2003-07-13 23:19:11 +00:00
dchandler	c986684beb	Updated help to talk about new features.	2003-07-13 22:51:35 +00:00
dchandler	f695b1a6c1	Updated baselines because conversions have improved since the last update.	2003-07-13 19:14:41 +00:00
dchandler	d10f97fc06	Disambiguation was not being used appropriately. This makes previous TMW->Wylie conversions with the new-and-improved TMW->Wylie algorithm faulty. Now I'm using it a little more than you need to, e.g. b.lha instead of blha is generated because bla and b.la are ambiguous.	2003-07-13 19:14:15 +00:00
dchandler	96afae795c	Disambiguation was not being used appropriately. This makes previous TMW->Wylie conversions with the new-and-improved TMW->Wylie algorithm faulty. Now I'm using it a little more than you need to, e.g. b.lha instead of blha is generated because bla and b.la are ambiguous.	2003-07-13 18:46:29 +00:00
dchandler	802e0cb588	If this method uses the Wylie representation, you get an infinite recursion when you do a TMW->Wylie conversion for a document with glyphs that have no known Wylie.	2003-07-13 17:40:02 +00:00
dchandler	a86a0f235b	I was missing a break; statement; this caused an Error to be thrown during some TMW->Wylie conversions. No conversions were erroneous, though.	2003-07-13 17:38:00 +00:00
dchandler	6677d1e245	Code cleanup.	2003-07-13 16:53:03 +00:00
dchandler	3b6eaa792e	Fixed javadocs.	2003-07-11 13:33:30 +00:00
dchandler	85176cd9f3	Put in a fix for a new bug in Swing's RTF support. This bug is w.r.t. escapes like \bullet, \emdash, etc., and this fix only works for Windows or OS/2 RTF files, not for Mac RTF files. So if you want a TM->TMW conversion to work, use MS Word for Windows, not for the Mac.	2003-07-11 13:30:22 +00:00
dchandler	d726bc0258	A couple of changes to TMW->Unicode thanks to Than's reply to my questions.	2003-07-09 01:44:15 +00:00
dchandler	9db233bdf8	Cosmetic change.	2003-07-08 14:31:14 +00:00
dchandler	02558a1d78	Jskad supports <7, >8, etc. again; it no longer supports the punctuation '<' and '>'. The current keyboard implementation makes this an either-or proposition, when fundamentally it need not be. Added a <?Numbers?> command and an <?Input:Numbers?> command to tibwn.ini; broke the numbers apart from the consonants. This facilitates the new-and-improved Tibetan->Wylie conversion. Tibetan->Wylie is now done by forming legal tsheg-bars. A legal tsheg bar is converted into perfect THDL Wylie. See code comments to learn what it thinks is a legal tsheg-bar, but it inlcudes bskyUMbsH minus the trailing punctuation (H), e.g. Illegal sequences, such as runs of transliterated Sanskrit, are turned into unambiguous Wylie; each glyph is followed by a vowel or a disambiguator ('.'). I've made it so that the illegal sequences are as beautiful as possible. You get 'pad+me', for example, not the equivalent but uglier 'pad+m.e.'.	2003-07-08 14:30:17 +00:00
dchandler	c04a3f189b	Rearranged the topics.	2003-07-08 12:50:27 +00:00
dchandler	23d18c925f	Tibetan! 5.1's docs were again faulty. fa and va were getting the wrong vowels.	2003-07-08 02:59:17 +00:00
dchandler	24ac6fd06c	The Trie of possible inputs fixed this bug.	2003-07-06 16:31:13 +00:00
dchandler	d88141512b	Small changes w.r.t. clearing preferences. Some code cleanup.	2003-07-06 16:24:29 +00:00
dchandler	086f4bb6ec	Renamed the Info menu Help. Now using CalHTMLPane to surf the offline and the online help.	2003-07-05 22:25:21 +00:00
dchandler	8c4ab30a52	Rearranged the Tools menu; made the converter smart about "find some..." and "find all..." modes.	2003-07-05 21:02:46 +00:00
dchandler	72d2eee503	Code cleanup.	2003-07-05 19:26:58 +00:00
dchandler	a463b686b3	Jskad now ships with both TibetanMachine and TibetanMachineWeb fonts by default, not just TMW. Thus users need not install these fonts on their systems.	2003-07-05 18:00:29 +00:00
dchandler	9effee0564	If you opened a file from the recently opened files list and very quickly mouse-clicked on the new Jskad window, you could cause an infinite regression of requestFocus() operations because the menu would try to get focus back. I grab focus from the menu now.	2003-07-05 02:30:00 +00:00
dchandler	51679c158b	Final fixes completed; recently opened files can now be selected from Jskad's file menu.	2003-07-05 02:15:33 +00:00
dchandler	4410b52c07	There's still a small bug in this, but here's the real stuff: Recently opened files can now be selected from Jskad's file menu. A Jskad now gives the focus to the DuffPane when that Jskad gets the focus.	2003-07-04 03:29:25 +00:00
dchandler	d863446d25	I think this compiles...	2003-07-04 02:32:40 +00:00
dchandler	407020108f	I didn't mean to commit the previous revision; I'm still tweaking it.	2003-07-04 02:32:03 +00:00
dchandler	9f0b1c3250	Recently opened files can now be selected from Jskad's file menu. A Jskad now gives the focus to the DuffPane when that Jskad gets the focus.	2003-07-04 02:31:23 +00:00
dchandler	7500b4e06b	Jskad won't allow you to exit by closing the last window anymore. Instead, you get a dialog box saying to use File/Exit.	2003-07-04 00:21:07 +00:00
dchandler	6c286573ba	Fixed Javadocs.	2003-07-04 00:12:59 +00:00
dchandler	0a1bc0d30b	getWylie now takes a parameter for error detection; I'm not detecting errors here though. Fixed a typo in a property name.	2003-07-01 23:20:08 +00:00
dchandler	0d1999d055	getWylie now takes a parameter for error detection; I'm not detecting errors here though.	2003-07-01 22:52:18 +00:00
dchandler	a48ec641d5	Better error messages in TMW->Wylie conversions. The user knows what's up.	2003-07-01 03:43:33 +00:00
dchandler	3113a4b8de	Some of the \tmw80.. mappings were out of date. 3+1/2 is not EWTS; took these out.	2003-07-01 03:42:30 +00:00
dchandler	e7e7c2bf15	The command-line tool runs in headless mode by default, so it will work on a Linux console, e.g. The JUnit tests will too, though 'ant check' still fails because we don't sneak the -Djava.awt.headless=true into the process early enough.	2003-07-01 02:50:09 +00:00
dchandler	6151a7bc94	TMW->Wylie now occurs in the TibetanDocument, not in DuffPane, which means that the command-line tool can finally function with a headless graphics device. Hopefully it will speed things up, too. It also means that entering Roman text into the TMW->Unicode conversion and TMW->TM conversion will be easy.	2003-07-01 01:21:57 +00:00
dchandler	61d29fc355	The TMW->Wylie mapping was busted w.r.t. tshegs. Also, I now map both TMW7.90 and TMW7.91 to EWTS 'M'.	2003-07-01 00:17:18 +00:00
dchandler	229536884f	I've validated by hand the TM<->TMW mappings. A few things changed, so no previous TM->TMW or TMW->TM conversions can be trusted.	2003-06-30 02:24:11 +00:00
dchandler	dc03083433	I've validated by hand the TM<->TMW mappings. A few things changed, so no previous TM->TMW conversions can be trusted.	2003-06-30 02:22:09 +00:00
dchandler	58644a6ef9	Better error handling.	2003-06-30 02:20:52 +00:00
dchandler	b16fb8a85c	This is correct; the Tibetan! 5.1 documentation is not. This affects TM->TMW conversions. See http://sourceforge.net/tracker/index.php?func=detail&aid=746871&group_id=61934&atid=502515 for a full list of Tibetan! 5.1 documentation errors.	2003-06-29 22:11:00 +00:00
dchandler	aedef4b44d	An error now appears if you try to convert from format A to format B but no glyphs in format A appear. In this case, it is likely that you meant to convert a different file or do a different conversion.	2003-06-29 21:31:48 +00:00
dchandler	ee14b7b97f	Jskad now has the ability to open its buffer with an external viewer, e.g. Microsoft Word. Better OOM error handling in the GUI converter; untested, though.	2003-06-29 20:49:30 +00:00
dchandler	646e23b4a4	Tweaked the converter GUI so that you can open the old and the new files with the external viewer.	2003-06-29 16:45:15 +00:00
dchandler	3f76c3692d	Fixed Javadoc warnings.	2003-06-29 15:37:35 +00:00
dchandler	b841a7f14b	The converter GUI can now be run standalone or from Jskad's Tools menu. The converter GUI gives nicer error messages in at least one case.	2003-06-29 04:18:36 +00:00
dchandler	7938648ca8	TM->TMW conversion has no known bugs. Oddballs have been comprehensively handled.	2003-06-29 03:03:07 +00:00
dchandler	689c1910aa	To deal with java.swing.text.rtf bugs regarding hexadecimal escape sequences, I've created RTFFixerInputStream. It turns illegal hexadecimal escapes into Unicode escapes.	2003-06-29 02:30:08 +00:00
dchandler	0b849aed97	Fixed comments w.r.t. javadoc warnings.	2003-06-29 02:22:20 +00:00
dchandler	4e279defb4	Fixed a couple of array bounds checks. Added support for two more oddballs. Deprecated the oddball lookup method because it drops up to 30 glyphs in TibetanMachine. The correct solution is to transform the RTF before Java's busted RTF readers ever see it. \'97 becomes \u151, e.g.	2003-06-28 16:33:58 +00:00
dchandler	2a359c45ef	Bad conversions were not leaving the unconvertable characters at the beginning of the document as they should and as they are documented to. They now do, and they bracket the bad characters with the TM or TMW for U+0F3C on the left and the TM or TMW for U+0F3D on the right. Some cleanup.	2003-06-28 16:20:19 +00:00
dchandler	c39d8d6326	My earlier code cleanup introduced this bug; TMW->TM conversion was busted.	2003-06-26 22:48:51 +00:00
dchandler	25510542b2	Now with a nicer error message in one case.	2003-06-26 22:48:05 +00:00
dchandler	c34259b105	Code cleanup.	2003-06-25 01:04:24 +00:00
dchandler	9e6c3009ac	Added an About button. Code cleanup. Changed the Cancel button to the Close button.	2003-06-25 00:49:11 +00:00
dchandler	569fba6467	Made the comments in the my_thdl_preferences.txt file use standard line separators.	2003-06-25 00:03:46 +00:00
dchandler	0f3c4174b6	Made the comments in the my_thdl_preferences.txt file more useful.	2003-06-24 23:48:00 +00:00
dchandler	33beb7b782	Bye bye debugging output.	2003-06-24 12:23:37 +00:00
dchandler	f547734043	Added Than's converter GUI code; adapted it to work with Jskad's converters. TMW->Unicode now uses Ximalaya by default.	2003-06-24 03:02:29 +00:00
dchandler	19d7cabfe6	Forget the final=faster myth.	2003-06-24 03:01:13 +00:00
dchandler	917864574c	Fixed a logic bug in mapTMWtoTM and mapTMtoTMW. You can now specify which Unicode font to use via 'java -Dthdl.tmw.to.unicode.font=Ximalaya ...'.	2003-06-23 01:58:11 +00:00
dchandler	b6d8fd89f9	When errors in (all but TMW->Wylie and Wylie->TMW) conversion occur, the troublesome glyphs are now put at the beginning of the document AFTER AN ACHEN. This makes a glyph like \tmw7095 visible atop the achen. Major fix to the handling of paragraphs in conversion; we were (for whatever reason) dropping paragraphs before.	2003-06-23 01:24:02 +00:00
dchandler	1f4343bed0	TMW->TM, TM->TMW, and TMW->Unicode conversions are all (at least 2) orders of magnitude faster.	2003-06-22 22:10:58 +00:00
dchandler	afe73c2228	The pseudo-file '-', referring to standard input, is now accepted as a command-line argument.	2003-06-22 21:05:16 +00:00
dchandler	900f7492b0	'ant clean check' was failing because I hadn't updated the --find-some-non-tmw and --find-all-non-tmw baselines. Code cleanup.	2003-06-22 16:11:58 +00:00
dchandler	66287f3cc9	Small TMW->Wylie performance improvements. TMW->Wylie is much faster than TMW->Unicode etc.; this is because many fewer replacements are made (i.e., more text is replaced each time a replacement is performed). I must find a way to still preserve formatting but do many fewer replacements in TMW->{Unicode,TM} and TM->TMW.	2003-06-22 04:32:59 +00:00
dchandler	6540b260bd	Fixes a (small, I think) TMW->Unicode performance glitch. I was inserting 5 characters at a time and then skipping ahead just one position. I don't think this affected correctness. I believe there's still a terrible (exponential?) slowdown as the input file gets bigger, however. Perhaps not -- but we run through the first 1000 TMW glyphs in 6 seconds, the 20th thousand takes at least 60 seconds. Is TMW->Wylie faster than TMW->Unicode? If so, why? Thought: don't use a DuffPane within TibetanConverter -- it can only add overhead, right? My hprof profile said that the conversion was taking just a couple of percent of the work; the rest was going to display-related stuff that you should only see if you were displaying the document. I'm not!	2003-06-22 04:08:33 +00:00
dchandler	dfe64a1927	Added --find-some-non-tm and --find-all-non-tm modes to the converter to help ensure worry-free TM->TMW conversions.	2003-06-22 00:14:18 +00:00
dchandler	80101666c7	Included a fix from WylieWord's tibwn.ini. Removed some needless trailing tildes.	2003-06-21 02:35:21 +00:00
dchandler	9a41f512d9	It used to be the case that you could select 'Close', and then when asked "do you want to save?" you could press yes and then press cancel and Jskad would still exit. That's no longer the case. Added File->Exit to Jskad.	2003-06-21 02:07:51 +00:00
dchandler	45b87b0fb4	In Jskad, you can now clear the preferences and return to default values.	2003-06-21 01:26:17 +00:00
eg3p	fbb6245fdb	Added cut() and copy() methods to override JTextPane's methods of same name.	2003-06-20 15:27:20 +00:00
dchandler	5067683121	Edward corrected me; he had intended to have M map to 7.91, not 7.90.	2003-06-17 01:46:19 +00:00
dchandler	ced830a7d3	Renamed TMW_RTF_TO_THDL_WYLIE TibetanConverter.	2003-06-15 19:19:23 +00:00
dchandler	34a7b5da9b	This converter now performs TMW->Unicode conversions.	2003-06-15 18:38:42 +00:00
dchandler	da70434e52	Jskad now allows for TMW->Unicode conversion.	2003-06-15 16:27:36 +00:00
dchandler	af5b95b08d	A TMW->Unicode table is here. Note these issues, however: Is the EWTS '_' to be represented as U+0020, or is it a wider space? Does TMW9.42, Dza, map to U+0F5F,U+0F39? Does TMW6.60, r+y, map to U+0F62,U+0FBB or to U+0F6A,U+0FBB? (Likewise with r+w, TMW6.61, TMW6.62, etc.) Is U+0F7E a bindu? What Unicode does TMW7.96 map to, for example? What does TMW7.91 map to? Should TMW8.97 and TMW8.98 map to swastiskas elsewhere in Unicode? If so, which codepoints? Likewise with TMW9.60, a Chinese character. Does TMW7.68 map to U+0F39? Does TMW7.74, the ITHI secret sign, have a Unicode mapping? f68,fa0,f80,f72 comes close, but fa0 would be too large, wouldn't it? What Unicode does TMW9.61 map to? Is it for sequences like f40,f7c,f60,f72? Or is it for f60,f72,f7c?	2003-06-15 03:25:45 +00:00
dchandler	b387c512e9	Fixed two bugs.	2003-06-15 03:08:57 +00:00
dchandler	189fef9aec	Made Jskad smart enough to handle a few more EWTS characters; some it can only convert to Wylie, others are live key sequences. This will make converting the shechen documents go more smoothly.	2003-06-09 13:35:43 +00:00
dchandler	09a55110b7	Handles more TibetanMachine oddballs.	2003-06-09 02:01:13 +00:00
dchandler	b9219640e5	Handles more TibetanMachine oddballs.	2003-06-09 01:53:01 +00:00
dchandler	e97e1c8464	Handles more TibetanMachine oddballs.	2003-06-09 01:20:32 +00:00
dchandler	651a599188	Fixed usage info.	2003-06-08 23:23:12 +00:00
dchandler	70b31558fa	Tried to fix a crashing bug that happened when you converted TM->TMW and then tried to convert that TMW to Wylie. I swear it's Java's problem (see the ugly stack trace in the code and decide for yourself), and I tried replacing rather than inserting-and-then-removing, but it didn't work. I've left these things as options.	2003-06-08 23:12:52 +00:00
dchandler	212414edef	TMW_RTF_TO_THDL_WYLIE now converts TM->TMW.	2003-06-08 22:43:27 +00:00
dchandler	32831b698f	If bad (oddball) TM glyphs appear, then converting to TMW causes, by default, all oddballs to appear once in the resulting document. This'll help me find the correct glyphs for the oddballs, and it'll prevent the average user from converting a document with oddballs.	2003-06-08 22:37:38 +00:00
dchandler	d45f5ab8c8	Improved performance (I suppose).	2003-06-03 23:49:34 +00:00
dchandler	7d768c9e06	Fixed a crashing bug that happened upon converting wylie to tibetan.	2003-06-03 23:45:15 +00:00
dchandler	0f724989b5	The Wylie 'M' used to map to TMW7.91, when it should map to TMW7.90. I've fixed that. I've also added a couple of Unicode mappings to give a flavor for how multi-codepoint mappings will be represented. TM->TMW conversion takes about 1 second per thousand glyphs on my PIII-550.	2003-06-01 23:05:32 +00:00
dchandler	54ca37c824	The Wylie 'M' used to map to TMW7.91, when it should map to TMW7.90. I've fixed that. I've also added a couple of Unicode mappings to give a flavor for how multi-codepoint mappings will be represented.	2003-06-01 19:14:08 +00:00
dchandler	e2caf99085	Some code cleanup. tibwn.ini must now have, in the Unicode column, either nothing, or 0FXX(,0FXX)*. E.g., 0F04,0F05 is valid. Debugging code ensures this is the case.	2003-06-01 18:09:49 +00:00

... 2 3 4 5 6 ...

544 commits