Jskad

Author	SHA1	Message	Date
dchandler	5c240ac072	From the converter GUI, you can now choose TMW->ACIP text and TMW->Wylie text. All the conversions show you which format they take as input and which format they give as output. File filter for ACIP files added. The GUI converter suggests a file extension wisely. Fixed newline bug in ACIP->Unicode converter.	2003-09-05 02:05:34 +00:00
dchandler	4abbf6db37	--to-acip-text and --to-wylie-text added; these get you text files, not RTF files like --to-acip and --to-wylie do. The GUI converter doesn't yet allow you to get text files.	2003-09-04 05:16:47 +00:00
dchandler	cc615f34df	ACIP->TMW and ACIP->Unicode have my pre-stamp of non-approval. Except for (NYAx} and {NYAo}, they're as good as I'll get them without input from experts of the employ of a complementary, syllabary-based approach.	2003-09-04 04:34:18 +00:00
dchandler	ae7a7577bc	ACIP->TMW and ACIP->Unicode are now smart about when a newline is really a newline and when a space is really a tsheg. The space in {KA ,MDO} is a tsheg, but the space in {GA ,MDO} is not.	2003-09-04 04:13:01 +00:00
dchandler	d2749cecd0	ACIP->TMW and ACIP->Unicode are now smart about when a newline is really a newline and when a space is really a tsheg. The space in {KA ,MDO} is a tsheg, but the space in {GA ,MDO} is not.	2003-09-04 04:04:21 +00:00
dchandler	72e531e515	Use shortened 'dreng-bu, not regular. As per TM glyphs. I suspect that the following would look better with shortened 'dreng-bu also, but I'm sticking with the TM/TMW docs: dz+r~137,2~~4,46~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fb2 dz+w~138,2~~4,47~1,110~4,120~1,123~1,126~4,106~4,113~f5b,fad dz+h~139,2~~4,48~1,110~4,120~1,123~1,126~4,106~4,113~0F5C dz+h+y~140,2~~4,49~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb1 dz+h+r~141,2~~4,50~1,110~4,121~1,123~1,126~4,107~4,114~f5c,fb2 dz+h+l~249,2~~4,51~1,110~4,123~1,123~1,126~4,110~4,117~f5c,fb3 dz+h+w~143,2~~4,52~1,110~4,122~1,123~1,126~4,108~4,115~f5c,fad	2003-09-04 03:46:35 +00:00
a1tsal	2f58ec2760	A bunch of Sanskrit stacks of the form ts+... and dz+...had 1,125 for their drengbu, but that is actually a naro. I changed it to 1,123 (which is one of the two drengbus).	2003-09-04 02:06:58 +00:00
dchandler	316f59107b	A preliminary TMW->ACIP converter is here. There are known bugs, mostly with rare punctuation.	2003-09-02 06:39:33 +00:00
dchandler	cc9ab06864	Added utility routine. Better comments.	2003-08-31 20:38:28 +00:00
dchandler	045c4069c9	Preliminary ACIP->TMW support is in place. {DU} gives you something less beautiful than what Jskad would give, so more work is needed.	2003-08-31 16:06:35 +00:00
a1tsal	1f4d53be2e	Moved ^M to punctuation section. Removed obsolete comment.	2003-08-31 00:44:23 +00:00
a1tsal	522812996e	Remove unused sections of tibwn.ini.	2003-08-31 00:34:15 +00:00
dchandler	dd22e161a5	Code cleanup for Jskad's Tibetan font converter GUI.	2003-08-30 05:01:15 +00:00
dchandler	896344f2d1	David Chapman removed some lines from tibwn.ini. That breaks TM<->TMW mappings, so I've put them back, but with the EWTS non-correspondences \tmwXYYY. Jskad no longer supports superscribed or subscribed numerals, because EWTS does not.	2003-08-26 01:28:02 +00:00
a1tsal	ccdebf6719	Removed half numbers (no longer in EWTS) Brought <?Other?> closer to EWTS Removed __TILDE__ (no longer in EWTS) Changed M^ to ^M per new EWTS draft Added ai, au, -i from WW tibwn.ini -- they were missing in this version	2003-08-25 23:19:48 +00:00
dchandler	1982c5847b	Jskad's converter now has ACIP-to-Unicode built in. There are known bugs; it is pre-alpha. It's usable, though, and finds tons of errors in ACIP input files, with the user deciding just how pedantic to be. The biggest outstanding bug is the silent one: treating { }, space, as tsheg instead of whitespace when we ought to know better.	2003-08-24 06:40:53 +00:00
dchandler	d5ad760230	TMW->Wylie conversion now takes advantage of prefix rules, the rules that say "ya can take a ga prefix" etc. The ACIP->Unicode converter now gives warnings (optionally, and by default, inline). This converter now produces output even when lexical errors occur, but the output has errors and warnings inline.	2003-08-23 22:03:37 +00:00
dchandler	21ef657921	I'd broken the ACIP->Wylie for ACIP vowels {'A}, {'I}, etc.	2003-08-22 05:13:32 +00:00
dchandler	1afb3a0fdd	ACIP->Unicode, without going through TMW, is now possible, so long as \, the Sanskrit virama, is not used. Of the 1370-odd ACIP texts I've got here, about 57% make it through the gauntlet (fewer if you demand a vowel or disambiguator on every stack of a non-Tibetan tsheg bar).	2003-08-18 02:38:54 +00:00
dchandler	245aac4911	I'm now stricter about accepting alphabetic characters. F, Q, X, a, b, c, d, e, ... do not belong in ACIP, so the scanner rejects them. This should make it even easier to distinguish automatically between Tibetan and English texts.	2003-08-17 02:38:58 +00:00
dchandler	39451d8879	Fixed a couple of small bugs. Only 250 errors are reported now; this is important if you try to convert an English document.	2003-08-17 02:12:49 +00:00
dchandler	4581a2d8ab	Improved the ACIP scanner (the part of the converter that says, "This is a correction, that's a comment, this is Tibetan, that's Latin (English), that's Tibetan inter-tsheg-bar punctuation, etc.) It now accepts more real-world ACIP files, i.e. it handles illegal constructs. The error checking is more user-friendly. There are now tests. Added some tsheg bars that Peter E. Hauer of Linguasoft sent me to the tests. Many thanks, Peter. I still need to implement rules that say, "This is not Tibetan, it must be Sanskrit, because that letter doesn't take a MA prefix."	2003-08-17 01:45:55 +00:00
dchandler	0b91ed0beb	I've improved the ACIP tsheg bar scanner to handle a lot of illegal constructions that occur in practice.	2003-08-16 16:13:53 +00:00
amontano	2a57439516	Updated the info displayed on the about window.	2003-08-14 14:16:49 +00:00
amontano	da384c6c2f	Now when loading, takes the default font options from the DuffPane.	2003-08-14 14:16:23 +00:00
dchandler	2b59d9838d	I now have a function that takes as input a String of ACIP and breaks up that String into tsheg bars, punctuation, etc., while finding errors. I've tested it some, but I'm not yet committing the tests. Next step: a converter that takes an ACIP file as input and outputs TMW+Latin.	2003-08-14 05:10:47 +00:00
dchandler	57f506384f	The ACIP->Tibetan converter now has perfect low-level functionality, and it has the capability to produce error messages and warnings that make sense to the user. One can now get the correct parse, if one exists, for an ACIP tsheg bar. One could even feed in ACIP and get a list of warnings about things as innocuous as PADMA, which a dumb converter would have trouble with. One could then turn ACIP into well-behaved ACIP for that dumb converter, if you really wanted to. Still to do: o Scan ACIP files into tsheg bars. o Produce TMW/Latin (from which you can get Unicode, etc.). o E-mail the illegal tsheg bars to the ACIP fellows so they can fix the affected documents (most of the Kangyur has unparseable creatures).	2003-08-12 04:13:11 +00:00
dchandler	87266646fb	Removed misinformation.	2003-08-10 19:33:01 +00:00
dchandler	e21d3774a9	Added an unfinished ACIP->Tibetan converter. Once it works properly for ACIP, it'll easily be made to work as a perfect EWTS Wylie->Tibetan converter. It has an extensive suite of tests for the existing functionality.	2003-08-10 19:30:07 +00:00
dchandler	39e0435b6b	Refactored this code so that Wylie->Tibetan and ACIP->Tibetan conversions can make use of it. Hooray for reuse.	2003-08-10 19:02:56 +00:00
dchandler	bcf1c12b6a	We now produce EWTS m.ya, g.rwa, d.rwa, and b.ya during TMW->Wylie. Our disambiguation is now perfect, happening when and only when it is necessary. These are all illegal, so it shouldn't affect many existing conversions. But if there were typos, it could.	2003-08-10 18:46:01 +00:00
dchandler	9093fd3c05	We now produce EWTS m.ya, g.rwa, d.rwa, and b.ya during TMW->Wylie. Our disambiguation is now perfect, happening when and only when it is necessary. These are all illegal, so it shouldn't affect many existing conversions. But if there were typos, it could.	2003-08-10 18:38:20 +00:00
dchandler	251d8feae5	brtan now gives TMW->Wylie brtan, not b.rtan. Etc. See bug report http://sourceforge.net/tracker/index.php?func=detail&aid=785791&group_id=61934&atid=502515.	2003-08-09 17:48:40 +00:00
dchandler	7dffc47cb7	'bad now gives TMW->Wylie 'bad, not TMW->Wylie 'abd. Andres came across this one, so we've added it to the list of ambiguous three-consonant combos.	2003-08-09 17:05:43 +00:00
amontano	52cdc17794	Added support for multiple keyboards and ability to set the preferences for size of tibetan font and type and size of roman font.	2003-08-09 08:00:58 +00:00
amontano	8e4b508de8	Made a new class for the preference window so that other software (i.e. the translation tool) can use re-use that same code to set up the attributes of the tibetan and roman fonts.	2003-08-09 07:57:21 +00:00
amontano	ef0df405d9	Redesigned the interface of the handheld version.	2003-08-03 06:29:08 +00:00
amontano	2b5a5fe67a	Got rid of redundant code	2003-08-03 06:28:22 +00:00
amontano	cce779bf88	Added a wizard window to avoid as much as possible using the command line. This way through clicking on the application through the wizard one can choose to connect to the available on-line dicts, open a local dict or generate a dict database.	2003-08-03 06:27:30 +00:00
dchandler	4caeafa1b1	You shouldn't have one of these without the other, now that there are two. This way neither TM nor TMW fonts will be loaded.	2003-07-26 00:55:32 +00:00
dchandler	2bb499e5a7	This was dying with a NullPointerException when you started it up using 'ant tt-run' with no dictionary. Now it starts up and shows you a nice error message, "Dictionary could not be loaded!", instead.	2003-07-26 00:53:59 +00:00
dchandler	e198519c5f	Jskad now supports EWTS ~, i.e. TMW8.91.	2003-07-25 02:35:31 +00:00
amontano	5df9b5b91a	now supports sorting	2003-07-25 01:43:58 +00:00
amontano	97f5fe91b3	when invalid wylie is encountered, instead of displaying a message it raises an exception.	2003-07-25 01:43:18 +00:00
amontano	7cdbf33333	changed it to support for 30 dictionaries (instead of just 15)	2003-07-25 01:42:17 +00:00
amontano	7b04d7bca5	changed the "about" info	2003-07-25 01:41:30 +00:00
dchandler	a7f0c35738	Added a test for ts.ha vs. tsha ambiguity; there is no ambiguity.	2003-07-18 03:51:29 +00:00
dchandler	dc454b8c0c	More test cases related to the following: The Tibetan d.za was being converted into the Wylie dza incorrectly. This is a rare case, but I want TMW->Wylie to be perfectly unambiguous.	2003-07-18 02:31:02 +00:00
dchandler	f8c959bfb0	The Tibetan d.za was being converted into the Wylie dza incorrectly. This is a rare case, but I want TMW->Wylie to be perfectly unambiguous.	2003-07-18 00:30:27 +00:00
dchandler	1c29566aee	I'm now using the Unix diff built in to Apache Jakarta Commons JRCS (which I found on suigeneris.org, not apache.org) in order to bulletproof the Tibetan Converter tests. They used to fail due to nondeterminism in the Java RTF writer; they should no longer fail. I've also changed it so that the Tibetan Converter tests run in headless mode, which means that they'll run on the nightly builds server.	2003-07-14 12:26:26 +00:00

1 2 3 4 5 ...

357 commits