Commit graph

174 commits

Author SHA1 Message Date
amontano
2250e03766 Updated copyright and version info. 2003-04-01 13:38:16 +00:00
amontano
a7a573020f Renamed LinkedList to SimplifiedLinkList and moved it from org.thdl.tib.scanner to org.thdl.util. This linked list was implemented because the VM running on handhelds does not include java.util.LinkedList. 2003-04-01 13:08:38 +00:00
dchandler
d836b850e8 "sgom pa'm ", not "sgom pa'am", is now used. "pe'm " was being
produced already, so the code was inconsistent.  If it turns out that
"pe'am " is preferred, I'll fix it later.  Consistency is very
appealing.
2003-03-31 01:38:27 +00:00
dchandler
33b3080068 Fixed a bunch of bugs; supports le'u'i'o, sgom pa'am, etc.
Better tests.  As part of that, I had to break TibetanMachineWeb into
TibetanMachineWeb+THDLWylieConstants, because I don't want the
class-wide initialization code from TibetanMachineWeb causing errors
in LegalTshegBarTest.
2003-03-31 00:33:50 +00:00
dchandler
1987f7d80a b-r-g, b-l-g-s, etc., when converted from Tibetan to Wylie, give
correct, unambiguous Wylie.
2003-03-30 21:49:55 +00:00
amontano
8565855dd1 Now the handheld version supports both portrait and landscape. 2003-03-30 17:09:09 +00:00
dchandler
f9670233ba Removed documentation FIXMEs from this code; did away for good with
some really iffy code that I think was behind the "Tibetan->Wylie
conversion fails when keyboard isn't Extended Wylie" bug.
2003-03-30 16:13:00 +00:00
dchandler
58f7371e66 I hope that Revamped the "Tools>Convert Tibetan To Wylie" feature that
converts TibetanMachineWeb glyphs to THDL Wylie.  Three-glyph and
four-glyph sequences with implicit "a" vowels are now handled
correctly, except for disambiguation w.r.t. things like b-la-g
vs. bla-g and d-wa vs. dwa.

pa'am, pa'ang etc. now work too.

Illegal Tibetan sequences now become very ugly, but "correct" Wylie.
Correct in the sense that converting it back to glyphs should get you
the glyphs you started with.

I also made a change to TibetanMachineWeb.java that I hope will clear
up problems with this feature when keyboards other than "Extended
Wylie" are selected.

Took nga out of the farRightSet [postsuffixes]; only da and sa belong
there, right?

I tried to get the system in a state such that I could run automated
tests of this stuff, but I ran into difficulties.  I have some manual
test cases; ask if you're interested.
2003-03-30 02:31:16 +00:00
dchandler
2b81020b0e More and better tests; fixed some bugs in LegalTshegBar. 2003-03-28 03:49:49 +00:00
amontano
35a9869aac 1. Fixed parsing error
2. Added support extreme uses of 'a' like le'u'i'o
3. Now parses correctly syllables that have the particles "ang" and "am" added to them. Second works only in "roman script" mode. The converter from tibetan script to roman script does not convert correctly this combinations. ("pa'ang" is converted wrongly into "pa'ng" and "pa'am" is converted wrongly into "pa'ma").
2003-03-23 20:27:54 +00:00
dchandler
08d2a5d702 Added a test for org.thdl.tib.text.tshegbar.UnicodeCodepointToThdlWylie. 2003-03-22 04:55:17 +00:00
dchandler
f2dcb0cbc3 I said I removed this earlier; I lied. Now it's gone. 2003-03-22 03:58:13 +00:00
dchandler
16cbfb6033 Moved ad-hoc test.java test cases to UnicodeGraphemeClusterTest.java,
a JUnit test which can be run via 'ant check'.  Removed test.java and
its build process.
2003-03-22 03:55:39 +00:00
dchandler
395eca7bb1 Moved ad-hoc test.java test cases to LegalTshegBarTest.java, a JUnit
test which can be run via 'ant check'.
2003-03-22 03:46:32 +00:00
dchandler
879b477902 Made some ad-hoc tests in test.java into JUnit tests, run by 'ant
check'.

NORM_NFD was replaced with NORM_NFKD in three cases in testMostlyNFKD.
2003-03-22 03:24:56 +00:00
dchandler
12eb7cf4cf Updated this file, which is used in making the Javadocs. 2003-03-22 03:14:02 +00:00
dchandler
3da52c5f0c Removed these files, which are for Savant and QuillDriver. They're
still in the CVS Attic if you need them.
2003-03-22 03:10:59 +00:00
dchandler
1e326bb06d Removing these QuillDriver leftovers. They're still in the CSV Attic,
if anyone needs them.
2003-03-22 02:38:24 +00:00
eg3p
fed25e27ee No longer necessary now that Savant & QuillDriver
have been moved out of THDL Tools.
2003-03-14 00:33:24 +00:00
eg3p
715203a12e Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics. So I removed
the Savant & QD related options.
2003-03-14 00:28:28 +00:00
eg3p
c280d0fc96 Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics.
2003-03-13 20:00:51 +00:00
eg3p
6cc0c5e99b Savant and QuillDriver are being removed from THDL Tools
and moved to a new site: Tools for Field Linguistics.
2003-03-13 19:57:12 +00:00
eg3p
4dff3b4ae0 Added new words 2003-03-12 13:01:16 +00:00
eg3p
a98849d3eb QD as XML editor. More details later. 2003-03-12 12:48:18 +00:00
eg3p
4070c5ccee Latest QD 2003-03-12 12:46:44 +00:00
dchandler
9e0dc68d12 Feature Request 697358 is done. The working directory for Jskad is
now a preference.

In addition, Jskad now raises an error dialog when you try to "Save
As" to a bad place or open a file that doesn't exist or isn't
readable.
2003-03-11 01:03:19 +00:00
dchandler
aa144dd599 Javadoc 1.4.1_01 no longer has a single warning about this package. 2003-02-03 01:36:56 +00:00
dchandler
c379db6ff5 Javadoc 1.4.1_01 no longer has a single warning about this file as we
use @ to represent the at sign @.
2003-02-03 01:36:08 +00:00
dchandler
e6a10d052f Added a "Help" menu item that pulls up jskad_doc.html, which is now put
into Jskad's JAR file.

Doing so required that I cut out a lot of fancy HTML code.  The correct fix
is to use XML to store the meat and then use XSL to generate two forms of
HTML: one dumb enough for Java, one for use on the THDL tools website.
2003-02-01 06:42:07 +00:00
dchandler
cf279bb620 Added a JScrollPane that views a noneditable HTML file found inside a JAR
file.
2003-02-01 06:37:32 +00:00
dchandler
bde0cc8381 Slapped on copyright boilerplate. 2003-02-01 05:52:03 +00:00
dchandler
a1f6b9e117 Each class's author is now listed as Than. 2003-02-01 05:38:48 +00:00
dchandler
d453e801ef Windows directory separators (backslashes) have been replaced with
java.io.File.separatorChar.  This means tibbibl puts its temporary
files under Jskad/bin in my Linux sandbox.
2003-02-01 05:30:22 +00:00
dchandler
72ee4fc7d2 Added the initial version of Tibbibl, which Nathaniel Garson of UVa
e-mailed to me.  Tibbibl is an editor for XML-based bibliographies of
Tibetan texts.  All I did was change the package from org.thdl.xml to
org.thdl.tib.bibl and add boilerplate; no changes to Than's code were
made.

Tibbibl features a diacritic input tool which Jskad might want to
swipe.
2003-02-01 05:08:02 +00:00
dchandler
190a3d9b60 achen must appear before a vowel. 2003-01-05 05:58:32 +00:00
dchandler
fcb75c55eb Small performance improvement involving String.intern(). Plus a
little bit of code cleanup.
2003-01-05 05:57:44 +00:00
dchandler
e5a63df1c1 Added a class skeleton that may not stay for long.
I'm committing in order to sync with my laptop, really.  This stuff will disappear
and reappear in better form later, after a holiday of coding and eggless,
alcohol-free nog.
2002-12-20 04:46:13 +00:00
dchandler
fdfedb4419 Added some tests for org.thdl.tib.text.tshegbar. These tests are preliminary,
and for this package only.

I'm committing in order to sync with my laptop, really.  This stuff will disappear
and reappear in better form later, after a holiday of coding and eggless,
alcohol-free nog.
2002-12-20 04:34:56 +00:00
dchandler
7ea185fa01 Renamed UnicodeCharToExtendedWylie to
UnicodeCodepointToThdlWylie.java.

Added a new class, UnicodeGraphemeCluster, that can tell you
the components of a grapheme cluster from top to bottom.  It does not
yet have good error checking; it is not yet finished.

Next is to parse clean Unicode into GraphemeClusters.  After that comes
scanning dirty Unicode into best-guess GraphemeClusters, and scanning
dirty Unicode to get nice error messages.
2002-12-17 13:51:18 +00:00
dchandler
8e8a23c6a6 Extended Wylie is referred to as THDL Extended Wylie or THDL Wylie
because a Japanese scholar has an "Extended Wylie" also.

NFKD and NFD have a new brother, NFTHDL.  I wish there weren't a need,
but as my yet-to-be-put-into-CVS break-unicode-into-grapheme-clusters code
demonstrates, the-need-is-there.  forgive-me for the hyphens, it's late.
2002-12-15 06:57:32 +00:00
dchandler
a42347b224 Now uses terminology from the Unicode standard. No more talk of
characters, for example.

Normalization forms NFKD and NFD are supported for the Tibetan Unicode
range.  I don't like either, actually.  I've tested NFKD, but I've not yet
committed the tests.
2002-12-15 03:35:24 +00:00
eg3p
3199ff7926 There are two classes here. One renders XML transcripts in JTextPane, and the other uses XPath to navigate the transcripts. Neither is part of the build yet. I'll document them more fully later when I've got to a point where they are worth sharing. 2002-12-12 15:17:42 +00:00
eg3p
86c2374706 New QD files that don't do anything yet. 2002-12-10 20:53:55 +00:00
dchandler
26993a5093 So that Unicode escape sequences appear correctly in javadocs. 2002-12-09 02:35:39 +00:00
dchandler
2d6c8be804 So that Unicode escape sequences appear correctly in javadocs. 2002-12-09 02:29:09 +00:00
dchandler
22c6ec5406 Javadoc now works without warnings. 2002-12-09 01:48:34 +00:00
dchandler
f4a16f8e9d This commit is for my benefit only; these classes are not ready for prime time,
and the build system is not yet aware of them.

I'm adding some classes for representing legal tsheg-bars (syllables, for the
most part) in Unicode.  These classes were designed bottom-up (OK, OK --
they weren't designed designed, but I had to write down everything I knew
about Tibetan syntax somewhere).  The classes are aware of extended
wylie.  I doubt the Javadocs work yet, and I'm still testing (and am not
committing my testing code with these as it is not yet ready).

Next on my list--fix these up to reflect my new awareness of suffix particles
(like le'u'i'o) add classes to support syntactically incorrect Unicode
sequences.  Then add a UnicodeReader, and we've got the back end of
a Tibetan Unicode shaping system (like half of MS's Uniscribe or Apple's
Worldscript or FreeType Layout or Omega's OTPs).

A top-down design would not have included LegalTshegBar.  But now that
my itch has been scratched, potential uses are lingering about.  For example,
it would be nice to scan some input and break it into LegalTshegBars,
punctuation/marks/signs, and illegal stacks.  Then we could alert the client
of the illegality, its precise form, and its precise location.

The real system for turning a Unicode stream into an internal representation
suitable for conversion to EWTS/ACIP/XHTML/what-have-you need not be
aware of Tibetan syntax.  But to make the very best conversion from
Unicode to, e.g., EWTS, it is necessary to konw that gaskad is better
represented as gskad, but that jaskad is not the same as jskad.
2002-12-09 01:02:23 +00:00
dchandler
03688b6137 Dza, fa, and va go together, not dza, fa, and va. 2002-12-08 21:08:58 +00:00
dchandler
53aa2e2309 Added jskad_doc.html (a revision of which is up at
http://iris.lib.virginia.edu/tibet/tools/jskad_doc.html) to the repository.  The
build puts this into Jskad's JARs, but Jskad itself does not allow for viewing
it.  In Java, that's a ten-minute job, but I haven't done it.
2002-12-07 17:53:24 +00:00
eg3p
9eedfcd909 This is Tashi's TibetanSyllable class for sorting Wylie Tibetan.
It does not have many methods for determining the root letter, suffix,
and so on, but these should be easy to add. David, please use this
class to the extent that it and your new work overlap.
2002-12-05 01:48:41 +00:00