Commit graph

141 commits

Author SHA1 Message Date
a1tsal
343a83bc10 Not sure what the change here is, but CVS thinks it's changed,
so I'm checking it in.
2005-01-19 01:12:14 +00:00
a1tsal
5034292ed6 Made 0F00, 0F02 and 0F03 be "symbols" rather than "combinations".
They should not be decomposed on transliteration.

    Made 0F0E (//) a "symbol" rather than "combination".

    Made the Sanskrit aspirates and k+Sh into foreign consonants,
    rather than "combination"s.  This doesn't actually affect the
    current code, but better reflects what it is doing (after the
    change to canonicalize these as single "precomposed" characters).

    Made the "strongly disrecommended" vowel signs "deprecated" rather
    than "combination".
2005-01-19 01:05:49 +00:00
a1tsal
4aa9128173 New feature for converting the 8-bit TCC fonts to Unicode.
Code-cleaned ReadTMWdefinitionFile which was an awful mess.  Did
  this along the way when implementing TCC 8-bit support.

  Unicode<->TMW conversion of whole document now applies to footnotes
  and endnotes.

  Now (per spec) use 00A0 space rather than 0020 (= ASCII 32) for EWTS _ .

  Implemented [] nesting rule for F9.  It already worked for F10,
  fortuitously.

  Optimize out most calls to MakeRange*Script (recently introduced to
  work around Word Unicode font-change bug); this improves performance
  and also decreases the number of useless font-change events on the
  Undo list.

  "combination" -> "deprecated" in Unicode definition reader.
2005-01-19 00:54:19 +00:00
a1tsal
67414c1e8c Checking in first version of testing protocol document 2005-01-16 01:00:51 +00:00
a1tsal
c80184b95f Minor changes 2005-01-16 00:59:31 +00:00
a1tsal
ce814a0a83 Minor changes -- mostly in comments -- plus one Unicode-related. 2005-01-16 00:58:50 +00:00
a1tsal
4e244b1b96 Updated for Unicode changes.
Various uses of "English" replaced. (Trying to be less ethnocentric;
we have lots of users whose primary language is not English.)
2005-01-16 00:55:48 +00:00
a1tsal
a93bcfdce5 Removing this informal branch of "WylieWord development copy.dot",
as the code in it is now the trunk.
2005-01-16 00:54:14 +00:00
a1tsal
8add415115 I had made an informal branch of this file called "Unicode hackup.dot"
for experimental Unicode extensions.  It's now clear that Unicode is
going in the trunk -- in fact I'm close to a first beta release -- so
I'm clobbering the "Unicode hackup.dot" file and checking in its
latest version here.

The changes:

Unicode support.  Extensive remodularization and code changes in most
parts of the system, far too much to detail here.  2.1 has roughly a
third more code than 2.0!

Per request of Cathy Cantwell, make F9 on body text that includes
footnotes/endnotes convert text of said notes.

In support of the previous, various changes to make Tibetan work in
footnotes (and endnotes) generally.

Pronunciation function now talks about "foreign" words rather than
"Sanskrit"; former is more generally correct.

Fixed prehistoric bug (goes back to Than's version 1.1) that I just
found: F6 went to end of document if you hit Cancel.

Fixed SourceForge bug 941183: literal curly quotes in VBA screwed up
Chinese Word.  (Need to get Jeff Wu to verify fix -- I couldn't
duplicate bug.)

Fixed SourceForge bug 986847: double quote doesn't curlify.  I
"fixed" this by removing the double-quote binding.  That's not the
optimal fix, but the only thing that's non-optimal about it is that
when you are typing EWTS, double-quote self-inserts rather than
telling you that it's not a valid EWTS character.

Fixed SourceForge bug 986850 (Interlineal failing when loaded from
Startup, due to an instance of the Word bug where Startup-loading
doesn't run initializations)

Fixed bug: in F9 conversion, \u allowed only lower-case (not
upper-case) alpha hex digits.

Fixed bug: in F9 conversion, \ was not interpreted as an escape
within [].

Fixed bug: \Z] did not return to TScript, but instead inserted the
].

Fixed bug: [ and ], when in Interlineal mode, now beep rather than
screwing things up.

Various uses of "English" replaced in the UI. (Trying to be less
ethnocentric; we have lots of users whose primary language is not
English.)
2005-01-16 00:43:46 +00:00
a1tsal
dff4db7205 Fix required to make build actually run 2005-01-16 00:11:56 +00:00
a1tsal
830db8d980 Better rejection tests, changes for Unicode, new test case "baraga",
which 2.0 gets wrong (I think -- still not sure on this one!)
2005-01-12 11:38:55 +00:00
a1tsal
cca31f1af1 Introduce new category for H and ?. 2005-01-12 11:33:30 +00:00
a1tsal
d069f0b1b6 Checking in incremental progress. Everything works, various small loose
ends to deal with, thorough testing yet to be done.
2005-01-12 11:31:18 +00:00
a1tsal
f205f7253d Incremental progress. Passes all conversion tests except non-TScript,
rejection, and the peculiar case kaR+Wabi.
2004-12-26 22:45:59 +00:00
a1tsal
d7f5b22a2d Progress: now passing most of TestConversion; exceptions are non-Tibetan
embeds, a few complex Skt stacks, and also it fails to reject some things
that should be unparseable.
2004-12-22 12:43:29 +00:00
a1tsal
de529e7187 Ye olde incremental progress.
Everything mostly works.
But everything is riddled with bugs.
2004-12-19 08:06:41 +00:00
a1tsal
129e7ed917 Incremental progress on Unicode. 2004-12-18 11:09:47 +00:00
a1tsal
cb6ffac9c8 Incremental progress on Unicode. 2004-12-18 05:44:54 +00:00
a1tsal
39752ad984 Whoops, +' is a foreign subjoined consonant. 2004-12-18 05:42:51 +00:00
a1tsal
4d4b1f9d81 GPL and note explaining GPL vs THDL OL added, and to .INP too.
CVS is whining about the GPL containing "escape characters" (presumably
control-L), but when I try to add it as binary, it complains that it's actually text.
I hope this works.
2004-12-17 02:04:15 +00:00
a1tsal
f0a91f9a79 TMW -> Unicode works modulo Word Unicode bugs. 2004-12-16 13:06:38 +00:00
a1tsal
b3eead0df6 Checkpoint: Typing EWTS to get Tibetan script mostly works.
Infinitely many bugs in Word Unicode handling remain to work around.
2004-12-16 08:02:06 +00:00
a1tsal
11dcda0532 Checkpoint: TMW-specific code removed from TScript parser. 2004-12-15 02:55:56 +00:00
a1tsal
a1dc82bd62 Check in incremental progress on Unicode.
Checkpointing here because I'm about to start overhauling the TMW
tsheg bar parser to accept Unicode Tibetan as well.
2004-12-15 02:23:53 +00:00
a1tsal
1c2fcb0910 Add reference to TMUni. 2004-12-13 01:30:05 +00:00
a1tsal
a4a07e5a6f Add the alpha Tibetan Machine Uni font and the "Unicode definition.txt" files.
Change tibwn.ini to overwrite -- previously it was set not to, which seems
completely wrong.
The Uni font is now set to overwrite; I hope this is the right thing.  For some
reason the TMW fonts had been set no-overwrite.  I'm hoping that is just
because I assumed they wouldn't change rather than because it doesn't
work to overwrite fonts or something.
2004-12-13 01:19:06 +00:00
a1tsal
23ddf2aac7 This file has the Wylie <-> Unicode correspondence, which actually can't
be recovered from tibwn.ini, oddly enough.
2004-12-12 22:31:11 +00:00
a1tsal
c93f7467a7 Unicode->Wylie now basically working. Typing EWTS to get Tibetan is
mostly not working but starting to show signs of life.
2004-12-12 12:03:39 +00:00
a1tsal
d211b73642 Incremental progress on Unicode. Wylie->Unicode is basically working
(but I haven't run extensive tests, and probably will break in some cases).
Unicode->Wylie is not working.  I'm checkpointing here because I'm
about to attack the Tibetan->Wylie code and I'm not sure my approach
is correct -- may need to back out.  (Current theory is to recycle the TMW->
Wylie code with small amounts of internal conditionality on TMW vs.
Unicode, but I'm not positive this can be made to work.)
2004-12-12 07:36:23 +00:00
a1tsal
4311f44042 Beginning of version 2.1 (with Unicode support).
It would cleaner to do a fork of "WylieWord development copy.dot",
which is what this, but the wincvs doc doesn't cover the fork command
at all, and I don't want to risk confusion.
2004-12-11 10:22:27 +00:00
a1tsal
92bf0142cd Minor changes. 2004-11-03 02:36:17 +00:00
a1tsal
e40cf55a30 Document new "F11 conversion uses matrices" option. 2004-11-03 02:35:50 +00:00
a1tsal
f6e5c6b045 Version 2.0p1:
Per request from Cathy Cantwell, add option to do F11 interlineal
  conversion without introducing tables, producing text only.  An
  unfortunate side-effect of this is that you get a one-time-only error
  message complaining about the options file (whose format has changed
  to store the value of this option).

  Changed the interlineal line breaking code to put only shad-like
  things at the end of lines, and not zla tse type things.  (But: it no
  longer breaks lines on zla tse type things at all; ideally it should
  break on those but put them at the beginning of lines.  On the other
  hand, you'd never expect to see a zla tse type thing other than after
  a shad type thing (or at the begnning of a text).)
2004-11-03 02:35:01 +00:00
a1tsal
79749d8909 Changed a bunch of Integer declarations to Long so that F11 conversion
wouldn't blow out on chunks of text longer than 32k characters (a
problem reported by Cathy Cantwell).  What a pathetic programming
language VBA is.
2004-11-02 07:15:14 +00:00
a1tsal
6f8ba71ad0 Fix problems reported by Julie Regan <juregan@hotmail.com> with
printing.  The page format and second-page-header were somehow
inherited from Matthew Kapstein's reader. This included line
numbering, non-standard margins, and the header.  Changed back to Word
defaults.
2004-11-01 09:31:14 +00:00
a1tsal
14206b789d Add note about test case for \ syntax. 2004-11-01 07:59:29 +00:00
a1tsal
0d78ad1444 Total rewrite of ConvertWylie() to fix the problem, reported by Cathy
Cantwell (SF RFE #1034292), that it stripped footnotes (and other
non-text items).

Per request of Cathy Cantwell (SF bug 1034688), make
HandleUserApostrophe respect Options.AutoFormatAsYouTypeReplaceQuotes.
I.e., typing a ' will no longer get curly single quotes if you have
turned off the "smart quotes" autocorrect option (which lots of
Tibetanists may have done, so a-chung comes out decently).

Fix PartOfTshegBar to not accept [, which it did, due to an
inexplicable Obiwan error.  Clarify code to make Obiwan error less
likely.

Fix PartOfTshegBar to not accept H.  This is bogus; H is a Sanskrit
letter, but tibwn.ini claims it is punctuation, and our code depends
on that.
2004-11-01 07:57:02 +00:00
a1tsal
47dbbdc16a Add test cases for bug reported by "Cathy Cantwell"
<catherine.cantwell@oriental-institute.oxford.ac.uk>
whereby TMW [m][kh]['][i] converted to EWTS makha'i.
2004-10-31 03:59:11 +00:00
a1tsal
4596ec268b Fix bug reported by "Cathy Cantwell"
<catherine.cantwell@oriental-institute.oxford.ac.uk>
whereby TMW [m][kh]['][i] converted to EWTS makha'i.
2004-10-31 03:58:37 +00:00
a1tsal
779db45e92 Version 2.0: no code changes from 2.0b3; just incremented the About version number. 2004-07-07 20:04:32 +00:00
a1tsal
a7cb875b1b Documented fix for Spanish keyboard DeadKey problem. 2004-07-07 19:55:26 +00:00
a1tsal
d3ed56655b Documented fix for Spanish keyboard DeadKey problem. 2004-05-02 13:12:32 +00:00
a1tsal
2a2c9112b3 Updated for latest theory of R, Y, W. 2004-04-21 19:30:10 +00:00
a1tsal
82ab632660 Fixed tiny typos. 2004-04-21 19:26:24 +00:00
a1tsal
c80fa8879d Extensive, ugly code changes to make R+ work.
Sean Something <knowone@zensearch.com> pointed out that a special ha
  glyph (1,102) is supposed to be used solely in the case of hU.
  Added code to do so.
2004-04-21 19:17:02 +00:00
a1tsal
0b6a4941f2 Remove from the rules cases that were previously exceptions but are now
handled by the nasalization rule.
2004-04-18 19:21:35 +00:00
a1tsal
2d1d85577d Update for the lastest hairifcation of the nasalization rule. 2004-04-18 19:20:49 +00:00
a1tsal
fd6d03d040 Change implementation of nasalization rule for lastest elaboration
thereof (n, not m, when root letter is b or ph but not pronounced as
  such).
2004-04-18 19:19:52 +00:00
a1tsal
a697cf2010 Added support for recently-added nasalization rule in THDL phonetics. 2004-04-12 08:28:35 +00:00
a1tsal
68ee1d083e Fixed phonetics in interlineal example. 2004-04-12 08:27:35 +00:00