diff --git a/htdocs/TibetanFormatConverterDesign.html b/htdocs/TibetanFormatConverterDesign.html new file mode 100644 index 0000000..c09d69c --- /dev/null +++ b/htdocs/TibetanFormatConverterDesign.html @@ -0,0 +1,293 @@ + + + + + + + + + + Tibetan Format Converter Design Document + + + +

Tibetan Format Converter Design Document

+ +

+ This document describes the design of a mechanism for converting + from any of a number of representations of Tibetan+Roman text to any + of a number of representations. This converter will store + Tibetan+Roman text internally in a + org.thdl.tib.text.TibetanDocument, and it will use a + org.thdl.tib.text.TibetanKeyboard to populate a TibetanDocument. + These two classes exist presently inside the Jskad application, but + will be modified as needed so that servlets, console applications, + and AWT/Swing-based applications can all make use of them. +

+ +

+ The difficulty is in fault-tolerance, reliability (DLC address both + verification AND validation), and speed. Speed will be of least + concern. +

+ +

Input formats

+ +

+ The converter will support, in a modular fashion, mixed Tibetan + and Roman input in the following formats: +

+ + +

+ In addition, the converter will support, in a modular fashion, + strictly Tibetan input in the following formats: +

+ + + +

+ The converter will attempt to accept input that has minor flaws, but + it will also have a mode that rejects input with even the slightest + flaw. +

+ + +

Output formats

+ +

+ The converter will support, in a modular fashion, outputting a + TibetanDocument that is entirely Tibetan, entirely Roman, or a + mix of Tibetan and Roman, to the following output formats: +

+ + +

+ The converter will support, in a modular fashion, outputting a + TibetanDocument that contains only Tibetan and no Roman text + to the following additional output formats: +

+ + +

+What formats am I missing? E-mail me them. +

+ +

Advantages and Benefits

+ +

+ After this work item is completed, Jskad will be a powerful viewer + of the various input formats described above. +

+ +

+ Command-line tools will exist to convert to-and-fro this-and-that. + The most useful conversions will be to-and-from Unicode. This will + allow long-term storage in a format that will exist for years, while + still allowing day-to-day work on systems without support for + rendering Unicode. +

+ +

+ In addition, it will be possible with a little extra work to use + Jskad as an HTML source editor rather than notepad. You can save as + the ugly, uneditable XHTML source that browsers can display, or + preview in your system's default browser. +

+ +

+ Edward envisions a servlet that allows users to paste in, type in, + or upload Tibetan in their format of choice. This will be shown on + the left side of the web page. Upon identifying that format + (perhaps the servlet will make an educated guess, even), they can + then select any of our supported output formats and see the result + (and download at their leisure) on the right half of the web page. +

+ +

Implementation Plan

+ +

+ To implement this converter, we will do the following: +

+
    +
  1. + Have TibetanDocument output a dense XML document that adheres to + the LetterByLetterTibetanAndRomanDocument DTD. +
  2. +
  3. + Play with XSLT and use it where appropriate to create output. +
  4. +
  5. + Get the keyboard input logic out of org.thdl.tib.input.DuffPane. + At this point, it will be possible to programmatically simulate + a human user at the keyboard. Automated tests that certain + Tibetan keyboards are working correctly will be performed at + this point, and these tests will work off the + LetterByLetterTibetanAndRomanDocument that TibetanDocument was + made to output above. +
  6. +
  7. + Create a command-line tool to convert from ACIP or Extended + Wylie to the currently supported output formats using Chandler's + modified gengetopt-2.4 [dubbed 2.4j] for command-line parameter + processing. +
  8. +
  9. + Add "Save As + [Unicode|Extended-Wylie|ACIP|XHTML|RTF(TMW)|RTF(SambhotaNew)|...]" + options to Jskad. +
  10. +
  11. + Code up Edward's servlet (described above). +
  12. +
+ +

+DLC: address fault-tolerance etc. +

+ +

Things to think more about...

+ +

+ Things to think more about: +

+ + + + + + +
+ + + Please + + + e-mail us + + your comments about this page. + + +
+ +The + + + THDL Tools + +project is generously hosted by: + + + + SourceForge Logo + + + + + +