diff --git a/htdocs/TibetanFormatConverterDesign.html b/htdocs/TibetanFormatConverterDesign.html
new file mode 100644
index 0000000..c09d69c
--- /dev/null
+++ b/htdocs/TibetanFormatConverterDesign.html
@@ -0,0 +1,293 @@
+
+
+
+
+
+
+
+
+
+ Tibetan Format Converter Design Document
+
+
+
+Tibetan Format Converter Design Document
+
+
+ This document describes the design of a mechanism for converting
+ from any of a number of representations of Tibetan+Roman text to any
+ of a number of representations. This converter will store
+ Tibetan+Roman text internally in a
+ org.thdl.tib.text.TibetanDocument, and it will use a
+ org.thdl.tib.text.TibetanKeyboard to populate a TibetanDocument.
+ These two classes exist presently inside the Jskad application, but
+ will be modified as needed so that servlets, console applications,
+ and AWT/Swing-based applications can all make use of them.
+
+
+
+ The difficulty is in fault-tolerance, reliability (DLC address both
+ verification AND validation), and speed. Speed will be of least
+ concern.
+
+
+Input formats
+
+
+ The converter will support, in a modular fashion, mixed Tibetan
+ and Roman input in the following formats:
+
+
+ -
+ An HTML file with embedded <tibetan
+ translit="extended-wylie">sgra</tibetan> tags (from the
+ SimpleTibetanAndRomanDocument DTD mentioned below)
+
+ -
+ Unicode (regardless of the order of consonants in a stack)
+
+ -
+ RTF for TibetanMachine
+
+ -
+ RTF for TibetanMachineWeb
+
+ -
+ RTF for Sambhota Old
+
+ -
+ RTF for Sambhota New
+
+ -
+ Edward and Than's XHTML
+
+
+
+
+ In addition, the converter will support, in a modular fashion,
+ strictly Tibetan input in the following formats:
+
+
+ -
+ Extended Wylie, ACIP, and any other format for which there
+ exists a Jskad keyboard (i.e., a .ini file in the desired
+ format). In practice, only ACIP and some Wylie variants are
+ used for storing Tibetan, but the mechanism is general. (This
+ will be in UTF8 with no metadata)
+
+
+
+
+
+ The converter will attempt to accept input that has minor flaws, but
+ it will also have a mode that rejects input with even the slightest
+ flaw.
+
+
+
+Output formats
+
+
+ The converter will support, in a modular fashion, outputting a
+ TibetanDocument that is entirely Tibetan, entirely Roman, or a
+ mix of Tibetan and Roman, to the following output formats:
+
+
+ -
+ A proprietary, not-very-well-thought-out XML file of David
+ Chandler's design. For ease of imputation, let's say that this
+ will adhere to the LetterByLetterTibetanAndRomanDocument DTD.
+ This is useful for testing the software. Also useful because it
+ can easily be transformed into as-yet-unthought-of output
+ formats.
+
+ -
+ Extended Wylie or ACIP (inside a trivial XML[UTF8] document that
+ describes the tool that output this file and links to a
+ versioned DTD on the THDL web server) [only these two are used,
+ but we could generate output in the TCC keyboard #1
+ "transliteration" because the mechanism is general]
+
+ -
+ Unicode (DLC: in which order for consonantal stacks? also,
+ normalized or not?)
+
+ -
+ RTF for TibetanMachine
+
+ -
+ RTF for TibetanMachineWeb
+
+ -
+ RTF for Sambhota Old
+
+ -
+ RTF for Sambhota New
+
+ -
+ Edward and Than's XHTML
+
+ -
+ XML that is much leaner and has <tibetan translit="acip |
+ extended-wylie"> and <roman> tags (just a minimum of
+ them). This will be according to the not-yet-in-existence
+ SimpleTibetanAndRomanDocument DTD.
+
+
+
+
+ The converter will support, in a modular fashion, outputting a
+ TibetanDocument that contains only Tibetan and no Roman text
+ to the following additional output formats:
+
+
+ -
+ Extended Wylie, ACIP, and any other format for which there
+ exists a Jskad keyboard (i.e., a .ini file in the desired
+ format). In practice, only ACIP and some Wylie variants are
+ used for storing Tibetan, but the mechanism is general. (This
+ will be in UTF8 with no metadata)
+
+ -
+ Phonetic Tibetan (ACIP loose standard)
+
+ -
+ Phonetic Tibetan (THDL standard)
+
+
+
+
+What formats am I missing? E-mail me them.
+
+
+Advantages and Benefits
+
+
+ After this work item is completed, Jskad will be a powerful viewer
+ of the various input formats described above.
+
+
+
+ Command-line tools will exist to convert to-and-fro this-and-that.
+ The most useful conversions will be to-and-from Unicode. This will
+ allow long-term storage in a format that will exist for years, while
+ still allowing day-to-day work on systems without support for
+ rendering Unicode.
+
+
+
+ In addition, it will be possible with a little extra work to use
+ Jskad as an HTML source editor rather than notepad. You can save as
+ the ugly, uneditable XHTML source that browsers can display, or
+ preview in your system's default browser.
+
+
+
+ Edward envisions a servlet that allows users to paste in, type in,
+ or upload Tibetan in their format of choice. This will be shown on
+ the left side of the web page. Upon identifying that format
+ (perhaps the servlet will make an educated guess, even), they can
+ then select any of our supported output formats and see the result
+ (and download at their leisure) on the right half of the web page.
+
+
+Implementation Plan
+
+
+ To implement this converter, we will do the following:
+
+
+ -
+ Have TibetanDocument output a dense XML document that adheres to
+ the LetterByLetterTibetanAndRomanDocument DTD.
+
+ -
+ Play with XSLT and use it where appropriate to create output.
+
+ -
+ Get the keyboard input logic out of org.thdl.tib.input.DuffPane.
+ At this point, it will be possible to programmatically simulate
+ a human user at the keyboard. Automated tests that certain
+ Tibetan keyboards are working correctly will be performed at
+ this point, and these tests will work off the
+ LetterByLetterTibetanAndRomanDocument that TibetanDocument was
+ made to output above.
+
+ -
+ Create a command-line tool to convert from ACIP or Extended
+ Wylie to the currently supported output formats using Chandler's
+ modified gengetopt-2.4 [dubbed 2.4j] for command-line parameter
+ processing.
+
+ -
+ Add "Save As
+ [Unicode|Extended-Wylie|ACIP|XHTML|RTF(TMW)|RTF(SambhotaNew)|...]"
+ options to Jskad.
+
+ -
+ Code up Edward's servlet (described above).
+
+
+
+
+DLC: address fault-tolerance etc.
+
+
+Things to think more about...
+
+
+ Things to think more about:
+
+
+ -
+ Unicode normalization
+
+
+
+
+
+
+
+
+
+
+ Please
+
+
+ e-mail us
+
+ your comments about this page.
+
+
+
+
+The
+
+
+ THDL Tools
+
+project is generously hosted by:
+
+
+
+
+
+
+
+
+
+