diff --git a/source/org/thdl/tib/scanner/AcipToWylie.java b/source/org/thdl/tib/scanner/AcipToWylie.java index f7f1485..659685d 100644 --- a/source/org/thdl/tib/scanner/AcipToWylie.java +++ b/source/org/thdl/tib/scanner/AcipToWylie.java @@ -21,8 +21,20 @@ package org.thdl.tib.scanner; import java.net.*; import java.io.*; -/** Provides interfase to convert from tibetan text transliterated in - the Acip scheme to THDL's extended wylie scheme. +/** Provides an interfase to convert from tibetan text transliterated in the Acip scheme to THDL's Extended Wylie scheme. + +

If no arguments are sent, it takes the Acip text from the standard input and sends the +Wylie text to the standard output. If one argument is sent, it interprets it as the +file name for the input. If two arguments are sent, it interprets the first one as the file name for the input and +the second one as the file name for the output. For example, the following +command converts the lam-rim-chen-mo.act storing the results in +lam-rim-chen-mo.txt:

+
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.AcipToWylie lam-rim-chen-mo.act lam-rim-chen-mo.txt
+

Alternatively by redirecting the standard input/output you perform the same +job:

+
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.AcipToWylie < lam-rim-chen-mo.act > lam-rim-chen-mo.txt
+

If you only want to display the results to the screen, you can run:

+
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.AcipToWylie lam-rim-chen-mo.act | more
@author Andrés Montano Pellegrini @see WindowScannerFilter diff --git a/source/org/thdl/tib/scanner/BinaryFileGenerator.java b/source/org/thdl/tib/scanner/BinaryFileGenerator.java index 33d3ac4..68ae631 100644 --- a/source/org/thdl/tib/scanner/BinaryFileGenerator.java +++ b/source/org/thdl/tib/scanner/BinaryFileGenerator.java @@ -23,8 +23,121 @@ import java.io.*; into a binary file tree structure format, to be used by some implementations of the SyllableListTree. -

The text files must be in the format used by the - The Rangjung Yeshe Tibetan-English Dictionary of Buddhist Culture.

+

Syntax (Dictionary files are assumed to be .txt. Don't include extensions!):

+

-delimiter

@author Andrés Montano Pellegrini @see SyllableListTree diff --git a/source/org/thdl/tib/scanner/ConsoleScannerFilter.java b/source/org/thdl/tib/scanner/ConsoleScannerFilter.java index a16a209..f40a3ad 100644 --- a/source/org/thdl/tib/scanner/ConsoleScannerFilter.java +++ b/source/org/thdl/tib/scanner/ConsoleScannerFilter.java @@ -23,7 +23,12 @@ import java.util.*; /** Inputs a Tibetan text and displays the words with their definitions through the console over a shell. Use when no - graphical interfase is supported or for batch processes. + graphical interfase is supported or for batch processes. For instance:

+
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.ConsoleScannerFilter ry-dic99
+

It reads from the standard input and prints the results to the + standard output. For example if you want to parse a text stored in puja.txt + and save the results in puja_words.txt, you can run the command:

+
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.ConsoleScannerFilter ry-dic99 < puja.txt > puja_words.txt
@author Andrés Montano Pellegrini */ diff --git a/source/org/thdl/tib/scanner/WindowScannerFilter.java b/source/org/thdl/tib/scanner/WindowScannerFilter.java index 9858008..b6e76a7 100644 --- a/source/org/thdl/tib/scanner/WindowScannerFilter.java +++ b/source/org/thdl/tib/scanner/WindowScannerFilter.java @@ -33,7 +33,12 @@ import org.thdl.tib.input.DuffPane; Tibetan script) and displays the words (Roman or Tibetan script) with their definitions. Works without Tibetan script in platforms that don't support Swing. Can access dictionaries stored - locally or remotely. + locally or remotely. For example, to access the public dictionary database run the command:

+
java -jar DictionarySearchStandalone.jar http://iris.lib.virginia.edu/tibetan/servlet/org.thdl.tib.scanner.RemoteScannerFilter
+

If the JRE you installed does not support Swing classes but supports + + AWT (as the JRE for handhelds), run the command:

+
java -jar DictionarySearchHandheld.jar -simple ry-dic99
@author Andrés Montano Pellegrini */ diff --git a/source/org/thdl/tib/scanner/package.html b/source/org/thdl/tib/scanner/package.html index bf7b40e..0d87955 100644 --- a/source/org/thdl/tib/scanner/package.html +++ b/source/org/thdl/tib/scanner/package.html @@ -8,14 +8,134 @@ --> -Provides classes and methods for translating Tibetan text to English. -

-Right now, this package scans Tibetan text, but we aim to make it parse Tibetan text. -

-Author: Andrés Montano Pellegrini +Provides the classes to take Tibetan language passages and divide the passages up +into their component phrases and words, and display corresponding dictionary definitions. +

This tool helps Tibetan to English translators partially automate the +translation process. In the Tibetan language, the boundaries of individual words +are not marked in any manner such as the way in which spaces separate and mark +words in English. Instead, there is +a punctuation mark called a "tsheg" which separates each syllable. Thus while syllabic boundaries are utterly explicit, word boundaries are +often unclear. One of the main +difficulties beginning students thus have with translating Tibetan texts is +figuring out where each word ends and the next word starts, and determining what +series of syllables to look up in the dictionary either as constituting a single +word or a larger compound phrase. This +entails a very time consuming process of looking up multiple combinations of +syllables to determine which are found within a given dictionary.

+

It partially automates that process by +breaking up a sentence/paragraph entered in + Extended Wylie or Tibetan script +into the biggest component parts it can find in multiple dictionary databases. +Then for each component part found, it displays its stored definitions and +relevant information. This will +thus often yield only the definition of a long phrase, rather than its component +words, but one can also search for the syllables of that phrase one by one +separately.

+

The tool can run on-line through a:

+ +

The tool can also run off-line in:

+ + +

The classes designed to be run from the command-line are:

+ + + +

Notes on Input:

+ + +

Author: Andrés Montano Pellegrini

Related Documentation

@see org.thdl.tib.text @see org.thdl.tib.input - + \ No newline at end of file