How It Works
This section is easier to follow if you have the source or the software installed. You must be familiar with the termsVowel, Consonant, and Conjunct. In the context of IMLI we define a conjunct as a combination of consonants. After installation, open the generic.con, generic.vow, and generic.cnj files (these are found in the same directory as the application). These are plain text files that you can open in any text editor. These files are used as the source for the data files used by the library, pre-processed and stored as binary for speed. These files are provided only for informational purposes (such as this :-) and not used by the editor. generic.con has three columns in the following format :Consonant [spaces] Consonant Number [spaces] hexCodeThe consonants of all supported languages are listed in their sorting order. The consonant code in the second column is a simple incremental decimal series. For example, ka is the first consonant, kha is the second consonant. The generic.vow file provides the same information for vowels. In generic.cnj you will find the consonants from generic.con (in the same order) on the LHS (of the ":"). In the RHS you will find a list of consonants that the LHS can combine with to produce a conjunct. The consonants within brackets are provide second level conjuncts. For example, the first line of generic.cnj is :
ka : ka , kha, cha, ja, jha, ta (ra), nna, tha (ya, yab, ra, va), ttha (na, ya, yab), da, na (ya, yab), pa, ba, ma (ya, yab), ya, yab, ra, la, va, ca, sa (tha, va), lla, nukhta (la, sa)This means the consonant ka can combine with ka, kha, cha, ja, etc. The ta (ra) is to be interpreted as ka combines with ta, and the conjunct formed by ka + ta combines with ra. Notice that the RHS entries also occur in the sorting order. Every entry in the LHS is limited to 64 entries in RHS (more on this in the coding scheme, explained below).
Tab files
Syllable to Display mapping is done via a lookup table that is stored in a tab (for table) file. These are plain text files, with the caveat that the text on the right side of the = sign in in the local language and will appear garbled in English. To view the right side correctly change the font of your editor to the font corresponding to the language (this may force the numbers and symbols on the LHS to appear in local languages, and the brackets my be replaced by a local language character, depending on the font you choose). With that out of the way... The tab file is a linear representation of the generic.cnj file with additional information. The format of the entries in the tab file is[C, Cj, V] = glyphswhere C is the consonant number (from generic.con), V is the vowel number (from generic.vow), and Cj is the Conjunct number as derived from the generic.cnj. Here are two examples to illustrate this : The Conjunct entry for ka + da is 18 since da appears in the 18th position in the ka list. Entries in the brackets have are counted in the same manner as the ones outside. In Devanagari (Hindi and Marathi) tab files, the entry look like this :
The conjunct entry for ka + ta + ra is 7. Since ka has a Consonant position 1 in the generic.con file, the tab file entries for these two are [1,18,0] and [1,7,0]. In Devanagari tab files, the entry look like this :
Entries [0,0,0] to [0,0,15] are representation of the Vowels in the full form (as opposed to the Matra form) listed in the same order in which they appear in generic.vow. Every entry of the form [n,0,0] is a full consonant appearing in the same order as in generic.con. For every [C,Cj] available in generic.cnj, the matra form of the vowels are appended (in order of generic.vow) to obtain the [C,Cj,V] entry in the tab file.
Coding Scheme
Every entry in the tab file represents one syllable. To store each syllable we use 2 bytes, which are broken up as 6 bits for C (48 Consonants out of an available 64), 6 bits for Cj (64 conjunct entries for each C), and 4 bits for V (for 15 Vowels).Caveats
When a consonant has more than 64 conjuncts, there are extra entries for the consonant to provide additional conjunct space. For example, we have added another entry for ra in the generic.cnj list, under rex (for Ra EXtra). The editor automatically switches between the ra and rex depending on the consonant that follows. na is another consonant extended the same way.