Jan 9, 2010

Palm Leaves to OSX; problems of Pali Fonts

Pali was originally a spoken language only, and was not committed to writing until several hundred years after the Buddha's time. During the Buddha's own lifetime writing was, in India, a fairly recent technological innovation and was used only for practical purposes such as commercial and diplomatic messages. It was still considered improper to use such a vulgar medium for religious texts.

So, Pali has no written alphabet of its own. The language has, by one count, 32 consonant and 8 vowel sounds. The consonants are organized in a logical fashion in a grid according to how they are sounded; whether aspirated or not and where the tongue is placed in the mouth. This is very different from the Roman alphabet used in English and other Western European languages, but is a system widely used in South and South-East Asian alphabets. (Some readers may be familiar with a similar system adopted by J.R.R. Tolkien for his imaginary Elvish languages. Tolkien was, after all, a linguist.)

In the traditional Theravada countries, Pali is easily rendered into the local alphabets and there are Sinhala, Thai and Burmese editions of the Tipitika. Pali was not rendered into Roman until the nineteenth century when German and English scholars began to take an interest in the scriptures of Theravada Buddhism. A problem arose immediately in that the Roman alphabet does not have enough letters to render each Pali sound.

This was solved in two ways. First, the aspirated versions of several consonants were rendered by adding an "h." Thus; bh, kh etc. represent only one letter in Pali. "Buddha" has four letters, not five, in Pali. This is a reasonable compromise and only causes confusion to those not familiar with Pali orthography, thus we see common misspellings such as "Bhudda."

The other method adopted was the addition of diacritical marks. Pali vowels are relatively simple; there are five basic vowel sounds which occur as either "long" or "short." The length of a vowel does not change it's basic sound, but only the time it is held and is mostly important for metrical purposes in verse. The long vowels are indicated by a macron (dash) over the letter. ā ū ī

Several of the consonants have a "retroflex" version, a sound not familiar to English speakers. It is made by curling the tongue back in the mouth. This is indicated by a dot placed under the letter.  ḍ ṭ There is also a special version of n, which is pronounced like "ny" as in the English "canyon" which is indicated by a tilde (like a sine-wave) mark over the n, like in Spanish. ñ


This leaves one very special sound in Pali to be rendered. That is the "pure nasal" or in Pali, the "niggahita" which nasalizes the preceding vowel. It is not really a sound on it's own, but roughly it is like a terminal "ng" as in English "ring." There is a lot of typographical confusion over this letter in Roman Pali. Nowadays it is most commonly indicated by an "m" with a dot underneath but in many older books one will see a funny "n" with a curly tail, or an "m" with a dot over it, or even an "n" with a dot over it. ṃ ŋ

When books were still printed with moveable type, special letters would have to be cast for the diacriticals. If a page with Pali words was produced on a typewriter, the marks would have to be added by hand.

This was the case for the early pioneering editions of the Pali texts produced in Roman fonts by the Pali Text Society. That august body still prints from photo-engraved plates based on the original, hence their editions usually have a longish insert of "errata" since it is impossible to correct minor faults in the original.

The original Roman Pali was produced by painstaking scholarship, comparing word by word the Sinhala, Siamese and Burmese versions; footnotes indicated any variation between the three. This, of course, was done at a time when computer technology was no more than a twinkle in Sir Charles Babbage's eye.

Fast forward to the 1980's and the dawn of the modern computer age with its promise of a paperless office, expanded leisure time and easy to use Pali fonts. (Not so much for any of it.) My own first computer was a Commodore-128.  For the information of the younger set, this was a primitive device with no hard-drive, a black-and-white low-res monitor and packed with 128 kilobytes of RAM. The word "font" was not yet known outside of professional typographical circles. The word processor had one bit-mapped typeface for general use but it did come with an alternate to be used for typing in French, which included the various accented vowels for that language.

I needed to be able to produce Pali letters so I copied the French type-set, hacked the machine-code for the bit-mapped letters and put the most common Pali diacritical letters in place of the French accented vowels. I was able to type Pali because the poor Commodore thought it was speaking French. (Oddly, I miss that machine.)

Come the nineties (remember them?) and the computer revolution shifted into second or third gear. I started using a Mac (System 6) and cobbled together my own postscript Pali Fonts using a programme called Fontographer. After something called the Internets became a wildly popular fad, more and more Pali Fonts started to become available.

The problem now was one sadly familiar to computer users in those days; lack of standards. Each font had its own unique keymap. A document produced using MyNorman would not print properly in LeedsBitPali. Conversions required a lot of tedious search-and-replace. Worse, fonts and keymaps did not translate well across platforms. Changes in software eventually made my own Pali fonts obsolete. Sometimes cumbersome work-arounds had to be employed. I once produced a Pali chanting book using Word macros. The resulting file was huge and it slowed the computer to crawl just attempting to scroll through the pages.

But now we are at the dawn of a new era. Finally, it is getting easy to use not only Pali but almost any alphabet, thanks to Unicode. This is an expansion of the old ASCII idea; each character has a unique, universally agreed hexadecimal code. The old ASCII standard was limited to 256 characters. The new Unicode, by adding a few digits, increases the potential to over 2 million characters. Of course, not every font will have every known character, but as long as developers adhere to the standard (hah! I'm talking to you Bill Gates) we should be guaranteed that every font which has a Pali retroflex "d" will have it in the same place, i.e. use the same hexadecimal coding.

So documents produced in Unicode Helvetica on a Mac should be readable in Unicode Arial on a Windows box. And everything should display properly in a browser window. Let's see if it works; I'm going to type the Pali word for "Consciousness" which uses several diacriticals; how does it display in your browser window?

viññāṇā

This was especially easy for me to do in Mac OSX using a freeware application called Ukelele which lets me define my own custom keymap. So, I have a home-made Pali keymap which puts, for example, the long-a under option-a. Because of Unicode, this doesn't matter at the other end because the hex code for the letter remains unchanged! When the Unicode standard becomes really universal we'll finally have reached the same ease of use for Pali letters as scratching on palm-leaves.

Paperless office and expanded leisure time coming next...

14 comments:

Jayarava said...

For PC the same app is Microsoft Keyboard Layout Creator - a freebie from the M$ website. I have three Unicode keyboards set up and can toggle between them

roman - āīūṛṝḷḹṃḥṭḍṇṅñ
देवनागरी
དྦུ་ཅན། - though this one needs work.

karuna.murti said...

Bhante, Bhante Pesala also prividing several freeware unicode font and keyboard layout for windows on his site: aimwell.org

Mushinronsha said...

Bhante, if ñ is pronounced as like the Spanish ñ (ny), how is a double ññ (as in viññāṇā) pronounced? I've seen this written before but I never had the foggiest as to how it's pronounced (I'm guessing like a long nny). Just thought I'd take the opportunity to ask.

Oh and by the way, the above comment by 黑色皮包 is an advertisement for pornography. Surely nothing good could come from advertising that on a Bhikkhu's blog! :-)

Ajahn Punnadhammo said...

Hello Mushinronsha

Double consonants in Pali are always pronounced discretely. Example from English, as the "dd" in "mad-dog" not blurred together as in "middle."

So viññana is three syllables;
Viñ - ña - na and both inflected n's are distinctly pronounced.

PS - Japanese porno ads deleted, thanks.

Mushinronsha said...

Thank you for clearing that up, Bhante.
I think I'll need quite a bit of practice with Pali pronunciation. I've tried to say viññana to the best of my ability, but it sounds as if I've got a cleft palate!

Ven. Jo Jo said...

Wonderful article. It brought back many frustrating memories of dealing with fonts in the past.
I also use Mac, never heard of Ukelele. I glanced at it — interesting. Are you aware you can use the U.S. Extended keyboard layout and easily type the Pāli diacritical marks easily that way straight from the keyboard?
A useful bit of information I felt that was left out was the use of OpenType fonts like Times New Roman as opposed to TrueType fonts like Times. Which also leads into the use of PDFs and how they embed the fonts, as long as they're OpenType, directly into the document which ensures it's readable on any computer system now and into the future.
Keep up the good work.

vartika said...

Bhante i wanted to get this buddhist pali incantation tattoed on my 20th birthday. I have the same but in cambodian script but i wanted it in original pali.
It goes like-
May your enemies run far away
from you.
If you acquire riches,
may they remain yours always.
Your beauty will be that of
Apsara.
Wherever you may go,
many will attend, serve
and protect you,
surrounding you on all sides.
Can you please translate that for me in pali text. Would be of great help. Thank you and god bless.

Veer Singh said...

Hi. There cannot be just one method sufficient for all human beings. To flush out repressions and pressures of the mind such methods are required which bring cathsarsis.Learn lots of Intresting Meditation techniques

MLEOW said...

Hi Ajahn i just found your blog after reading your book mara`s letter and i am glad to find someone else that is a "contemporary buddhist" yet doesn`t automatically pay lip service to the modern world and its myths, or anti-myths if you will. I am struggling to attach myself to the theravadin tradition and sangha just because in its current state it seems so degenerated, i guess the saying "But if for company you cannot find a wise and prudent friend, one who leads a good life, then, like a king who leaves behind a conquered kingdom or a lone elephant in the elephant forest, you should go your own way alone" is more valid now then ever... Though i am only familiar with the western Sangha does it differs in some respect to the Sangha found in the east Thailand , Sri Lanka etc?
And another question , have you read Julius Evolas book the "The doctrine of awakening" if so what is your opinion about it? Personally i think it offers a realistic and unique point of view concerning the way and method of original buddhism, certainly a farcry from every modern exegesis in the likes of Batchelor for example. As you may know it was translated by Nanavira Thera and supposedly led to his renunciation.

Anyway you have one more blog reader! And please excuse my poor english grammar, im from Sweden (if that is valid as an excuse for a lazy education).
/Martin.

raja said...

The place in which Digha lies today, was named as Beerkul. In Warren Hasting’s letter this name Beerkul was referred as Brighton of the East. In 1923 an English tourist Jhon Frank Smith started to live here and his writing about Digha helped to increase popularity of Digha. Due to his request Mr. Bidhan Chandra Roy, The Chief Minister of West Bengal at that time, developed the place as Beach Resort

www.Siamese-Dream.Com said...

Hello and Sawasdee Kha Ajahn:

So in Pali language the vowels are like Thai vowels and a long vowel lasts twice as long as a short vowel???

Thank you and Sawasdee Kha

Srisuda Hongthai

Boink Blupp said...
This comment has been removed by the author.
Boink Blupp said...
This comment has been removed by the author.
Boink Blupp said...

Dhamma Greetings,

this blog looks deserted, but hopefully someone is listening. I use a pāḷi layout for windows, but I am missing the ŋ. There is ṅṇṃ, but no ŋ.

I understand, that it's basically all the same, but does anyone know a way to make up one own, new keyboard layout?

Much Metta,
Mirco

EDIT: Hadn't seen Jayarava's comment - I will try my luck with the MS Keyboard Layout Creator