Pali script transliteration for Python. Converts between Thai Pali script, Sinhala script, and IAST (International Alphabet of Sanskrit Transliteration / romanization) — the three scripts used for Pali in Theravada Buddhist texts, including the Pali Canon (Tipitaka).
Useful for Pali scholars, Buddhist text digitization projects (SuttaCentral, bilara, CSCD), NLP researchers working with Pali, and anyone building tools for Theravada texts.
- Zero dependencies — stdlib only, runs anywhere Python 3.10+ does
- Single file — drop
paliscript.pyinto your project, orpip install - Bidirectional — every conversion round-trips: Thai ↔ IAST ↔ Sinhala
- 190 tests — including cross-script sutta phrase verification against bilara-data
Don't want to install anything? Use the free online transliteration tool at rianthai.pro/pali/transliteration — it is powered by paliscript and lets you convert between Thai Pali, Sinhala, and IAST directly in the browser.
pip install paliscriptOr just copy paliscript.py — it's a single standalone file.
Note: Examples below use
aspiration=AspirationStyle.DIGRAPHfor readability. The default output uses dotted-H (e.g.dḣammā) for unambiguous round-tripping. See Aspiration styles.
from paliscript import to_iast, to_thai, sinhala_to_iast, iast_to_sinhala, AspirationStyle
# Thai Pali → IAST romanization
to_iast("กุสลา ธมฺมา", aspiration=AspirationStyle.DIGRAPH) # → "kusalā dhammā"
to_iast("พุทฺโธ", aspiration=AspirationStyle.DIGRAPH) # → "buddho"
# Sinhala → IAST romanization
sinhala_to_iast("කුසලා ධම්මා", aspiration=AspirationStyle.DIGRAPH) # → "kusalā dhammā"
# IAST → Sinhala
iast_to_sinhala("mettā", aspiration=AspirationStyle.DIGRAPH) # → "මෙත්තා"
# IAST → Thai Pali
to_thai("nibbāna", aspiration=AspirationStyle.DIGRAPH) # → "นิพฺพาน"
# Cross-script via IAST pivot: Thai → IAST → Sinhala
iast = to_iast("พุทฺโธ", aspiration=AspirationStyle.DIGRAPH) # Thai → IAST
iast_to_sinhala(iast, aspiration=AspirationStyle.DIGRAPH) # IAST → Sinhala: "බුද්ධො"Dotted-H is the default — unambiguous and safe for round-tripping:
to_iast("ธมฺมา") # "dḣammā" (default: dotted-H)
to_iast("ธมฺมา", aspiration=AspirationStyle.DIGRAPH) # "dhammā" (traditional digraph)paliscript --to-iast "กุสลา ธมฺมา"
# kusalā dḣammā
paliscript --to-iast --aspiration digraph "กุสลา ธมฺมา"
# kusalā dhammā
paliscript --to-iast --script sinhala "කුසලා ධම්මා"
# kusalā dḣammā
paliscript --from-iast --script sinhala "kusalā dḣammā"
# කුසලා ධම්මා
paliscript --to-thai "kusalā dhammā" --aspiration digraph
# กุสลา ธมฺมา
echo "เมตฺตา" | paliscript --to-iast
# mettāAlso works standalone without installing: python paliscript.py --to-iast "เมตฺตา"
Aspirated consonants have two IAST representations:
| Style | Example | Notes |
|---|---|---|
| Dotted-H (default) | kḣ, dḣ, bḣ | Unambiguous — each aspirate is one token |
| Digraph | kh, dh, bh | Traditional, but ambiguous with standalone h |
If you are exchanging text with bilara-data, SuttaCentral exports, or standard Pali dictionaries, use AspirationStyle.DIGRAPH.
| Feature | Thai Pali | Sinhala | IAST |
|---|---|---|---|
| Vowels | 10 | 18 standalone + 18 dependent | Latin + diacritics |
| Consonants | 33 | 41 (incl. ligatures, ś, ṣ, f) | Latin + diacritics |
| Virama | Phinthu ฺ (U+0E3A) | Hal kirīma ් (U+0DCA) | — |
All scripts use standard Unicode encodings: Thai (U+0E00–U+0E7F), Sinhala (U+0D80–U+0DFF). Input is NFC-normalized before processing.
This library handles Pali language texts only — specifically texts written in Thai Pali script, Sinhala Pali script, or IAST. It is not a general Thai or Sinhala language transliterator: modern Thai and Sinhala characters that do not appear in the Pali alphabet will pass through unchanged.
Transliteration tables and algorithm by Bhante Buddhañāṇo Thera, originally implemented as LibreOffice StarBasic macros for Pali text processing in monastic and academic contexts. Rewritten in Python with his permission. IAST conventions are verified against bilara-data (SuttaCentral, Mahasangiti edition).
MIT