paper
Sawtone: A Universal Framework for Phonetic Similarity and Alignment Across Languages and Scripts
Abstract
Abstract
This paper introduces Sawtone, a universal framework for computing phonetic similarity and alignment across languages and scripts. The framework employs modular language-specific adapters and demonstrates 88% BLEU transliteration accuracy and 87–95% phonetic alignment accuracy across language/script pairs.
Key Contributions
- A cross-script phonetic alignment framework with modular language-specific adapters
- 88% BLEU transliteration accuracy
- 87–95% phonetic alignment accuracy across language/script pairs
- Case study on preprocessing Moroccan Arabic (Darija) for LLM training
Citation
Kamali, O. (2025). Sawtone: A Universal Framework for Phonetic Similarity and Alignment Across Languages and Scripts. Lingua Posnaniensis, 67(1), 165–200. https://doi.org/10.14746/linpo.2025.67.1.8
Citation
Related Research
GenAI for Moroccan Darija: Challenges and Early Results
Conference presentation at the 7th International Congress for Moroccan Arabic, discussing challenges and early results in applying generative AI to Moroccan Darija.
Gherbal: A Multilingual Classifier for Low-Resource Languages
Conference presentation at TIM'24, introducing Gherbal — a multilingual classifier designed for low-resource languages.