Abstract

Introduces a cross-script phonetic alignment framework with modular language-specific adapters. Demonstrates 88% BLEU transliteration and 87–95% phonetic alignment accuracy across language/script pairs. Includes a case study on preprocessing Moroccan Arabic (Darija) for LLM training.

Abstract

This paper introduces Sawtone, a universal framework for computing phonetic similarity and alignment across languages and scripts. The framework employs modular language-specific adapters and demonstrates 88% BLEU transliteration accuracy and 87–95% phonetic alignment accuracy across language/script pairs.

Key Contributions

  • A cross-script phonetic alignment framework with modular language-specific adapters
  • 88% BLEU transliteration accuracy
  • 87–95% phonetic alignment accuracy across language/script pairs
  • Case study on preprocessing Moroccan Arabic (Darija) for LLM training

Citation

Kamali, O. (2025). Sawtone: A Universal Framework for Phonetic Similarity and Alignment Across Languages and Scripts. Lingua Posnaniensis, 67(1), 165–200. https://doi.org/10.14746/linpo.2025.67.1.8

Citation

Kamali, O. (2025). Sawtone: A Universal Framework for Phonetic Similarity and Alignment Across Languages and Scripts. Lingua Posnaniensis, 67(1), 165–200. https://doi.org/10.14746/linpo.2025.67.1.8
PhoneticsTransliterationCross-ScriptNLPLow-Resource LanguagesMoroccan Darija

Related Research