report
GenAI for Moroccan Darija: Challenges and Early Results
January 1, 2024 · University of Navarra, Spain
Omar Kamali
Abstract
Conference presentation at the 7th International Congress for Moroccan Arabic, discussing challenges and early results in applying generative AI to Moroccan Darija.
Conference Presentation
Presented at the 7th International Congress for Moroccan Arabic, University of Navarra, Spain, 2024.
This talk discusses the unique challenges of applying generative AI to Moroccan Darija and presents early results from Omneity Labs' work on LLM training for this underrepresented language.
Citation
Kamali, O. (2024). GenAI for Moroccan Darija: Challenges and Early Results. 7th International Congress for Moroccan Arabic, University of Navarra, Spain.
LLMMoroccan DarijaLow-Resource LanguagesNLPCultural AI
Related Research
Sawtone: A Universal Framework for Phonetic Similarity and Alignment Across Languages and Scripts
Introduces a cross-script phonetic alignment framework with modular language-specific adapters. Demonstrates 88% BLEU transliteration and 87–95% phonetic alignment accuracy across language/script pairs. Includes a case study on preprocessing Moroccan Arabic (Darija) for LLM training.
Gherbal: A Multilingual Classifier for Low-Resource Languages
Conference presentation at TIM'24, introducing Gherbal — a multilingual classifier designed for low-resource languages.