Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Published in arXiv preprint arXiv:2601.13802, 2026

Arabic spans over 30 spoken varieties, yet no open-source text-to-speech system unifies them. We present Habibi, a unified-dialectal Arabic TTS framework that addresses key barriers including substantial cross-dialect lexical and phonological divergence, scarce synthesis-grade data, and the absence of a standardized multi-dialect evaluation benchmark. Both automatic metrics and human evaluations confirm that Habibi is highly competitive with ElevenLabs’ Eleven v3 (alpha) in intelligibility, speaker similarity, and naturalness.

Paper Code Demo

Recommended citation: Y. Chen, J. Liu, Y. Tu, Z. Niu, Y. Liang, C. Qiang, C. Zhang, K. Yu, X. Chen. (2026). "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis." arXiv preprint arXiv:2601.13802.
Download Paper

分享到

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Yujie Tu

分享到