Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
发布时间:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
发布时间:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
发布时间:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
发布时间:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
发布时间:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis
Published in arXiv preprint arXiv:2601.13802, 2026
Habibi is a unified-dialectal Arabic TTS framework covering 12+ regional dialects. Our unified model matches or surpasses per-dialect specialized models and is highly competitive with ElevenLabs Eleven v3 (alpha).
Recommended citation: Y. Chen, J. Liu, Y. Tu, Z. Niu, Y. Liang, C. Qiang, C. Zhang, K. Yu, X. Chen. (2026). "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis." arXiv preprint arXiv:2601.13802.
Download Paper
VIBEVOICE-ASR Technical Report
Published in arXiv preprint arXiv:2601.18184, 2026
VibeVoice-ASR is a general-purpose speech understanding framework that supports single-pass processing for up to 60 minutes of audio, unifying ASR, Speaker Diarization, and Timestamping into a single end-to-end generation task. It supports over 50 languages and natively handles code-switching.
Recommended citation: Z. Peng, J. Yu, Y. Chang, Z. Wang, L. Dong, Y. Hao, Y. Tu, C. Yang, W. Wang, et al. (2026). "VIBEVOICE-ASR Technical Report." arXiv preprint arXiv:2601.18184.
Download Paper
