Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Future Blog Post

少于 1 分钟阅读时长

发布时间： January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

少于 1 分钟阅读时长

发布时间： August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

少于 1 分钟阅读时长

发布时间： August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

少于 1 分钟阅读时长

发布时间： August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

少于 1 分钟阅读时长

发布时间： August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Published in arXiv preprint arXiv:2601.13802, 2026

Habibi is a unified-dialectal Arabic TTS framework covering 12+ regional dialects. Our unified model matches or surpasses per-dialect specialized models and is highly competitive with ElevenLabs Eleven v3 (alpha).

Recommended citation: Y. Chen, J. Liu, Y. Tu, Z. Niu, Y. Liang, C. Qiang, C. Zhang, K. Yu, X. Chen. (2026). "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis." arXiv preprint arXiv:2601.13802.
Download Paper

VIBEVOICE-ASR Technical Report

Published in arXiv preprint arXiv:2601.18184, 2026

VibeVoice-ASR is a general-purpose speech understanding framework that supports single-pass processing for up to 60 minutes of audio, unifying ASR, Speaker Diarization, and Timestamping into a single end-to-end generation task. It supports over 50 languages and natively handles code-switching.

Recommended citation: Z. Peng, J. Yu, Y. Chang, Z. Wang, L. Dong, Y. Hao, Y. Tu, C. Yang, W. Wang, et al. (2026). "VIBEVOICE-ASR Technical Report." arXiv preprint arXiv:2601.18184.
Download Paper

Yujie Tu

Sitemap

Pages

Page Not Found

个人简介

Archive Layout with Content

Posts by Category

Posts by Collection

CV

CV

Funfacts

Page not in menu

Page Archive

Portfolio

Projects

Publications

Sitemap

Posts by Tags

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

Blog Post number 4

Blog Post number 3

Blog Post number 2

Blog Post number 1

portfolio

Portfolio item number 1

Portfolio item number 2

publications

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

VIBEVOICE-ASR Technical Report

talks

teaching