Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yingli Shen's picture
3

Yingli Shen

ylshen
21world's profile picture Gargaz's profile picture 0xSojalSec's profile picture
·
https://github.com/yl-shen
  • yl-shen

AI & ML interests

Postdoctoral Researcher @ THUNLP, Tsinghua University. Researching Multilingual Large Language Models.

Recent Activity

authored a paper about 1 month ago
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
authored a paper about 1 month ago
From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora
updated a dataset about 1 month ago
openbmb/DCAD-2000
View all activity

Organizations

OpenBMB's profile picture

authored 2 papers about 1 month ago

DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection

Paper • 2502.11546 • Published Feb 17

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora

Paper • 2505.14045 • Published May 20
updated a dataset about 1 month ago

openbmb/DCAD-2000

Viewer • Updated Oct 13 • 6.23B • 36.4k • 15
liked a model about 2 months ago

openbmb/VoxCPM-0.5B

Text-to-Speech • Updated Sep 19 • 2.97k • 759
updated 2 datasets about 2 months ago

openbmb/DCAD-2000

Viewer • Updated Oct 13 • 6.23B • 36.4k • 15

openbmb/DCAD-2000

Viewer • Updated Oct 13 • 6.23B • 36.4k • 15
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs