AdinaY (Adina Yakefu)

reacted to danielhanchen's post with 🔥 1 day ago

Post

5679

DeepSeek-V4 can now run locally with Unsloth GGUFs! 🐳

Run lossless DeepSeek-V4-Flash on 168GB RAM or
3-bit works on 110GB Mac, RAM, VRAM setups.

Run via Unsloth Studio or llama.cpp.

GGUF: unsloth/DeepSeek-V4-Flash-GGUF
Guide: https://unsloth.ai/docs/models/deepseek-v4

reacted to satgeze's post with 🔥 2 days ago

Post

3552

First GGUF quants of Tencent's Hy3 (299B MoE), built before official llama.cpp support exists.

Hy3 dropped ~30 hours ago with only MLX and MXFP4 quants, both datacenter-sized. So I converted it myself using a community llama.cpp fork that implements the hy_v3 architecture.

What's in the repo:

- IQ1_M (62GB, fits a 128GB MacBook), IQ2_M (90GB), Q2_K (101GB), all with 1M context baked in via YaRN
- IQ quants are importance-matrix: bootstrap style. The static Q2_K ran RAM-resident to compute the imatrix, then IQ1_M and IQ2_M were requantized from the archived f16 with it
- Fixed chat template (the stock one uses .format() calls llama.cpp's Jinja rejects)
- Build instructions for the fork, including the two gotchas that cost me three build attempts

Honesty section, because that is how these repos work: this is EXPERIMENTAL. Not needle-certified yet (1M is baked but unverified, certification ladder will be published either way). MTP layer exists in the checkpoint but no llama.cpp build can run hy_v3 MTP inference yet, so it is not included. Real gate outputs are on the card, misses and all, judge for yourself.

satgeze/Hy3-1M-GGUF

Full quant ladder (Q3 through Q8) is mirroring to ModelScope for bigger hardware.

9 replies

·

replied to YMRohit's post 24 days ago

@ChaseJing

reacted to qgallouedec's post with 🔥 about 2 months ago

Post

10504

Shipped hf-sandbox! 🥡

🧪 Running an eval that executes model-generated C on a few thousand prompts? You probably don't want any of that on your laptop.
Just shipped hf-sandbox, a Modal-style sandbox API on top of Hugging Face Jobs. Spin up an isolated, ephemeral container, run untrusted code, get the result back. No Docker on your laptop, no infra to manage.

Just pip install hf-sandbox.

Early days (v0.1); feedback and issues very welcome:
👉 https://github.com/huggingface/hf-sandbox

1 reply

·

reacted to imnotkitty's post with 🔥 3 months ago

Post

4035

tencent/Hy3-preview is out: an open-weights MoE reasoning model.

✅ 295B total / 21B active / 256K context
✅ Fused fast-and-slow thinking in a single model
✅ First model trained on Hunyuan's rebuilt pretraining + RL infra (Feb → Apr)

Benchmarks:
👉 SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, WideSearch — competitive results, particularly strong on agentic tool use
👉 Top score on Tsinghua's 2026 Spring math PhD qualifying exam
👉 Strong context-learning and instruction-following on Tencent's CL-bench / CL-bench-Life

More details can be found in my article: https://huggingface.co/blog/imnotkitty/hy3-preview

2 replies

·

reacted to NJX-njx's post with 👍 4 months ago

Post

7565

Recently, I have open-sourced an AI emotional companion product based on openclaw, called opensoul.

On this platform, you can create a "soulmate" that matches your personality, and configure it with the skills, tools you want it to have, as well as the platforms it can integrate with (such as Telegram, Discord, etc.).
You can even create group chats, invite multiple agents and your friends to chat about recent events, discuss projects together, and so on.

On the one hand, I hope it can better accompany you in daily life by virtue of its unique memory mechanism, self-feedback and iteration mechanism, and the modeling of users' emotions. On the other hand, I also hope it can help you better handle your work with its unique skills, tools and ability to deal with complex task scenarios.

Although the entire product has taken shape, I think there are still many areas that need adjustment and optimization. I also hope to rely on the strength of the community to do a good job in AI emotional companionship.

This is the project introduction URL: https://opensoul-web.vercel.app
This is the GitHub project URL: https://github.com/NJX-njx/opensoul
@AdinaY @lilianweng@burtenshaw@clem
let's just do it

24 replies

·

posted an update 5 months ago

Post

4174

MiniMax M2.5 is now available on the hub 🚀

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS

2 replies

·

posted an update 5 months ago

Post

819

RynnBrain 🤖 a physics aware embodied brain for robots from Alibaba DAMO

https://huggingface.co/collections/Alibaba-DAMO-Academy/rynnbrain

✨ 2B/8B/30B (3B active)
✨ Apache 2.0
✨ Understands egocentric scenes with strong spatial awareness
✨ Tracks objects and motion over time

2 replies

·

posted an update 5 months ago

Post

4301

Game on 🎮🚀

While Seedance 2.0’s videos are all over the timeline, DeepSeek quietly pushed a new model update in its app.

GLM-5 from Z.ai adds more momentum.

Ming-flash-omni from Ant Group , MiniCPM-SALA from OpenBMB
, and the upcoming MiniMax M2.5 keep the heat on 🔥

Spring Festival is around the corner,
no one’s sleeping!

✨ More releases coming, stay tuned
https://huggingface.co/collections/zh-ai-community/2026-february-china-open-source-highlights

posted an update 5 months ago

Post

4004

Ming-flash-omni 2.0 🚀 New open omni-MLLM released by Ant Group

inclusionAI/Ming-flash-omni-2.0

✨ MIT license
✨ MoE - 100B/6B active
✨ Zero-shot voice cloning + controllable audio
✨ Fine-grained visual knowledge grounding

2 replies

·

posted an update 5 months ago

Post

853

LLaDA 2.1 is out 🔥 A new series of MoE diffusion language model released by AntGroup

inclusionAI/LLaDA2.1-mini
inclusionAI/LLaDA2.1-flash

✨LLaDA2.1-mini: 16B - Apache2.0
✨LLaDA2.1-flash: 100B - Apache2.0
✨Both delivers editable generation, RL-trained diffusion reasoning and fast inference

2 replies

·

posted an update 5 months ago

Post

2661

AI for science is moving fast🚀

Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab

internlm/Intern-S1-Pro

✨ 1T total / 22B active
✨ Apache 2.0
✨ SoTA scientific reasoning performance
✨ FoPE enables scalable modeling of long physical time series (10⁰–10⁶)

2 replies

·

posted an update 5 months ago

Post

1454

✨ China’s open source AI ecosystem has entered a new phase

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3

One year after the “DeepSeek Moment,” open source has become the default. Models, research, infrastructure, and deployment are increasingly shared to support large-scale, system-level integration.

This final blog examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source.

reacted to rajkumarrawal's post with 🔥 5 months ago

Post

3701

I submitted a "FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning" Paper by Tanyu Chen, Tairan Chen, Kai shen , Zhenghua Bao, Zhihui Zhang, Man Yuan, Yi Shi From

FlashLabs to Daily Papers on

huggingface .

Chroma 1.0 enables real time spoken dialogue with personalized voice cloning through discrete speech representations and interleaved text audio token scheduling.

Chroma 1.0 , the world’s first open source, real time speech to speech model with voice cloning.

FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning (2601.11141)

posted an update 5 months ago

Post

432

GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)

posted an update 5 months ago

Post

1639

Step 3.5 Flash 🔥 new foundation model from StepFun ai

https://huggingface.co/collections/stepfun-ai/step-35-flash

✨ Sparse MoE：196B/11B active
✨ Supports up to 256K context
✨ Multi-token prediction for fast decoding (100–300 tok/s)
✨ Runs locally on consumer hardware

posted an update 5 months ago

Post

1153

What a week 🤯

Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics
has now released a VLA model on the hub too!

unitreerobotics/UnifoLM-VLA-Base

posted an update 5 months ago

Post

336

LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team.

meituan-longcat/LongCat-Flash-Lite

✨ Total 68.5B / 3B active - MIT license
✨ 256k context
✨ Faster inference with N-gram embeddings

reacted to danielhanchen's post with 🚀 5 months ago

Post

3561

You can now run Kimi K2.5 locally! 🔥

We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit.
Get >40 tok/s on 242GB or 622GB VRAM/RAM for near full precision.

GGUF: unsloth/Kimi-K2.5-GGUF

Guide: https://unsloth.ai/docs/models/kimi-k2.5

7 replies

·

posted an update 5 months ago

Post

320

Ant Group is going big on robotics 🤖

They just dropped their first VLA and depth perception foundation model on huggingface.

✨ LingBot-VLA :
- Trained on 20k hours of real-world robot data
- 9 robot embodiments
- Clear no-saturation scaling laws
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-vla
Paper:
A Pragmatic VLA Foundation Model (2601.18692)

✨ LingBot-Depth:
- Metric-accurate 3D from noisy, incomplete depth
- Masked Depth Modeling (self-supervised)
- RGB–depth alignment, works with <5% sparse depth
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-depth
Paper:
Masked Depth Modeling for Spatial Perception (2601.17895)

Adina Yakefu

AI & ML interests

Recent Activity

Organizations

Adina Yakefu

AI & ML interests

Recent Activity

Organizations

AdinaY's activity