wikeeyang's picture
Update README.md
a503fd9 verified
---
license: other
license_name: tencent-hunyuan-community
license_link: LICENSE
pipeline_tag: text-to-image
library_name: transformers
language:
- zh
- en
tasks:
- text-to-image-synthesis
frameworks: PyTorch
base_model:
- tencent/HunyuanImage-3.0
base_model_relation: quantized
---
==================================================================================
ๆœฌๆจกๅž‹ไธบ https://huggingface.co/tencent/HunyuanImage-3.0 ๆจกๅž‹็š„ qint4 ้‡ๅŒ–็‰ˆๆœฌ๏ผŒ้‡‡็”จ https://github.com/huggingface/optimum-quanto ๆŠ€ๆœฏ้‡ๅŒ–๏ผŒ้‡‡็”จ้žๅฎ˜ๆ–นๆŠ€ๆœฏไฟๅญ˜็š„ๆƒ้‡ๆ–‡ไปถใ€‚
ๆœฌ้‡ๅŒ–ๆจกๅž‹็›ฎๅ‰ๅœจ H20 96GB ๅ•ๅกไธŠ้€š่ฟ‡ๆต‹่ฏ•ใ€‚ๆจกๅž‹ๅŠ ่ฝฝๆ–นๅผ๏ผŒ้‡‡็”จ้žๅฎ˜ๆ–นไปฃ็ ๏ผŒ่ฏฆ่ง load_quantized_model.py ไปฃ็ ๏ผŒ็›ฎๅ‰้‡Œ้ขๅŒ…ๅซไธค็งๅŠ ่ฝฝๆ–นๅผ๏ผŒไพ›ๅคงๅฎถๅ‚่€ƒ๏ผŒๆฌข่ฟŽๅคงๅฎถ็›ธไบ’ไบคๆตใ€ๅ…ฑๅŒ็ ”็ฉถๅญฆไน ๏ผŒ่ฐข่ฐข๏ผ
ๅŠ ่ฝฝๆ–นๅผไธ€๏ผšๆจกๅž‹ๅˆๅง‹ๅŒ–ๅŠ ่ฝฝ้œ€่ฆ CPU ๅคง็บฆ 160GB ๅทฆๅณ๏ผŒGPU ๅˆๅง‹ๅ ็”จ 50GB๏ผ›ๆŽจ็†ๅผ€ๅง‹ๅŽ CPU ๅ ็”จ้™่‡ณ 70GB ๅทฆๅณ๏ผŒGPU ๅ ็”จ็บฆ 55-60 GBใ€‚ๆจกๅž‹ๅŠ ่ฝฝๆ—ถไผšๅ‡บ็Žฐๆจกๅž‹้”ฎๅ€ผ็š„่ญฆๅ‘Šไฟกๆฏ๏ผŒไฝ†ไธๅฝฑๅ“ไฝฟ็”จใ€‚
ๅŠ ่ฝฝๆ–นๅผไบŒ๏ผšๆจกๅž‹ๅˆๅง‹ๅŒ–ๅŠ ่ฝฝ้œ€่ฆ CPU ๅคง็บฆ 75GB๏ผŒGPU ๅˆๅง‹ๅ ็”จ 50GB๏ผ›ๆŽจ็†ๅผ€ๅง‹ๅŽ CPU ไฟๆŒ 75GB ๅ ็”จ๏ผŒ GPU ๅ ็”จ็บฆ 55-60GBใ€‚ๆจกๅž‹ๅŠ ่ฝฝๆ—ถ๏ผŒ็”ฑไบŽๆไพ›ไบ†้”ฎๅ€ผ map , ๆ‰€ไปฅไธไผšๅ‡บ็Žฐไปปไฝ•่ญฆๅ‘Šไฟกๆฏใ€‚
ไธค็งๆ–นๆณ•ๆŽจ็†ๆ—ถ้—ดๅคง่‡ด็›ธๅŒ๏ผŒๅœจ H20 ไธŠๅคง็บฆ 12 ๅˆ†้’Ÿไธ€ๅผ (9:16 / 16:9)ใ€‚
<img src="./example.jpg" alt="Example Generated Image" width="800">
==================================================================================
### HunyuanImage-3.0 ๆ˜ฏไธ€ไธช้žๅธธๅ‡บ่‰ฒ็š„ๅ…จๆจกๆ€ๆททๅˆไธ“ๅฎถๆจกๅž‹๏ผไปฅไธ‹ไป‹็ปๅ†…ๅฎนๅผ•็”จ่‡ชๅฎ˜ๆ–นๅŽŸๆจกๅž‹ไป‹็ป้กตใ€‚ๆœฌ้กน็›ฎๆไพ›็š„ๆจกๅž‹ๅ’Œไปฃ็ ไป…็”จไบŽ็คพๅŒบๅˆ†ไบซๅ’ŒๆŠ€ๆœฏ็ ”็ฉถๅญฆไน ไฝฟ็”จ๏ผŒ่ฏท้ตๅฎˆ่…พ่ฎฏๆททๅ…ƒๅฎ˜ๆ–น็š„ License ็›ธๅ…ณ่ง„ๅฎšใ€‚
==================================================================================
<div align="center">
<img src="./logo.png" alt="HunyuanImage-3.0 Logo" width="600">
# ๐ŸŽจ HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
</div>
<div align="center">
<img src="./banner.png" alt="HunyuanImage-3.0 Banner" width="800">
</div>
<div align="center">
<a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
<a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
<a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
<a href=https://arxiv.org/pdf/2509.23951 target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
<a href=https://docs.qq.com/doc/DUVVadmhCdG9qRXBU target="_blank"><img src=https://img.shields.io/badge/๐Ÿ“š-PromptHandBook-blue.svg?logo=book height=22px></a>
</div>
<p align="center">
๐Ÿ‘ Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
๐Ÿ’ป <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(ๅฎ˜็ฝ‘) Try our model!</a>&nbsp&nbsp
</p>
## ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ News
- **September 28, 2025**: ๐Ÿ“– **HunyuanImage-3.0 Technical Report Released** - Comprehensive technical documentation now available
- **September 28, 2025**: ๐Ÿš€ **HunyuanImage-3.0 Open Source Release** - Inference code and model weights publicly available
## ๐Ÿงฉ Community Contributions
If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know.
## ๐Ÿ“‘ Open-source Plan
- HunyuanImage-3.0 (Image Generation Model)
- [x] Inference
- [x] HunyuanImage-3.0 Checkpoints
- [ ] HunyuanImage-3.0-Instruct Checkpoints (with reasoning)
- [ ] VLLM Support
- [ ] Distilled Checkpoints
- [ ] Image-to-Image Generation
- [ ] Multi-turn Interaction
## ๐Ÿ—‚๏ธ Contents
- [๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ News](#-news)
- [๐Ÿงฉ Community Contributions](#-community-contributions)
- [๐Ÿ“‘ Open-source Plan](#-open-source-plan)
- [๐Ÿ“– Introduction](#-introduction)
- [โœจ Key Features](#-key-features)
- [๐Ÿ› ๏ธ Dependencies and Installation](#-dependencies-and-installation)
- [๐Ÿ’ป System Requirements](#-system-requirements)
- [๐Ÿ“ฆ Environment Setup](#-environment-setup)
- [๐Ÿ“ฅ Install Dependencies](#-install-dependencies)
- [Performance Optimizations](#performance-optimizations)
- [๐Ÿš€ Usage](#-usage)
- [๐Ÿ”ฅ Quick Start with Transformers](#-quick-start-with-transformers)
- [๐Ÿ  Local Installation & Usage](#-local-installation--usage)
- [๐ŸŽจ Interactive Gradio Demo](#-interactive-gradio-demo)
- [๐Ÿงฑ Models Cards](#-models-cards)
- [๐Ÿ“ Prompt Guide](#-prompt-guide)
- [Manually Writing Prompts](#manually-writing-prompts)
- [System Prompt For Automatic Rewriting the Prompt](#system-prompt-for-automatic-rewriting-the-prompt)
- [Advanced Tips](#advanced-tips)
- [More Cases](#more-cases)
- [๐Ÿ“Š Evaluation](#-evaluation)
- [๐Ÿ“š Citation](#-citation)
- [๐Ÿ™ Acknowledgements](#-acknowledgements)
- [๐ŸŒŸ๐Ÿš€ Github Star History](#-github-star-history)
---
## ๐Ÿ“– Introduction
**HunyuanImage-3.0** is a groundbreaking native multimodal model that unifies multimodal understanding and generation within an autoregressive framework. Our text-to-image module achieves performance **comparable to or surpassing** leading closed-source models.
<div align="center">
<img src="./framework.png" alt="HunyuanImage-3.0 Framework" width="90%">
</div>
## โœจ Key Features
* ๐Ÿง  **Unified Multimodal Architecture:** Moving beyond the prevalent DiT-based architectures, HunyuanImage-3.0 employs a unified autoregressive framework. This design enables a more direct and integrated modeling of text and image modalities, leading to surprisingly effective and contextually rich image generation.
* ๐Ÿ† **The Largest Image Generation MoE Model:** This is the largest open-source image generation Mixture of Experts (MoE) model to date. It features 64 experts and a total of 80 billion parameters, with 13 billion activated per token, significantly enhancing its capacity and performance.
* ๐ŸŽจ **Superior Image Generation Performance:** Through rigorous dataset curation and advanced reinforcement learning post-training, we've achieved an optimal balance between semantic accuracy and visual excellence. The model demonstrates exceptional prompt adherence while delivering photorealistic imagery with stunning aesthetic quality and fine-grained details.
* ๐Ÿ’ญ **Intelligent World-Knowledge Reasoning:** The unified multimodal architecture endows HunyuanImage-3.0 with powerful reasoning capabilities. It leverages its extensive world knowledge to intelligently interpret user intent, automatically elaborating on sparse prompts with contextually appropriate details to produce superior, more complete visual outputs.
## ๐Ÿ“š Citation
If you find HunyuanImage-3.0 useful in your research, please cite our work:
```bibtex
@article{cao2025hunyuanimage,
title={HunyuanImage 3.0 Technical Report},
author={Cao, Siyu and Chen, Hangting and Chen, Peng and Cheng, Yiji and Cui, Yutao and Deng, Xinchi and Dong, Ying and Gong, Kipper and Gu, Tianpeng and Gu, Xiusen and others},
journal={arXiv preprint arXiv:2509.23951},
year={2025}
}
```
## ๐Ÿ™ Acknowledgements
We extend our heartfelt gratitude to the following open-source projects and communities for their invaluable contributions:
* ๐Ÿค— [Transformers](https://github.com/huggingface/transformers) - State-of-the-art NLP library
* ๐ŸŽจ [Diffusers](https://github.com/huggingface/diffusers) - Diffusion models library
* ๐ŸŒ [HuggingFace](https://huggingface.co/) - AI model hub and community
* โšก [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
* ๐Ÿš€ [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine
## ๐ŸŒŸ๐Ÿš€ Github Star History
[![GitHub stars](https://img.shields.io/github/stars/Tencent-Hunyuan/HunyuanImage-3.0?style=social)](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0)
[![GitHub forks](https://img.shields.io/github/forks/Tencent-Hunyuan/HunyuanImage-3.0?style=social)](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0)
[![Star History Chart](https://api.star-history.com/svg?repos=Tencent-Hunyuan/HunyuanImage-3.0&type=Date)](https://www.star-history.com/#Tencent-Hunyuan/HunyuanImage-3.0&Date)