Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
csuhan 's Collections
Tar
OneLLM

Tar

updated Jul 3

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Upvote
1

  • Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

    Paper • 2506.18898 • Published Jun 23 • 33

  • Running on Zero
    44
    44

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • Running on Zero
    2
    2

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • Runtime error
    60
    60

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • ByteDance-Seed/Tar-1.5B

    Any-to-Any • 3B • Updated Jul 2 • 411 • 18

  • ByteDance-Seed/Tar-7B

    Any-to-Any • 9B • Updated Jul 2 • 123 • 35

  • ByteDance-Seed/Tar-TA-Tok

    Updated Jul 2 • 6

  • csuhan/tar_1.5B_pretrain_demo

    3B • Updated Jun 16 • 6
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs