Large Motion Video Autoencoding with Cross-modal Video VAE Paper โข 2412.17805 โข Published Dec 23, 2024 โข 24
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering Paper โข 2311.16465 โข Published Nov 28, 2023 โข 2
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper โข 2109.10282 โข Published Sep 21, 2021 โข 6