CosAE Convolutional Harmonic Autoencoder
This is a pretrained Convolutional Harmonic Autoencoder (CosAE) model. It encodes images into amplitude/phase harmonics and reconstructs RGB images.
Usage
from transformers import AutoModel
# Load the model with remote code trust
model = AutoModel.from_pretrained(
"vedant-jumle/cosae",
trust_remote_code=True,
)
model.eval()
# Example input: tensor of shape [B, 9, H, W] (RGB + FFT) or [B,3,H,W]
import torch
x = torch.randn(1, 9, 256, 256)
with torch.no_grad():
recon = model(x)
Model Details
- Architecture: Convolutional encoder (ResBlocks + optional attention), Harmonic Construction Module, upsampling decoder
- Input channels: 9 (3 RGB + 6 FFT) or 3
- Image size: 256ร256 (configurable)
References
- Sifei et al. (2024). CosAE: Convolutional Harmonic Autoencoder. NVIDIA AMRI. https://research.nvidia.com/labs/amri/publication/sifei2024cosae/
License
This model is released under the MIT License. See the repository LICENSE for details.
- Downloads last month
- 62
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support