YAML Metadata
Warning:
The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
This repository hosts the official checkpoints for the paper "Categorical Schrödinger Bridge Matching", accepted at ICML 2025.
📌 TL;DR
This paper extends the Schrödinger Bridge problem to work with discrete time and spaces.
💾 Checkpoints
CSBM
| Dataset | Reference Process | α | N | Saved Iteration |
|---|---|---|---|---|
| Colored MNIST | gaussian | 0.01 | 2, 4, 10, 25, 50, 100 | 3 |
| Colored MNIST | uniform | 0.01, 0.05 | 25 | 3 |
| CelebA | uniform | 0.01, 0.005 | 100 | 4 |
| Amazon Review | uniform | 0.01, 0.005 | 100 | 5 |
Each experiment directory includes a
config.yamlfile with the full training configuration.
Additional Components
vqgan_celeba_f8_1024.ckpt— VQ-GAN pretrained on the CelebA datasettokenizer_amazon.json— Tokenizer trained on the Amazon Reviews dataset
🎓 Citation
@inproceedings{
ksenofontov2025categorical,
title={Categorical {Schr\"odinger} Bridge Matching},
author={Grigoriy Ksenofontov and Alexander Korotin},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=RBly0nOr2h}
}
🙏 Credits
- Weights & Biases — experiment-tracking and visualization toolkit;
- Hugging Face — Tokenizers and Accelerate libraries for tokenizer implementation, parallel training, and checkpoint hosting on the Hub;
- D3PM — reference implementation of discrete-diffusion models;
- Taming Transformers — original VQ-GAN codebase;
- VQ-Diffusion — vector-quantized diffusion architecture;
- MDLM — diffusion architecture for text-generation experiments;
- ASBM — evaluation metrics and baseline models for CelebA face transfer;
- Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising — processed Amazon Reviews dataset and sentiment-transfer baselines;
- Inkscape — an excellent open-source editor for vector graphics.