SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model
This model, presented in the paper SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model, proposes a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation, enabling higher-resolution outputs with improved control over viewing angles.
Code: https://github.com/xiechun298/SV-DRR
Visual Comparison with SOTA Methods
Usage
π Quick Start
π οΈ Environment Setup
To ensure compatibility and reproducibility, follow these steps to set up the environment:
Clone the Repository:
git clone https://github.com/xiechun-tsukuba/svdrr.git cd svdrrCreate a Python Virtual Environment:
conda create -f environment.yaml
β¬ Download Pretrained Models
You can download the pretrained models by either:
Option 1: Automated Download (Recommended)
python scripts/download_models.py
This will download all models into the models/ directory. Shared components will be stored in the shared/ folder, and symbolic links will be created in each model folder accordingly.
Option 2: Manual Download from Hugging Face
- 256 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-256
- 512 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-512
- 1024 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-1024
π Inference
Important Note: The coordinate system of LIDC-IDRI-DRR is opposite to the intuitive one β the polar angle increases downward, and the azimuth angle increases when rotating to the left. To invert the pose coordinate system, use the --flip_pose option.
Single Image Inference
Default views (azimuth angles from -90Β° to 90Β° in 5Β° increments):
python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
--image_path demo/real_xray.jpg \
--log_dir outputs/ \
--image_size 512 \
--simple_pose
User-specified views defined in camera_views.json:
python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
--image_path demo/real_xray.jpg \
--log_dir outputs/ \
--image_size 512 \
--poses demo/camera_views.json
Dataset Inference
Perform inference on the LIDC-IDRI-DRR dataset:
python test_svdrr_DiT.py --model_path models/svdrr-DiT-fb-256 \
--dataset {path/to/dataset/} \
--log_dir outputs/ \
--image_size 256
BibTex
If you find this work useful, a citation will be appreciated via:
@InProceedings{XieChu_SVDRR_MICCAI2025,
author = { Xie, Chun AND Yoshii, Yuichi AND Kitahara, Itaru},
title = { { SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15963},
month = {September},
page = {572 -- 582},
doi = {https://doi.org/10.1007/978-3-032-04965-0_54}
}
@misc{xie2025svdrr,
title = {SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model},
author = {Chun Xie and Yuichi Yoshii and Itaru Kitahara},
year = {2025},
eprint = {2507.05148},
archivePrefix = {arXiv},
doi = {https://doi.org/10.48550/arXiv.2507.05148},
}
- Downloads last month
- 21