SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

This model, presented in the paper SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model, proposes a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation, enabling higher-resolution outputs with improved control over viewing angles.

Code: https://github.com/xiechun298/SV-DRR

demo2.gif

Visual Comparison with SOTA Methods

visulization

Usage

πŸš€ Quick Start

πŸ› οΈ Environment Setup

To ensure compatibility and reproducibility, follow these steps to set up the environment:

  1. Clone the Repository:

    git clone https://github.com/xiechun-tsukuba/svdrr.git
    cd svdrr
    
  2. Create a Python Virtual Environment:

    conda create -f environment.yaml
    

⏬ Download Pretrained Models

You can download the pretrained models by either:

Option 1: Automated Download (Recommended)

python scripts/download_models.py

This will download all models into the models/ directory. Shared components will be stored in the shared/ folder, and symbolic links will be created in each model folder accordingly.

Option 2: Manual Download from Hugging Face

πŸ” Inference

Important Note: The coordinate system of LIDC-IDRI-DRR is opposite to the intuitive one β€” the polar angle increases downward, and the azimuth angle increases when rotating to the left. To invert the pose coordinate system, use the --flip_pose option.

Single Image Inference

Default views (azimuth angles from -90Β° to 90Β° in 5Β° increments):

python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
    --image_path demo/real_xray.jpg \
    --log_dir outputs/ \
    --image_size 512 \
    --simple_pose

User-specified views defined in camera_views.json:

python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
    --image_path demo/real_xray.jpg \
    --log_dir outputs/ \
    --image_size 512 \
    --poses demo/camera_views.json

Dataset Inference

Perform inference on the LIDC-IDRI-DRR dataset:

python test_svdrr_DiT.py --model_path models/svdrr-DiT-fb-256 \
    --dataset {path/to/dataset/} \
    --log_dir outputs/ \
    --image_size 256 

BibTex

If you find this work useful, a citation will be appreciated via:

@InProceedings{XieChu_SVDRR_MICCAI2025,
        author = { Xie, Chun AND Yoshii, Yuichi AND Kitahara, Itaru},
        title = { { SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {572 -- 582},
        doi = {https://doi.org/10.1007/978-3-032-04965-0_54}
}

@misc{xie2025svdrr,
        title = {SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model}, 
        author = {Chun Xie and Yuichi Yoshii and Itaru Kitahara},
        year = {2025},
        eprint = {2507.05148},
        archivePrefix = {arXiv},
        doi = {https://doi.org/10.48550/arXiv.2507.05148}, 
} 
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support