DermLIP_PanDerm-base-w-PubMed-256 / README.md

Update README.md

b079489 verified about 2 months ago

4.13 kB

	---
	language:
	- en
	license: cc-by-4.0
	tags:
	- vision
	- image-text-to-text
	- medical
	- dermatology
	- multimodal
	- clip
	- zero-shot-classification
	- image-classification
	pipeline_tag: zero-shot-image-classification
	library_name: transformers
	---

	# DermLIP: Dermatology Language-Image Pretraining

	## Model Description

	DermLIP is a vision-language model for dermatology, trained on the Derm1M dataset—the largest dermatological image-text corpus to date. This model variant (`PanDerm-base-w-PubMed-256`) utilizes domain-specific pretraining to deliver superior performance compared to other DermLIP variants..

	### Model Details

	- Model Type: Pretrained Vision-Language Model (CLIP-style)

	- Architecture:

	- Vision encoder (PanDerm-base): https://github.com/SiyuanYan1/PanDerm
	- Text encoder (PubmedBert-256): https://huggingface.co/NeuML/pubmedbert-base-embeddings

	- Resolution: 224×224 pixels

	- Paper: https://arxiv.org/abs/2503.14911

	- Repository: https://github.com/SiyuanYan1/Derm1M

	- license: cc-by-nc-nd-4.0


	## Training Details

	- Training data: 403,563 skin image-text pairs from Derm1M datasets. Images include both dermoscopic and clinical images.
	- Training objective: image-text contrastive loss
	- Hardware: 1 x Nvidia H200（～90GB memory usage）
	- Hours used: ~9.5 hours

	## Intended Uses

	### Primary Use Cases

	- Zero-shot classification
	- Few-shot learning
	- Cross-modal retrieval
	- Concept annotation/explanation


	## How to Use

	### Installation

	First, clone the Derm1M repository:
	```bash
	git clone git@github.com:SiyuanYan1/Derm1M.git
	cd Derm1M
	```

	Then install the package following the instruction in the repository.

	### Quick Start
	```python
	import open_clip
	from PIL import Image
	import torch

	# Load model with huggingface checkpoint
	model, _, preprocess = open_clip.create_model_and_transforms(
	'hf-hub:redlessone/DermLIP_PanDerm-base-w-PubMed-256'
	)
	model.eval()

	# Initialize tokenizer
	tokenizer = open_clip.get_tokenizer('hf-hub:redlessone/DermLIP_PanDerm-base-w-PubMed-256')

	# Read example image
	image = preprocess(Image.open("your_skin_image.png")).unsqueeze(0)

	# Define disease labels (example: PAD dataset classes)
	PAD_CLASSNAMES = [
	"nevus",
	"basal cell carcinoma",
	"actinic keratosis",
	"seborrheic keratosis",
	"squamous cell carcinoma",
	"melanoma"
	]

	# Build text prompts
	template = lambda c: f'This is a skin image of {c}'
	text = tokenizer([template(c) for c in PAD_CLASSNAMES])

	# Inference
	with torch.no_grad(), torch.autocast("cuda"):
	# Encode image and text
	image_features = model.encode_image(image)
	text_features = model.encode_text(text)

	# Normalize features
	image_features /= image_features.norm(dim=-1, keepdim=True)
	text_features /= text_features.norm(dim=-1, keepdim=True)

	# Compute similarity
	text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

	# Get prediction
	final_prediction = PAD_CLASSNAMES[torch.argmax(text_probs[0])]
	print(f'This image is diagnosed as {final_prediction}.')
	print("Label probabilities:", text_probs)
	```

	## Contact

	For any additional questions or comments, contact Siyuan Yan (`siyuan.yan@monash.edu`),

	## Cite our Paper
	```bibtex
	@misc{yan2025derm1m,
	title = {Derm1M: A Million‑Scale Vision‑Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology},
	author = {Siyuan Yan and Ming Hu and Yiwen Jiang and Xieji Li and Hao Fei and Philipp Tschandl and Harald Kittler and Zongyuan Ge},
	year = {2025},
	eprint = {2503.14911},
	archivePrefix= {arXiv},
	primaryClass = {cs.CV},
	url = {https://arxiv.org/abs/2503.14911}
	}

	@article{yan2025multimodal,
	title={A multimodal vision foundation model for clinical dermatology},
	author={Yan, Siyuan and Yu, Zhen and Primiero, Clare and Vico-Alonso, Cristina and Wang, Zhonghua and Yang, Litao and Tschandl, Philipp and Hu, Ming and Ju, Lie and Tan, Gin and others},
	journal={Nature Medicine},
	pages={1--12},
	year={2025},
	publisher={Nature Publishing Group}
	}
	```