---
license: mit
language:
- en
base_model:
- dmis-lab/biobert-v1.1
---
Purpose: We fine-tuned the BioBERT model on the text part of the IU Chest X-ray dataset. Then this model could be used as an embedding model for text embedding to support our experiment on retrieval-augmented in-context learning.

Usage:
 - tokenizer = AutoTokenizer.from_pretrained("dmis-lab/biobert-v1.1")
 - model = AutoModel.from_pretrained("Learn4everrr/Tuned_bioBERT")

Training parameters:
training_args = TrainingArguments(
    output_dir="./biobert_finetuned",
    num_train_epochs=30,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    save_total_limit=1,
)


Please cite our paper as
```
@article{zhan2025retrieval,
  title={Retrieval-augmented in-context learning for multimodal large language models in disease classification},
  author={Zhan, Zaifu and Zhou, Shuang and Zhou, Xiaoshan and Xiao, Yongkang and Wang, Jun and Deng, Jiawen and Zhu, He and Hou, Yu and Zhang, Rui},
  journal={arXiv preprint arXiv:2505.02087},
  year={2025}
}
```