shvit_s2 Fine-tuned on EuroSAT
This model is a fine-tuned version of SHViT (Single-Head Vision Transformer) on the EuroSAT dataset.
SHViT is from the CVPR 2024 paper: SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design by Seokju Yun and Youngmin Ro.
Model Description
- Base Model: shvit_s2
- Fine-tuned Dataset: EuroSAT
- Number of Classes: 10
- Input Resolution: 224x224
- Framework: PyTorch / timm
Performance
- Test Accuracy (100% data): 93.83%
- Data Efficiency Score: 0.728
- Data for 90% Performance: 55.0%
Dataset
EuroSAT: Satellite image classification (10 land use classes)
- Classes: 10
- Image Size: 64x64 โ 224x224 (resized)
Training Details
This model was trained as part of a comprehensive analysis comparing SHViT with baseline models (DeiT-Tiny, MobileNetV2) across multiple dimensions:
- Robustness to corruptions (noise, blur, weather effects)
- Data efficiency across different training data fractions
- Geometric invariance (rotation, crop, color changes)
- Domain adaptation capabilities
- Representation similarity analysis
Training configuration:
- Optimizer: AdamW
- Learning rate schedule: Cosine decay
- Augmentation: RandAugment, Random Erasing
- Input size: 224ร224
Usage
import torch
from timm import create_model
# Load model
model = create_model('shvit_s2', num_classes=10, pretrained=False)
# Load checkpoint
checkpoint = torch.hub.load_state_dict_from_url(
'hf://YOUR_USERNAME/shvit_s2-eurosat/checkpoint_99.pth'
)
model.load_state_dict(checkpoint['model'])
model.eval()
# Use for inference
# (your image preprocessing code here)
Or use with Hugging Face Hub:
from huggingface_hub import hf_hub_download
import torch
# Download checkpoint
checkpoint_path = hf_hub_download(
repo_id="YOUR_USERNAME/shvit_s2-eurosat",
filename="checkpoint_99.pth"
)
# Load model (requires timm and the SHViT model definition)
checkpoint = torch.load(checkpoint_path)
# ... load into your model
Analysis Repository
This model is part of a comprehensive analysis project. Full analysis code, scripts, and additional models available at:
- GitHub: [Your GitHub Repository]
- Paper/Report: [If available]
Analysis Scripts Include:
- Learning curve analysis across data fractions
- Robustness evaluation under various corruptions
- Geometric invariance testing (rotation, crop, color)
- Domain shift and transfer learning experiments
- Representation similarity (CKA, CCA) analysis
- Gradient-based saliency visualizations
Citation
If you use this model, please cite the original SHViT paper:
@inproceedings{yun2024shvit,
author={Yun, Seokju and Ro, Youngmin},
title={SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages={5756--5767},
year={2024}
}
And if you found this fine-tuned model or analysis useful:
@misc{shvit_s2_eurosat,
author = {Your Name},
title = {shvit_s2 Fine-tuned on EuroSAT},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/YOUR_USERNAME/shvit_s2-eurosat}},
}
Collaborators
This work was done in collaboration with:
- Vishal V
- Priyal Garg
License
This model follows the MIT license. The original SHViT implementation is also under MIT license.
Acknowledgments
- Original SHViT authors: Seokju Yun and Youngmin Ro
- Built using timm
- Trained models and analysis scripts available in our repository
- Downloads last month
- -