wolbanking77-afro-xlmr-large

This model is a fine-tuned version of Davlan/afro-xlmr-large on the WolBanking77 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2959
  • F1: 0.5810

Model description

See WolBanking77: Wolof Banking Speech Intent Classification Dataset paper

Intended uses & limitations

  • Customer Intent Detection
  • Machine Translation in French & Wolof
  • Automatic Speech Recognition in Wolof
  • Comparing different machine learning models for Intent Classification

Training and evaluation data

See WolBanking77 dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss F1
4.2932 1.0 1958 4.0790 0.0315
2.9364 2.0 3916 2.7736 0.3511
2.288 3.0 5874 2.3849 0.4881
1.992 4.0 7832 2.2629 0.5396
1.6931 5.0 9790 2.2066 0.5694
1.3725 6.0 11748 2.2224 0.5770
1.2085 7.0 13706 2.2959 0.5810
0.9111 8.0 15664 2.4032 0.5704
0.8359 9.0 17622 2.4982 0.5719
0.6609 10.0 19580 2.6449 0.5720
0.4818 11.0 21538 2.7611 0.5691
0.417 12.0 23496 2.9075 0.5677
0.285 13.0 25454 3.0922 0.5570
0.1873 14.0 27412 3.2292 0.5582
0.1321 15.0 29370 3.3421 0.5646
0.0938 16.0 31328 3.4509 0.5694
0.0704 17.0 33286 3.5159 0.5695
0.0423 18.0 35244 3.5586 0.5740
0.0262 19.0 37202 3.6021 0.5652
0.0228 20.0 39160 3.6177 0.5654

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for karim155/wolbanking77-afro-xlmr-large

Finetuned
(21)
this model