train_codealpacapy_123_1762572064

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4916
  • Num Input Tokens Seen: 24941912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7537 1.0 1908 0.7072 1248304
0.5438 2.0 3816 0.5793 2497016
0.4952 3.0 5724 0.5439 3742552
0.4696 4.0 7632 0.5270 4985200
0.5107 5.0 9540 0.5173 6233920
0.5222 6.0 11448 0.5106 7478504
0.4821 7.0 13356 0.5058 8722744
0.8077 8.0 15264 0.5022 9977520
0.5135 9.0 17172 0.4996 11225416
0.6831 10.0 19080 0.4971 12472912
0.5967 11.0 20988 0.4956 13721824
0.4765 12.0 22896 0.4947 14970528
0.6152 13.0 24804 0.4937 16220808
0.4111 14.0 26712 0.4929 17464792
0.4217 15.0 28620 0.4923 18706976
0.4192 16.0 30528 0.4920 19956544
0.4315 17.0 32436 0.4919 21204416
0.4564 18.0 34344 0.4918 22451928
0.4151 19.0 36252 0.4916 23696296
0.6589 20.0 38160 0.4917 24941912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_123_1762572064

Adapter
(2038)
this model