dousery/medical-reasoning-gpt-oss-20b · More details on finetuning params

More details on finetuning params

by armand0e - opened Nov 6

Nov 6

First off, great job with this finetune. I was hoping to gain some more insight on your training parameters for training. I saw some included in your readme but was curious on these values as well:

weight decay, lora alpha, and lr_scheduler_type

Thanks.

dousery

Owner Nov 6

Hello @armand0e

Thanks for your interest! In addition to the hyperparameters already listed in the README, here are the values you're asking about :

Weight Decay: 0.01
Used for light regularization, as we're running a single-epoch fine-tune on a large model.
LoRA Alpha: 16
Aligned with a LoRA rank of 8, it offers a good balance between adaptation strength and training stability.
LR Scheduler Type: "linear"

Hope that helps — let me know if you'd like to dig deeper into any other part of the configuration or training workflow!

armand0e

about 1 month ago

•

edited about 1 month ago

Thanks a bunch, I have tried similar params for my finetunes and experienced lots of tool calling issues, will give it another go with these exact settings and see how it does.

While this was helpful, I don't think they are 1:1 for my use-case. Thanks anyways for your response

armand0e changed discussion status to closed about 1 month ago

dmmaximus

27 days ago

Hi, thanks for the work. I can't run the model; I get the error: "AttributeError: GptOssExperts has no attribute 'down_projs'." Maybe you've encountered this.

armand0e

26 days ago

Hi, thanks for the work. I can't run the model; I get the error: "AttributeError: GptOssExperts has no attribute 'down_projs'." Maybe you've encountered this.

Try double checking you have the right versions of Transformers and TRL, I noticed that some dependencies yield weird errors with GptOss.

transformers==4.56.2
trl==0.22.2

armand0e changed discussion status to open 26 days ago

dousery

Owner 26 days ago

Hi, thanks for the work. I can't run the model; I get the error: "AttributeError: GptOssExperts has no attribute 'down_projs'." Maybe you've encountered this.

Try double checking you have the right versions of Transformers and TRL, I noticed that some dependencies yield weird errors with GptOss.

transformers==4.56.2

trl==0.22.2

Thanks for stepping in and providing an answer, really appreciate it.
Yes, it turns out the issue was caused by dependency version mismatches.

armand0e

26 days ago

Happy to help 😀

armand0e changed discussion status to closed 26 days ago

armand0e

18 days ago

Noticed the README says this:

Training Configuration
Base Model: unsloth/gpt-oss-20b (20B parameters)
Training Method: LoRA (adapter-only fine-tuning)
LoRA Rank: 8
Learning Rate: 5e-5
Batch Size: 4 per device, gradient_accumulation_steps=4
Epochs: 2
Max Sequence Length: 2048
LR Scheduler: Cosine,
warmup_ratio=0.05
Final Training Loss: 1.22

While your response earlier contradicts this. Just confirming which values.

Weight Decay: 0.01
Used for light regularization, as we're running a single-epoch fine-tune on a large model.

LoRA Alpha: 16
Aligned with a LoRA rank of 8, it offers a good balance between adaptation strength and training stability.

LR Scheduler Type: "linear"

armand0e changed discussion status to open 18 days ago

dousery

Owner 17 days ago

Noticed the README says this:

Training Configuration
Base Model: unsloth/gpt-oss-20b (20B parameters)
Training Method: LoRA (adapter-only fine-tuning)
LoRA Rank: 8
Learning Rate: 5e-5
Batch Size: 4 per device, gradient_accumulation_steps=4
Epochs: 2
Max Sequence Length: 2048
LR Scheduler: Cosine,
warmup_ratio=0.05
Final Training Loss: 1.22

While your response earlier contradicts this. Just confirming which values.

Weight Decay: 0.01
Used for light regularization, as we're running a single-epoch fine-tune on a large model.

LoRA Alpha: 16
Aligned with a LoRA rank of 8, it offers a good balance between adaptation strength and training stability.

LR Scheduler Type: "linear"

Yes because the model files have been updated, and in the current version you can access the adapter layers directly. This is expected and ensures you won’t encounter any dependency issues. Also, training parameters you’re seeing now are the correct and fully updated ones.

dousery locked this discussion 3 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment