How to enable bidirectional attention?
#21 opened 9 days ago
by
Adenialzz
finetuning
#20 opened 15 days ago
by
Hadbeen123
error with AutoModel GmeQwen2VLConfig after upgrade
7
#16 opened 18 days ago
by
findpather
Does batch_size=128 during training refer to the global or single-GPU batch size, and is it trained using DeepSpeed Zero3?
1
#13 opened 3 months ago
by
Hipanda
training code supported
#8 opened 3 months ago
by
tastelikefeet
Results on M-BEIR
1
#7 opened 4 months ago
by
wongyukim
LoRA weights
#6 opened 5 months ago
by
NohTow

Fused-Modal Data
#5 opened 5 months ago
by
paralym
training batch size
1
#3 opened 6 months ago
by
yyy111
Training code release
6
#2 opened 6 months ago
by
listli