As per the title - the lm_head has a different vocab size wrt the loaded tokenizer.Why is that the case?
· Sign up or log in to comment