How to calculate video embedding?

#24
by leaf-potato - opened

I see that the Qwen-VL model supports video understanding, but the gme model seems to only support text and images. I would like to ask if there are plans to support video embedding?

Sign up or log in to comment