merve's picture

merve PRO

merve

·

https://github.com/merveenoyan/smol-vision

AI & ML interests

I love this website VLMs, vision & co

Recent Activity

reacted to bartowski's post with 🤗 1 day ago

Was going to post this on /r/LocalLLaMa, but apparently it's without moderation at this time :') https://huggingface.co/bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF Was able to use previous mistral chat templates, some hints from Qwen templates, and Claude to piece together a seemingly working chat template, tested it with llama.cpp server and got perfect results, though lmstudio still seems to be struggling for some reason (don't know how to specify a jinja file there) Outlined the details of the script and results in my llama.cpp PR to add the jinja template: https://github.com/ggml-org/llama.cpp/pull/14349 Start server with a command like this: ``` ./llama-server -m /models/mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M.gguf --jinja --chat-template-file /models/Mistral-Small-3.2-24B-Instruct-2506.jinja ``` and it should be perfect! Hoping it'll work for ALL tools if lmstudio gets an update or something, not just llama.cpp, but very happy to see it works flawlessly in llama.cpp In the meantime, will try to open a PR to minja to make the strftime work, but no promises :)

posted an update 1 day ago

we've merged LightGlue keypoint matcher to Hugging Face transformers! it allows commercial use when paired with an open-source keypoint detector 🙏🏻 it works very well, try it yourself: https://huggingface.co/spaces/ETH-CVG/LightGlue here's an in-the-wild test with two images of the same place ⤵️

new activity 1 day ago

ETH-CVG/lightglue_superpoint:Fix task tag

View all activity

Organizations

published an article 7 days ago

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

By

and 4 others •

7 days ago

• 62

published an article 14 days ago

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

By

and 6 others •

14 days ago

• 100

published an article 23 days ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

By

and 8 others •

23 days ago

• 160

published an article about 1 month ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

By

and 6 others •

May 21

• 169

published an article about 1 month ago

Article

Vision Language Models (Better, Faster, Stronger)

By

and 4 others •

May 12

• 458

published an article about 2 months ago

Article

Welcoming Llama Guard 4 on Hugging Face Hub

By

and 3 others •

Apr 29

• 38

published an article 2 months ago

Article

Cohere on Hugging Face Inference Providers 🔥

By

and 6 others •

Apr 16

• 126

published an article 4 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

By

and 3 others •

Mar 12

• 436

published an article 4 months ago

Article

SigLIP 2: A better multilingual vision language encoder

By

and 2 others •

Feb 21

• 169

published an article 4 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

By

and 6 others •

Feb 20

• 274

published an article 5 months ago

Article

Open-source DeepResearch – Freeing our search agents

By

and 4 others •

Feb 4

• 1.26k

published an article 5 months ago

Article

We now support VLMs in smolagents!

By

and 2 others •

Jan 24

• 104

published an article 5 months ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

By

and 2 others •

Jan 23

• 181

published an article 6 months ago

Article

Introducing smolagents: simple agents that write actions in code.

By

and 2 others •

Dec 31, 2024

• 1.07k

published an article 7 months ago

Article

Welcome PaliGemma 2 – New vision language models by Google

By

and 3 others •

Dec 5, 2024

• 155

published an article 7 months ago

Article

SmolVLM - small yet mighty Vision Language Model

By

and 4 others •

Nov 26, 2024

• 320

published an article 9 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

By

and 6 others •

Sep 25, 2024

• 189

published an article 12 months ago

Article

Preference Optimization for Vision Language Models

By

and 3 others •

Jul 10, 2024

• 79

published an article about 1 year ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

By

and 2 others •

Jun 24, 2024

• 197

published an article about 1 year ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

By

and 2 others •

May 14, 2024

• 253