nltpt-q

community

AI & ML interests

None defined yet.

cfahlgren1Ā 
posted an update 1 day ago
view post
Post
147
I ran the Anthropic Misalignment Framework for a few top models and added it to a dataset: cfahlgren1/anthropic-agentic-misalignment-results

You can read the reasoning traces of the models trying to blackmail the user and perform other actions. It's very interesting!!

reach-vbĀ 
posted an update 13 days ago
view post
Post
2118
Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub 🤯

Starting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! šŸ’„

Go, play with it today: https://huggingface.co/blog/inference-providers-featherless

P.S. They're also bringing on more GPUs to support all your concurrent requests!
ariG23498Ā 
posted an update 22 days ago
view post
Post
1448
🚨 Implement KV Cache from scratch in pure PyTorch. 🚨

We have documented all of our learning while implementing KV Cache to nanoVLM. Joint work with @kashif @lusxvr @andito @pcuenq

Blog: hf.co/blog/kv-cache
  • 1 reply
Ā·
cfahlgren1Ā 
posted an update 23 days ago
cfahlgren1Ā 
posted an update about 1 month ago
view post
Post
1698
Yesterday, we dropped a new conversational viewer for datasets on the hub! šŸ’¬

Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.

Here's some datasets you can try it out on:
• mlabonne/FineTome-100k
• Salesforce/APIGen-MT-5k
• open-thoughts/OpenThoughts2-1M
• allenai/tulu-3-sft-mixture

Any other good ones?
  • 1 reply
Ā·
reach-vbĀ 
posted an update about 1 month ago
view post
Post
3968
hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! šŸ’„

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! šŸ¤—
Ā·
cfahlgren1Ā 
posted an update 5 months ago
view post
Post
2342
If you haven't seen yet, we just released Inference Providers šŸ”€

> 4 new serverless inference providers on the Hub 🤯
> Use your HF API key or personal key with all providers šŸ”‘
> Chat with Deepseek R1, V3, and more on HF Hub šŸ‹
> We support Sambanova, TogetherAI, Replicate, and Fal.ai šŸ’Ŗ

Best of all, we don't charge any markup on top of the provider 🫰 Have you tried it out yet? HF Pro accounts get $2 of free usage for the provider inference.
ariG23498Ā 
posted an update 5 months ago
view post
Post
2831
Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
ariG23498Ā 
posted an update 5 months ago
cfahlgren1Ā 
posted an update 6 months ago
view post
Post
1778
Wow, I just added Langfuse tracing to the Deepseek Artifacts app and it's really nice šŸ”„

It allows me to visualize and track more things along with the cfahlgren1/react-code-instructions dataset.

It was just added as a one click Docker Space template, so it's super easy to self host šŸ’Ŗ
cfahlgren1Ā 
posted an update 6 months ago
view post
Post
2265
You'll notice the AI in the SQL Console is much better at working with chatml conversations:

Here's example of unnesting the cfahlgren1/react-code-instructions in less than 10 seconds by asking it. Check it out here: cfahlgren1/react-code-instructions

- "show me the average assistant response length"
- "extract user, system, and assistant messages into separate columns"

It's super easy to work with conversational datasets now with natural language šŸ—£ļø





  • 2 replies
Ā·
cfahlgren1Ā 
posted an update 6 months ago
reach-vbĀ 
posted an update 7 months ago
view post
Post
7109
VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: @jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! šŸ”„
ariG23498Ā 
posted an update 7 months ago
cfahlgren1Ā 
posted an update 7 months ago
view post
Post
1940
You can just ask things šŸ—£ļø

"show me messages in the coding category that are in the top 10% of reward model scores"

Download really high quality instructions from the Llama3.1 405B synthetic dataset šŸ”„

argilla/magpie-ultra-v1.0

cfahlgren1Ā 
posted an update 7 months ago
view post
Post
3053
We just dropped an LLM inside the SQL Console 🤯

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any Hugging Face dataset ✨

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think šŸ¤—