We now have the newest Open AI models available on the Dell Enterprise Hub!
We built the Dell Enterprise Hub to provide access to the latest and greatest model from the Hugging Face community to our on-prem customers. We’re happy to give secure access to this amazing contribution from Open AI on the day of its launch!
You can now find it in the Hugging Face Collection in Azure ML or Azure AI Foundry, along with 10k other Hugging Face models 🤗🤗 Qwen/Qwen3-235B-A22B-Instruct-2507-FP8
ZML just released a technical preview of their new Inference Engine: LLMD.
- Just 2.4GB container, which means fast startup times and efficient autoscaling - Cross-Platform GPU Support: works on both NVIDIA and AMD GPUs. - written in Zig
I just tried it out and deployed it on Hugging Face Inference Endpoints and wrote a quick guide 👇 You can try it in like 5 minutes!
We just released native support for @SGLang and @vllm-project in Inference Endpoints 🔥
Inference Endpoints is becoming the central place where you deploy high performance Inference Engines.
And that provides the managed infra for it. Instead of spending weeks configuring infrastructure, managing servers, and debugging deployment issues, you can focus on what matters most: your AI model and your users 🙌
Fine-tune Gemma3n on videos with audios inside with Colab A100 🔥 Just dropped the notebook where you can learn how to fine-tune Gemma3n on images+audio+text at the same time!
keep in mind, it's made for educational purposes 🫡 we do LoRA, audio resampling & video downsampling to be able to train <40GB VRAM stretch modalities and unfreeze layers as you wish! 🙏🏻 merve/smol-vision
🎉 New in Azure Model Catalog: NVIDIA Parakeet TDT 0.6B V2
We're excited to welcome Parakeet TDT 0.6B V2—a state-of-the-art English speech-to-text model—to the Azure Foundry Model Catalog.
What is it?
A powerful ASR model built on the FastConformer-TDT architecture, offering: 🕒 Word-level timestamps ✍️ Automatic punctuation & capitalization 🔊 Strong performance across noisy and real-world audio
It runs with NeMo, NVIDIA’s optimized inference engine.
Want to give it a try? 🎧 You can test it with your own audio (up to 3 hours) on Hugging Face Spaces before deploying.If it fits your need, deploy easily from the Hugging Face Hub or Azure ML Studio with secure, scalable infrastructure!
📘 Learn more by following this guide written by @alvarobartt
🚀 New in smolagents v1.20.0: Remote Python Execution via WebAssembly (Wasm)
We've just merged a major new capability into the smolagents framework: the CodeAgent can now execute Python code remotely in a secure, sandboxed WebAssembly environment!
🔧 Powered by Pyodide and Deno, this new WasmExecutor lets your agent-generated Python code run safely: without relying on Docker or local execution.
Why this matters: ✅ Isolated execution = no host access ✅ No need for Python on the user's machine ✅ Safer evaluation of arbitrary code ✅ Compatible with serverless / edge agent workloads ✅ Ideal for constrained or untrusted environments
This is just the beginning: a focused initial implementation with known limitations. A solid MVP designed for secure, sandboxed use cases. 💡
💡 We're inviting the open-source community to help evolve this executor: • Tackle more advanced Python features • Expand compatibility • Add test coverage • Shape the next-gen secure agent runtime