Ilyas Moutawwakil's picture

Ilyas Moutawwakil

IlyasMoutawwakil

·

IlyasMoutawwakil

AI & ML interests

Optimization, LLMs, Hardware, Backends, ..

Recent Activity

posted an update 6 days ago

🚀 Optimum: The Last v1 Release 🚀 Optimum v1.27 marks the final major release in the v1 series. As we close this chapter, we're laying the groundwork for a more modular and community-driven future: - Optimum v2: A lightweight core package for porting Transformers, Diffusers, or Sentence-Transformers to specialized AI hardware/software/accelerators.. - Optimum‑ONNX: A dedicated package where the ONNX/ONNX Runtime ecosystem lives and evolves, faster-moving and decoupled from the Optimum core. 🎯 Why this matters: - A clearer governance path for ONNX, fostering stronger community collaboration and improved developer experience.. - Enable innovation at a faster pace in a more modular, open-source environment. 💡 What this means: - More transparency, broader participation, and faster development driven by the community and key actors in the ONNX ecosystem (PyTorch, Microsoft, Joshua Lochner 👀, ...) - A cleaner, more maintainable core Optimum, focused on extending HF libraries to special AI hardware/software/accelerators tooling and used by our partners (Intel Corporation, Amazon Web Services (AWS), AMD, NVIDIA, FuriosaAI, ...) 🛠️ Major updates I worked on in this release: ✅ Added support for Transformers v4.53 and SmolLM3 in ONNX/ONNXRuntime. ✅ Solved batched inference/generation for all supported decoder model architectures (LLMs). ✨ Big shoutout to @echarlaix for leading the refactoring work that cleanly separated ONNX exporter logic and enabled the creation of Optimum‑ONNX. 📝 Release Notes: https://lnkd.in/gXtE_qji 📦 Optimum : https://lnkd.in/ecAezNT6 🎁 Optimum-ONNX: https://lnkd.in/gzjyAjSi #Optimum #ONNX #OpenSource #HuggingFace #Transformers #Diffusers

upvoted an article 8 days ago

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

updated a model 9 days ago

optimum-internal-testing/tiny-random-falcon-alibi-True

View all activity

Organizations

published an article 3 months ago

Article

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

By

and 8 others •

Apr 29

• 38

published an article 4 months ago

Article

Accelerating LLM Inference with TGI on Intel Gaudi

By

and 4 others •

Mar 28

• 14

published an article 8 months ago

Article

Benchmarking Language Model Performance on 5th Gen Xeon at GCP

By

and 2 others •

Dec 17, 2024

• 6

published an article over 1 year ago

Article

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

Dec 5, 2023

• 4

published an article almost 2 years ago

Article

Overview of natively supported quantization schemes in 🤗 Transformers

By

and 4 others •

Sep 12, 2023

• 12