Quentin Anthony's picture

Quentin Anthony

qanthony-z

·

AI & ML interests

None yet

Recent Activity

updated a model 17 days ago

Zyphra/ZAYA1-reasoning-base

new activity 17 days ago

Zyphra/ZAYA1-8B:Update README.md

new activity 17 days ago

Zyphra/ZAYA1-8B:What on earth did you guys do

View all activity

Organizations

authored a paper about 2 years ago

Zamba: A Compact 7B SSM Hybrid Model

Paper • 2405.16712 • Published May 26, 2024 • 25

authored 6 papers over 2 years ago

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference

Paper • 2401.08383 • Published Jan 16, 2024 • 1

The Case for Co-Designing Model Architectures with Hardware

Paper • 2401.14489 • Published Jan 25, 2024 • 4

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Paper • 2308.04014 • Published Aug 8, 2023 • 2

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Paper • 2304.01373 • Published Apr 3, 2023 • 9

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Paper • 2204.06745 • Published Apr 14, 2022 • 1

BlackMamba: Mixture of Experts for State-Space Models

Paper • 2402.01771 • Published Feb 1, 2024 • 25