Beyond Release: Access Considerations for Generative AI Systems Paper • 2502.16701 • Published Feb 23 • 16
Power Hungry Processing: Watts Driving the Cost of AI Deployment? Paper • 2311.16863 • Published Nov 28, 2023 • 6
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources Paper • 2201.10066 • Published Jan 25, 2022
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code Paper • 2206.11249 • Published Jun 22, 2022
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model Paper • 2212.04960 • Published Dec 9, 2022 • 1
Evaluating the Social Impact of Generative AI Systems in Systems and Society Paper • 2306.05949 • Published Jun 9, 2023 • 9
Stronger Together: on the Articulation of Ethical Charters, Legal Tools, and Technical Documentation in ML Paper • 2305.18615 • Published May 9, 2023 • 1
Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards Paper • 2108.07374 • Published Aug 16, 2021
Stable Bias: Analyzing Societal Representations in Diffusion Models Paper • 2303.11408 • Published Mar 20, 2023 • 1
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages Paper • 2303.12582 • Published Mar 22, 2023 • 20
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 7
Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives Paper • 2301.08488 • Published Jan 20, 2023