Today, we are excited to launch Aya 23, a state-of-the-art multilingual 8B and 35B open weights release. Aya 23 pairs a highly performant pre-trained model with the recent Aya dataset, making multilingual generative AI breakthroughs accessible to the research community. 🌍
https://lnkd.in/e2jTSG4w
Our original Aya model covered 101 languages. Aya 23 is an experiment in depth vs breadth ↔️ ↕️, exploring the impact of allocating more capacity to fewer languages that are included during pre-training.
Aya expands state-of-the-art language modeling capabilities to nearly 1/2 the world's population, outperforming Aya 101 and other widely used open weight instruction-tuned models 🌍
We report win-rates and find Aya 23 consistently generates higher quality responses. In both discriminative and generative multilingual benchmarks, Aya-23-35B achieves the highest results for the languages covered. Aya-23-8B demonstrates best-in-class for comparable model size multilingual performance. Dive deeper into the work with our technical report where you’ll find our evaluation results on multiple multilingual NLP benchmarks, and generation quality assessments. 📝 https://lnkd.in/gGUyAaZj
Most high-performant language models only serve a handful of languages. Both this and the wider Aya family of models are part of a commitment to contributing state-of-the-art research demonstrating that more languages can be treated as first-class citizens.
This release builds on the momentum of Aya - an open science movement that brought together 3,000 collaborators from 119 countries around the world.
Aya 23 is now available to experiment, explore, and build on for fundamental research and safety auditing. 🎉 Access the model, report, and learn more about the Aya initiative here: https://lnkd.in/e2jTSG4w
Try Aya 23: https://lnkd.in/gp4ZxWJv