News, Blog

Featured Articles

May 30, 2024

MLCommons and AI Verify to collaborate on AI Safety Initiative

Agree to a memorandum of intent to collaborate on a set of AI safety benchmarks for LLMs

Dr Peter Mattson, MlCommons and Dr Ong Chen Hui AI Verify sign a MOI to work on AI Safety

May 21, 2024

Creating a comprehensive Test Specification Schema for AI Safety

Helping to systematically document the creation, implementation, and execution of AI safety tests

Test specification computer image

April 16, 2024

Announcing MLCommons AI Safety v0.5 Proof of Concept

Achieving a major milestone towards standard benchmarks for evaluating AI Safety

April 16, 2024

The AI Safety Ecosystem Needs Standard Benchmarks

IEEE Spectrum contributed blog excerpt, authored by the MLCommons AI Safety working group

March 27, 2024

New MLPerf Inference Benchmark Results Highlight The Rapid Growth of Generative AI Models

With 70 billion parameters, Llama 2 70B is the largest model added to the MLPerf Inference benchmark suite.

March 27, 2024

Llama 2 70B: An MLPerf Inference Benchmark for Large Language Models

MLPerf Inference task force shares insights on the selection of Llama 2 for the latest MLPerf Inference benchmark round.

Blog

21 May. 2024

Blog, Featured

Creating a comprehensive Test Specification Schema for AI Safety

Helping to systematically document the creation, implementation, and execution of AI safety tests
13 May. 2024

Blog, News

Unveiling the PRISM Alignment Project

Prioritizing the Data-Centric Human Factors for Aligning Large Language Models
16 Apr. 2024

Blog, Featured

The AI Safety Ecosystem Needs Standard Benchmarks

IEEE Spectrum contributed blog excerpt, authored by the MLCommons AI Safety working group

News

30 May. 2024

Featured, News

MLCommons and AI Verify to collaborate on AI Safety Initiative

Agree to a memorandum of intent to collaborate on a set of AI safety benchmarks for LLMs
20 May. 2024

News

MLPerf Mobile v4.0 application adds new benchmark, expands hardware support

The mobile benchmark suite adds a brand-new image classification model and supports neural acceleration on some of the latest mobile devices
13 May. 2024

Blog, News

Unveiling the PRISM Alignment Project

Prioritizing the Data-Centric Human Factors for Aligning Large Language Models