MLCommons

807 posts

MLCommons

@MLCommons

Better Artificial Intelligence for Everyone

Katılım Eylül 2020

148 Takip Edilen3.6K Takipçiler

Sabitlenmiş Tweet

MLCommons@MLCommons·1 Nis

MLPerf Inference v6.0 is here - our most significant benchmark update ever. 5 new/updated benchmarks. 24 submitting organizations. Industry-first tests for text-to-video and speculative decoding. Full results: mlcommons.org/2026/04/mlperf… #MLPerf #MLCommons

English

3.6K

MLCommons@MLCommons·3d

Security theater vs. rigorous AI benchmarking - the difference is methodology. AILuminate Jailbreak v0.7: a mechanism-first taxonomy for single-turn jailbreak attacks. Defensible. Reproducible. Auditable. mlcommons.org/2026/02/jailbr… #AILuminate #AISecurity

English

123

MLCommons@MLCommons·4d

May 12-13 in NYC: The New Wave of AI in Healthcare 2026 at @IcahnMountSinai. MLCommons' Andrew Gruen PhD is on the agenda May 13. Register: lnkd.in/efz2t-Ja #AIinHealthcare #DigitalHealth

English

MLCommons@MLCommons·5d

MLPerf Endpoints uses step functions, not trend lines. Interpolating between measured points can hide real failures: memory overflows, P99 spikes. Only verified operating points. No paper performance. bit.ly/3Pjx34u #MLPerf #AIBenchmarking

English

MLCommons@MLCommons·6d

What happens when AI doesn't just assist - but acts? MLCommons' Dave Graham joined @UtilizingAI to talk about the future of agentic AI: opportunities, risks, and societal impact. Worth a listen. youtu.be/P37u1YdQp4k #AgenticAI #AISafety #MLCommons

YouTube

English

117

MLCommons@MLCommons·23 Nis

MLCommons is at the ISO Plenary in Singapore this week. As AI safety becomes a global policy priority, aligning international standards matters more than ever. Stay tuned for our recap. #MLCommons #AILuminate #AIPolicy #ISO

English

MLCommons@MLCommons·23 Nis

The median AI benchmark longevity score is 5/100. AILuminate scored 75 - but that degrades over time too. So we built the Continuous Prompt Stewardship System to keep it that way. New from the @MLCommons AIRR team: mlcommons.org/2026/04/contin…

English

MLCommons@MLCommons·23 Nis

Does AI reliability work? Does it follow the right rules, under every circumstance - including when someone is actively trying to break it? MLCommons AIRR Working Group introduces the AI Reliability Map. mlcommons.org/2026/04/airr-m… #AISafety #AIReliability #MLCommons

English

119

MLCommons@MLCommons·22 Nis

Early MLPerf Endpoints results include DeepSeek-R1, GPT OSS 120B, Llama 3.1 8B, QWEN 3 Coder 480B — across nearly a dozen systems. More models added as rolling submissions open in Q2 2026. Endpoints.MLCommons.org #MLPerf #GenerativeAI

English

228

MLCommons@MLCommons·20 Nis

Excited to share that MLCommons' Andrew Gruen, PhD will be speaking at The New Wave of AI in Healthcare 2026 — a two-day symposium in New York City, May 12-13. Register now: lnkd.in/efz2t-Ja #AI #Healthcare #AIinHealthcare #MLCommons

NYAS@NYASciences

📢 Registration is open for The New Wave of AI in Healthcare 2026, happening May 12-13 in NYC. Co-hosted by the Academy & @IcahnMountSinai, this two-day symposium explores how AI is transforming diagnosis, treatment, & healthcare delivery. Register now: events.nyas.org/event/aihealth…

English

131

MLCommons retweetledi

NYAS@NYASciences·2 Ara

English

375

MLCommons@MLCommons·17 Nis

Nice write-up by TechArena on MLPerf Inference v6.0 - great context on what's driving the results and what's coming next. Worth a read 👇 bit.ly/4thVn5P #MLPerf #AIBenchmarks #MLCommons

English

116

MLCommons@MLCommons·14 Nis

Introducing MLPerf Endpoints - a ground-up rethinking of how the industry benchmark of record measures GenAI performance. Step functions. Verified operating points. No interpolation, no paper performance. If it has an API, we can measure it. bit.ly/4ss4ekm

English

101

MLCommons@MLCommons·14 Nis

MLPerf Client v1.6 is out - updated runtimes for Windows ML, llama.cpp, and Apple MLX, plus GUI improvements for faster, smoother benchmarking on Windows, Mac, and iPad. Available now at mlcommons.org and the iOS and Mac App Stores. mlcommons.org/2026/04/mlperf…

English

123

MLCommons@MLCommons·13 Nis

Want a full walkthrough of what's new in MLPerf Inference v6.0? Watch the press briefing for a deep dive into five new benchmarks, standout results, and what it all means for the future of AI inference. ▶️ youtu.be/3FdkYZZlhDI #MLPerf #AI #Inference #MLCommons

YouTube

English

170

MLCommons@MLCommons·13 Nis

How do we measure AI's growing energy footprint? MLCommons' David Kanter joins researchers from Artificial Analysis and Purdue on April 16 in San Francisco to discuss AI benchmarking for environmental impact, performance, and energy usage. Register below. luma.com/50nggqqb

English

121

MLCommons@MLCommons·13 Nis

MLCommons has joined Partnership on AI - connecting our open engineering community with a global network working toward AI systems that are accurate, safe, and accountable. Excited to be part of this coalition. bit.ly/4t5usu9

English

103

MLCommons@MLCommons·10 Nis

Explore MLPerf Inference v6.0 results yourself. Filter by benchmark, system, and scenario to see how the latest hardware stacks up in our interactive dashboard. 📊 bit.ly/3PLbCJR #MLPerf #AI #Benchmarking #Inference

English

130

MLCommons@MLCommons·10 Nis

The AILuminate Global Assurance Program gives organizations a third-party path to evaluate AI safety - built on the most comprehensive open AI safety benchmark available. Developed with Google, Microsoft, Qualcomm, and KPMG. bit.ly/4kIS18x

English

131

MLCommons@MLCommons·10 Nis

When choosing an AI service, you need to know what it can actually deliver under real conditions. MLPerf Endpoints benchmarks verified operating points only, reproducible results that you can compare directly against latency and throughput requirements. bit.ly/4ss4ekm

English

MLCommons@MLCommons·10 Nis

MLCommons took a different approach to jailbreak tests - publishing a mechanism-first taxonomy for single-turn jailbreak evaluation that's reproducible, governance-aligned, and built for real-world AI deployment. bit.ly/3ZCHqlZ

English

178

Keşfet

@IcahnMountSinai @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine