Benchmarking AI with MLPerf

How fast is your machine learning infrastructure, and how do you measure it? That’s the topic of this episode, featuring David Kanter of MLCommons, Frederic Van Haren, and Stephen Foskett. MLCommons is focused on making machine learning better for everyone through metrics, datasets, and enablement. The goal for MLPerf is to come up with a fair and representative benchmark to allow the makers of ML systems to demonstrate the performance of their solutions. They focus on real data from a reference ML model that defines correctness, review the performance of a solution, and post the results. MLPerf started with training then added inferencing, which is the focus for users of ML. We must also consider factors like cost and power use when evaluating a system, and a reliable benchmark makes it easier to compare systems.

Three Questions

Frederic: Is it possible to create a truly unbiased AI?

Stephen: How big can ML models get? Will today’s hundred-billion parameter model look small tomorrow or have we reached the limit?

Andy Hock, Cerebras: What AI application would you build or what AI research would you conduct if you were not constrained by compute?

Guests and Hosts

David Kanter is the Executive Director of MLCommons. You can connect with David on Twitter at @TheKanter and on LinkedIn. You can also send David an email at [email protected].

Frederic Van Haren is the CTO and Founder at HighFens Inc., Consultancy & Services. Connect with Frederic on LinkedIn or on X/Twitter and check out the HighFens website.

Stephen Foskett, Organizer of the Tech Field Day Event Series, part of The Futurum Group. Find Stephen’s writing at GestaltIT.com, on Twitter at @SFoskett, or on Mastodon at @[email protected].