2.78 TFLOPS on a Fanless MacBook Air? Benchmarking Apple's M4 with MLX
(This article is an English translation of a post originally published in Japanese on my blog. You can read the original Japanese version here). My fanless M4 MacBook Air hit 2.78 TFLOPS in a matri...

Source: DEV Community
(This article is an English translation of a post originally published in Japanese on my blog. You can read the original Japanese version here). My fanless M4 MacBook Air hit 2.78 TFLOPS in a matrix multiplication benchmark using Apple's MLX framework. Matrix multiplication (GEMM) isn't just a basic math problem; it's the beating heart of modern Machine Learning and Large Language Models (LLMs). By measuring how fast a machine can multiply huge matrices, we are essentially measuring its raw capability to run AI locally. Let's see what the M4 chip can do. < Test Environment > Machine: M4 Macbook Air Memory: 16GB Python: 3.10.11 Framework: MLX v0.28.0 1. Measuring Execution Time To measure the execution time, I used a simple matrix multiplication operation. 1.1 The Benchmark Script I've published the measurement script on GitHub Gist. Feel free to download it and test it on your own Apple Silicon Mac. Note: I ran the benchmark last summer. I have recently verified that the script s