Apple M4 and AI training: Neural Engine shows hidden potential

Danny Weber

A developer found a way around Apple’s software limits and showed that the Neural Engine in M4 chips can handle more than simple inference.

A security researcher and developer using the alias 0X0SOJALSEC says they found a way around Apple restrictions that normally prevent the Neural Engine in M4 chips from being used for full AI model training.

Apple’s neural engine usually works first of all as an accelerator for already trained models and on-device AI features. But the enthusiast managed to use this block for far more demanding scenarios, including transformer model training with backpropagation.

To make it work, the developer built a custom compiler layer based on Model Intermediate Language (MIL). It allows direct access to the Neural Engine, bypassing familiar Apple tools such as Core ML and Metal. During execution, data stays in RAM, which helps cut delays caused by constant writes to storage.

The developer also implemented a training recovery mechanism: if the process freezes or stops, the system can continue from the latest checkpoint without losing the progress already made.

The project’s source code has already been published on GitHub. According to the author, early tests showed high performance: individual training stages for transformer-type models on M4 chips were completed in milliseconds.

Apple does not officially offer developers a way to train neural networks with the Neural Engine and presents this block mainly as an inference accelerator. However, the reverse engineering suggests that the hardware potential of these chips is much broader than Apple’s public tools imply.

If the approach proves useful in practice, Mac computers and iPad tablets could become a more interesting platform for local development and testing of small AI models without being tied to cloud services.

The discovery once again raises the question of how many capabilities are hidden inside Apple hardware, and how many of them remain unavailable not because of the silicon itself, but because of the company’s own software limits.

© T. Feodor