ANE Transformers is a reference PyTorch implementation of Transformer components optimized for Apple Neural Engine on devices with A14 or newer and on Macs with M1 or newer chips. It demonstrates how to structure attention and related layers to achieve substantial speedups and lower peak memory compared to baseline implementations when deployed to ANE. The repository targets practitioners who want to keep familiar PyTorch modeling while preparing models for Core ML/ANE execution paths. Documentation highlights reported improvements in throughput and memory residency, while releases track incremental fixes and packaging updates. The project sits alongside related Apple ML repos that focus on deploying attention-based models efficiently to ANE-equipped hardware. In short, it’s a practical blueprint for adapting Transformers to Apple’s dedicated ML accelerator without rewriting entire model stacks.
Features
- Reference PyTorch layers tailored for ANE deployment
- Target support for A14+ iOS devices and M1+ Macs
- Reported multi-x speed and memory improvements over baselines
- Example code paths for attention and related modules
- Release artifacts to ease version pinning and integration
- Companion to Apple ML tooling for Core ML/ANE pipelines