Apple constructed two critical components of Apple Intelligence using Google’s Tensor Processing Units (TPUs) instead of Nvidia’s widely-used graphics processing units (GPUs). Apple highlights its reliance on Google’s cloud hardware in a new research paper (via CNBC).
Apple’s research paper reveals Apple used 2,048 of Google’s TPUv5p chips to build AI models and 8,192 TPUv4 processors for server AI models. While the research paper doesn’t specifically mention Nvidia, the absence of any mention of the GPU manufacturer indicates that Apple solely used Google’s technology.
Apple’s decision is somewhat unusual, as Nvidia has dominated the AI processor market, due to its performance and efficiency. While Nvidia sells its chips and systems as physical standalone products, Google’s TPUs are accessed through the cloud. Customers develop their software using Google’s integrated tools and services.
In the research paper, Apple’s engineers discuss how Google’s TPUs allow them to efficiently train large AI models.
The AFM models are pre-trained on v4 and v5p Cloud TPU clusters with the AXLearn framework [Apple, 2023], a JAX [Bradbury et al., 2018] based deep learning library designed for the public cloud. Training is conducted using a combination of tensor, fully-sharded-data-parallel, and sequence parallelism, allowing training to scale to a large number of model parameters and sequence lengths at high utilization. This system allows us to train the AFM models efficiently and scalably, including AFM-on-device, AFM-server, and larger models.
The engineers describe how Google organizes its TPUs into large clusters, providing the processing power required to train Apple’s AI models. Apple plans to invest over $5 billion in AI server enhancements over the next two years. That should allow the iPhone maker to boost its AI capabilities, reducing its dependence on external providers.
The research paper also addresses ethical considerations in AI development. Apple says no user data is used to train its AI models, instead relying on publicly available, licensed, and open-sourced datasets for training purposes.