The surge in AI computing has resulted in delays to the supply of AI-capable chips, as demand has outstripped supply. Global giants Microsoft, Google and AWS are ramping up custom silicon production to reduce dependence on the dominant suppliers of GPUs, NVIDIA and AMD.
As a result, APAC enterprises may soon find themselves utilising an expanding array of chip types in cloud data centres. The chips they choose will depend on the compute power and speed required for different application workloads, cost and cloud vendor relationships.
Compute-intensive tasks like training an AI large language model require massive amounts of computing power. As demand for AI computing has risen, super advanced semiconductor chips from the likes of NVIDIA and AMD have become very expensive and difficult to secure.
The dominant hyperscale cloud vendors have responded by accelerating the production of custom silicon chips in 2023 and 2024. The programs will reduce dependence on dominant suppliers, so they can deliver AI compute services to customers globally, and in APAC.
Google debuted its first ever custom ARM-based CPUs with the release of the Axion processor during its Cloud Next conference in April 2024. Building on custom silicon work over the past decade, the step up to producing its own CPUs is designed to support a variety of general purpose computing, including CPU-based AI training.
For Google’s cloud customers in APAC, the chip is expected to enhance Google’s AI capabilities within its data center footprint, and will be available to Google Cloud customers later in 2024.
Microsoft, likewise, has unveiled its own first in-house custom accelerator optimised for AI and generative AI tasks, which it has badged the Azure Maia 100 AI Accelerator. This is joined by its own ARM-based CPU, the Cobalt 100, both of which were formally announced at Microsoft Ignite in November 2023. The firm’s custom silicon for AI has already been in use for tasks like running OpenAI’s ChatGPT 3.5 large language model. The global tech giant said it was expecting a broader rollout into Azure cloud data centres for customers from 2024.
AWS investment in custom silicon chips dates back to 2009. The firm has now released four generations of Graviton CPU processors, which have been rolled out into data centres worldwide, including in APAC; the processors were designed to increase the price performance for cloud workloads. These have been joined by two generations of Inferentia for deep learning and AI inferencing, and two generations of Trainium for training 100B+ parameter AI models.
At a recent AWS Summit held in Australia, Dave Brown, vice president of AWS Compute & Networking Services, told TechRepublic the cloud provider’s reason for designing custom silicon was about providing customers choice and improving “price performance” of available compute.
“Providing choice has been very important,” Brown said. “Our customers can find the processors and accelerators that are best for their workload. And with us producing our own custom silicon, we can give them more compute at a lower price,” he added.
AWS has long-standing relationships with major suppliers of semiconductor chips. For example, AWS’ relationship with NVIDIA, the now-dominant player in AI, dates back 13 years, while Intel, which has released Gaudi accelerators for AI, has been a supplier of semiconductors since the cloud provider’s beginnings. AWS has been offering chips from AMD in data centres since 2018.
Brown said the cost optimisation fever that has gripped organisations over the last two years as the global economy has slowed has seen customers moving to AWS Graviton in every single region, including in APAC. He said the chips have been widely adopted by the market — by more than 50,000 customers globally — including all the hyperscaler’s top 100 customers. “The largest institutions are moving to Graviton because of performance benefits and cost savings,” he said.
SEE: Cloud cost optimisation tools not enough to reign in cloud spending.
The wide deployment of custom AWS silicon is seeing customers in APAC utilize these options.
Enterprise customers in APAC could benefit from an expanding range of compute options, whether that is measured by performance, cost or appropriateness to different cloud workloads. Custom silicon options could also help organisations meet sustainability goals.
The competition provided by cloud providers, in tandem with chip suppliers, could drive advances in chip performance, whether that is in the high-performance computing category for AI model training, or innovation for inferencing, where latency is a big consideration.
Cloud cost optimisation has been a major issue for enterprises, as expanding cloud workloads have led customers into ballooning costs. More hardware options give customers more options for reducing overall cloud costs, as they can more discerningly choose appropriate compute.
A growing range of custom silicon chips within cloud services will allow enterprises to better match their application workloads to the specific characteristics of the underlying hardware, ensuring they can use the most appropriate silicon for the use cases they are pursuing.
Sustainability is predicted to become a top five factor for customers procuring cloud vendors by 2028. Vendors are responding: for instance, AWS said carbon emissions can be slashed using Graviton4 chips, which are 60% more efficient. Custom silicon will help improve overall cloud sustainability.