AMD's AI gamble
AMD held a conference call to discuss the $4.9 billion acquisition of ZT Systems, which provided insight into how Lisa Su is building her AI empire. The AMD AI landscape she depicts is the polar opposite of Nvidia's proprietary approach.
In her opinion, customers have two choices: one is to choose a dystopian Nvidia world (where Nvidia owns the assets); The other is to choose AMD's world, where you can choose your partners, hardware, technology, and AI tools.
The acquisition of ZT Systems is in the spirit of providing engineers with the ability to build systems optimized for AI processing and power consumption.
Su believes that its AI products will be very differentiated.
"We can actually leverage our system capabilities and let customers use what they think is best for their workloads and data center environments," Sue said. ”
To be sure, full-stack providers don't seem new. AMD has been enhancing its system vendor capabilities by acquiring all key parts of computing (software, hardware, and networking).
Replicate NVIDIA's strategy
Earlier this year, AMD announced that it would release a new GPU every year, similar to Nvidia. ZT Systems provided AMD with 1,000 engineers to build the system, just as Nvidia's engineers built the DGX system.
"ZT ships hundreds of thousands of servers and tens of thousands of AI racks annually to the largest hyperscale cloud companies with industry-leading quality," said Su. ”
This sounds like Nvidia's current strategy – all major cloud providers provide Nvidia with space to install DGX systems. Nvidia has built its own parallel cloud service, connecting its GPU systems with all cloud providers.
"We try to give our customers choice while leveraging our technology to provide them with best-in-class design capabilities," said Su. ”
While AMD has received accolades, there's a lot more to be done to become the next Nvidia.
It took decades for NVIDIA to get to where it is today. The transformation includes:
In 2007, a software framework was built using CUDA.
Looking ahead to the capabilities of artificial intelligence.
Provide the first hardware that allows OpenAI to test its AI model.
AMD is not Nvidia, there are still many things that need to be done to become the next Nvidia
Now is a good time to take a look at the problems that the company needs to solve.
AMD's GPUs are still facing issues
Proper use of GPUs ensures that AMD's AI world can withstand Nvidia's onslaught.
AMD is pleased with the progress of its GPUs. The MI300X is favored by top customers such as Microsoft and Meta.
But let's take a quick look at the reality: two of the top three cloud providers still don't want MI300 or MI300X GPUs. Google and AWS have not yet ordered AMD GPUs. That's probably why AMD acquired ZT Systems – to get more cloud providers on board.
AMD's GPUs may just be poor man's versions of Nvidia, and no customer is in dire need of hardware. However, AMD's GPUs are the only legal alternative to Nvidia, and orders are increasing.
"We now expect data center GPU revenue to exceed $4.5 billion in 2024, up from our April forecast of $4 billion," said Su. ”
Earlier this year, AMD revealed that it will release a new GPU every year, very similar to Nvidia, which includes the MI325X and next year's MI400.
"Our MI400 series, based on the CDNA Next architecture, has made tremendous progress in development and is scheduled to be launched in 2026," said Su. ”
The good news is that AMD has a GPU roadmap, and customers now have a clear idea of what they're buying. If everything goes in AMD's favor, things could change dramatically by 2026.
"It's about CPUs, GPUs, networks, systems, and clusters. How do you ensure their reliability? This team will help us do that because they have done it," Sue said.
Sue said that systems with AMD MI350 (which will be available next year) and MI400 will result in complex systems that will require hiring experts from ZT Systems.
AMD keeps up with Nvidia in terms of hardware features, memory, and manufacturing.
Clumsy benchmarks and software
AMD's benchmark results are mixed. The company has yet to submit its AI benchmark to MLPerf, but Microsoft and Meta have confirmed that AMD's Instinct GPUs are performing well.
AMD has recently been criticized by Intel for dishonestly introducing the upcoming Turin CPU. Its Zen 5 PC CPU has recently been criticized for poor performance gains.
Benchmarking is hard, but it's best to be cautious. However, the company's software ecosystem is a far cry from the CUDA stack that Nvidia has built.
AMD has spent years developing ROCm, a standard set of tools, libraries, drivers, and compilers. But it's still in its infancy.
Sue said on the earnings call: "From a functional standpoint, ROCm ...... We gained a lot of confidence and learned a lot along the way. ”
AMD executives have repeated the same thing about ROCm at many meetings, which means that this work has been a work in progress for years.
AMD is still stuck at the programming level, lagging behind the OneAPI-based UXL Foundation's parallel programming framework.
However, the openness of ROCm aligns with AMD's goal of being able to handle workloads. The question is whether the developers will adapt to ROCm.
ROCm vs. CUDA
Nvidia is light-years ahead of ROCm with CUDA, which has evolved into a full-fledged computing program and dataset. CUDA executables for major verticals include robotics, autonomous vehicles, healthcare, finance, and quantum computing.
CUDA tools are used to generate synthetic data that is not available in the real world. These and other tools are integrated into Nvidia's AI Enterprise software.
But there's no doubt that Nvidia's CUDA is expensive. But it's also easier to deploy – customers just need to enter data and get output. For those who need further customization, the technical difficulty of the CUDA tool can be increased.
AMD's ROCm is complex, but it offers more flexibility in terms of tools and model development. AMD also supports Open Networking technologies.
"We are working closely with the Super Ethernet Alliance and the UA Link Group to ensure that we have robust network technology that meets industry standards," said Su. ”
The right steps
AMD's acquisition of ZT Systems is the latest in a series of strategic acquisitions the company has made to fill the hole.
AMD has made interesting acquisitions to shape its overall AI initiatives. In 2022, AMD spent $49 billion to acquire Xilinx's FPGAs and software. AMD has CPUs and GPUs, while Xilinx provides it with a trio of FPGAs and ASICs.
The company also acquired software companies Pensando Systems, Silo.AI, and Nod.
"The Silo team has greatly expanded our ability to serve large enterprise customers looking to optimize AI solutions for AMD hardware," Lisa Su said on the earnings call. ”
The Company will continue to pursue strategic acquisitions.
"We will continue to look at how we can actively enhance our capabilities, both organic and inorganic," Su said. ”
This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com