By Catherine Sbeglia Nin December 4, 2024
Collected at: https://www.rcrwireless.com/20241204/fundamentals/ai-ran-workload-models
Three workload distribution models are emerging for AI-RAN — RAN-only, RAN-heavy or AI-heavy
In November, SoftBank announced that it successfully piloted the world’s first combined AI and 5G telecom network using the NVIDIA AI Aerial accelerated computing platform in an outdoor trial conducted in Japan’s Kanagawa prefecture. This combined model, which the industry calls AI-RAN, represents a significant step towards achieving network energy efficiency goals and realizing AI revenue streams for telecom operators.
“Telecom networks are designed for peak and when you design something for peak, that actually means that most of the time, it’s underutilized. How do you create more utilization? Having an orthogonal workload in the daytime … using it for RAN when it’s highly utilized and, in the nighttime, you might actually use it for AI workloads,” said Nvidia’s General Manager of AI, 5G and telecoms Soma Velayutham.
Therefore, AI and RAN multi-tenancy and orchestration — the ability to run and manage RAN and AI workloads concurrently — is one of the key principles of AI-RAN technology. Multi-tenancy can refer to dividing network resources based on time of day or on the amount of compute.
In the SoftBank AI-RAN trial, the pair stated that concurrent AI and RAN processing was successfully demonstrated between RAN and AI workloads, with the goal of maximizing capacity utilization. Nvidia claimed that AI-RAN enables telcos to achieve almost 100% utilization compared to 33% capacity utilization for typical RAN-only networks — an increase of up to 3x — while implementing dynamic orchestration and prioritization policies to accommodate peak RAN loads.
As such, the field network also provided the opportunity to compare the multiple workload distribution models that are emerging for AI-RAN — RAN-only, RAN-heavy or AI-heavy. These labels refer to how much of the server is dedicated to RAN versus AI workloads at any given time, which again, can be adjusted dynamically depending on traffic.
In the AI-heavy scenario, Nvidia used a one-third RAN and two-third AI workload distribution and claimed that for every dollar of CapEx investment in accelerated AI-RAN infrastructure, telcos can generate 5x the revenue over five years, with the overall investment delivering a 219% profit margin, considering all CapEx and OpEx costs.
In the RAN-heavy scenario, Nvidia used two-thirds RAN and one-third AI workload distribution, which showed revenue divided by CapEx for Nvidia-accelerated AI-RAN is 2x, with a 33% profit margin over five years.
Finally, in the RAN-only scenario, Nvidia concluded that using the Aerial RAN Computer-1 is more cost efficient than custom RAN-only solutions.
“From these scenarios, it is evident that AI-RAN is highly profitable as compared to RAN-only solutions, in both AI-heavy and RAN-heavy modes. In essence, AI-RAN transforms traditional RAN from a cost center to a profit center. The profitability per server improves with higher AI use. Even in RAN-only, AI-RAN infrastructure is more cost-efficient than custom RAN-only options,” Nvidia’s Senior Director for Telco Marketing Kanika Atri summarized wrote in a blog.
And when it comes to power performance specifically, in the 100% RAN-only mode, power performance in Watt/Gbps of the GB200-NVL2 server — which resides inside Nvidia’s AI Aerial accelerated computing platform — achieved 40% less power consumption than existing RAN-only systems and 60% less power consumption than commercial-off-the-shelf (COTS) x86-based vRAN, as well as similar efficiencies across distributed-RAN and centralized-RAN configurations.
Leave a Reply