FuriosaAI and RNGD: The New Era of Ultra-Efficient AI Servers, Challenging the Energy Dominance of Nvidia H100

Posted by

27 September, 2025

On 27 September, 2025

The explosion of Generative Artificial Intelligence (Generative AI) has created a new “gold rush” in the tech industry, but it comes with an unprecedented “energy thirst.” Against this backdrop, FuriosaAI, a semiconductor startup from South Korea, is emerging as a game-changer. With its NXT RNGD Server and RNGD (Renegade) chip, the company not only claims performance comparable to Nvidia’s top-tier H100 GPU but also achieves it with just one-third of the power consumption. This is not just a technical leap but a solution to the critical challenges of Total Cost of Ownership (TCO) and the sustainability of global AI infrastructure.

MỤC LỤC BÀI VIẾT

I. The Energy Thirst: The Biggest Constraint of the AI Era

Over the past two years, the dramatic development of Large Language Models (LLMs) like GPT-4, Llama 3, and Exaone has pushed the demand for computing power to historic levels. However, running and training these models relies primarily on high-performance GPUs from Nvidia, particularly the H100 series. The superior power of the H100 comes with a critical drawback: massive energy consumption.

A single Nvidia H100 SXM GPU card can consume up to 700W of power. A standard DGX H100 server system (typically including eight H100 cards and other components) can easily exceed the 10-kilowatt (kW) mark under full load.

This reality poses a significant barrier for enterprises and Cloud Service Providers:

Power and Cooling Constraints: Most modern data centers, built before the Generative AI era, typically have power limits of around 8 kW or 15 kW per server rack. With the power demands of a single DGX H100 server exceeding 10 kW, data centers can only fit one or, at most, one and a half DGX H100 systems per rack, leading to wasted space and hindering rapid scaling.
Astronomical Operating Costs: The costs for cooling and electricity for an AI-focused data center can surpass the initial hardware procurement costs within just a few years.

This is the opportunity that FuriosaAI, backed by the LG conglomerate, seized to introduce a solution fundamentally redesigned for energy efficiency: the NXT RNGD Server.

II. Details of the RNGD (Renegade) “Rebel”

FuriosaAI’s NXT RNGD Server is positioned not to compete with Nvidia in model training, but in the inference domain – the stage where models are actually used to respond to users, which accounts for the vast majority of AI workload volume and operating costs.

1. Performance and Energy Advantage

The technical specifications of the NXT RNGD Server have sent a shockwave through the AI chip industry:

Metric	FuriosaAI NXT RNGD Server	Nvidia DGX H100 (Reference System)	Outcome
Compute Power	4 PetaFLOPs (for inference)	Comparable (for LLM inference)	Equivalent Performance
Memory (HBM3)	384 GB (8 RNGD cards)	640 GB (8 H100 cards)	High Memory Capacity
Power Consumption	3 kW	>10 kW	>70% Power Savings
Rack Density (15kW)	5 servers	1 server	5x Density
Performance/Watt	2.25x higher than traditional GPUs (per LG)	Industry Standard	Breakthrough in Operating Costs

Xuất sang Trang tính

Practical Implications for Data Centers:

For data centers limited to 15 kW per rack, the ability to install 5 RNGD servers instead of just one DGX H100 means maximizing space utilization and significantly reducing the need for complex cooling system investments. This enables enterprises to scale AI deployment flexibly without having to spend millions of dollars on upgrading existing power and cooling infrastructure.

2. Specialized Chip Architecture

RNGD’s advantage does not come from copying the GPU architecture, but from redesigning the chip from scratch, specializing it for LLM inference tasks:

RNGD Chip (Renegade): This is FuriosaAI’s second-generation AI accelerator. The chip is manufactured on an advanced process and features a large die area of 653mm² with 40 billion transistors.
Packaging Technology: RNGD utilizes TSMC’s industry-leading CoWoS® (Chip-on-Wafer-on-Substrate) packaging technology, integrating HBM3 memory to provide ultra-high bandwidth, a key factor in LLM processing. The collaboration with GUC (Global Unichip Corporation), a major ASIC design partner, has enabled FuriosaAI to achieve rapid development timelines and ensure chip quality.
Tensor Contraction Processor (TCP) Architecture: Unlike GPUs that focus on large matrices, the RNGD’s architecture is optimized for “Tensor Contraction,” which efficiently handles LLM calculations like matrix multiplication with maximum performance and energy efficiency. The TDP of a single RNGD card is publicly stated as 150W, an extremely low figure compared to competitors in the same performance segment.

III. FuriosaAI: From Samsung Engineer to AI Unicorn

To understand RNGD’s breakthrough, one must look at the company’s journey and vision.

1. Foundation and Strategic Vision

FuriosaAI was founded in 2017 by CEO June Paik (Baek Jun-ho), an engineer with over 20 years of experience in hardware and software, who previously worked at Samsung Electronics and AMD. His extensive experience in CPU, GPU, and memory system design laid the foundation for FuriosaAI’s design philosophy: to create the most efficient, integrated hardware-software solution for AI workloads.

The company’s first product, the Warboy chip (Generation 1), focused on computer vision and edge AI applications, establishing a reputation for performance in industry benchmarks like MLPerf.

The Rejected Meta Acquisition:

In 2024, FuriosaAI made headlines by revealing it had rejected an acquisition offer of up to $800 million USD from the tech giant Meta (formerly Facebook). This action demonstrates the leadership team’s absolute confidence in their independent potential and ability to reshape the AI chip market.

2. Financial Achievements and “Unicorn” Status

In mid-2025, FuriosaAI demonstrated its market maturity by completing a Series C bridge funding round worth approximately $125 million USD (or 170 billion KRW). This round, which saw participation from major institutions like the Korea Development Bank (KDB), the Industrial Bank of Korea (IBK), and private equity (PE) funds such as Keystone Partners, raised the company’s total capital to $246 million USD.

Crucially, this funding round propelled FuriosaAI’s valuation past 1 trillion KRW (approximately $735 million – $770 million USD), officially placing the startup in the ranks of domestic Korean technology Unicorns. The participation of PE funds, which typically avoid early-stage deep tech startups, is clear evidence that the market recognizes FuriosaAI not just as a research company but as a scale-ready AI infrastructure provider.

IV. Software Ecosystem and Enterprise Collaboration

Powerful hardware is only half of the AI equation; software is what dictates market acceptance. FuriosaAI understands this and has heavily invested in building a robust, developer-friendly software ecosystem.

1. Partnership with LG AI Research and Exaone

The partnership with LG AI Research, the AI research arm of the LG Group, is the most significant milestone. LG has adopted RNGD hardware to run its large language model, EXAONE. Test results confirmed FuriosaAI’s energy efficiency claims:

LG reported achieving an inference performance per watt that is 2.25 times higher than when using traditional GPUs.
This allows LG to run large models like EXAONE at significantly lower operational costs, solidifying RNGD’s position as an optimal choice for enterprise-scale AI applications.

2. Compatibility and Large Model Performance

FuriosaAI has focused on compatibility with industry standards, particularly with OpenAI’s API, ensuring businesses can easily migrate or expand existing AI systems to the RNGD platform.

Updates to the FuriosaAI SDK (Software Development Kit) have added key features to optimize LLM inference:

Diverse Model Support: The latest SDK (v2024.3.0) has expanded support for major LLMs such as Llama 3.1 (8B and 70B variants), Qwen, Solar, and CodeLLaMA2.
LLM Optimization: The SDK integrates advanced techniques like PagedAttention and Block KV Cache for more efficient KV cache memory management, along with Continuous Batching to handle continuous query queues, reducing latency and increasing throughput.
Target Throughput: With just two RNGD cards, the Llama 3.1-70B model can be executed effectively. The company aims for a throughput of up to 8,000 Tokens per second (TPS) on a single NXT RNGD server equipped with 8 RNGD cards, an impressive figure for real-time inference.

V. Sustainable Future and Commercial Roadmap

FuriosaAI is transitioning from the development and sampling phase to mass production and commercialization.

Availability Update:

Currently, the NXT RNGD server is undergoing testing with enterprise customers and major partners (Early Access Program – EAP) in real-world production environments. Based on technical achievements and the partnership with TSMC, the company has accelerated its roadmap: mass production and global market readiness are expected in 2025, sooner than the initial forecast of early 2026.

CEO June Paik affirms that RNGD is the solution for a “broken business model” in AI, where the infrastructure and energy costs of GPUs are becoming a critical roadblock. FuriosaAI’s commitment is to provide an AI solution that is not only environmentally sustainable but also economically sustainable (reducing TCO).

Third-Generation Vision:

Even as the second-generation RNGD is nearing commercialization, FuriosaAI has used its new funding to begin the development of its third-generation chip architecture. This shows the company’s relentless drive in the technology race, where the chip innovation cycle is extremely rapid.

Conclusion

FuriosaAI is not just another semiconductor startup; it is a prime example of deep tech innovation solving one of the most pressing challenges in the AI industry: energy.

While the Nvidia DGX H100 remains the “king” of model training, the RNGD Server is positioning itself as the undisputed leader in the ultra-efficient AI inference segment. With comparable performance and up to a 70% reduction in power consumption, and strong backing from major partners like LG, FuriosaAI is poised to fundamentally alter the economics of large-scale AI deployment. This Korean “unicorn” is bravely taking on the dominance of traditional GPUs, ushering in a new era for more cost-effective and sustainable AI infrastructure.