In a significant move for the global semiconductor landscape, Chinese GPU designer Moore Threads has unveiled its ambitious next-generation roadmap. At its MUSA 2025 developer conference, the company announced the "Flower Harbor" (Huagang) architecture, which will power two distinct product lines: the Lushan GPU for gaming and the Huashan GPU for artificial intelligence. These announcements signal a bold step forward in China's quest for technological self-reliance, promising massive performance leaps that aim to close the gap with industry leaders like NVIDIA and AMD.
The Next-Gen Flower Harbor Architecture
The foundation for Moore Threads' new GPUs is the Flower Harbor architecture, which represents a substantial overhaul from its predecessors. The company claims the new design increases compute density by 50% and improves energy efficiency by 10%. A key technical advancement is the architecture's comprehensive support for a wide range of compute formats, from FP64 down to FP4, including proprietary mixed-precision formats like MTFP6 and MTFP4 designed for AI workloads. For large-scale AI deployments, Moore Threads highlighted its MTLink high-speed interconnect technology, which it states can scale to connect over 100,000 GPUs within a single cluster, a critical feature for building competitive AI factories.
Flower Harbor (Huagang) Architecture Claims:
- Compute Density: +50%
- Energy Efficiency: +10%
- Scalability: MTLink interconnect to scale >100,000 GPUs in a cluster.
- Supported Compute Formats: FP64, FP32, TF32, FP16, BF16, FP8, FP6, FP4, INT8, plus proprietary MTFP8, MTFP6, MTFP4.
Lushan: A Quantum Leap for Gaming
The Lushan GPU is positioned as the direct successor to Moore Threads' existing consumer cards, the MTT S80 and S90. While specific specifications like core counts and clock speeds were not disclosed, the performance claims are staggering. Moore Threads promises the Lushan will deliver a 15x improvement in AAA gaming performance, a 50x boost in ray tracing capabilities, and a 64x increase in AI compute power. Other cited improvements include a 16x uplift in geometry processing, a 4x faster texture fill rate, and an 8x improvement in atomic memory access performance. Perhaps most notably for gamers, the new architecture will offer full support for modern APIs like DirectX 12 Ultimate, addressing a major compatibility shortcoming of previous Moore Threads products. Memory capacity is also set to quadruple, suggesting flagship Lushan models could feature up to 64GB of VRAM.
Announced Moore Threads Next-Gen GPUs:
| Product Line | Target Use | Key Claimed Improvements | Expected Launch |
|---|---|---|---|
| Lushan | Gaming & Content Creation | 15x Gaming, 50x Ray Tracing, 64x AI Compute, 4x Memory Capacity | 2026 |
| Huashan | AI Acceleration | Performance between NVIDIA Hopper & Blackwell; Superior memory bandwidth | 2026 |
Huashan: Challenging NVIDIA in AI
Alongside the gaming-focused Lushan, Moore Threads introduced the Huashan AI accelerator. This chip appears to be a more complex, chiplet-based design incorporating eight HBM (High Bandwidth Memory) sites. In a direct comparative slide, Moore Threads positioned Huashan's performance between NVIDIA's current Hopper and next-generation Blackwell GPUs, claiming comparable floating-point compute to the Blackwell B200 and superior memory bandwidth. This positioning is audacious and, if realized, would represent a formidable new competitor in the high-stakes AI hardware market. The demonstrated performance of the existing MTT S5000 GPU on the DeepSeek V3 model—achieving 1000 tokens/second in decode and 4000 tokens/second in prefill—was presented as evidence of the company's growing capability in this space.
Context on Current Gen (for Comparison):
- The GPUs being replaced (MTT S80/S90) feature 16GB GDDR6 memory.
- Moore Threads claims the current MTT S5000 AI GPU performs slightly ahead of NVIDIA's Hopper lineup in China, citing DeepSeek V3 benchmarks of 1000 tokens/s (Decode) and 4000 tokens/s (Prefill).
The Road Ahead and Market Implications
Moore Threads stated that products based on the Lushan and Huashan designs are slated for launch in 2026. The announcement, heavy on promises and light on concrete specs or independent benchmarks, is a classic pre-launch teaser designed to generate developer interest and signal capability to the market and investors. The success of these GPUs will hinge entirely on the veracity of these performance claims and the stability of their software drivers upon release. If Moore Threads can deliver even a fraction of the promised gains, particularly in gaming with full DirectX 12 Ultimate support, it could begin to erode the dominance of Western GPU giants in the Chinese market and provide a credible, state-backed alternative for AI development within the region.
