AI Factories 2025H2
🌍 AI Factories Top 20 — 2025 H2 Edition
Inspired by the TOP500 Supercomputer List, this list ranks the largest known or announced AI training infrastructures worldwide — often called “AI Factories”.
Figures are based on public announcements, credible leaks, and infrastructure analysis as of November 2025.
| 🏅 Rank | Name / Project | 🌐 Location | 🏢 Owner / Operator | ⚙️ GPU Count / Compute Scale | 🧩 Platform / Architecture | 📆 Status | 📝 Notes |
|---|---|---|---|---|---|---|---|
| 1 | Stargate (OpenAI + Microsoft) | Quincy, WA, USA | Microsoft / OpenAI | ~150,000+ NVIDIA Blackwell GPUs (planned) | Azure AI Fabric | Announced (2024) | Part of $100B multi-phase AI infrastructure expansion |
| 2 | Frontier AI Factory (NVIDIA + CoreWeave) | USA | CoreWeave / NVIDIA | ~100,000 GPUs (H100/B100 mix) | DGX Cloud | Under Construction | NVIDIA’s flagship partner cloud cluster |
| 3 | xAI “Colossus” Cluster (Elon Musk / xAI) | USA | xAI | ~100,000 GPUs (H100 + B200) | Custom Tesla Dojo hybrid | Building (2025) | Colossus 1 & 2; dedicated to Grok model suite |
| 4 | Amazon Bedrock / Titan Cluster | USA | AWS | ~80,000 GPUs equivalent | Trainium + H100 mix | Operational | Internal Amazon FMs training platform |
| 5 | Google Gemini Training Infrastructure | USA | Google DeepMind | ~70,000 GPUs (A3 Mega + TPUv5e) | TPUv5e / TPUv6 | Operational | Powers Gemini and multimodal DeepMind systems |
| 6 | Meta “AI Research SuperCluster” (RSC 2) | USA | Meta | ~60,000 GPUs (H100/B200) | PyTorch / NVIDIA DGX | Under Expansion | Successor to RSC Phase 1 (16k A100s) |
| 7 | Azure “AI Supercomputer” Cluster | USA + EU | Microsoft Azure | ~60,000+ GPUs (H100s) | Azure NDv5 | Operational | Used for OpenAI + Copilot workloads |
| 8 | G42 Condor Galaxy (UAE) | Abu Dhabi, UAE | G42 + Cerebras | ~54,000 GPU-equivalents | Cerebras WSE + NVIDIA | Expansion (CG-2, CG-3) | Key MENA AI hub, multi-region rollout |
| 9 | China “Baidu AI Cloud” Cluster | Beijing, China | Baidu | ~50,000 GPUs (H800 / Ascend 910B) | Kunlun + NVIDIA H800 | Operational | Baidu’s core generative AI infrastructure |
| 10 | Alibaba Tongyi Cluster | Hangzhou, China | Alibaba Cloud | ~45,000 GPUs (H800 / Ascend 910B) | PAI / Lingjun AI Infra | Operational | Powers Tongyi Qianwen LLM family |
| 11 | Tencent AI SuperCenter | Shenzhen, China | Tencent | ~40,000 GPUs | H800 + Blackwell (planned) | Expansion | Supports Hunyuan model suite |
| 12 | DOE “Frontier AI Discovery” Project | Oak Ridge, USA | U.S. Dept. of Energy (DOE) | ~35,000 GPUs equivalent | AMD Instinct / HPE Cray | Building (2025) | Research-oriented, exascale AI-HPC hybrid |
| 13 | DOE “Lux” AI Facility | USA | U.S. Dept. of Energy | ~30,000 GPUs equivalent | Hybrid DOE HPC / NVIDIA | Announced | Next-gen AI scientific computing center |
| 14 | EU “JUPITER AI Extension” (EuroHPC) | Jülich, Germany | EuroHPC / Forschungszentrum Jülich | ~25,000 GPUs (MI300A + H100) | Modular Supercomputing | Under Construction | Europe’s flagship AI-HPC hybrid |
| 15 | Saudi NEOM AI Factory | NEOM, Saudi Arabia | NEOM Tech / NVIDIA | ~24,000 GPUs | NVIDIA + Oracle Cloud | Announced | AI hub for NEOM smart city projects |
| 16 | ETH Zurich “Alps” AI-HPC | Zurich, Switzerland | CSCS / ETH | ~22,000 GPUs | NVIDIA Grace Hopper | Launching 2025 | European sovereign AI compute center |
| 17 | France “Jean Zay 2” AI Upgrade | Paris, France | GENCI | ~20,000 GPUs | NVIDIA / Atos BullSequana | Construction | French national AI infrastructure |
| 18 | Japan “ABCI 3.0” | Tokyo, Japan | AIST | ~18,000 GPUs | NVIDIA H200 | Planned (2025) | Asia’s open AI research cluster |
| 19 | Amazon / Anthropic “Claude Factory” | USA | AWS + Anthropic | ~15,000 GPUs | Trainium2 / H100 | Operational | Dedicated to Claude 3+ and next-gen models |
| 20 | OpenAI “Legacy Azure Cluster” (SuperPod) | Iowa, USA | Microsoft / OpenAI | ~14,000 GPUs (A100s) | Azure AI | Operational | Former GPT-4 / GPT-3 training supercluster |
Notes (2025):
- 🏭 “AI Factories” are large-scale GPU or accelerator clusters dedicated to training or serving AI foundation models.
- 🧠 GPU counts are based on credible estimates; real totals may vary due to hybrid architectures (TPUs, WSEs, Trainium, etc.).
- 🌎 The U.S. leads in installed capacity, but UAE, China, and EU projects are rapidly expanding.
- 🚀 Projects like Stargate and Frontier AI Factory mark the beginning of the 100k+ GPU era of AI infrastructure.
- ⚡ Expect several >200k-GPU builds to break ground before 2026.
Last updated: November 2025
Source: company filings, press releases, HPC community trackers, and infrastructure research.
This post is licensed under CC BY 4.0 by the author.