← back
How to Build Your Own AI Data Center in 2025 — Paul Gilbert, Arista Networks
Original: How to Build Your Own AI Data Center in 2025 — Paul Gilbert, Arista Networks
Takeaway
Enterprise AI data centers need rail-optimized, isolated backend GPU networks tuned for job-completion time, sized differently for training vs inference.
Summary
- Arista's Paul Gilbert explains backend GPU networks: typically 8 GPUs per server, isolated high-speed switches, with metrics like job completion time replacing classical latency targets.
- Training networks differ from inference networks — Wedge Sosa's ratio is roughly 18x training vs 2x inference GPUs, though chain-of-thought reasoning shifts that ratio.
- Example architecture: 248 GPU training cluster for 1-2 months, then 4xH100 inference after alignment.
- Discusses backend rail-optimized topologies, isolation of GPU networks from corporate networks, and power/cooling constraints for enterprise AI buildouts.
ai-infrastructurenetworkinggpus
Original description
This presentation talks to AI Executives who are tasked with building self managed AI Networks. It will talk about some new terminology that is used, list the key considerations around power to the racks, new hardware (GPU servers) and software (workload managers, GPU management and programming) that's required, there are specific storage requirements, multi-tenancy options, uptime, telemetry and visibility. It will then drill down into the details of the differences between the Data Center networks we build today versus what's required for the next generation of AI Networking. Recorded live at the Leadership Track Session Day from the AI Engineer Summit 2025 in New York. Learn more at https://ai.engineer and purchase tickets to our next event, the AI Engineer World's Fair, in SF June 3 - 5 here: https://ti.to/software-3/ai-engineer-worlds-fair-2025 About Paul Paul Gilbert has spent over 30 years in technology. He worked at Cisco Systems for over 25 years reaching the title of Distinguished Systems Engineer, the first one in North America. In 2022 he left Cisco and joined an AI startup, he was there 1 year then moved onto Arista Networks. He is a tech lead at Arista and focuses on AI and Data Center.