AI infrastructure requirements are booming. Larger AI models carry hefty training loads and inference latency requirements, driving an urgent need to scale AI acceleration clusters in data centers. Advanced GPUs and NPUs offer solutions for the computational load. However, insufficient bandwidth or latency between servers… Read More
Podcast EP308: How Clockwork Optimizes AI Clusters with Dan Zheng
Daniel is joined by Dan Zheng, VP of Partnerships and Operations at Clockwork. Dan was the General Manager for Product and Partnerships at Urban Engines which was acquired by Google in 2016. He has also held roles at Stanford University and Google.
Dan explores the challenges of operating massive AI hardware infrastructure at … Read More
