Google unveiled its eighth-generation tensor processing units, splitting its AI hardware into distinct chips for training and inference as it seeks to curb reliance on Nvidia. The company says the training TPU delivers 2.8 times the performance of its prior Ironwood generation at the same price, while the inference chip boosts performance by 80% and adds 384MB of on-chip SRAM to cut latency for AI agents. Google stopped short of head-to-head comparisons with Nvidia, even as hyperscalers from Amazon to Microsoft push custom silicon to control costs and tailor workloads. Early adopters include Citadel Securities and all 17 U.S. Energy Department national labs, and Anthropic has committed to multiple gigawatts of TPU capacity. The move underscores intensifying competition in AI infrastructure as cloud providers race to offer alternatives to scarce, pricey GPUs.





























