GPU Cost Optimization··14 min read
Spot H100s Are 70% Cheaper. Most Teams Use Them Wrong and Pay More.
Spot GPUs are the single biggest cost lever you have — and the fastest way to turn a savings story into a reliability incident. The team that runs everything on spot eats a preemption, sees 503s, migrates back to on-demand, and triples the bill without ever asking whether the original setup was wrong. The real model: what a preemption actually costs, which workloads win on spot and which never should, the per-cloud warning windows, and the 70/30 baseline-plus-spot mix that cuts the bill 40-55% with no SLO hit — if the drain logic is correct.
Read post