- Latestly AI
- Posts
- Why GPUs Are the New Oil
Why GPUs Are the New Oil
The Compute Crisis
In 2025, every AI company from scrappy startups to trillion-dollar giants is chasing the same thing: compute.
Without it, your models donāt train, your agents donāt think, and your product roadmap dies before launch.
Weāve entered the compute crisis: a global scramble for chips, energy, and infrastructure.
The new reality? GPUs are the new oil.
Whatās Behind the Crisis?
Every AI boom in history has followed one curve: compute demand outpacing supply.
But this time, the gap is massive.
OpenAI alone is estimated to consume over 3 GW of power annually, equivalent to a small nation. Meta, Google, and Anthropic each have multi-billion-dollar GPU orders locked with NVIDIA through 2026.
Startups, meanwhile, canāt even get access.
NVIDIA H100s are booked out for months, and even second-hand A100s trade on gray markets for $25,000+ per unit.
(Source: The Information, Oct 2025)
At the center of it all sits NVIDIA, now worth over $3 trillion ā the most valuable company on Earth.
The Anatomy of Compute
AI ācomputeā isnāt just GPUs. Itās a full stack:
Silicon (hardware): GPUs like H100s, AMD MI300s, and TPUv5 chips.
Interconnect: High-bandwidth networking (InfiniBand, Ethernet fabric).
Energy: Data centers draw enormous electricity ā 1M+ watts per rack.
Cooling & Real Estate: Specialized infrastructure to handle heat.
Software Stack: CUDA, Triton, ROCm, and distributed training frameworks.
Miss any layer, and your performance collapses.
Thatās why hyperscalers like AWS, Azure, and Google Cloud now invest directly in chip design.
(Source: SemiAnalysis, July 2025)
The Geopolitics of GPUs
Compute is now a national security issue.
The U.S. has imposed export bans on high-end GPUs to China, tightening AI chip controls.
China is racing to replicate NVIDIAās performance domestically through startups like Biren and Moore Threads.
Europe is funding sovereign chip efforts via its ā¬45B Chips Act.
The Middle East (notably UAE and Saudi Arabia) is building āAI oasesā ā data centers powered by cheap energy and sovereign models.
This arms race has geopolitical consequences.
Whoever controls compute controls the next decade of innovation.
(Source: Financial Times, Oct 2025)
Training GPT-5-scale models isnāt just about chips. Itās about power.
AI datacenters now consume more energy than the entire Bitcoin network did at its peak.
As a result, data-center developers are co-locating near renewable grids, hydro plants, and nuclear facilities.
Microsoftās OpenAI partnership alone requires ~5 GW of new power build-out by 2028, equivalent to five nuclear reactors.
Some startups are experimenting with modular nuclear reactors and AI-powered energy optimization.
But the short-term reality: electricity is the next constraint.
(Source: BloombergNEF, Sept 2025)
Enter the Alternatives: CPUs, TPUs, and ASICs
With NVIDIA supply choked, everyone is building alternatives.
Google doubled down on TPUv5p, optimized for massive LLM training.
Cerebras sells wafer-scale systems that train models on a single chip.
Groq and Tenstorrent focus on low-latency inference for agents.
AMDās MI300X is finally competitive in memory bandwidth.
Yet, even with progress, NVIDIAās CUDA ecosystem remains the moat.
Most open-source training code simply wonāt run elsewhere without months of rewriting.
(Source: TechCrunch Hardware Review, Oct 2025)
How Startups Are Surviving the Compute Drought
Smaller AI startups are learning to survive by being clever, not rich.
1. Renting ephemeral compute: Platforms like Lambda Cloud, CoreWeave, and Modal let teams rent GPU clusters by the minute.
2. Using quantization: 4-bit and 8-bit quantized models cut memory use dramatically.
3. Distillation: Startups fine-tune smaller models (Mistral 7B, Claude Haiku) instead of training from scratch.
4. Model sharing: Open models like Llama 3.2 and Phi-3 mini are enabling rapid prototyping.
Itās the āfrugal AIā movement ā efficient, modular, and open.
(Source: Hugging Face āEfficientAIā Report, 2025)
The Economics of Compute
Layer | Cost Structure | Trend |
|---|---|---|
Training | $2ā$10 million for frontier models | Increasing |
Fine-tuning | $50 Kā$500 K | Stable |
Inference | $0.001ā$0.02 per query | Falling |
Storage | $20ā$50 K/month for petabyte-scale logs | Rising due to energy |
The cost bottleneck is pushing founders to move toward agentic workflows: smaller, purpose-built models instead of monolithic LLMs.
Compute-as-a-Service: The Next Gold Rush
In the same way AWS turned servers into a service, a new wave of startups is turning GPUs into flexible ācompute credits.ā
CoreWeave (valuation $19 B) ā offers GPU leasing like cloud infrastructure.
Together AI ($500 M funding) ā shared LLM hosting for open models.
RunPod ā pay-per-GPU cluster for developers.
Modal ā serverless GPU jobs for startups.
Even OpenAI and Anthropic are entering this market, renting spare compute to third parties.
The result: compute markets are becoming financialized ā tradable, bookable, and auctioned.
The Competitive Landscape
A handful of companies now control 90% of the worldās usable AI compute.
Thatās sparking concern among policymakers and opportunity for entrepreneurs.
New funds like a16zās Infra II are investing in decentralized compute networks (e.g., Akash, Gensyn, IO.Net).
Web3-linked startups are tokenizing compute supply to democratize access.
Edge inference and on-device AI (Apple, Qualcomm, AMD) are reducing central dependency.
In short: the future of AI will be shaped by who can access GPUs cheaply and sustainably.
(Source: a16z āCompute Inequality Index,ā Sept 2025)
What Founders Should Do Right Now
Start small, fine-tune smart. Avoid full training runs, distill existing models.
Leverage open ecosystems. Hugging Face, Together AI, Modal.
Design for inference efficiency. Optimize latency and model size.
Negotiate GPU credits early. Apply for NVIDIA and AWS startup programs.
Focus on differentiated data. Compute without data is just heat.
Final Take
The AI industry doesnāt run on code, it runs on compute.
And just like oil once powered the industrial age, GPUs now power intelligence itself.
The founders who win this decade wonāt just build smarter models ā theyāll secure, optimize, and trade compute like a resource.
Because in 2025, the most valuable startup isnāt another chatbot.
Itās the one that can train without waiting for a GPU delivery.
How was this edition? |
We hope you enjoyed this Latestly AI edition.
š§ Got an AI tool for us to review or do you want to collaborate?
Send us a message and let us know!
Was this edition forwarded to you? Sign up here
