AI energy · Topic
Efficiency: chips, models, software
Energy per unit of useful AI work keeps falling thanks to better chips, smaller and smarter models, and software optimizations. Efficiency gains partly offset rising demand, but total usage still grows as adoption expands.
Key facts
- Each accelerator generation delivers more performance per watt.
- Techniques like quantization, distillation, and smaller specialized models cut energy per task.
- Software and serving optimizations (batching, caching, better scheduling) reduce waste.
- Efficiency rarely reduces total energy use, because cheaper AI tends to be used much more (the rebound effect).
Takeaway
Efficiency lowers energy per query but rarely total demand, because cheaper AI gets used far more.