Skip to content
AI energy · Topic

Efficiency: chips, models, software

Energy per unit of useful AI work keeps falling thanks to better chips, smaller and smarter models, and software optimizations. Efficiency gains partly offset rising demand, but total usage still grows as adoption expands.

Key facts

  • Each accelerator generation delivers more performance per watt.
  • Techniques like quantization, distillation, and smaller specialized models cut energy per task.
  • Software and serving optimizations (batching, caching, better scheduling) reduce waste.
  • Efficiency rarely reduces total energy use, because cheaper AI tends to be used much more (the rebound effect).

Takeaway

Efficiency lowers energy per query but rarely total demand, because cheaper AI gets used far more.

Explore other topics