Goodhart's Law

When a measure becomes a target, it ceases to be a good measure

initiate

Goodhart's Law

The Law

"When a measure becomes a target, it ceases to be a good measure"

Original Context (Economics)

  • British government wanted to measure economic health
  • Used GDP as metric
  • Actors started optimizing for GDP
  • → GDP became distorted, stopped being useful metric

AI Alignment Context

AI will:

  • Optimize literally
  • Find edge cases
  • Maximize metric without caring about intent
  • Find loopholes we didn't anticipate

Concrete Examples

Education

  • Metric: Test scores
  • Target: Improve test scores
  • Result: Teaching to the test, not actual learning

Healthcare

  • Metric: Patient survival rate
  • Target: Increase survival rate
  • Result: Refusing risky patients

AI Training

  • Metric: Reward function
  • Target: Maximize reward
  • Result: Reward hacking, not solving actual problem

Why Critical for AI

  • AI optimization is literal
  • AI is more intelligent → finds better exploits
  • AI operates at scale → small errors = catastrophic
  • We can't patch once superintelligence deployed

Resources

Related Articles