data moats are not real

February 21, 2025

been thinking about how people are positioning data moats as competitive advantages in ai, and i'm increasingly convinced they're making an anti-bitter lesson argument - writing this down to clarify my thinking

the anti-bitter lesson

here's what i keep seeing: companies that think specific data moats exist are essentially betting against the bitter lesson. they believe their proprietary datasets, their special fine-tuning, their unique domain knowledge encoded in data will be their durable advantage.

rich sutton's bitter lesson teaches us that "general methods that leverage computation are ultimately the most effective, and by a large margin." for 70 years, we've watched domain-specific approaches get demolished by whoever just threw more compute at the problem.

scale wins again

take gpt-4-turbo and do extensive supervised fine-tuning on zig programming (which it's bad at). you might think you've created something valuable. but claude-4-sonnet will demolish your specially trained model just by being bigger and better at the base level—no special zig training needed.

this isn't just about programming languages. in computer vision, we saw the same pattern—"early methods conceived of vision as searching for edges, or generalized cylinders, or in terms of SIFT features. but today all this is discarded." the winners didn't have better data; they had better ways to leverage compute.

missing pieces

there are really two pieces that haven't been solved yet:

  1. being a good base knowledge worker
  2. some form of continual learning (as dwarkesh points out — models can't yet learn your preferences the way a human employee would)

both of these are extremely scale-prone. they're not defensible positions—they're just temporary limitations at current levels of intelligence.

it's not obvious to me what other specific jobs need tuning, but even those are likely scale-prone and/or won't need much training at the next oom of intelligence.

so what

this is both for how to think about moats while building companies in ai and how to choose problems while building companies.

if your entire thesis is "we have special data," you're making a bet against the trajectory of ai progress itself. better to think about problems that:

  • leverage ai but aren't just ai wrappers
  • solve coordination/trust/human problems that pure intelligence can't
  • build network effects or switching costs unrelated to model performance

i could be very wrong here, but i don't think i am. scale has been undefeated. the bitter lesson always wins.


evolving thoughts—would love to hear counterarguments or examples where data genuinely creates lasting advantage. find me at [email protected] or book time here.

this essay was somewhat vibes written