Why I believe in SOTA models over custom ones

The future is general models getting cheaper, not specialized models getting narrower
March 10, 2026

The Specialization Illusion — iceberg showing specialized tasks above water and general intelligence below

I've never been a big believer in training custom models. I've also never believed in fine-tuning.

Going all the way back to 2023, my intuition has always pushed me towards the best SOTA model possible, combined with context management.

I just finally crystallized my reasoning around this:

Anytime you think you're using a small model for a small task, there's usually a whole lot more going into a given decision than just that individual area of expertise.

For example, labeling emails. Writing reports. Processing security events. Searching for threats on a network.

On one hand I think these are specialized, but the fact is the smarter and more experienced a human is who has this expertise, the better job they will do.

This is because most specialized tasks still benefit from the general life experience of the person doing the execution.

This is why I think the future is not a whole bunch of extremely small specialized models throughout the enterprise.

I think what's far more likely is more of an opus sonnet haiku model where the best of the best just keeps coming down in price, including being open source.

And those smaller models are used in conjunction with context to perform all the different tasks in an organization at much lower cost.

But they will still be extremely general models, not tiny and narrow custom ones.

supporting = loving

Since 1999 I've been creating ad-free technical tutorials and essays here. It's a one-person effort that's also my life and livelihood. If it makes your day more livable in any way, please consider supporting the work with a monthly or one-time donation. Your support means a lot to me, and makes all the difference. 🫶🏼