We've Been Thinking About AI All Wrong

AI is just a way to execute Intelligence Tasks that only humans can (could) do

July 30, 2024

#ai #business #ethics #future #innovation #productivity #society #technology

AI Intelligence Pipeline

When I tell people that AI going to separate people into have's and have-nots, or multiply our global productivity by trillions of dollars, most don't believe me.

I realize now why that is. It's because most people don't have the right mental model for thinking about AI.

When most people think AI they think image generation or chatbots. And understandably so—since those were the first applications of what's now called GenAI.

But it's much better to think of AI as an Intelligence Pipeline.

What the hell does that mean?

Great question. An Intelligence Pipeline is a series of Intelligence Tasks that result in a useful output. And Intelligence Tasks are functions that can only be done using human intelligence.

Here are some real-world examples.

Intelligence Pipeline Examples

Before we get into these, let's highlight the point by doing something crazy. Let's completely abandon the word "AI". It's a silly word, and it means 100 different things depending on who you ask.

Instead I want you to think about people. Humans. And specifically, human workers.

So imagine a person—let's call them Chris—who works in a cube with a computer. Chris has a coffee next to him, and a small plant. And a picture of his girlfriend and his dog on the cube wall.

Chris's job

Chris works at a company called CutePup. CutePup finds pictures of cute dogs and puts them on the CutePup website.

Chis is a member of a Process Team that does one part of the company workflow. Here's the whole process.

Take an uploaded picture and determine if it's a dog
Determine if the dog is cute
Determine what kind of dog it is
Post all cute dogs on the website in the section for its breed

So the workflow looks like this:

CutePup Workflow

The CutePup Workflow

That's it. That's what CutePup does.

Chris is not alone in his building. He's in a cube farm with 48,912 other people.

Chris is part of the Process 1 team, so his job is to determine whether a picture is a dog or not. Here's what he sees on his screen all day:

Chris's screen

That Chris lyfe

This one is a cat, so Chris clicks on the No button.

Chris's teammates

Carol sits next to Chris. She works in Process 2. She only gets photos that Process 1 has determined are dogs, and she has a screen that asks her if the dog is cute or not.

Carol's screen

Carol has a better job

Next to Carol is Amir who works in Process 3. Amir is an expert on dog breeds.

When a dog pops up, Amir looks at it and types in the breed into a text box.

Amir's screen

You've got to know a lot of dogs

Why use humans and not just computer code?

You might be wondering why we don't have computers do this.

Well, because they can't. You can't ask Python or C++ if something is a dog or not. Or if that dog is cute.

You need a human for that. You need Intelligence.

So, the CutePup workflow looks like this:

Is it a dog?
Is it cute?
What kind is it?

That's three different tasks that require human intelligence. That's an Intelligence Pipeline, and each node in the Pipeline is an Intelligence Task.

Let's look at more complex example.

ClaimRight Insurance

ClaimRight is an insurance company that pays people out if their products wear out before they're supposed to. It's for all sorts of products, like scooters, tents, baby strollers, etc.

But they don't pay out if it's fraud or abuse of the product. Here's the workflow:

ClaimRight workflow

Checking for fraud and abuse of the product

Look at the 50 pictures of the item that are submitted as part of a claim
Determine if the item is covered by ClaimRight
Review the video of the submitter talking through the photos they took
Determine if it's the same person who took out the policy based on their face and their voice
Determine whether the item in the video is the same as the item in the photos
Determine whether the damage in the photos is from normal wear-and-tear or from abuse
If everything adds up, mark it as wear-and-tear and pay out the policy.

Kira works at ClaimRight, along with 349,219 other people in the Boise office. She has a plaque on her cube for 25 years of service. She's really good at determining the difference between wear-and-tear and abuse.

And she's not just good at it—she's fast. In her 8 hour day, not counting lunch and breaks and stuff, she can get through an average of 29 cases per day!

29!

That's 11 more than the median, and with an 89% accuracy rating, which is top 2% in the company.

Now let's look at something even more cognitively difficult.

Overseer

Kevin works at Overseer. They're a military intelligence service company that sells intelligence reports to the US government. They specialize in watching all the military bases in a foreign country using satellite images, and then determining what that country is doing militarily.

Here's the Pipeline.

Overseer workflow

Lots of analysis and expertise needed in multiple places

Look at the 28,452 satellite images that come in every day
Compare the images to the previous day's images
Identify everything in the new image
Determine what changed since the last image
Determine the military significance of those changes
Construct a narrative around that significance, framed for a particular customer within the government
Write the report
Submit the report

Kevin is an employee at Overseer, and he's kind of a genius. Among the 712,309 people who work at his company (there are hundreds of satellites and hundreds of places of interest to monitor), he's one of the few who can work in Process 2, Process 3, and Process 5. Plus he's pretty good at 6 and 7. Most people can only do one or two.

And like Carol at ClamRight, Kevin is super fast. He can actually do 9 reports per week! End-to-end if necessary. And his accuracy is off the charts at 86%.

Let's look at another example—this time in Medicine.

Badspot checks for moles

BadSpot is a company that checks for dangerous moles on people. You send in the picture and it determines if it's something you need to worry about.

Here's the BadSpot Intelligence Pipeline.

BadSpot workflow

Decades of schooling and experience required

With CutePup and ClaimRight the stakes were pretty low. Maybe you get an occasional cat in your dog pics, or maybe the insurance policy pays out when it shouldn't have. No biggie.

But with Overseer and BadSpot, we're talking about military intelligence and health. So we're potentially dealing with people's lives.

And as you might expect, the level of expertise required is much higher. Think about the intelligence, knowledge, and experience needed to execute the Intelligence Tasks in these Pipelines:

Overseer

Know thousands of different military vehicles
Know the military history of the target country
Know all their recent military moves
Correlate that data with what's happening in the news
Correlate that with what's happening in other intel reports
Experience with analyzing satellite photos
Experience with detecting techniques that attempt to hide vehicles and military activity
Expertise in writing intel reports for different audiences

BadSpot

Anyone doing the job must be a Doctor (M.D.)
So that's medical school, a residency, and then however long they've been practicing
The better they are intelligence and creativity wise (think the TV Show, House), and the more experienced, the better they are at finding the Bad Spots.

One thing both of these Intelligence Pipelines have in common is that there aren't many people who can do the Intelligence Tasks involved. Like, there aren't many people who can do these things on the planet. We're talking a few a few thousand at most.

More on that later. First let's look at how common these types of Tasks and Pipelines are throughout society.

More Intelligence Task and Pipeline Examples

As it turns out, business is nothing but collections of these types of intelligence tasks and pipelines.

Here are a bunch more Intelligence Tasks we all recognize from the corporate world.

Office work

summarize_meeting
send_summary_to_stakeholders
read_report
proofread_document
create_meeting
organize_event

Programming work

solve_problem
write_code
research_better_way
check_for_security_issues
check_peers_code
approve_pr

Customer Service work

read_complaint
check_customer_history
check_for_fraud
check_current_policy
respond_to_customer
make_customer_happy

Medical work

analyze_mole
diagnose_disease
write_prescription
analyze_xray
assess_patient
analyze_mri
talk_with_family

Researcher work

find_sources
rate_sources
summarize_article
rate_article
extract_key_ideas
synthesize_ideas
perform_analysis
write_report
submit_report
find_funding

Manager work

interview_candidate
give_performance_review
manage_budget
document_program_progress
write_progress_update
create_progress_update_presentation
deliver_presentation

Creative Work

brainstorm
riff_on_idea
expand_idea
write_first_draft
create_art
write_prose

And the list goes on…

The thing that unifies all these tasks is that you can't give them to a computer program to execute.

These are things that only humans can do. These aren't just work tasks, they're Intelligence tasks.

Similarities across tasks and pipelines

Now let's look at some similarities across all these tasks and pipelines.

Above we looked at four different companies: CutePup, ClaimRight, Overseer, and BadSpot—all doing various thinking-based activities that require human intelligence. And then we looked above at a whole bunch more examples of intelligence-based tasks.

Now that we've talked about them, let's look at what makes someone good or bad at these things.

Traits that make people good at intelligence-based tasks

Here are some attributes that make great employees in knowledge work.

Smarts — how sharp are they at finding patterns and adjusting?
Knowledge — how much do they know about the field?
Experience — how many examples have they seen?
Consistency — do they deliver high-quality after 8 hours of doing it?
Attention-to-detail — do they catch the details?
Speed — How many of these tasks can they do in a period?
Dependability — do they call in sick or take lots of vacation?
Autonomy — How independent are they at doing the task?
Trustworthiness — are we sure they haven't been paid off?
Caution — do they cause problems we have to clean up?
Learning — do they learn new stuff quickly?

I think these are solid attributes. Now let's collapse them into a few metrics.

ITEM — Intelligence Task Execution Metrics

So the metrics concept we'll remember as ITEM (EYE-tehm), and the metrics themselves we'll remember as KISAC (KAI-sack).

📘 Knowledge — The depth of their knowledge about the entire field, it's history, all the main thinkers in the field, all the seminal works, all the academic theory, all the reading, all the papers, etc.
🧠 Intelligence — The ability to hold all that knowledge in their mind at once, find the patterns in the input being evaluated, and come up with insightful analysis.
🕰️ Speed — The number of those tasks they can do—per minute, day, week, etc.—at a given quality level.
🔎 Accuracy — Their accuracy, lack of mistakes, etc.
💶 Cost — The amount of money it costs to hire them, keep them employed, and keep them trained.

These are decent because they capture not only someone's ability to do a task (knowledge and intelligence), but also the performance of their outputs (speed and accuracy), as well as the cost of execution.

Coming back to AI

Right, so that was a lot of setup, and now we're able to make the main point.

The best way to think about AI—especially as it relates to business, the economy, and productivity—is to realize that AI is simply a way to execute all these various Intelligence Tasks better, more consistently, and cheaper.

Companies are Intelligence Tasks organized into Pipelines

Companies are just Intelligence Tasks organized into Pipelines

That's it. Forget all the other crap about AI.

Forget the chatbots
Forget the image generation
Forget the crazy videos

Those are distractions.

What matters is how AI will help humans do actual work that otherwise humans would have had to do ourselves. And keep in mind—a lot of intelligence-heavy work isn't being done at all!

There are thousands of intelligence-based tasks that desperately need doing, but there simply aren't enough people to do them.

Watching all the meteors in the sky (Astronomy)
Tutoring (Education)
Medical Evals (Medicine)
Looking things up (Library Science)
Tracking transactions (Fraud & Corruption)
Investigations (Journalism)
Researching a Topic (Research)
Empathic and Active Listening (Mental Health)
Watching computer logs (Cybersecurity)
Watching security cameras (Physical Security)
Tracking down criminals and corruption (Journalism)
Etc.

There are literally billions of people who don't have access to teachers, tutors, therapists, nurses, researchers, journalists, etc., and all the wonderful Intelligence Tasks that they are able to do.

The planet needs hundreds of billions of these Intelligence Tasks done every day, and there are very, very few people with the education, training, certification, or availability to carry them out.

And that's just for the stuff that nobody is doing. Now let's look at the work that's actually being done using the KISAC metrics above.

Comparing humans vs. AI on Intelligence Tasks

Here are the KISAC metrics again.

📘 Knowledge — The depth of their knowledge about the entire field, it's history, all the main thinkers in the field, all the seminal works, all the academic theory, all the reading, all the papers, etc.
🧠 Intelligence — The ability to hold all that knowledge in their mind at once, find the patterns in the input being evaluated, and come up with insightful analysis.
🕰️ Speed — The number of those tasks they can do—per minute, day, week, etc.—at a given quality level.
🔎 Accuracy — Their accuracy, lack of mistakes, etc.
💶 Cost — The amount of money it costs to hire them, keep them employed, and keep them trained.

📘Knowledge

👥Humans:

📚Reading: A couple thousand reading maximum
💼Experience: Let's say 50 years
🔬Examples: Let's say hundreds, thousands, or a tens of thousands max

🤖AI:

📚Reading: All the reading in the entire field, with perfect recall, and millions of related reading
💼Experience: The combined experience of every person who's ever done that task
🔬Examples: Tens or hundreds of millions, or maybe billions depending on the task

🧠 Intelligence

👥Humans:

Very few Einsteins or Von Neumann's in the world
Max I.Q. around 180 or so
Most people at around 100
Not rising very fast at all

🤖AI:

In 2022 it was less smart than a child
In 2024 it's currently around 100 I.Q., depending on the task
Many experts agree that top models will be genius-level within a few years
In narrow applications, current models are already super-human
It's improving very quickly

🕰️ Speed

👥Humans:

Checking Moles — A few hundred a day
Report Writing — 1 to 15 a month
Article Summarization — 5 to 20 a day
Cyber Investigations — 1 to 5 a week
Rating Cute Dog Pics — 200 - 2000 a day
Assessing X-Rays — 100 - 500 a day

🤖AI:

Checking Moles — Millions per day
Report Writing — Hundreds per day
Article Summarization — Thousands per day
Cyber Investigations — Dozens per day
Rating Cute Dog Pics — Hundreds of thousands per day
Assessing X-Rays — Hundreds of thousands per day

Keep in mind—this is just for a single AI instance, and most systems will have a fleet of them performing what a single human or a small human team was doing. So multiply those numbers by 10, 100, or 1000x.

🔬Accuracy

👥Humans:

Very high accuracy if the human goes extremely slow, depending on the person and the task
Medical errors are the third largest cause of death in the US. SOURCE

🤖AI:

Some studies are already showing AI as equal to, or better than, doctors at identifying diseases, assessing moles, reading X-Rays, etc. SOURCE
Automation allows for faster use of multiple checks and validations to ensure acceptable results
AI's accuracy within a given pipeline is likely to increase over time due to the Knowledge and Intelligence advantage, whereas humans have a constant cycle of get_smart —> retire —> retrain

💶 Cost

👥Humans:

Expensive to train
Expensive to retrain
Expensive and time consuming to re-integrate into a team
Expensive to replace
Even more expensive for those with the best results

🤖AI:

Will cost a tiny fraction for most Intelligence Tasks
Will cost a tiny fraction for re-training and re-deployment
Upgrades to general models will often upgrade the entire fleet
The difference in cost between execution at mid-human level vs. high-human-level will likely be negligible

Comparing vs. our examples

Earlier we were talking about how fast and accurate Carol was at her job.

And she's not just good at it—she's fast! In her 8 hour day, not counting lunch and breaks and stuff, she can get through an average of 29 cases per day!
29!
That's 11 more than the median, and with an 89% accuracy rating, which is top 2% in the company.

Now imagine an AI doing this same job but with the following metrics:

NOTE: These are just estimated numbers, but I think they're fairly realistic.

29,000 a day instead of 29 (which will increase rapidly)
93% accuracy instead of 89% (which will increase rapidly)
$3,500 a year in AI costs instead of $137,200 in salary & benefits (which will decrease rapidly)

In short, humans will beat out AI in a few things for a long time to come—but for most Intelligence Tasks, AI is going to do 10-1000x the amount of work that humans can do—with as-good-or-better quality—for a fraction of the cost.

And again, this is not some theoretical or ambiguous work. This is the work we're all familiar with. It's the regular work we get hired at companies to do.

Regular work that humans get hired to do every day

That is what AI is. And that is why it matters.

Summary

People are confused about AI becasue they equate it with either chatbots or image generation.
The best way to clarify your thinking on it is to remove the word "AI" from the conversation entirely.
Replace the word "AI" with a unit of work that only humans can do, called an Intelligence Task.
AI is getting extremely competent at executing such tasks, and it's doing so faster, better, and cheaper every day.
Companies are just sequences of those Intelligence Tasks organized into Intelligence Pipelines that accomplish a given goal.
Which means companies and individuals that intelligently leverage AI will become dominant, while those that don't will get left behind.
Meanwhile, the Intelligence Pipelines that used to get executed by human workers will soon be mostly be executed by AI.
This is why AI matters, and why it will have such an extraordinary impact on the economy and society.

We've Been Thinking About AI All Wrong

Intelligence Pipeline Examples ​

Chris's job ​

Chris's teammates ​

Why use humans and not just computer code? ​

ClaimRight Insurance ​

Overseer ​

Badspot checks for moles ​

Overseer ​

BadSpot ​

More Intelligence Task and Pipeline Examples ​

Office work ​

Programming work ​

Customer Service work ​

Medical work ​

Researcher work ​

Manager work ​

Creative Work ​

Similarities across tasks and pipelines ​

Traits that make people good at intelligence-based tasks ​

ITEM — Intelligence Task Execution Metrics ​

Coming back to AI ​

Comparing humans vs. AI on Intelligence Tasks ​

📘Knowledge ​

🧠 Intelligence ​

🕰️ Speed ​

🔬Accuracy ​

💶 Cost ​

Comparing vs. our examples ​

Summary ​

Intelligence Pipeline Examples

Chris's job

Chris's teammates

Why use humans and not just computer code?

ClaimRight Insurance

Overseer

Badspot checks for moles

Overseer

BadSpot

More Intelligence Task and Pipeline Examples

Office work

Programming work

Customer Service work

Medical work

Researcher work

Manager work

Creative Work

Similarities across tasks and pipelines

Traits that make people good at intelligence-based tasks

ITEM — Intelligence Task Execution Metrics

Coming back to AI

Comparing humans vs. AI on Intelligence Tasks

📘Knowledge

🧠 Intelligence

🕰️ Speed

🔬Accuracy

💶 Cost

Comparing vs. our examples

Summary