The Architecture

Thoughts on what the ultimate Personal AI system looks like
January 27, 2026

The last 3 months of AI development, especially spawned by Moltbot (formerly Clawdbot) and a million conversations with my buddy Jason Haddix, has made me think a lot about the ultimate Personal AI Infrastructure architecture.

Animating...

I think the PAI project, Claude Code, OpenCode, and now MoltBot are all converging on AS3 infrastructure, but the PAIMM didn't talk about the interface or architecture at all. It was focused on the capabilities of AS3.

Compiling...

Here are some of the key features from the maturity model, organized by level:

The 6 Dimensions

  • Multitask Scale [ Self -> Dozens of Tool Calls/Second -> Dozens of Agents Simultaneously -> Hundreds/Thousands Simultaneously ]

AS1 (Transition to Assistants)

  • Agents start to become less important, background elements that are working to do the bidding of your Assistant

AS2 (Proactive Advocates)

  • Full Agent orchestration, including spin up and spin down, custom task assignment, etc., all happening transparently in the background without your knowledge
  • Agents become like processes on a computer wielded by a Program. We only see the Program.

AS3 (Digital Assistant)

  • Trusted Companion—AS3-level Assistants feel more like trusted companions, partners, protectors, friends, and confidants than technology
  • Continuous Advocate—Works continuously, without rest, as an Advocate. Constantly scanning the world for opportunities, threats, better deals, useful information, and ways to optimize your life according to your goals
  • Building Partner—When you sit down at a computer, your Assistant has full access to everything the computer can do, can see all your screens, can hear everything
  • Deep Understanding—Deep understanding of your full context and history as a person: your upbringing, your relationship with your parents and family, your education, your traumas, your journal, your goals, your aspirations

All these are cool, but I want an actual visual technical architecture to build towards, and that's this post.

The Architecture Components

Hypothesizing...
This is an earlier version of this idea from August 2024.

In August of 2024 I said everyone would compete on these 4 components.

  1. The Model Itself — The base model, the neural net size and power
  2. Post-training — Teaching the model how to solve real-world problems
  3. Internal Tooling — Making it easier to use the model (APIs, context sizes, fine-tuning, etc.)
  4. Agent Functionality — Emulating human intelligence, decision-making, and action as part of workflows

Not bad, but we can do much better now given how much the tech has advanced.

I think they come down to these:

1. Intelligence

How smart the system is overall, which is a combination of model and scaffolding.

  1. The model intelligence
  2. The scaffolding in which it operates

2. Capabilities

The tools that the system's intelligence has to get work done.

  1. Integrations / Connectors, which let your system use things like Gmail, Calendar, Slack, HubSpot, etc.
  2. Skills, which are a combination of prompting and other tooling designed to perform a specific task well.
  3. Tools, which can be anything from simple CLI tools to MCPs or whatever, that are used by the system to do things.

3. Orchestration

The control system for managing all the different agents and other components that make up the system.

  1. Cloud Infrastructure, which allows us to use various services to inexpensively scale our capabilities.
  2. Orchestration Logic, which is the quality of the frameworks/systems that manage coordination of work across tens/hundreds/??? of agents.

4. Interface

Tom Cruise using the gesture interface in Minority Report

How we as humans actually use our AI stack. Like what do we see? How do we interact with it?

  1. Web application
  2. Electron app
  3. Voice
  4. Terminal/TUI
  5. Chat services like WhatsApp, Telegram, etc.
  6. Gestures
  7. Etc…
Deliberating...