System

Tier	Plan	Summary	Status	Phase	View
Loading plans...

Based on Garry Tan's agent architecture series (April 2026) + Jarvis portability audit

Thin Harness, Fat Skills

"Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN."

Core Concepts

1. Skill File

Reusable markdown procedure teaching a model HOW to do something. Works like a method call with parameters. The same /investigate skill becomes a medical analyst or forensic investigator depending on arguments.

"Markdown is actually code."

2. Harness

The program running the LLM. Does four things: runs model in a loop, reads/writes files, manages context, enforces safety. Anti-pattern: fat harness with thin skills (40+ tool definitions eating half the context window).

3. Resolver

A routing table for context. When task type X appears, load document Y first. Skills say HOW. Resolvers say WHAT to load WHEN. Tan's CLAUDE.md was 20,000 lines; the fix was ~200 lines of pointers.

"Architecture astronomy" — Do the right job at the right layer. Everything else is architecture astronomy.

4. Latent vs. Deterministic

Intelligence (judgment, synthesis) in latent space; trust (same input, same output) in deterministic code.

"An LLM can seat 8 people at a dinner table. Ask it to seat 800 and it will hallucinate."

5. Diarization

The model reads everything about a subject and writes a structured profile. "Read 50 documents, produce 1 page of judgment." No SQL or RAG pipeline can do this.

Architecture: Three Layers

Fat Skills (Markdown Procedures)

90% of value — encoding judgment, workflow, domain knowledge

Thin CLI Harness (~200 Lines)

JSON in, text out — loop, context, safety

Deterministic Foundation (Your App)

QueryDB, ReadDoc, Search — same input, same output

"Push smart fuzzy operations humans do into markdown skills. Fat skills. Push must-be-perfect deterministic operations into code. Fat code. The harness? Keep it thin."

Jarvis Alignment Audit

Tan Concept	Jarvis Implementation	Alignment
Fat skills (markdown procedures)	`skills/` + `.claude/skills/` — 17 skills	Strong
Thin harness	Claude Code CLI + jarvis-bot dispatcher	Strong
Resolver (routing table)	RESOLVER.md + decision tree	Strong
`checkResolvable()`	`scripts/check-resolvable.sh`	Direct Port
Memory is markdown	MEMORY.md + ~240 satellite files in git	Strong
Deterministic foundation	Gateway → Screener → Processor pipeline	Strong
Diarization	VEP enrichment pipeline	Partial
Latent vs Deterministic split	Karpathy Principle (programmatic first, LLM last resort)	Strong
Self-improving skills	Lesson capture + feedback loop	Partial

Gaps & Divergences

Harness Thickness

Jarvis's harness (~15k lines of TypeScript) is thicker than Tan's ~200-line ideal. Justified by scope — multi-channel I/O (Telegram, email, cron, dashboard) that a CLI-only model doesn't address. Thickness from scope, not bloat.

Skill Triggers in Frontmatter

gstack v0.18 added triggers: arrays to all 38 skill templates — multi-word keywords powering the resolver. Jarvis skills don't have trigger keywords yet.

Self-Improving Skills

Tan's /improve reads feedback, extracts patterns, and rewrites matching rules back into skill files autonomously. Jarvis captures lessons but doesn't auto-rewrite the skills that produced them.

Duplicate Skill Systems

gstack has one skill system. Jarvis has two (skills/ legacy + .claude/skills/). One should be canonical.

Portability Assessment

6.5 / 10

Claude Coupling Score — Lower is more portable

Moat (Yours to Keep)

Accumulated knowledge graph
Human-authored synthesis
Memory architecture & compounding
Workflow orchestration & integrations
Domain-specific tool chains

Commodity (Swappable)

The LLM doing reasoning
Model provider
Raw inference
Individual model features
SDK/API specifics

Hard Coupling

Claude CLI invocations in 8 files

CLI-specific flags: --output-format stream-json, --resume

.claude/ system: 22 agents, hooks, settings

@anthropic-ai/sdk imports in 2 files

Soft Coupling

Model IDs hardcoded in 8 files

Tool use events tied to Claude NDJSON format

OAuth token management in claude-env.ts

Already Portable

Multi-provider router (5 providers, 30+ models)

OpenRouter fallback with OpenAI SDK

LTM/VLTM memory (SQLite + markdown)

240+ memory files, all domain knowledge

Migration Path

Phase 1 — SDK Swap (2-3 days)

Replace Anthropic SDK with OpenRouter. Centralize model IDs to config. Minimal disruption.

Phase 2 — Generic CLI Wrapper (1-2 weeks)

Replace direct claude invocations with a pluggable CLI wrapper. AuthProvider abstraction. Cross-provider StreamingEventAdapter.

Phase 3 — Full Decoupling (Architectural)

Replace Claude Code CLI orchestration entirely. Migrate .claude/ agents to provider-agnostic format. Reimplement hooks system.

Key Sources

Garry Tan — "Thin Harness, Fat Skills" (Apr 9) + "Resolvers: The Routing Table for Intelligence" (Apr 15)

Harrison Chase — "Deep Agents" counter-thesis: harness IS the product, memory lock-in risk

Kevin Simback — Two-author knowledge systems: author: human (untouchable) vs author: agent (updatable)

Andrej Karpathy — Four AI coding failure modes, "Iron Man suit" augmentation pattern

Simon Willison — "Agents are merchants of complexity"

Steve Yegge — "10x to 100x" productivity claim, 8 levels of AI adoption

Production systems: gstack (74k+ stars, 28 skills) • gbrain (8.7k+ stars, 17,888 pages, 21 cron jobs)

Emergency Stop & Overrides

Override Controls

Plan

Moat (Yours to Keep)

Commodity (Swappable)

Phase 1 — SDK Swap (2-3 days)

Phase 2 — Generic CLI Wrapper (1-2 weeks)

Phase 3 — Full Decoupling (Architectural)

File