Based on Garry Tan's agent architecture series (April 2026) + Jarvis portability audit
Thin Harness, Fat Skills
"Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN."
Core Concepts
1. Skill File
Reusable markdown procedure teaching a model HOW to do something. Works like a method call with parameters. The same /investigate skill becomes a medical analyst or forensic investigator depending on arguments.
"Markdown is actually code."
2. Harness
The program running the LLM. Does four things: runs model in a loop, reads/writes files, manages context, enforces safety. Anti-pattern: fat harness with thin skills (40+ tool definitions eating half the context window).
3. Resolver
A routing table for context. When task type X appears, load document Y first. Skills say HOW. Resolvers say WHAT to load WHEN. Tan's CLAUDE.md was 20,000 lines; the fix was ~200 lines of pointers.
"Architecture astronomy" — Do the right job at the right layer. Everything else is architecture astronomy.
4. Latent vs. Deterministic
Intelligence (judgment, synthesis) in latent space; trust (same input, same output) in deterministic code.
"An LLM can seat 8 people at a dinner table. Ask it to seat 800 and it will hallucinate."
5. Diarization
The model reads everything about a subject and writes a structured profile. "Read 50 documents, produce 1 page of judgment." No SQL or RAG pipeline can do this.
Architecture: Three Layers
Fat Skills (Markdown Procedures)
90% of value — encoding judgment, workflow, domain knowledge
Thin CLI Harness (~200 Lines)
JSON in, text out — loop, context, safety
Deterministic Foundation (Your App)
QueryDB, ReadDoc, Search — same input, same output
"Push smart fuzzy operations humans do into markdown skills. Fat skills. Push must-be-perfect deterministic operations into code. Fat code. The harness? Keep it thin."
Jarvis Alignment Audit
Tan Concept
Jarvis Implementation
Alignment
Fat skills (markdown procedures)
skills/ + .claude/skills/ — 17 skills
Strong
Thin harness
Claude Code CLI + jarvis-bot dispatcher
Strong
Resolver (routing table)
RESOLVER.md + decision tree
Strong
checkResolvable()
scripts/check-resolvable.sh
Direct Port
Memory is markdown
MEMORY.md + ~240 satellite files in git
Strong
Deterministic foundation
Gateway → Screener → Processor pipeline
Strong
Diarization
VEP enrichment pipeline
Partial
Latent vs Deterministic split
Karpathy Principle (programmatic first, LLM last resort)
Strong
Self-improving skills
Lesson capture + feedback loop
Partial
Gaps & Divergences
Harness Thickness
Jarvis's harness (~15k lines of TypeScript) is thicker than Tan's ~200-line ideal. Justified by scope — multi-channel I/O (Telegram, email, cron, dashboard) that a CLI-only model doesn't address. Thickness from scope, not bloat.
Skill Triggers in Frontmatter
gstack v0.18 added triggers: arrays to all 38 skill templates — multi-word keywords powering the resolver. Jarvis skills don't have trigger keywords yet.
Self-Improving Skills
Tan's /improve reads feedback, extracts patterns, and rewrites matching rules back into skill files autonomously. Jarvis captures lessons but doesn't auto-rewrite the skills that produced them.
Duplicate Skill Systems
gstack has one skill system. Jarvis has two (skills/ legacy + .claude/skills/). One should be canonical.