SYSTEM

Usage
CLI Usage
System Map Pipeline Models
Scheduled
Harness
Context
OpenRouter Account
Loading...
--
Usage Breakdown
Loading...
--
Daily Trend (14 Days)
Breakdown
By Task Type
Task TypeCostTokensCalls
By Agent
AgentCostTokensCalls
By Model
ModelCostTokensCalls
Loading...
--
Model Split
Daily Cost Trend
Activity Breakdown
Sessions
Date Project Activity Turns Cost Duration
Remote Triggers
Manage on claude.ai ↗
Loading triggers...
Based on Garry Tan's agent architecture series (April 2026) + Jarvis portability audit
Thin Harness, Fat Skills
"Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN."
Core Concepts
1. Skill File
Reusable markdown procedure teaching a model HOW to do something. Works like a method call with parameters. The same /investigate skill becomes a medical analyst or forensic investigator depending on arguments.
"Markdown is actually code."
2. Harness
The program running the LLM. Does four things: runs model in a loop, reads/writes files, manages context, enforces safety. Anti-pattern: fat harness with thin skills (40+ tool definitions eating half the context window).
3. Resolver
A routing table for context. When task type X appears, load document Y first. Skills say HOW. Resolvers say WHAT to load WHEN. Tan's CLAUDE.md was 20,000 lines; the fix was ~200 lines of pointers.
"Architecture astronomy" — Do the right job at the right layer. Everything else is architecture astronomy.
4. Latent vs. Deterministic
Intelligence (judgment, synthesis) in latent space; trust (same input, same output) in deterministic code.
"An LLM can seat 8 people at a dinner table. Ask it to seat 800 and it will hallucinate."
5. Diarization
The model reads everything about a subject and writes a structured profile. "Read 50 documents, produce 1 page of judgment." No SQL or RAG pipeline can do this.
Architecture: Three Layers
Fat Skills (Markdown Procedures)
90% of value — encoding judgment, workflow, domain knowledge
Thin CLI Harness (~200 Lines)
JSON in, text out — loop, context, safety
Deterministic Foundation (Your App)
QueryDB, ReadDoc, Search — same input, same output
"Push smart fuzzy operations humans do into markdown skills. Fat skills. Push must-be-perfect deterministic operations into code. Fat code. The harness? Keep it thin."
Jarvis Alignment Audit
Tan ConceptJarvis ImplementationAlignment
Fat skills (markdown procedures)skills/ + .claude/skills/ — 17 skillsStrong
Thin harnessClaude Code CLI + jarvis-bot dispatcherStrong
Resolver (routing table)RESOLVER.md + decision treeStrong
checkResolvable()scripts/check-resolvable.shDirect Port
Memory is markdownMEMORY.md + ~240 satellite files in gitStrong
Deterministic foundationGateway → Screener → Processor pipelineStrong
DiarizationVEP enrichment pipelinePartial
Latent vs Deterministic splitKarpathy Principle (programmatic first, LLM last resort)Strong
Self-improving skillsLesson capture + feedback loopPartial
Gaps & Divergences
Harness Thickness
Jarvis's harness (~15k lines of TypeScript) is thicker than Tan's ~200-line ideal. Justified by scope — multi-channel I/O (Telegram, email, cron, dashboard) that a CLI-only model doesn't address. Thickness from scope, not bloat.
Skill Triggers in Frontmatter
gstack v0.18 added triggers: arrays to all 38 skill templates — multi-word keywords powering the resolver. Jarvis skills don't have trigger keywords yet.
Self-Improving Skills
Tan's /improve reads feedback, extracts patterns, and rewrites matching rules back into skill files autonomously. Jarvis captures lessons but doesn't auto-rewrite the skills that produced them.
Duplicate Skill Systems
gstack has one skill system. Jarvis has two (skills/ legacy + .claude/skills/). One should be canonical.
Portability Assessment
6.5 / 10
Claude Coupling Score — Lower is more portable

Moat (Yours to Keep)

  • Accumulated knowledge graph
  • Human-authored synthesis
  • Memory architecture & compounding
  • Workflow orchestration & integrations
  • Domain-specific tool chains

Commodity (Swappable)

  • The LLM doing reasoning
  • Model provider
  • Raw inference
  • Individual model features
  • SDK/API specifics
Hard Coupling
Claude CLI invocations in 8 files
CLI-specific flags: --output-format stream-json, --resume
.claude/ system: 22 agents, hooks, settings
@anthropic-ai/sdk imports in 2 files
Soft Coupling
Model IDs hardcoded in 8 files
Tool use events tied to Claude NDJSON format
OAuth token management in claude-env.ts
Already Portable
Multi-provider router (5 providers, 30+ models)
OpenRouter fallback with OpenAI SDK
LTM/VLTM memory (SQLite + markdown)
240+ memory files, all domain knowledge
Migration Path

Phase 1 — SDK Swap (2-3 days)

Replace Anthropic SDK with OpenRouter. Centralize model IDs to config. Minimal disruption.

Phase 2 — Generic CLI Wrapper (1-2 weeks)

Replace direct claude invocations with a pluggable CLI wrapper. AuthProvider abstraction. Cross-provider StreamingEventAdapter.

Phase 3 — Full Decoupling (Architectural)

Replace Claude Code CLI orchestration entirely. Migrate .claude/ agents to provider-agnostic format. Reimplement hooks system.

Key Sources
Garry Tan — "Thin Harness, Fat Skills" (Apr 9) + "Resolvers: The Routing Table for Intelligence" (Apr 15)
Harrison Chase — "Deep Agents" counter-thesis: harness IS the product, memory lock-in risk
Kevin Simback — Two-author knowledge systems: author: human (untouchable) vs author: agent (updatable)
Andrej Karpathy — Four AI coding failure modes, "Iron Man suit" augmentation pattern
Simon Willison — "Agents are merchants of complexity"
Steve Yegge — "10x to 100x" productivity claim, 8 levels of AI adoption
Production systems: gstack (74k+ stars, 28 skills) • gbrain (8.7k+ stars, 17,888 pages, 21 cron jobs)
Loading session context...