Onboarding: garrytan/gstack

Item: garrytan/gstack
Rating: 3
Author: RepoPilot

Generated by RepoPilot · 2026-05-05 · Source

Verdict

WAIT — Single-maintainer risk — review before adopting

Last commit 1d ago
5 active contributors
MIT licensed
CI configured
Tests present
⚠ Small team — 5 top contributors
⚠ Single-maintainer risk — top contributor 96% of commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

TL;DR

gstack is a collection of 23 opinionated Claude Code slash-command tools (stored as Markdown files) that simulate a virtual engineering team — CEO, Designer, Eng Manager, QA, Security Officer, Release Manager, etc. It also ships a compiled headless browser binary (browse/dist/browse) and a PDF generator (make-pdf/dist/pdf) built with Bun, enabling a single developer to run AI-assisted product development workflows from code review to browser-based QA without leaving Claude Code. Monorepo: each major capability lives in its own subdirectory (browse/, make-pdf/, design/, benchmark/, autoplan/, benchmark-models/) with its own SKILL.md and SKILL.md.tmpl for AI tool definitions, plus a src/ and dist/ for compiled binaries. The bin/ directory holds ~25 standalone gstack-* CLI scripts (TypeScript + shell), and .claude/commands/ (implied by CLAUDE.md) is where slash commands are registered for Claude Code.

Who it's for

Solo founders, indie hackers, and small-team CTOs who use Claude Code daily and want a structured, opinionated AI workflow — particularly those who want to ship production features fast without a full engineering team. Garry Tan (YC president) built and uses this himself; contributors are expected to be Claude Code power users.

Maturity & risk

The repo is at v1.26.3.0 with 1,237+ GitHub contributions in 2026 alone, a full CI matrix (.github/workflows/ has 8+ workflows including evals, PDF gates, version gates, and actionlint), and a bun test suite with free-test sharding via scripts/test-free-shards.ts. Actively developed and used in production by its author, though it is a single-maintainer personal toolchain — production-ready for Claude Code users willing to accept that dependency.

Single-maintainer risk is high — all design decisions flow from one person and breaking changes appear frequently (CHANGELOG.md, VERSION file tracked explicitly). The project depends on Bun as both runtime and bundler (not npm/node), which is a non-standard toolchain requirement that can surprise contributors. The tools also assume Claude Code as the execution environment, so they have zero utility outside that specific AI IDE context.

Active areas of work

Active work is visible in: periodic evals workflows (.github/workflows/evals-periodic.yml, evals.yml) suggesting ongoing model benchmarking; a make-pdf-gate CI step enforcing PDF generation quality; a version-gate workflow enforcing VERSION file bumps on PRs; and TODOS.md tracking live feature work. The skill-docs workflow auto-generates SKILL.md files from .tmpl templates, which is actively used across all subdirectories.

Get running

git clone https://github.com/garrytan/gstack && cd gstack && cp .env.example .env # fill in required API keys bun install bun run build

To run the headless browser:

./browse/dist/browse

To run tests (free/no-LLM subset):

bun run test:free

Daily commands: bun run dev # headless browser CLI (browse/src/cli.ts) bun run server # browse server mode (browse/src/server.ts) bun run dev:make-pdf # PDF generator (make-pdf/src/cli.ts) bun run dev:design # design CLI (design/src/cli.ts) bun test # full test suite (excludes e2e/LLM tests) bun run test:free # sharded tests with no LLM calls

Map of the codebase

CLAUDE.md — Primary configuration file that defines how Claude Code interacts with the gstack toolchain — every contributor must read this to understand agent behavior and conventions.
ARCHITECTURE.md — High-level architectural overview of gstack's layered design, component responsibilities, and how the 23 specialist tools are organized and invoked.
SKILL.md — Defines the skill interface contract that all specialist agents (CEO, Designer, Eng Manager, etc.) must conform to — the load-bearing abstraction for the entire multi-agent system.
browse/src/browser-manager.ts — Core abstraction managing headless browser lifecycle, CDP connections, and session pooling — central to gstack's browsing capability.
browse/src/browse-client.ts — Primary client interface for the headless browser tool, orchestrating page navigation, screenshot, and interaction commands used by AI agents.
bin/gstack-brain-context-load.ts — Entry point for loading context into the AI brain subsystem — critical to understanding how memory and context are managed across agent sessions.
AGENTS.md — Documents all 23 agent/specialist definitions, their roles (CEO, Designer, QA, etc.), and how they are wired together into the gstack workflow.

How to make changes

Add a new specialist agent skill

Copy SKILL.md.tmpl to a new directory named after your specialist (e.g. my-specialist/SKILL.md.tmpl) and fill in the role, responsibilities, and prompt instructions. (SKILL.md.tmpl)
Run the skill-docs workflow (or manually invoke the gen:skill-docs script) to generate my-specialist/SKILL.md from your template. (.github/workflows/skill-docs.yml)
Register the new agent in AGENTS.md with its name, role description, and invocation key. (AGENTS.md)
Update CLAUDE.md to reference the new skill so Claude Code is aware of the agent and when to invoke it. (CLAUDE.md)

Add a new browser skill command

Define the new command handler function in browser-skill-commands.ts, following the existing pattern of named exports that accept a BrowseClient instance. (browse/src/browser-skill-commands.ts)
If the command involves writing/form interaction, add supporting logic to browser-skill-write.ts. (browse/src/browser-skill-write.ts)
Wire the command into the browse CLI entry point and expose it as a named subcommand. (browse/src/browse-client.ts)
Update browse/SKILL.md to document the new command so agents know it is available. (browse/SKILL.md)

Add a new gstack CLI binary tool

Create a new script in bin/ named gstack-<your-tool>, following the shebang and argument parsing conventions of existing tools like gstack-config. (bin/gstack-config)
If the tool is TypeScript, write it as bin/gstack-<your-tool>.ts and add it to the build script in package.json so it compiles to a standalone binary. (bin/gstack-global-discover.ts)
Register the new binary in the bin field of package.json and ensure gstack-paths exports its resolved location. (bin/gstack-paths)
Document the tool in SKILL.md or CLAUDE.md so agents can discover and invoke it. (SKILL.md)

Integrate a new GBrain / Supabase-backed feature

Add provisioning logic for any new Supabase tables or functions in gstack-gbrain-supabase-provision. (bin/gstack-gbrain-supabase-provision)
Add shared helper functions to the GBrain shell library if the feature needs reusable shell utilities. (bin/gstack-gbrain-lib.sh)
Implement the sync logic for the new feature in gstack-gbrain-sync.ts, pushing/pulling from Supabase. (bin/gstack-gbrain-sync.ts)
Add verification checks in gstack-gbrain-supabase-verify so the install process confirms the feature is properly provisioned. (bin/gstack-gbrain-supabase-verify)

Why these technologies

Bun — Used as the TypeScript runtime and bundler to compile CLI binaries (browse, make-pdf, gstack-global-discover) — chosen for its fast startup, native TypeScript support, and single-binary compilation without Node.js overhead.
undefined — undefined

Traps & gotchas

Bun is required — not Node.js/npm; many scripts use Bun-specific APIs and bun build --compile for single-binary output. 2) The .env.example must be populated with real API keys (Claude/Anthropic at minimum, likely OpenAI and others for evals) before most tools function. 3) The bun run build step vendors xterm.js from node_modules into extension/lib/ — if you skip the build, the browser extension UI will be broken. 4) LLM eval tests (test/skill-llm-eval.test.ts, test/skill-e2e-.test.ts) are explicitly excluded from bun test and require separate setup and API quota. 5) The gstack-brain- scripts in bin/ imply a running external brain/queue service (likely Redis or similar) — check bin/gstack-brain-init before running any brain commands.

Architecture

Concepts to learn

Claude Code Slash Commands — All 23 gstack tools are implemented as slash commands registered with Claude Code — understanding how Claude Code discovers and executes these Markdown-defined commands is fundamental to the entire repo
Chrome DevTools Protocol (CDP) — bin/chrome-cdp and browse/src/ use CDP directly to drive a headless browser for QA automation, bypassing higher-level abstractions like Playwright
Bun Single-Binary Compilation — gstack uses bun build --compile to produce self-contained executables (browse/dist/browse, make-pdf/dist/pdf) — understanding this Bun-specific feature explains why there's a dist/ directory per module
LLM Evals (automated evaluation pipelines) — The .github/workflows/evals.yml and evals-periodic.yml workflows run automated LLM-graded quality checks on the slash-command tools — a non-obvious CI pattern where the judge is also an AI model
Go Templates for code generation — SKILL.md.tmpl files use Go Template syntax processed by scripts/gen-skill-docs.ts to generate final SKILL.md files — the 806K lines of Go Template in the repo are all template source, not a Go application
OWASP STRIDE Threat Modeling — The README explicitly names OWASP and STRIDE as capabilities of the security officer tool — knowing these frameworks explains what the security slash command actually audits
xterm.js terminal emulator — xterm.js is vendored into extension/lib/ during build (vendor:xterm script) to provide a browser-embedded terminal UI — it's a non-obvious runtime dependency surfaced only in the build script

Related repos

anthropics/claude-code — The AI IDE that executes all gstack slash commands — gstack is a configuration/tooling layer on top of this
openclaw/openclaw — Directly cited in the README as the inspiration showing one person can ship at team scale using AI agents
paul-gauthier/aider — Close alternative: AI pair programming CLI tool that also targets solo developers wanting team-scale output, different architecture (Python, git-centric)
continuedev/continue — Ecosystem alternative: open-source AI coding assistant with slash commands and context tools, runs in VS Code/JetBrains rather than Claude Code
BuilderIO/ai-shell — Predecessor-style inspiration: AI-powered shell command generation, similar philosophy of wrapping AI power into developer CLI ergonomics

PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add unit tests for bin/gstack-brain-context-load.ts and bin/gstack-memory-ingest.ts

The brain/memory pipeline (gstack-brain-context-load.ts, gstack-memory-ingest.ts, gstack-brain-sync, gstack-gbrain-sync.ts) is core infrastructure for the AI workflow, yet the test script in package.json explicitly ignores skill e2e and eval tests but has no mention of coverage for these TS bin files. A bug here silently corrupts context passed to Claude Code. Unit tests would catch regressions in serialization, chunking, and enqueue logic before they reach users.

[ ] Read bin/gstack-brain-context-load.ts and bin/gstack-memory-ingest.ts to understand their exported functions and data shapes
[ ] Create test/brain-context-load.test.ts and test/memory-ingest.test.ts using Bun's built-in test runner (matching the pattern in the existing 'bun test … test/' invocation in package.json scripts)
[ ] Mock any Supabase or network calls (referenced by bin/gstack-gbrain-supabase-provision and bin/gstack-gbrain-supabase-verify) using Bun's module mocking so tests run offline
[ ] Cover edge cases: empty context, oversized payloads, malformed JSONL input (relevant to bin/gstack-jsonl-merge)
[ ] Ensure the new tests are picked up by the existing 'bun test … test/' glob in package.json and pass in the windows-free-tests CI workflow

Add a GitHub Actions workflow to lint and type-check all TypeScript bin/* entry points

The repo has CI workflows for actionlint, evals, PDF generation, version gates, and Windows-free tests, but there is no workflow that runs tsc --noEmit (or bun's type checker) across the TypeScript source files in bin/ (e.g. gstack-brain-context-load.ts, gstack-gbrain-sync.ts, gstack-global-discover.ts, gstack-memory-ingest.ts). These files are compiled with 'bun build --compile' in the build script, meaning type errors only surface at build time on a developer's machine — never in a fast PR gate. A dedicated typecheck workflow would catch broken imports and type regressions within seconds of a PR being opened.

[ ] Create .github/workflows/typecheck.yml triggered on pull_request and push to main
[ ] Reuse the existing CI Docker image already defined in .github/docker/Dockerfile.ci to keep environments consistent
[ ] Add a step that runs 'bun x tsc --noEmit --project tsconfig.json' (or create a minimal tsconfig.json in bin/ if one doesn't exist) covering all .ts files under bin/, browse/src/, design/src/, and make-pdf/src/
[ ] Add a second step that runs 'bun run build' in --dry-run or limited mode to verify all bun build --compile targets resolve without errors
[ ] Document the new workflow in CONTRIBUTING.md under the existing CI section so new contributors know to fix type errors before pushing

Create SKILL.md documentation and a .tmpl template for the bin/gstack-diff-scope and bin/gstack-extension tools

Every major tool in the repo follows a clear convention: a SKILL.md documenting the skill's purpose and usage, and a SKILL.md.tmpl template used by scripts/gen-skill-docs.ts to auto-generate documentation (see autoplan/SKILL.md, autoplan/SKILL.md.tmpl, benchmark/SKILL.md, benchmark-models/SKILL.md, etc.). However, bin/gstack-diff-scope and bin/gstack-extension — both of which are surfaced to users and Claude Code agents — have no corresponding SKILL.md or .tmpl file in the repository listing. This breaks the generated skill-docs output and leaves these tools

Good first issues

Several bin/ scripts (e.g. bin/gstack-builder-profile, bin/gstack-codex-probe) have no corresponding test files in test/ — adding unit tests for their CLI argument parsing would be a well-scoped contribution. 2) The benchmark/ and benchmark-models/ directories each have a SKILL.md.tmpl but no visible README explaining how to run benchmarks locally — writing that documentation would immediately help new contributors. 3) The .github/workflows/windows-free-tests.yml suggests Windows compatibility is tested, but there is no CONTRIBUTING.md section on Windows-specific setup for Bun + Chrome CDP — adding that would reduce onboarding friction for Windows contributors.

Top contributors

@garrytan — 94 commits
@byliu-labs — 1 commits
[@Jared Friedman](https://github.com/Jared Friedman) — 1 commits
@evansolomon — 1 commits
@invalid-email-address — 1 commits

Recent commits

db9447c — v1.26.3.0 feat: /sync-gbrain skill + native code-surface orchestrator (#1314) (garrytan)
30fe6bb — v1.26.2.0 fix: plan-eng-review STOP gates always fire AskUserQuestion + report-at-bottom contract enforcement (#1313) (garrytan)
a0bfa00 — v1.26.1.0 fix: gbrain-sync orchestrator resolves sibling via import.meta.dir (#1312) (garrytan)
bf65487 — v1.26.0.0 feat: V1 transcript ingest + per-skill gbrain manifests + retrieval surface (#1298) (garrytan)
b512be7 — v1.25.1.0 fix: office-hours Phase 4 STOP gate + AskUserQuestion recommendation judge (#1296) (garrytan)
6e1625c — v1.25.0.0 fix: AskUserQuestion resolves to host MCP variant when native is disallowed (#1287) (garrytan)
0570ef9 — v1.24.0.0 feat: cross-platform hardening — curated Windows lane + Bun.which resolver + path-portability helper (#1252) (garrytan)
7efa85c — v1.23.0.0 feat: always prefix PR titles with v<VERSION> (#1284) (garrytan)
454423a — v1.21.1.0 test: tighten plan-ceo-review smoke (Step 0 must fire) (#1255) (garrytan)
e8893a1 — v1.20.0.0 feat: browser-skills runtime + gbrain-support carryover (#1233) (garrytan)

Security observations

High · API Key Exposure Risk via .env Pattern — .env.example. The .env.example file reveals that ANTHROPIC_API_KEY is stored with the pattern 'sk-ant-your-key-here', indicating real API keys are stored in a .env file loaded automatically by Bun. If .env is ever accidentally committed, or if the application exposes environment variables through error messages, logs, or API responses, sensitive API keys could be leaked. The comment 'bun auto-loads .env — no dotenv needed' means keys are always in scope for the entire process. Fix: Ensure .env is in .gitignore (verify this explicitly). Use secret management solutions (e.g., AWS Secrets Manager, HashiCorp Vault) in production. Audit all logging statements to prevent accidental key exposure. Consider using short-lived tokens or key rotation policies.
High · Potential Secret Leakage via Telemetry and Logging Binaries — bin/gstack-telemetry-log, bin/gstack-telemetry-sync, bin/gstack-learnings-log, bin/gstack-review-log, bin/gstack-session-update. The presence of multiple logging and telemetry scripts (gstack-telemetry-log, gstack-telemetry-sync, gstack-learnings-log, gstack-review-log, gstack-question-log, gstack-session-update) raises concern that environment variables, API keys, or sensitive user data could inadvertently be captured and transmitted in telemetry payloads. Without reviewing the full source, it is unclear whether these scripts scrub sensitive data before transmission. Fix: Audit all telemetry and logging scripts to ensure no environment variables, API keys, PII, or sensitive repository content is captured. Implement an allow-list approach for telemetry data fields. Provide clear user disclosure about what is collected and transmitted.
High · Headless Browser CDP Endpoint Exposure — bin/chrome-cdp, browse/src/server.ts, browse/src/cli.ts. The bin/chrome-cdp binary and the broader browse/ subsystem expose a Chrome DevTools Protocol (CDP) interface. If the CDP port is bound to 0.0.0.0 or accessible without authentication, it can allow arbitrary code execution on the host machine by any process or network peer that can reach the port. CDP provides full browser control including file system access via the browser context. Fix: Ensure the CDP endpoint binds exclusively to 127.0.0.1. Implement authentication tokens for CDP connections. Review browse/src/server.ts to confirm no unauthenticated remote access is possible. Consider network-level isolation (e.g., Docker network policies) when running in CI/CD.
High · Supabase Provisioning Scripts May Handle Sensitive Credentials Insecurely — bin/gstack-gbrain-supabase-provision, bin/gstack-gbrain-supabase-verify, bin/gstack-gbrain-lib.sh. The scripts gstack-gbrain-supabase-provision and gstack-gbrain-supabase-verify likely handle Supabase connection strings, service role keys, or JWT secrets. Shell scripts are prone to credential leakage via process lists (ps aux), shell history, and insecure temporary files. These credentials often carry admin-level database access. Fix: Avoid passing credentials as command-line arguments in shell scripts. Use environment variables sourced from a secrets manager. Ensure temporary files containing credentials are created with restrictive permissions (chmod 600) and cleaned up reliably. Audit shell history exposure.
Medium · GitHub Actions Workflow Injection Risk — .github/workflows/pr-title-sync.yml, .github/workflows/evals.yml, .github/workflows/evals-periodic.yml. Multiple GitHub Actions workflows are present (.github/workflows/). Without viewing their full content, workflows that interpolate GitHub context variables (e.g., github.event.pull_request.title, github.head_ref) directly into run: shell steps are vulnerable to script injection attacks from malicious PR titles or branch names. The pr-title-sync.yml and evals.yml workflows are particularly suspect given their interaction with external PR data. Fix: Never interpolate ${{ github.event.* }} directly into run: shell commands. Instead, assign untrusted values to intermediate environment variables (env: VAR: ${{ github.event.pull_request.title }}) and reference $VAR in shell. Use the actionlint tool (already configured) and enforce its checks strictly.
undefined · undefined — undefined. undefined Fix: undefined

LLM-derived; treat as a starting point, not a security audit.

Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

garrytan/gstack

Embed this verdict

Onboarding doc