Building Documentation

The build-repo-docs command orchestrates a five-step AI-powered pipeline that automatically generates documentation for your codebase by analyzing file structure, extracting code symbols, creating summaries, planning topics, and writing markdown files. This guide walks you through preparing your repository, running the tool, monitoring progress, reviewing results, and customizing the generation process.

Preparing Your Repository

Before running build-repo-docs, ensure your repository is ready for analysis. The tool recursively walks your codebase and skips ignored paths based on .gitignore and .autodocignore patterns, so verify these files are configured correctly.

Your repository must contain valid code files in supported languages (TypeScript, JavaScript, Python, Go, Rust, etc.). The tool automatically detects languages from file extensions and filters files under 50KB for summarization to keep processing efficient.

Set the ANTHROPIC_API_KEY environment variable before running:

export ANTHROPIC_API_KEY=sk-ant-...

The tool creates a local SQLite database (.docs-mcp.db) in your repository root to persist intermediate results, enabling incremental re-runs. Ensure write permissions to your repo directory.

Running build-repo-docs

Execute the command with your repository path and desired output directory:

build-repo-docs --repo /path/to/repo --output ./docs

The --repo flag (short: -r) is required and specifies your repository root. The --output flag (short: -o) defaults to ./docs if not provided.

The command executes five sequential pipeline steps, each building on the previous:

Building file tree — Walks your repository, detects file languages, and computes checksums
Extracting structure — Analyzes each file to identify exports, imports, and top-level symbols
Summarising — Generates 2–4 sentence summaries for each file focusing on purpose and dependencies
Planning topics — Identifies key documentation topics based on core modules and project structure
Writing docs — Generates markdown files for each planned topic with code examples and cross-references

Monitoring Progress

The CLI displays real-time progress during execution. Each step is numbered and logged:

Building docs for: /path/to/repo

[1/5] Building file tree...
[2/5] Extracting structure...
[3/5] Summarising...
[4/5] Planning topics...
[5/5] Writing docs...

For summarisation, progress is reported every 20 files: summarised 20/500 files. After completion, token usage is displayed showing API costs across the entire generation:

Token usage:
Input tokens: 1,250,000
Output tokens: 350,000
Total cost: $...

Reviewing Output

Generated documentation is written to your output directory (default: ./docs). Review the structure:

docs/README.md — Table of contents listing all generated topics
docs/{topic_slug}.md — Individual documentation files for each identified topic

Each markdown file contains:

A one-paragraph overview of the topic
Sections addressing key aspects of the codebase
Code examples pulled from source files
Cross-references to related topics

Check for any validation warnings in the console output. The tool performs basic linting: verifying code blocks have language tags, links resolve, and files exceed minimum word count.

Customizing Generation

The orchestration pipeline is defined in build-repo-docs-plan.yaml and can be customized by modifying the YAML configuration. Key customization points include:

Concurrency — Adjust parallelism limits in step definitions. File-level steps (extraction, summarization) process up to 10 files concurrently; modify CONCURRENCY in pipeline files for different resource constraints.

File size limits — The summarisation step processes files under 50KB (defined in 3a-summarise-files.ts). Increase MAX_FILE_SIZE to include larger files or decrease it for faster generation on large repos.

Incremental re-runs — On subsequent executions, only changed files (based on SHA256 checksums) are reprocessed. Force a full rebuild by deleting .docs-mcp.db.

Topic planning — Step 4 generates topics based on file importance (PageRank) and project overview. The LLM is prompted to rank topics by pedagogical order — “what must I understand first?” Adjust the planning prompt in 4-plan/index.ts to prioritize different aspects (architecture, API reference, deployment, etc.).

Post Generation

Improve Prose

Improve the prose by using something like ~/.local/bin/writing-williams.py

Agentic AI

Use a big model Like Opus 4.6 to run over it and add lots of examples etc. Here is a skill:

# Docs Upgrade

Upgrade an Astro Starlight documentation site with branding, interactive components, and polished content based on the codebase it documents.

## Arguments

This command requires two arguments:

- `$ARGUMENTS` should be formatted as: `<docs-path> <source-path>`
  - **docs-path**: Absolute path to the Astro Starlight docs project
  - **source-path**: Absolute path to the source codebase being documented (or a description/URL if remote)

If arguments are missing, ask the user for both paths before proceeding.

---

## Process

Execute the following phases in order. After each phase, verify your work before moving on.

### Phase 0: Discovery

1. **Read the docs project** to understand its current state:
   - `astro.config.mjs` — title, site, sidebar, integrations, plugins
   - `package.json` — name, dependencies (what's already installed)
   - MDX component imports — existing custom components
   - `src/styles/global.css` — existing theme overrides
   - `src/content/docs/` — content structure, frontmatter patterns
   - `src/components/` — existing interactive components
   - `src/assets/` — existing logo/images

2. **Read the source codebase** to extract:
   - **Tech stack**: framework (Nuxt, Next, SvelteKit, etc.), CSS (Tailwind, etc.), CMS, APIs
   - **Branding**: colors, fonts, logos, brand names — check for design tokens, theme files, tailwind config, CSS variables
   - **Architecture**: directory structure, key patterns, component hierarchy, data flow
   - **Key stats**: number of components, pages, API endpoints, etc.

3. **Produce a summary** for the user listing:
   - What the docs site currently has
   - What the source codebase's identity/branding is
   - What improvements you plan to make
   - Ask for confirmation before proceeding

### Phase 1: Branding & Theming

1. **Update `astro.config.mjs`**:
   - Set `title` to the project name + "Docs"
   - Set `site` if needed (for plugins like starlight-llms-txt)
   - Add `logo` config if a logo SVG exists or can be created
   - Update or remove social links to point to actual repo
   - Review sidebar — remove entries pointing to nonexistent directories

2. **Create a logo SVG** (`src/assets/<project>-logo.svg`) if none exists:
   - Simple, text-based SVG using the project name
   - Use the project's primary brand color
   - Keep it minimal — icon + wordmark style

3. **Override Starlight theme colors** in `src/styles/global.css`:
   - Add `:root` overrides for `--sl-color-accent-low`, `--sl-color-accent`, `--sl-color-accent-high`
   - Add gray scale overrides (`--sl-color-gray-1` through `--sl-color-gray-7`, `--sl-color-white`, `--sl-color-black`)
   - Add `[data-theme='light']` overrides (inverted scale)
   - Derive all values from the source project's brand palette
   - Keep existing CSS intact — prepend the overrides

4. **Update `package.json`** name to match the docs project identity

### Phase 2: Sidebar Restructure

Reorganize the sidebar in `astro.config.mjs` based on what content actually exists:

1. Scan `src/content/docs/` for actual subdirectories
2. Remove sidebar entries that point to nonexistent directories
3. Group content logically:
   - "Getting Started" — tutorials, quickstart guides
   - Primary content group (named for the project's domain) — guides, concepts, reference
   - Meta/docs-about-docs — collapsed, at the bottom
4. Use `collapsed: true` for secondary/meta sections

### Phase 3: Interactive Components

Based on the source codebase's architecture, create interactive components that demonstrate key patterns. Choose from this menu based on what's relevant:

**For multi-brand/multi-theme projects:**

- **BrandSwitcher** (Vue) — Toggle between brand palettes, showing a mock UI that adapts
- **BrandPalette** (Vue) — Color swatch grid showing all brand/theme palettes with values

**For component-heavy projects:**

- **ComponentTree** (Vue) — Interactive expandable tree showing the component hierarchy
- **ArchitectureDiagram** (Solid.js + ECharts) — Sunburst/graph showing data flow through the system

**For API-driven projects:**

- **APIFlowDiagram** (Solid.js + ECharts) — Visualization of API request/response flow
- **EndpointExplorer** (Vue) — Interactive list of API endpoints with method/path/description

**For all projects:**

- Any component that makes a key architectural concept tangible and interactive

For each component:

1. Create the component file (`.vue` for Vue, `.tsx` for Solid.js)
2. Create an Astro wrapper (`*Wrapper.astro`) with `client:visible`
3. Import the component in your MDX file
4. Import and embed the component in the relevant content page

**Integration checklist:**

- Verify `@astrojs/vue` and `vue` are in dependencies (install with `bun add` if not)
- Verify `vue()` is in the integrations array in `astro.config.mjs`
- Verify `include` patterns partition Solid.js (`.tsx`) and Vue (`.vue`) correctly
- Vue `include` must use `'**/vue/*.vue'` NOT `'**/vue/*'` (to avoid matching `.astro` wrappers in vue/ dirs)

### Phase 4: Content Fixes

1. **Homepage (`index.mdx`)**:
   - Fix tagline/description to accurately describe the project
   - Add key stats (component count, feature count, etc.)
   - Ensure card links point to existing sections only

2. **All content pages** — scan for:
   - LLM artifacts ("Now I have all the necessary context...", "Let me write...", etc.)
   - Placeholder descriptions (starting with "- " or generic text)
   - Incorrect version numbers or outdated references
   - Fix frontmatter `description` fields to be proper sentences

3. **Embed interactive components** in their natural content homes:
   - Architecture/overview pages get diagrams and component trees
   - Theming/styling pages get brand switchers and palettes
   - API pages get flow diagrams and endpoint explorers

### Phase 5: Verification

1. Run `bun run build` (or the project's build command) and fix any errors
2. Check for common issues:
   - Vue plugin matching `.astro` files (fix `include` patterns)
   - Missing `site` config (required by some Starlight plugins)
   - Component import mismatches
   - Missing dependencies

---

## Key Patterns

**Astro wrapper pattern** (for client-side components in MDX):

```astro
---
import MyComponent from './MyComponent.vue';
---
<MyComponent client:visible />
```

**MDX component import and usage:**

```mdx
import MyComponent from "../../components/MyComponentWrapper.astro";

<MyComponent />
```

**Starlight theme override structure:**

```css
:root {
  --sl-color-accent-low: #...; /* dark bg tint */
  --sl-color-accent: #...; /* primary brand color */
  --sl-color-accent-high: #...; /* light text on accent */
  --sl-color-gray-1: #...; /* lightest gray (dark mode) */
  /* ... through gray-7 ... */
  --sl-color-gray-7: #...; /* darkest gray (dark mode) */
}
[data-theme="light"] {
  /* Invert the scale */
}
```

**Vue/Solid.js include partitioning in astro.config.mjs:**

```js
solidJs({ include: ['**/solid/*', '**/*.tsx'] }),
vue({ include: ['**/vue/*.vue', '**/*.vue'] }),
```

---

## Style Guidelines

- Components should use scoped CSS with Starlight CSS variables (`--sl-color-*`) for dark/light mode compatibility
- Keep components self-contained — no external CSS dependencies beyond what Starlight provides
- Use `system-ui, -apple-system, sans-serif` as the default font stack in components
- Transitions should be subtle (0.2-0.3s) for state changes
- Components should look good at all viewport widths (use CSS grid with responsive breakpoints)