AI Models

This reference covers the available AI models, how to select between them, understanding token-based pricing, tracking API usage, and calculating costs across your requests. The system provides three Anthropic Claude model variants with different performance and pricing characteristics.

Available Models

Three Claude model variants are available, each optimized for different use cases:

Haiku: The fastest and most cost-effective model (claude-haiku-4-5). Default choice for CLI commands.
Sonnet: Balanced performance and cost (claude-sonnet-4-6). Suitable for general-purpose tasks.
Opus: Most capable model with highest performance (claude-opus-4-5). Best for complex reasoning tasks.

These models are exported directly from the model module:

import { haiku, sonnet, opus, model } from "src/commands/build-repo-docs/model";

// haiku is the default export
console.log(model); // claudefraude-haiku-4-5

Model Selection

Choose a model based on your task complexity and cost constraints:

Quick queries and CLI interactions: Use Haiku (default)
Balanced workloads: Use Sonnet
Complex reasoning and analysis: Use Opus

The model export defaults to Haiku for backward compatibility. Override by importing the specific variant you need.

Token Pricing

Each model has distinct pricing for input, output, and cached input tokens (per million tokens):

Model	Input	Cached Input	Output
Haiku	$0.80	$0.08	$4.00
Sonnet	$3.00	$0.30	$15.00
Opus	$15.00	$1.50	$75.00

Cached input tokens receive an 90% discount compared to regular input tokens, rewarding repeated use of the same context.

Usage Tracking

The UsageTracker class monitors token consumption and calculates costs automatically:

import { UsageTracker } from "src/commands/build-repo-docs/model";

const tracker = new UsageTracker();

// After each API call, track usage
tracker.track(usage, "haiku");
tracker.track(anotherUsage, "sonnet");

Cost Calculation

The tracker automatically calculates total cost accounting for all token types:

// Get aggregated statistics
console.log(`Total calls: ${tracker.totalCalls}`);
console.log(`Total cost: $${tracker.cost.toFixed(4)}`);

// Pretty-print usage by model tier
console.log(tracker.toString());
// Output:
//   haiku: 5 calls, 1,250 in (100 cached) / 2,500 out
//   sonnet: 2 calls, 5,000 in / 12,000 out
//   Cost: $0.0823

The cost calculation automatically separates cached and non-cached input tokens to apply appropriate pricing tiers.