Skip to content

AI Models

This reference covers the available AI models, how to select between them, understanding token-based pricing, tracking API usage, and calculating costs across your requests. The system provides three Anthropic Claude model variants with different performance and pricing characteristics.

Three Claude model variants are available, each optimized for different use cases:

  • Haiku: The fastest and most cost-effective model (claude-haiku-4-5). Default choice for CLI commands.
  • Sonnet: Balanced performance and cost (claude-sonnet-4-6). Suitable for general-purpose tasks.
  • Opus: Most capable model with highest performance (claude-opus-4-5). Best for complex reasoning tasks.

These models are exported directly from the model module:

import { haiku, sonnet, opus, model } from "src/commands/build-repo-docs/model";
// haiku is the default export
console.log(model); // claudefraude-haiku-4-5

Choose a model based on your task complexity and cost constraints:

  • Quick queries and CLI interactions: Use Haiku (default)
  • Balanced workloads: Use Sonnet
  • Complex reasoning and analysis: Use Opus

The model export defaults to Haiku for backward compatibility. Override by importing the specific variant you need.

Each model has distinct pricing for input, output, and cached input tokens (per million tokens):

ModelInputCached InputOutput
Haiku$0.80$0.08$4.00
Sonnet$3.00$0.30$15.00
Opus$15.00$1.50$75.00

Cached input tokens receive an 90% discount compared to regular input tokens, rewarding repeated use of the same context.

The UsageTracker class monitors token consumption and calculates costs automatically:

import { UsageTracker } from "src/commands/build-repo-docs/model";
const tracker = new UsageTracker();
// After each API call, track usage
tracker.track(usage, "haiku");
tracker.track(anotherUsage, "sonnet");

The tracker automatically calculates total cost accounting for all token types:

// Get aggregated statistics
console.log(`Total calls: ${tracker.totalCalls}`);
console.log(`Total cost: $${tracker.cost.toFixed(4)}`);
// Pretty-print usage by model tier
console.log(tracker.toString());
// Output:
// haiku: 5 calls, 1,250 in (100 cached) / 2,500 out
// sonnet: 2 calls, 5,000 in / 12,000 out
// Cost: $0.0823

The cost calculation automatically separates cached and non-cached input tokens to apply appropriate pricing tiers.