Precise Time Estimation — QA Note Docs

Every AI Fix run records the PLAN-stage estimate (suggestedMinutes) and the actual elapsed time (actualMinutes). Stack the values up, and you get a real distribution: "issues like this take N minutes on average." Future estimates lean on that baseline.

Why Time Estimates Matter

"How long will this AI Fix take?" is a routing decision. A 5-minute fix runs immediately. A 30-minute fix runs in the background during a meeting. Anything bigger should probably be split into smaller PRs by hand. Accurate estimates make those branches automatable.

Estimates ride on past actuals. Claude Code looks at recent issues from the same project — average minutes spent — and adjusts the next estimate. There is no explicit learning loop, but the PLAN stage gets recent durations injected as context.

Fields Auto-recorded

Four fields land in the issue's ai_fix metadata.

Field	Type	Unit	Recorded at
`suggestedMinutes`	`number \| null`	minutes	End of PLAN stage
`actualMinutes`	`number \| null`	minutes	End of IMPLEMENT stage (PR created or timeout)
`startedAt`	ISO 8601	UTC	AI Fix start click
`finishedAt`	ISO 8601	UTC	PR creation or 30-min timeout

In the DB, the values live on issues.metadata as ai_fix.durations. Re-running AI Fix on the same issue overwrites the latest values; older runs go to ai_fix.history.

Accuracy Feedback Loop

Captured durations are used in two ways.

PLAN-stage context injection — When Claude Code drafts a plan, it sees the recent N issues' suggestedMinutes / actualMinutes averages from the same project. Corrections like "this project tends to take 1.4× the estimate" emerge naturally.
Dashboard surfaces — The "AI Fix run log" on the issue page shows both values and the error ratio (actualMinutes / suggestedMinutes).

There is no explicit learning model. Calibration happens entirely through Claude Code's in-context reasoning. So early estimates wobble (±100%+ error). Things stabilize after 5–10 runs in the same project.

Where It Shows Up

The "AI Fix run log" section on the issue page surfaces both values.

AI Fix run log
─────────────────
Started: 2026-04-20 14:32:01 UTC
PLAN done: 14:33:18 (estimate: 18 min)
IMPLEMENT done: 14:53:42 (actual: 21 min 41 sec)
Error: +20% (took longer than estimated)

Querying via MCP

The get_issue MCP tool includes the duration data.

json

{
  "id": "iss_abc",
  "ai_fix": {
    "durations": {
      "suggestedMinutes": 18,
      "actualMinutes": 21,
      "startedAt": "2026-04-20T14:32:01.000Z",
      "finishedAt": "2026-04-20T14:53:42.000Z"
    },
    "history": [
      { "suggestedMinutes": 25, "actualMinutes": 30, "startedAt": "..." }
    ]
  }
}

Pass the history array to an LLM and you get summaries like "first run: 25 min estimate → 30 min actual; second run: 18 min estimate → 21 min actual."

Caveats

Timeouts: A 30-minute timeout records actualMinutes = 30 even though the work is incomplete. Drop these from error-ratio stats.
Human time is not tracked: Time-to-merge and review time are not measured. Tracking ends when the PR is created.
Granularity: Minutes, integer. A 30-second run rounds up to 1 minute.