SupernovaAgentic Workflow Analysis and Optimization
SummarySpend, savings, accuracy, and core chartsCostSpend reporting and savings opportunitiesAccuracyOutcome coverage and improvement signals

Accuracy reporting

Outcome signal coverage

reward
Outcome coverage100%

300/300 sessions have score evidence

Selected source100%

reward coverage

Observed success56.7%

56.7%

Low-confidence scores550

3,880 total score records

LLM judge

Workflow evaluation

estimate pending
Sample size
Total estimateClaude Sonnet 4.6
Discovery
Final run
Rangepricing source

Selected score source

Outcome distribution

100% selected-source coverage
Success170
Failure130
Partial0
Unknown0

Analyzed sample

Quality by workflow

5 workflows

Accuracy opportunities

What could improve outcomes

25 signals

Outcome cohort gap

47% of runs fail or partially pass. Failed runs average 19.8 tool calls and 934 input tokens versus 9.9 calls and 565 tokens for successful runs.

NextSet a tool call limit per run to stop runaway loops early.

55 workflows

Tool error loop

The agent retried a broken tool call 193 times across 2 threads without changing its approach, burning context and making no progress.

NextAdd a circuit breaker that stops retrying after 3 identical errors.

55 workflows

Output contract mismatch

Users are fixing incomplete or incorrectly shaped responses from your assistant. This suggests the assistant isn't following its output rules consistently. A validator can catch these problems before users see them.

NextDefine exact output shape: required fields, data types, and format rules

55 workflows