SupernovaAgentic Workflow Analysis and Optimization
Last import · completed

Metric scope

Workflow metrics are projected; evidence stays tied to analyzed threads

Projected workflow runs226 / moThis workflow represented 68 of 300 analyzed threads.
Analyzed workflow sample68 threadsFindings, recommendations, and evidence cards are still anchored to the normalized workflow sample.
Projection factor3.3xApplied to this workflow's spend, savings, runs, and token totals. Confidence: medium.
Source pool996 sessionsThe full source-pool population used by the dashboard projection.

derived-flight-cancellation-refund · Based on 68 threads · medium confidence

Flight Cancellation Refund

Derived primarily from user-authored prompts across a 300-thread slice. Full-slice prompt clustering ran on every thread, and Claude consolidated the major workflow types from cluster exemplars because the slice exceeds the non-sampling threshold.

Projected spend / mo$2.49sample $0.75
Projected savings / mo$0.63sample $0.19 · Could cut spend by ~25%
Projected runs / mo226sample 68
Projected total tokens587.4Kavg 2.6K per run
Projected input / output125.7K / 461.7K

This workflow

Token and spend trend

14 hour buckets
Input tokensOutput tokens

Model mix

Tokens and spend by model

6 models

Tokens by model

input + output
gemini-3-pro-preview109.6K tokens
325 calls31.2% of total
Kimi-K275.6K tokens
322 calls21.5% of total
Claude Opus 4.573.3K tokens
153 calls20.9% of total
GPT-5.269.3K tokens
224 calls19.7% of total
GPT-5.1-CODEX14.1K tokens
48 calls4% of total
Qwen3-Coder9.6K tokens
44 calls2.7% of total

Spend by model

estimated cost
Claude Opus 4.5$1.53
61.7% of total
GPT-5.2$0.62
24.8% of total
gemini-3-pro-preview$0.13
5.1% of total
Kimi-K2$0.12
5% of total
GPT-5.1-CODEX$0.08
3.2% of total
Qwen3-Coder$0
0.2% of total

Opportunities

2 opportunities for this workflow

$0.63 projected
Tool misuse

Failed benchmark outcomes are still paying the full workflow cost

The imported outcome labels show a high failure rate after the workflow has already spent tokens and tool calls, which points to missing early exits or weak preflight checks.

$0.33projected / month savedsample $0.10/mo
high riskmedium confidence
Recommended changes
  • Compare passing and failing traces for this workflow and add an early gate before the expensive tool loop starts.
  • Use the imported outcome label as an evaluation dimension so regressions are ranked by wasted spend, not just by raw failure count.
Evidence (1)
StepImported failing outcome
Imported benchmark outcome ended with failure
Tool misuse

Tool loops are dense enough to need batching or early stopping

tool dominates repeated tool activity, so the workflow is likely doing incremental calls where batching, caching, or tighter stop conditions would reduce churn.

$0.30projected / month savedsample $0.09/mo
medium riskmedium confidence
Recommended changes
  • Batch or cache repeated tool calls where the inputs overlap across adjacent steps.
  • Add a per-run tool budget and stop condition so failed runs do not keep exploring after the likely answer is already unreachable.

Prompt composition

Input token breakdown

37.7K tokens
user37.7K · 100%

Tool signals

How this workflow runs

Retries0

How often steps had to re-run.

Delegated subtasks0

Tasks handed off to sub-agents during the workflow.

Documents retrieved0

Total documents pulled in across all tool calls.

Median step latency0 ms

Typical time each step takes to finish.

Stage order

Typical workflow path

3 steps
  1. Loop×2

    Loop: respond → plan → tool — repeats 2 times.

    Latency unavailable762 tok avg
    1. 1
      RespondKimi-K2

      Respond step in the workflow.

      Latency unavailable225 tok avg
    2. 2
      PlanKimi-K2

      Plan the next steps in the workflow.

      Latency unavailable78 tok avg
    3. 3
      Tooltool

      Tool step in the workflow.

      Latency unavailable78 tok avg
  2. Loop×97

    Loop: plan → tool — repeats 97 times.

    Latency unavailable15.1K tok avg
    1. 1
      PlanKimi-K2

      Plan the next steps in the workflow.

      Latency unavailable78 tok avg
    2. 2
      Tooltool

      Tool step in the workflow.

      Latency unavailable78 tok avg
  3. Verify

    Verify step in the workflow.

    Latency unavailable132 tok avg

Threads

Pick a thread to see what happened

68 threads
Cost per run$0.06
Monthly runs1
Monthly cost$0.06
Operation path43 named tool/models
  1. 1
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    7 tok · $0.00
  2. 2
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    52 tok · $0.00
  3. 3
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    24 tok · $0.00
  4. 4
    toolTool
    Run tooltool
    172 tok
  5. 5
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    41 tok · $0.00
  6. 6
    toolTool
    Run tooltool
    166 tok
  7. 7
    toolTool
    Run tooltool
    166 tok
  8. 8
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    218 tok · $0.01
  9. 9
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    94 tok · $0.00
  10. 10
    toolTool
    Run tooltool
    1.3K tok
  11. 11
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    68 tok · $0.00
  12. 12
    toolTool
    Run tooltool
    1.3K tok
  13. 13
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    357 tok · $0.01
  14. 14
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    29 tok · $0.00
  15. 15
    toolTool
    Run tooltool
    5 tok
  16. 16
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  17. 17
    toolTool
    Run tooltool
    5 tok
  18. 18
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    50 tok · $0.00
  19. 19
    toolTool
    Run tooltool
    56 tok
  20. 20
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    52 tok · $0.00
  21. 21
    toolTool
    Run tooltool
    14 tok
  22. 22
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  23. 23
    toolTool
    Run tooltool
    179 tok
  24. 24
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  25. 25
    toolTool
    Run tooltool
    195 tok
  26. 26
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  27. 27
    toolTool
    Run tooltool
    181 tok
  28. 28
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  29. 29
    toolTool
    Run tooltool
    242 tok
  30. 30
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  31. 31
    toolTool
    Run tooltool
    12 tok
  32. 32
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    241 tok · $0.01
  33. 33
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    213 tok · $0.01
  34. 34
    toolTool
    Run tooltool
    14 tok
  35. 35
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    110 tok · $0.00
  36. 36
    toolTool
    Run tooltool
    16 tok
  37. 37
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    269 tok · $0.01
  38. 38
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    103 tok · $0.00
  39. 39
    toolTool
    Run tooltool
    217 tok
  40. 40
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    119 tok · $0.00
  41. 41
    toolTool
    Run tooltool
    5 tok
  42. 42
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    55 tok · $0.00
  43. 43
    dataset evaluationTool
    Run dataset_evaluationdataset_evaluation
    100 tok
  44. 44
    Imported benchmark outcomeVerify
    Imported benchmark outcome
    235 tok

The old plan/tool string was the normalized span order. Rows above use imported operation records; when a tool name is missing, the source only provided the normalized stage and operation label.

Snapshots
full_transcriptSnapshot 1 · imported

Hi! How can I help you today? I need to cancel my upcoming flights for reservations XEHM4B and 59XX6W. I'd be happy to help you cancel those reservations. First, I need to verify …