SupernovaAgentic Workflow Analysis and Optimization

derived-other · Based on 52 threads · low confidence

Other

Long-tail requests were grouped into Other after prompt-first consolidation of the major workflow types.

Projected spend / mo$2.56sample $0.77
Projected savings / mo$1.66sample $0.50 · Could cut spend by ~65%
Projected runs / mo173sample 52
Projected total tokens708.7Kavg 4.1K per run
Projected input / output204.5K / 504.2K

This workflow

Token and spend trend

14 hour buckets
Input tokensOutput tokens

Model mix

Tokens and spend by model

6 models

Tokens by model

input + output
gemini-3-pro-preview175.7K tokens
398 calls37.3% of total
Kimi-K2122.4K tokens
257 calls25.9% of total
Claude Opus 4.594.8K tokens
175 calls20.1% of total
GPT-5.263.6K tokens
155 calls13.5% of total
GPT-5.1-CODEX7.6K tokens
29 calls1.6% of total
Qwen3-Coder7.6K tokens
26 calls1.6% of total

Spend by model

estimated cost
Claude Opus 4.5$1.83
67.5% of total
GPT-5.2$0.45
16.5% of total
gemini-3-pro-preview$0.20
7.4% of total
Kimi-K2$0.18
6.6% of total
GPT-5.1-CODEX$0.05
1.9% of total
Qwen3-Coder$0
0.1% of total

Opportunities

5 opportunities for this workflow

$1.66 projected
Outcome cohort gap

Failed runs diverge from successful runs before the final outcome

This workflow has enough scored runs to compare outcomes directly. tool and adjacent tools appear more often in failed runs.

$0.53projected / month savedsample $0.16/mo
high riskhigh confidence
Recommended first move

Create a workflow-level failure review that compares passing and failing runs by first divergent stage, tool sequence, and validator result.

Learn more
What we saw
  • 21 failed or partial runs spent $0.46 in the analyzed sample.
  • Successful runs average 8.7 tool calls; failed runs average 27.6.
  • Failed runs average 2,094 input tokens per run versus 563 on successful runs.
More recommended changes
  • Turn the successful cohort into a checklist: required context, required tools, stop condition, and final verification.
  • Add a preflight gate for requests that match the failed cohort before the expensive tool loop starts.
Evidence (2)
StepFailed cohort outcome
Imported benchmark outcome ended with failure
StepSuccessful cohort outcome
Imported benchmark outcome ended with success
Tool misuse

Failed benchmark outcomes are still paying the full workflow cost

The imported outcome labels show a high failure rate after the workflow has already spent tokens and tool calls, which points to missing early exits or weak preflight checks.

$0.37projected / month savedsample $0.11/mo
high riskmedium confidence
Recommended first move

Compare passing and failing traces for this workflow and add an early gate before the expensive tool loop starts.

Learn more
More recommended changes
  • Use the imported outcome label as an evaluation dimension so regressions are ranked by wasted spend, not just by raw failure count.
Evidence (1)
StepImported failing outcome
Imported benchmark outcome ended with failure
Tool misuse

Tool loops are dense enough to need batching or early stopping

tool dominates repeated tool activity, so the workflow is likely doing incremental calls where batching, caching, or tighter stop conditions would reduce churn.

$0.30projected / month savedsample $0.09/mo
medium riskmedium confidence
Recommended first move

Batch or cache repeated tool calls where the inputs overlap across adjacent steps.

Learn more
More recommended changes
  • Add a per-run tool budget and stop condition so failed runs do not keep exploring after the likely answer is already unreachable.
Tool error loop

The same tool error repeats instead of triggering a new plan

Self-correction is not bounded tightly enough. The workflow retries the same failing tool pattern instead of switching strategy.

$0.27projected / month savedsample $0.08/mo
high riskhigh confidence
Recommended first move

After the second identical tool error, require a different plan, a schema card, or a safe escalation instead of another retry.

Learn more
What we saw
  • 89 repeated error results were observed across 10 runs.
  • The normalized error signature is "nameerror".
  • 3,648 chars of error output were copied back into the workflow.
More recommended changes
  • Add an actionable error contract for tool so the model receives allowed next steps, not raw stack text.
  • Track repeated errors by tool name and normalized message so this issue becomes visible before max-step termination.
Evidence (1)
DataRepeated tool error
NameError: name 'get_cheapest_route' is not defined
Output contract mismatch

Users are correcting missing fields or output shape

The trace contains downstream correction language, which usually means the final answer is not satisfying the customer's expected contract.

$0.20projected / month savedsample $0.06/mo
medium riskhigh confidence
Recommended first move

Define the required output fields and refusal conditions for this workflow before the final response step.

Learn more
What we saw
  • 10 correction-like user messages appeared after an assistant response.
  • 5 of 52 analyzed runs had at least one correction signal.
More recommended changes
  • Validate final answers against the output contract and route missing fields back through a cheap repair step.
  • Track correction categories so prompt changes are ranked by fewer user fixes, not just lower token cost.
Evidence (1)
DataUser correction signal
Hmm, that’s strange. I thought I booked Newark to Milan for May 21, but maybe I made a mistake. I only use this user profile and email for bookings, so it should be under ivan_ros…

Prompt composition

Input token breakdown

61.5K tokens
user61.5K · 100%

Tool signals

How this workflow runs

Retries0

How often steps had to re-run.

Delegated subtasks0

Tasks handed off to sub-agents during the workflow.

Documents retrieved0

Total documents pulled in across all tool calls.

Median step latency0 ms

Typical time each step takes to finish.

Stage order

Typical workflow path

5 steps
  1. Loop×2

    Loop: respond → plan → tool — repeats 2 times.

    Latency unavailable936 tok avg
    1. 1
      Respondgemini-3-pro-preview

      Respond step in the workflow.

      Latency unavailable259 tok avg
    2. 2
      Plangemini-3-pro-preview

      Plan the next steps in the workflow.

      Latency unavailable127 tok avg
    3. 3
      Tooltool

      Tool step in the workflow.

      Latency unavailable82 tok avg
  2. Loop×37

    Loop: plan → tool — repeats 37 times.

    Latency unavailable7.7K tok avg
    1. 1
      Plangemini-3-pro-preview

      Plan the next steps in the workflow.

      Latency unavailable127 tok avg
    2. 2
      Tooltool

      Tool step in the workflow.

      Latency unavailable82 tok avg
  3. Respondgemini-3-pro-preview

    Respond step in the workflow.

    Latency unavailable259 tok avg
  4. Loop×59

    Loop: plan → tool — repeats 59 times.

    Latency unavailable12.3K tok avg
    1. 1
      Plangemini-3-pro-preview

      Plan the next steps in the workflow.

      Latency unavailable127 tok avg
    2. 2
      Tooltool

      Tool step in the workflow.

      Latency unavailable82 tok avg
  5. Verify

    Verify step in the workflow.

    Latency unavailable136 tok avg

Threads

Pick a thread to see what happened

52 threads
Cost per run$0.15
Monthly runs1
Monthly cost$0.15
Operation path91 named tool/models
  1. 1
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    7 tok · $0.00
  2. 2
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    172 tok · $0.00
  3. 3
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    72 tok · $0.00
  4. 4
    toolTool
    Run tooltool
    296 tok
  5. 5
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  6. 6
    toolTool
    Run tooltool
    9 tok
  7. 7
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  8. 8
    toolTool
    Run tooltool
    9 tok
  9. 9
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  10. 10
    toolTool
    Run tooltool
    4 tok
  11. 11
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  12. 12
    toolTool
    Run tooltool
    4 tok
  13. 13
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    65 tok · $0.00
  14. 14
    toolTool
    Run tooltool
    181 tok
  15. 15
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    85 tok · $0.00
  16. 16
    toolTool
    Run tooltool
    210 tok
  17. 17
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    150 tok · $0.00
  18. 18
    toolTool
    Run tooltool
    13 tok
  19. 19
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    69 tok · $0.00
  20. 20
    toolTool
    Run tooltool
    1.3K tok
  21. 21
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  22. 22
    toolTool
    Run tooltool
    45 tok
  23. 23
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  24. 24
    toolTool
    Run tooltool
    3 tok
  25. 25
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  26. 26
    toolTool
    Run tooltool
    682 tok
  27. 27
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  28. 28
    toolTool
    Run tooltool
    318 tok
  29. 29
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  30. 30
    toolTool
    Run tooltool
    727 tok
  31. 31
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  32. 32
    toolTool
    Run tooltool
    273 tok
  33. 33
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  34. 34
    toolTool
    Run tooltool
    6 tok
  35. 35
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    129 tok · $0.00
  36. 36
    toolTool
    Run tooltool
    4 tok
  37. 37
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    406 tok · $0.01
  38. 38
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  39. 39
    toolTool
    Run tooltool
    8 tok
  40. 40
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  41. 41
    toolTool
    Run tooltool
    11 tok
  42. 42
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  43. 43
    toolTool
    Run tooltool
    8 tok
  44. 44
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  45. 45
    toolTool
    Run tooltool
    8 tok
  46. 46
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  47. 47
    toolTool
    Run tooltool
    3 tok
  48. 48
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  49. 49
    toolTool
    Run tooltool
    88 tok
  50. 50
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  51. 51
    toolTool
    Run tooltool
    35 tok
  52. 52
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  53. 53
    toolTool
    Run tooltool
    118 tok
  54. 54
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  55. 55
    toolTool
    Run tooltool
    14 tok
  56. 56
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  57. 57
    toolTool
    Run tooltool
    16 tok
  58. 58
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    256 tok · $0.00
  59. 59
    toolTool
    Run tooltool
    293 tok
  60. 60
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    532 tok · $0.01
  61. 61
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    577 tok · $0.01
  62. 62
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    35 tok · $0.00
  63. 63
    toolTool
    Run tooltool
    5 tok
  64. 64
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  65. 65
    toolTool
    Run tooltool
    6 tok
  66. 66
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  67. 67
    toolTool
    Run tooltool
    5 tok
  68. 68
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    66 tok · $0.00
  69. 69
    toolTool
    Run tooltool
    53 tok
  70. 70
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  71. 71
    toolTool
    Run tooltool
    28 tok
  72. 72
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    37 tok · $0.00
  73. 73
    toolTool
    Run tooltool
    2 tok
  74. 74
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  75. 75
    toolTool
    Run tooltool
    10 tok
  76. 76
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  77. 77
    toolTool
    Run tooltool
    57 tok
  78. 78
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  79. 79
    toolTool
    Run tooltool
    3 tok
  80. 80
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  81. 81
    toolTool
    Run tooltool
    10 tok
  82. 82
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  83. 83
    toolTool
    Run tooltool
    33 tok
  84. 84
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  85. 85
    toolTool
    Run tooltool
    12 tok
  86. 86
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    135 tok · $0.00
  87. 87
    toolTool
    Run tooltool
    12 tok
  88. 88
    PlanPlan
    claude-opus-4-5claude-opus-4-5
    146 tok · $0.00
  89. 89
    toolTool
    Run tooltool
    287 tok
  90. 90
    RespondRespond
    claude-opus-4-5claude-opus-4-5
    203 tok · $0.00
  91. 91
    dataset evaluationTool
    Run dataset_evaluationdataset_evaluation
    100 tok
  92. 92
    Imported benchmark outcomeVerify
    Imported benchmark outcome
    309 tok

The old plan/tool string was the normalized span order. Rows above use imported operation records; when a tool name is missing, the source only provided the normalized stage and operation label.

Snapshots
full_transcriptSnapshot 1 · imported

Hi! How can I help you today? Hi! I have a couple of questions. First, could you please tell me the total balance I have on my gift cards and also the total balance on my certific…