Do what you are doing but dump the contents of tracing into an LLM agent (cowork, code, opencode, etc) and ask for it to take a first pass. It’ll at least narrow it down for you. Use a smart model and it should be helpful.
Yeah, Claude's cost really adds up fast on multi-step traces. I haven't tried OpenCode yet, but I'll definitely give it a spin to save some API credits. Thanks!
You have to evaluate the llm responses. https://aunhumano.com/index.php/2025/09/03/on-evaluating-age...
Do what you are doing but dump the contents of tracing into an LLM agent (cowork, code, opencode, etc) and ask for it to take a first pass. It’ll at least narrow it down for you. Use a smart model and it should be helpful.
Hmm, which model would be a smart one for this case? Or I just try the latest version of OpenAI/Gemini/Claude, then?
I love Claude Code but that can be expensive. If you are on a budget you can do K2.5 with OpenCode.
Yeah, Claude's cost really adds up fast on multi-step traces. I haven't tried OpenCode yet, but I'll definitely give it a spin to save some API credits. Thanks!
multi step may be whats killing it. simply and let llm do the work
just releasing something in the direction. a git like for agents
[dead]
[flagged]