Opus 4.7 vs. 4.6 after 3 days of real coding side by side from my actual session

8 points | by agentseal a day ago ago

5 comments

$alegd a day ago

interesting data. I use Claude Code daily and noticed 4.7 feels different but couldnt put numbers to it like this.
does your one-shot rate account for how much context you give it? I keep a detailed CLAUDE.md with project conventions and wondering if that closes the gap at all or if 4.7 just struggles regardless.
the fewer tools per turn thing worries me. Are you seeing it hallucinate project structure more? In my sessions it seems to want to figure things out in its head instead of actually reading the files
More expensive and lower first-try accuracy is rough. You planning to stick with 4.7 or going back?
[-]