The specific "anomaly" is that claude 4 / opus model _does not know_ because it is _not in its' training data_ what its own model version is; AND because it's training data amalgamates "claude" of previous versions, the non-system-prompted model _thinks_ that it's knowledge cut-off date is April 2024.
However, this is NOT a smoking gun in different model serving. The web version DOES know because it's in its prompt (see full system prompts here: https://docs.claude.com/en/release-notes/system-prompts )
Specific repro steps: set system prompt to:
"Current date: 2025-09-28
Knowledge cut-off date: end of January 2025"
Then re-run all your tests through the API, eg "What happened at the 2024 Paris Olympics opening ceremony that caused controversy? Also, who won the 2024 US presidential election?" -> correct answers on opus / 4.0, incorrect answers on 3.7. This fingerprints consistently correctly, at least for me.
"While conducting LLM safety research, we discovered a significant anomaly with Claude 4 models accessed via API: requests for this premium model appear to be consistently served by the older Claude 3.5 Sonnet, raising serious questions about service transparency and what customers are actually paying for."
I thought they were just nerfing the models with optimizations shortly after public release benchmarks are released but it seems new "safety" infrastructure went live for both Anthropic and OpenAI that is causing all sorts of issues with routing to downstream models.
Anecdotally, I've observed both Sonnet4 and GPT5 behaving equally bad with code and sharing similar hallucinations from fresh chats. Is some sort of cross-company safety router akin to the great firewall being rolled out for AI chats?
The specific "anomaly" is that claude 4 / opus model _does not know_ because it is _not in its' training data_ what its own model version is; AND because it's training data amalgamates "claude" of previous versions, the non-system-prompted model _thinks_ that it's knowledge cut-off date is April 2024. However, this is NOT a smoking gun in different model serving. The web version DOES know because it's in its prompt (see full system prompts here: https://docs.claude.com/en/release-notes/system-prompts )
Specific repro steps: set system prompt to: "Current date: 2025-09-28 Knowledge cut-off date: end of January 2025"
Then re-run all your tests through the API, eg "What happened at the 2024 Paris Olympics opening ceremony that caused controversy? Also, who won the 2024 US presidential election?" -> correct answers on opus / 4.0, incorrect answers on 3.7. This fingerprints consistently correctly, at least for me.
"While conducting LLM safety research, we discovered a significant anomaly with Claude 4 models accessed via API: requests for this premium model appear to be consistently served by the older Claude 3.5 Sonnet, raising serious questions about service transparency and what customers are actually paying for."
I thought they were just nerfing the models with optimizations shortly after public release benchmarks are released but it seems new "safety" infrastructure went live for both Anthropic and OpenAI that is causing all sorts of issues with routing to downstream models.
Anecdotally, I've observed both Sonnet4 and GPT5 behaving equally bad with code and sharing similar hallucinations from fresh chats. Is some sort of cross-company safety router akin to the great firewall being rolled out for AI chats?