Hi folks -- I'm Chris from OpenRouter. This one hurts. We're back, but our database was down for about 45 minutes, which caused user and credit lookups to fail, and took down the API. We are investigating why, and of course going to look into improving durability so this failure mode can't happen again. We will share a post-mortem on the site when we have finished our investigation. I'm sorry to our users who count on us.
One of OpenRouter's main points is that it allows you to bypass individual AI vendors' downtimes. I was considering using it for an uptime-critical project of mine.
Nope, for closed models too. Claude for example has multiple providers they work with. Google Vertex, Amazon Bedrock and Anthropic themselves all provide inference for Claude.
The vast majority of models on OpenRouter (both closed and open) have multiple providers.
FYI, they (oddly enough) communicate mostly through Discord, and they have said they are investigating the issue at 10:30am UTC - 13 minutes after the first user reports.
Hi folks -- I'm Chris from OpenRouter. This one hurts. We're back, but our database was down for about 45 minutes, which caused user and credit lookups to fail, and took down the API. We are investigating why, and of course going to look into improving durability so this failure mode can't happen again. We will share a post-mortem on the site when we have finished our investigation. I'm sorry to our users who count on us.
Interesting timing. I was just reading about a new self-hosted router [1] today. Now I’ll definitely need to check it out.
https://github.com/felixszeto/NiceAPI
One of OpenRouter's main points is that it allows you to bypass individual AI vendors' downtimes. I was considering using it for an uptime-critical project of mine.
The post-mortem will be worth watching.
OpenRouter: eliminating Single Points of Failure… by introducing a beautifully centralized one.
Their uptime is still infinitely better than any single provider though.
infinitely?
Well in FP4
To be fair, it is still useful on this front; it's much faster than waiting for requests to fail and fallback to a backup yourself.
You still need another backup provider or two for cases like this though.
>One of OpenRouter's main points is that it allows you to bypass individual AI vendors' downtimes.
Only if you're using a model hosted by multiple providers (e.g. an open model).
Nope, for closed models too. Claude for example has multiple providers they work with. Google Vertex, Amazon Bedrock and Anthropic themselves all provide inference for Claude.
The vast majority of models on OpenRouter (both closed and open) have multiple providers.
Interesting. I would think they would safeguard core IP from competitors.
Also you might be fine with routing to a different model.
DNS of AI
Been down for ~50 minutes now and there's no information other than the automated notice on their status page.
FYI, they (oddly enough) communicate mostly through Discord, and they have said they are investigating the issue at 10:30am UTC - 13 minutes after the first user reports.
Frankly I prefer that than a green tick and "All Systems Operational"
yellow: "volcano has erupted under the datacenter and it's being flooded with lava. engineers are investigating"
red: "datacenter has been subject to multiple nuclear strikes. next update in 30 min"
Could that be due to contractual clauses for uptime in SLAs?
True, that happens far too often.
Can someone power it off and back on again please?
[flagged]
How can a router be down this long? I would have to reconsider using them moving forward.
I'm mostly concerned about their lack of communication. Would have been nice to know that they are looking into it and an ETA.
Should be coming up now.
So it seems you can't subscribe via RSS. Shame.
You can now subscribe to their RSS feed on https://status.openrouter.ai/
Hey, I run the underlying status page service - I'll add RSS!
Looking forward to the postmortem.