Additionally, OrioleDB beta12 features a new fastpath tree search, which can accelerate workloads with intensive key-value lookups by up to 20%. Stay tuned for a new blog post about this later this week.
Have you gotten any feedback from the core Postgres team about whether any of the improvements might make it into the mainline, or do you think it will remain an extension for ever?
I haven't studied the design super closely, so there might be shortcomings or design compromises that aren't immediately obvious. But from where I'm sitting the performance improvements seem so extreme that it's hard to see why anyone would not want to replace the current MVCC heap with this.
An undo log-based system is essentially the way Oracle structures its core row storage, with each row mutated in place and containing a header listing the chain of transactions currently holding different versions of it. This means you avoid the bloat that comes with Postgres-style MVCC, by sacrificing some read performance whenever someone needs to read a row that has been mutated but not committed, as the reader has to follow the chain to find the old committed version in the undo log. That's always seemed to me to be a better design. Moving the dead tuple bloat away from the main heap and into a separate undo log makes a lot of sense.
But I wonder if the core team, which is famously quite conservative about introducing big changes, sees this as too different to adopt as a replacement for what's been the state of the art since forever.
> But I wonder if the core team, which is famously quite conservative about introducing big changes, sees this as too different to adopt as a replacement
we respect the level of conservatism in Postgres and expect that this could take a while to upstream. we're committed to any timeline and we'll develop it for self-hosters and on the supabase platform first to iron out all the bugs
that said, even if it remains an extension that's fine: as long as the TAM patches land upstream this enables everyone to create storage engines using an extension approach - a very "postgres" way of doing things
you can follow the progress of these patches here:
I know it can exist just fine as an independent extension. I was asking more about whether you had gotten any signal that the core team was interested at all.
With software projects like these, you often have a somewhat insular core team who isn't particularly amenable to innovations that may be perceived as disruptive or "not invented here" or going against the grain, or for other reasons. With 30+ years fighting the downsides of MVCC vacuums and heap bloat you would think and hope they would be jumping at the opportunity.
The extension system is great. But extensions will always suffer from not being part of the core. For example, if you use a cloud provider like AWS and GCP, you're limited by what extensions are included, as well as their release cadence.
We have made our intentions clear with the core team but also haven’t pushed the agenda much - it’s too early.
The conversations we have had so far are promising, we just need to make sure we approach it the right way. Projects like Oriole have started and sputtered out several times (zheap, for example). The burden first lies on our side to prove that we can see it through to GA, with meaningful usage. They (correctly) shouldn’t need to entertain the maintenance burden until they know the juice is worth the squeeze.
If it’s not accepted into core, we will work with cloud providers to add it as a supported extension
Hopefully this will be part of the official Postgres soon(ish), along with what Planetscale is doing we could finally have an Open Source SQL DB system that is powerful without too much customisation.
Does it require core patches or I can install it into the standard upstream Postgres? Asking because, afaik, it did, but it might that something has changed already.
pgactive uses logical replication so it should be compatible OOTB
one of the possible enhancements Oriole could enable is multi-master (in the presentation here: https://www.orioledb.com/docs#solving-postgresql-wicked-prob...), although that work will come later. The focus for now is getting to GA so that it can be used instead of Heap
Additionally, OrioleDB beta12 features a new fastpath tree search, which can accelerate workloads with intensive key-value lookups by up to 20%. Stay tuned for a new blog post about this later this week.
Have you gotten any feedback from the core Postgres team about whether any of the improvements might make it into the mainline, or do you think it will remain an extension for ever?
I haven't studied the design super closely, so there might be shortcomings or design compromises that aren't immediately obvious. But from where I'm sitting the performance improvements seem so extreme that it's hard to see why anyone would not want to replace the current MVCC heap with this.
An undo log-based system is essentially the way Oracle structures its core row storage, with each row mutated in place and containing a header listing the chain of transactions currently holding different versions of it. This means you avoid the bloat that comes with Postgres-style MVCC, by sacrificing some read performance whenever someone needs to read a row that has been mutated but not committed, as the reader has to follow the chain to find the old committed version in the undo log. That's always seemed to me to be a better design. Moving the dead tuple bloat away from the main heap and into a separate undo log makes a lot of sense.
But I wonder if the core team, which is famously quite conservative about introducing big changes, sees this as too different to adopt as a replacement for what's been the state of the art since forever.
> But I wonder if the core team, which is famously quite conservative about introducing big changes, sees this as too different to adopt as a replacement
we respect the level of conservatism in Postgres and expect that this could take a while to upstream. we're committed to any timeline and we'll develop it for self-hosters and on the supabase platform first to iron out all the bugs
that said, even if it remains an extension that's fine: as long as the TAM patches land upstream this enables everyone to create storage engines using an extension approach - a very "postgres" way of doing things
you can follow the progress of these patches here:
https://www.orioledb.com/docs#patch-set
I know it can exist just fine as an independent extension. I was asking more about whether you had gotten any signal that the core team was interested at all.
With software projects like these, you often have a somewhat insular core team who isn't particularly amenable to innovations that may be perceived as disruptive or "not invented here" or going against the grain, or for other reasons. With 30+ years fighting the downsides of MVCC vacuums and heap bloat you would think and hope they would be jumping at the opportunity.
The extension system is great. But extensions will always suffer from not being part of the core. For example, if you use a cloud provider like AWS and GCP, you're limited by what extensions are included, as well as their release cadence.
We have made our intentions clear with the core team but also haven’t pushed the agenda much - it’s too early.
The conversations we have had so far are promising, we just need to make sure we approach it the right way. Projects like Oriole have started and sputtered out several times (zheap, for example). The burden first lies on our side to prove that we can see it through to GA, with meaningful usage. They (correctly) shouldn’t need to entertain the maintenance burden until they know the juice is worth the squeeze.
If it’s not accepted into core, we will work with cloud providers to add it as a supported extension
Fair points. Thanks!
Hopefully this will be part of the official Postgres soon(ish), along with what Planetscale is doing we could finally have an Open Source SQL DB system that is powerful without too much customisation.
we (supabase) are developing an open source vitess adaptation fwiw
https://supabase.com/blog/multigres-vitess-for-postgres
Does it require core patches or I can install it into the standard upstream Postgres? Asking because, afaik, it did, but it might that something has changed already.
It still requires some patches to the Table Access Method API which have been submitted upstream:
https://www.orioledb.com/docs#patch-set
Great! How does it fit into the landscape?
Can i used it in conjunction with other extensions, like the recently announced pgactive from amazon?
pgactive uses logical replication so it should be compatible OOTB
one of the possible enhancements Oriole could enable is multi-master (in the presentation here: https://www.orioledb.com/docs#solving-postgresql-wicked-prob...), although that work will come later. The focus for now is getting to GA so that it can be used instead of Heap
Would Postgres FTS (ts_vector) benefit from this storage engine?
Awesome. How far away would you say are you from a stable orioledb as postgres extension?
Great work!!