Astro/Solid - Hacker News

$Mawr 10 hours ago

I'm surprised the API doesn't follow the well known by now "1 write connection, n read connections" model, at least by default.

See https://kerkour.com/sqlite-for-servers for that and more ideas.

[-]

$jacob2161 3 hours ago

Thanks for pointing this out. I don't think it's something this little helper library should try to do. This kind of thing belongs in whatever database abstraction you're building/using, if any. It's also a little trickier than it seems at first glance (RETURNING, etc).

And for many no/low concurrency situations there's no/little benefit.

Added this example to the README.

  rwdb, err := sqlitebp.OpenReadWriteCreate("app.db")
  if err != nil {
    log.Fatal(err)
  }
  rwdb.SetMaxOpenConns(1)
  rwdb.SetMaxIdleConns(1)


  // Read-only with default adaptive pool size (2-8 based on GOMAXPROCS)
  rodb, err := sqlitebp.OpenReadOnly("app.db")
  if err != nil {
    log.Fatal(err)
  }

$ncruces 9 hours ago

To do this, it's best if the user opens two pools themselves.

I've tried to create something that does this transparently (give you a pool that magically routes writes to a single connection), but it's hard.

It starts simple: Exec is a write, Query is a read. Unless it contains the word RETURNING, then it can be an UPDATE…RETURNING. BeginTx can be read-only otherwise assume it's a write.

The problem is that attached DBs throw a wrench and make it even more confusing.

So only the user can really know if it's a write or a read. But then this makes it harder to use an ORM or even something like SQLc. So I don't know if the “driver” should do anything about it really.

$ncruces 9 hours ago

My driver (Wasm based) does many of these things by default.

Foreign keys are default enabled, default synchronous mode is FULL for rollback and NORMAL for WAL, shared cache is disabled, the default busy timeout when using database/sql is 60 seconds.

About the default huge busy timeout: my driver can easily afford this because context cancellation actually works and interrupt both long running queries and filesystem locks. I don't think the mattn driver can use a (e.g.) WithDeadline(100ms) context to interrupt early a connection blocked on a busy timeout of 5 seconds: your goroutine will be stuck in a Cgo loop without anything to interrupt it.

Doing this properly requires a custom busy handler, the overhead of which might be to big for Cgo (not sure). Having done this, I also implemented the Ruby approach to improve tail latencies: https://fractaledmind.com/2023/12/06/sqlite-on-rails-improvi...

Another important thing that I don't think mattn does, is have a default transaction "kind" and then allow specific transactions to override it. For SQLite, arguably, the default transaction should be immediate, with only read-only transactions being deferred. I make it possible to pick the transaction type with BeginTx.

Also, recommendations on PRAGMA optimize vary. If you do it when opening connections, you're supposed to use specific flags: optimize uses previous work by the connection to guide what to do, and there's none at this point. This also adds latency to opening connections. Another possibility is doing it when a connection is closed. My driver makes adding a close callback easy.

https://github.com/ncruces/go-sqlite3

[-]

$jacob2161 3 hours ago

Thanks for the comments. I'm a fan.

Yeah, I think "PRAGMA optimize" here on startup is basically a no-op and only costs microseconds but I should just remove it.

Seems like the ideal thing is to spawn a goroutine and run optimize periodically. Many Go daemons I run will have uptime measured in months so running on shutdown isn't super useful there.

Another one I might add is running `PRAGMA wal_checkpoint(PASSIVE)` periodically from a goroutine, which is something I've done in some of my daemons. Because otherwise a random write suffers from increased latency due to having to checkpoint.

Probably makes sense to open a dedicated read/write connection just for running `PRAGMA wal_checkpoint(PASSIVE)` at some regular interval to keep ahead of any autocheckpointing?

[-]

$ncruces 11 minutes ago

> Seems like the ideal thing is to spawn a goroutine and run optimize periodically. Many Go daemons I run will have uptime measured in months so running on shutdown isn't super useful there.

Connections don't necessarily need to be kept forever. Between SetConnMaxLifetime/SetConnMaxIdleTime that's one way of periodically optimizing from “warm” connections. But doing it on a timer from a fresh goroutine is another.

$georgeck 14 hours ago

SQLite's backward compatibility means many best practices - like WAL mode, foreign key enforcement, and sane busy timeouts - are not enabled by default.

The author's Go library, sqlitebp, automates these settings and others (NORMAL synchronous, private cache, tuned page cache, connection pool limits, automatic PRAGMA optimize, and in-memory temp storage) to make high-concurrency, reliable usage safer and easier right out of the box

[-]

$bob1029 10 hours ago

The backwards compatibility also means that the frustration over concurrency and synchronization is largely a waste of time. Most SQLite builds are created such that all activity is serialized through a single mutex by default.

> In serialized mode, API calls to affect or use any SQLite database connection or any object derived from such a database connection can be made safely from multiple threads.

https://www.sqlite.org/threadsafe.html

Many libraries get this wrong and make it unsafe to use from multiple threads despite the underlying provider being capable. I think these are effectively bugs that should be resolved.

In my C# applications, I use System.Data.SQLite and share a single SQLiteConnection instance across the entire app. This connection instance typically gets injected as the first service, so I can just take a param on it any time I need to talk to SQL. Absolutely no synchronization occurs in my code. I've tried Microsoft.Data.Sqlite but it seems to have rare problems with sharing connections between threads.

[-]

$ncruces 9 hours ago

In Go, a database/sql “connection” is actually a pool, and Go makes sure that it only calls driver methods serially for an actual driver connection from a single goroutine.

So your point (which is not very clear to me, with my limited knowledge of C# and SDS) is largely moot in Go terms.

Go and SQLite Best Practices