Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I concur that sqlite is quite amazing. That said, I was a heavy user and have grown some skepticism as well:

- it is not that hard to lock the db. Usually killing the process that caused the deadlock solves the issue - but you need to identify it / monitor for it. And yes, it happens with WAL too

- but when it does happen, it is quite scary. Simply, anything that touches your DB suddenly stops working - can't read, can't write.

- in some cases, WAL does not checkpoint. This leads to drastic growth in the size of the WAL file, and down the line in catastrophic slowdown of queries - things that take 10ms suddenly take 10 seconds. In my particular case, no tweaking of SQLite params fixed it. I had to monitor for it, and periodically force WAL file to be rolled into the main DB.

- all of this gets harder on Windows, where eg.you cannot just 'lsof' a file.

- the performance stats change somewhat for the worse in the cloud on drives that look local but actually aren't. Of course that is not sqlite's fault, but the blazing fast performance doesn't apply to all commonly encountered environments that look like real local drives.

I'm not dissing SQLite, I use it despite these shortcomings. Equally, I'm happy to reach for something like Postgres, which, well, hasn't burned me yet.

EDIT I should add that despite all this I never managed to corrupt the DB, or break any of the SQL promises - never messed up indices, never saw broken ACID compliance etc. And that's a massive endorsement, on reflection.



These are some really good points.

- WAL checkpointing is very important (litestream handles this well). As you said not checkpointing can cause massive query slow down.

- SQLITE_LOCK and SQLITE_BUSY can be avoided by ensuring your application only has a single write connection ideally behind an MPSC queue. After WAL this is probably one of the biggest SQLite quality of life improvements.

- 100% avoid cloud drives in this context you ideally want attached NVME.

- Postgres is great and there's nothing wrong with using it!


> a single write connection ideally behind an MPSC queue

That’s a pretty tall order. What if I want read-after-write consistency for code that issues a write? Did you mean some kind of a fair mutex around writes instead (in which case, how is this different from what SQLite already does?)? What if writes are coming from multiple uncoordinated processes? Do I then need to bring in a daemon or IPC?


It's really not. You have multiple read connections and a single write connection. You batch over that single write connections. The items of a batch is just a function with a sequence of queries and or application logic. That means these functions can read their own writes (as they can read using the write connection). This gives you read-after-write consistency. Because, all these functions are in a batch transaction they can read the writes of functions that have been run before them in the same batch.

Generally I find SQLite works best with a CQRS model, where you push view updates to your clients. So reads and writes are separate. In this model you only need read-after-write for transaction logic (i.e doing a read to check for something before doing an update).

Writes can come from multiple threads, they will just be put on a queue. Personally, I use an optimistic concurrency queue for batches, and there's only one "batcher" that has the write connection.

The key difference between this and what SQLite does is quite significant.

SQLite doesn't queue those writes instead they storm the place, taking turns at asking "can I go now" and sleeping for (tens, hundreds of) milliseconds at a time.

This only gets "worse" as computers get faster: imagine how many write transactions a serial writer could complete (WAL mode and normal synchronous mode) while all your writers are sleeping after the previous one left.

If you have a single limited pool, your readers will now be stuck waiting for an available connection taken by sleeping writers etc.

It's fairer and more efficient if your application handles the writer queue.


This may be a good approach, but it departs a little from "it's just a file with ACID semantics, abuse it all you like". If you have multiple reader/writer processes then, if I read your post correctly, you'd need a gatekeeper process that batches the writes. And it needs, I suppose, to support more than query+data. I remember running a transaction that inserts the data, queries last inserted rowid and returns that to the user. It's not super straightforward to implement via a separate process.

But in any case, by the time you do that, you need a monitored (hand-rolled) service and at least some of the allure of a db-in-a-file goes away.

Again I'm not being mean about SQLite, it's a great piece of technology. Just sharing my war stories with others who may want to push it hard one day too.


SQLite, for me at least, is an embedded database it should be relatively tightly coupled to your application if you want to get the most out of it in the context of web servers. After all it's part of your application.

Again, maybe it's because I'm using Clojure on the JVM (with both real and virtual threads as well as bunch of great concurrent data-structure). But, setting up a "process" (thread) to do that batching is not hard, it's also easy for the individual functions to return results to their call sites via the Java promise API (after the batch completes).

All of this runs in a single process. With a single deployable uberjar/artefact.


> setting up a "process" (thread) to do that batching is not hard

Processes and threads are different things with specific meanings.

How would this approach work in an application server that implements concurrency with forking/multiple child processes (Python, ruby, PHP, node cluster, and many many more)?

The biggest benefit of SQLite is that, as GP says, it can be used like a file. Many places I’ve worked have used it to coordinate between all sorts of processes—often different processes written in different languages!


>How would this approach work in an application server that implements concurrency with forking/multiple child processes (Python, ruby, PHP, node cluster, and many many more)?

Sadly it mostly doesn't without a lot of work. You need to use languages that are not single threaded (have native multithreading support): Clojure/Java/Go/C# etc.

It's why python/ruby/php/node etc are fundamentally constrained. I'd argue those languages are the ones that pushed the horizontal scaling trend as they struggle to get the most out of a single machine.


From your experience, would you call these behaviors bugs, or are they more known issues that result from SQLites specific implementation quirks? What kinds of workloads were you throwing at it when these types of issues happened? Asking as someone who really enjoys and respects SQLite but hasn't encountered these specific behaviors before.


I was pushing SQLite quite hard. My DB was at peak 25GB or so. Occasional queries of O(1e6) rows while simultaneously inserting etc. Many readers and a few writers too. Id expect some degradation, sure, but Id say it wasn't very graceful.

I think, however, I was well within the parameters that SQLite maximalists would describe as within th envelope of heavy but fine usage. YMMV.

I found a very small number of people online with the exact same issues. Enough to know I'm not hallucinating, but not enough to find good support for this :/ but, TLDR, forcing WAL truncation regularly fixed it all. But I had to do it from an external process on a heartbeat, etc etc


You don't need to truncate the WAL, you can checkpoint PASSIVE and the WAL will be overwritten (so your queries won't slow). Generally if you're using litestream for backups it will do checkpointing for you. If you aren't depending on the after each batch (always be batching!) works well too.

I'd say the hardest part of using SQLite is its defaults are rough, and a lot of drivers don't handle batching for you.


I was in love with sqlite too, until it just started getting randomly corrupted/locked and I kept having to restore it, and I never worked out why it was happening.

I appreciate its "simplicity" but ultimately I hated not knowing why it occasionally just shit the bed and ended up in an unrecoverable state. I also didn't like having to roll my own recovery system for it. Now I just use Postgres for all my hobby projects and it "just works" and I've never had it lock-up or corrupt itself...

Your mileage may vary, but sqlite definitely isn't as stable as it makes it seem.


WAL2 might help with the checkpointing problem: https://sqlite.org/src/doc/wal2/doc/wal2.md


Iirc it was somewhat experimental when I last used SQLite like this. Is is not considered mainstream? Thanks for the top tip.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: