I don’t see how this is fundamentally any different from IndexedDB: both are asynchronous transactional key-value databases. One is byte-oriented and the other object-oriented, but it looks to me like they’re essentially completely equivalent in what’s possible with them—except insofar as File System Access may allow you to avoid loading the entire thing into memory. (I say may because you can’t control the browser’s implementation; my feeling is that it’s not unlikely that the browser may keep the full thing in memory in FSA where I wouldn’t expect it to with IndexedDB, which if so would return them to basically the same position.)
Implementing SQLite on top of this would require that a commit write the changes, close the file, and then reopen it, since writes are only performed when you close the file. That could perform tolerably or terribly (there’s no way of knowing), but certainly won’t be as good as what you get natively, where you can truly write only part of a file, especially once you get to large databases where it will certainly be drastically slower. If you want performance out of any of these sorts of things, you’re going to need to stop putting everything in one “file” and split it into many, so that you can avoid touching most of them on most writes.
The original slow IndexedDB implementations were done that way, but my understanding is that anything fast and "modern" probably isn't in 2020 as most of the browsers moved to their own IndexedDB storage implementations that are not SQLite-based. IndexedDB has very different expected performance characteristics than a pure relational DB (as it is a key/value store with indexes closer to most "NoSQL" DBs than to SQLite) and it did not (and does not) make long term sense to use SQLite below the hood.
That article was written in July, and in the intervening time, Chrome's announced an intent to ship "Multiple Readers and Writers in File System Access API" aka `readwrite-unsafe` [0] in v121 (stable in 2 weeks) which could help improve SQLite's performance even more.
> I don't like databases, and the reason I don't like them is that they seem to cause terrible performance problems in web browsers.
Disk I/O is slow, so any kind of disk storage will be slow. I'm sure storing web browser's data in some kind of ad-hoc files would be much slower than SQLite. And the fsync() issues are orthogonal to using flat files/DB.
Surely what the world really needs is a new, faster implementation of IndexedDB? I propose writing it on top of this sqlite implementation, so we get the full indexedDB on sqlite on indexedDB on sqlite experience.
A basic implementation would use one document per page. An advanced integration might replace the whole sqlite storage engine with one based on IndexedDB.
I've used SQLite a bit, but not enough to say I know where you run into performance issues. I would anticipate though, if it's similar to fread/fwrite with some marginal overhead for dealing with SQL, after considering most queries aren't very complex, I think most people are going to have a hard time hitting those limits.
And that's assuming you're making queries every time an event occurs versus persisting data at particular points in time.
You have to be careful though, as SQLite doesn't support concurrent read/writes, so it blocks, which makes it rather difficult to scale up for websites, or at least it has in my experience with it.
SQLite is not a real solution for anything needing to support more than one user at a time. It's fine for things like managing bookmarks within firefox, but not for a full relational database replacement. The write access is single threaded (at least last time I used it).
To chime in with a "me too": I ported an app from Python/SQLite to JS/IndexedDB, and I also noticed about an order of magnitude decrease in performance. I think it's mainly because you have so much less control over how/when you read/write from the hard drive.
> "Multiple Readers and Writers in File System Access API" aka `readwrite-unsafe` in v121 (stable in 2 weeks) which could help improve SQLite's performance even more.
As your EDIT notes, Roy (wa-sqlite) has already experimented with this and reports great results. Experimenting with this in the sqlite project's own OPFS VFS (see https://sqlite.org/wasm) is pending, but the feature is not yet widespread enough to replace the current VFSes. We've no information on how long it will take for the other browsers to catch up with that API. Until then, the sqlite project offers two OPFS VFSes, one of which trades speed for a moderate degree of cross-tab concurrency and another which offers tremendous speed but a complete lack of concurrency.
SQLite works just fine for website backends. Writes are serialized, first from replicas to the central write leader (like in Postgres), and then with the WAL and transaction isolation.
Even in write-heavy environments, I've used SQLite effectively with multiple thousands of concurrent (as in logged in and changing things at the speed of human site interactions, not as in simultaneous web requests at the same instant) users. In this context, SQLite is effectively a write linearizer.
With appropriate pragmas and database handles stored persistently between requests in server processes, the rate (in bytes of data affected by INSERT/UPDATE/DELETE) at which users could make DB changes got pretty darn close to the rate at which a file could be sequentially written on the filesystem, which is more than enough aggregate throughput for 1000s of concurrent non-bulk changes in most web applications, I'd imagine.
I have but I'd say that SQLite is not intended for use in that scenario. It's an in-process library for persisting data to disk in a generally relational format with a SQL interface. It starts to degrade under highly concurrent read and write workloads that you can experience in client-server applications. At that point, a typical RDBMS with more robust concurrency support starts to be a better choice. I experienced that when using SQLite and we eventually moved the application to a full RDBMS which was more complex but also performed and scaled much better.
>>SQLite is not directly comparable to client/server SQL database engines such as MySQL, Oracle, PostgreSQL, or SQL Server since SQLite is trying to solve a different problem.
Client/server SQL database engines strive to implement a shared repository of enterprise data. They emphasize scalability, concurrency, centralization, and control. SQLite strives to provide local data storage for individual applications and devices. SQLite emphasizes economy, efficiency, reliability, independence, and simplicity.
SQLite does not compete with client/server databases. SQLite competes with fopen().<<
It's worth noting that when considered as competition to fopen(), SQLite is quite good.
I wonder how performance would change if using SQLite instead. You would get those range trees and indexes for free (hence much simpler implementation). Plus persistent storage, some guarantees, etc.
> Situations Where A Client/Server RDBMS May Work Bette
> • High-volume Websites
> • High Concurrency
Anecdotal Observation:
I like to use SQLite when running tests etc... because it's easy to make the data make sense, and easy to ship the resulting file, but recently I had to run a high number of concurrent writes and it slowed down my tests and made my laptop unusable. I re-wired my scripts to use PostgreSQL and it moved the bottleneck to what I was actually trying to test, and even though my disk led was blinking like crazy, my laptop was still usable. I checked what was going on and saw that postgres had spawn a whole bunch of processes...
On other workload, especially with low-concurrency, I have found SQLite actually performing really well, and comparably to postgres. You do have to tweek a few things depending on your workload, but there are a lot of very informative blogs and SO pages on this subject out there.
I did see a full copy of the SQLite amalgamation file in the FDB codebase, but you're probably right that they might be using internal APIs.
I'm still skeptical of the "tremendous performance penalty" you'd suffer from using SQLite. Just because you do fewer things doesn't necessarily mean you're faster at doing them. I've hit ~120,000 inserts/sec on SQLite without weakening any of it's durability guarantees. If you play fast and loose with fsync and WAL, I'm sure you can squeeze out even more performance.
I can also think of use-cases where you don't want the write amplification that comes with RocksDB or the memory constraints of LMDB.
I specifically cited concurrent writes. SQLite is fantastic for read-only data. Those websites are using SQLite for what it's designed for. But, if they were trying to use SQLite to do even dozens of writes per second, it would crash and burn.
Implementing SQLite on top of this would require that a commit write the changes, close the file, and then reopen it, since writes are only performed when you close the file. That could perform tolerably or terribly (there’s no way of knowing), but certainly won’t be as good as what you get natively, where you can truly write only part of a file, especially once you get to large databases where it will certainly be drastically slower. If you want performance out of any of these sorts of things, you’re going to need to stop putting everything in one “file” and split it into many, so that you can avoid touching most of them on most writes.
reply