Long before we acquired Akiban, I prototyped a sql layer using (now defunct) sqlite4, which used a K/V store abstraction as its storage layer. I would guess that a virtual table implementation would be similar: easy to get working, and it would work, but the performance is never going to be amazing.
To get great performance for SQL on FoundationDB you really want an asynchronous execution engine that can take full advantage of the ability to hide latency by submitting multiple queries to the K/V store in parallel. For example if you are doing a nested loop join of tables A and B you will be reading rows of B randomly based on the foreign keys in A, but you want to be requesting hundreds or thousands of them simultaneously, not one by one.
Even our SQL Layer (derived from Akiban) only got this mostly right - its execution engine was not originally designed to be asynchronous and we modified it to do pipelining which can get a good amount of parallelism but still leaves something on the table especially in small fast queries.
SQLite's concurrency support is irrelevant. ActorDB is a sharded by default type of database. The point of ActorDB is to split your dataset into "actors". Actor is an individual unit of storage (a sqlite database). As such your entire database is scalable across any number of nodes.
A natural fit is something like dropbox, evernote, feedly, etc. Every user has his own actor and it is a full SQL database. Which makes the kind of queries dropbox would need very easy to do.
It also has a KV store option. This is basically a large table across all your servers. This table can have foreign key sub tables. From my understanding of FoundationDB this part is closer to what they were offering.
I don’t see how this is fundamentally any different from IndexedDB: both are asynchronous transactional key-value databases. One is byte-oriented and the other object-oriented, but it looks to me like they’re essentially completely equivalent in what’s possible with them—except insofar as File System Access may allow you to avoid loading the entire thing into memory. (I say may because you can’t control the browser’s implementation; my feeling is that it’s not unlikely that the browser may keep the full thing in memory in FSA where I wouldn’t expect it to with IndexedDB, which if so would return them to basically the same position.)
Implementing SQLite on top of this would require that a commit write the changes, close the file, and then reopen it, since writes are only performed when you close the file. That could perform tolerably or terribly (there’s no way of knowing), but certainly won’t be as good as what you get natively, where you can truly write only part of a file, especially once you get to large databases where it will certainly be drastically slower. If you want performance out of any of these sorts of things, you’re going to need to stop putting everything in one “file” and split it into many, so that you can avoid touching most of them on most writes.
Unlike a regular filesystem, mvsqlite offloads both storage and transaction processing to FoundationDB. Fine-grained optimistic locking is implemented, and massive write concurrency across a cluster is possible.
> This is something missing from the SQLite ecosystem (before mvSQLite)
This is also something missing from self hosted, open source, SQL rdbms.
As far as I'm aware, this is the closest thing to Aurora and AlloyDB that I can run myself.
Also, by leveraging FoundationDBs ability to have multi-writer deployments, it gives us the potential to scale writes far beyond what Aurora and AlloyDB can handle. I'd love to see an extreme benchmark comparing a single 24xlarge Aurora writer vs a few dozen nodes in FoundationDB. The Aurora writer is likely more efficient per CPU but ultimately can't scale past 96 cpus.
They indeed use an SQL database and SQL queries to implement a key-value store. The data is stored in a regular table (CREATE TABLE example (key BLOB PRIMARY KEY, value BLOB NOT NULL). It's very unlikely that this really achieves high-performance, given that the SQLite backend is actually a key-value store and the SQL interpreter runs on top of it as an intermediate layer. To avoid some performance-drawbacks caused by this architecture they added caching.
They have even considered RocksDB and LMDB, but have rejected them with reasons that I can not relate.
I have implemented a transactional key-value store myself for my project and information management tools fifteen years ago (before the other mentioned stores became available) based on the aforementioned SQLite backend (see https://github.com/rochus-keller/Udb and https://github.com/rochus-keller/Sdb) which supports both robust integrity and high performance. Doing the same with the full SQLite just adds complexity and lowers performance with no added value.
You jest, but SQLite run in-memory is a pretty fast database. I'm currently using it for ECS implementation in a little game I'm writing on the side (albeit in Common Lisp, and not yet very performance-heavy).
The reason I went this way is because when writing my third ECS with various access pattern optimizations, I realized I'm hand-implementing database indices - so I may as well plug SQLite for now, nail the interface, and refactor it into Array-of-Struct implementation with handles later.
There has always been a case for server-side embedded databases for app servers. It can be an in-process library like SQLite or Apache Derby, but databases like SQL Anywhere and SQL Server also have local IPC protocols that achieve the same effect.
App Embedded databases are not the main use case for non-SQL transactional engines like Berkeley DB and Microsoft ESE (used for ActiveDirectory and Exchange Server).
SQLite is a better server-side engine than the critics think but it is ill suited with the default options. WAL mode and Shared Cache mode must be enabled for decent performance. The popular runtime libraries for SQLite don’t expose the required C API calls nor even the enhanced functionality of the FILE URL optional connection string.
Even when configured properly, concurrency in SQLite is limited for apps that have a hot row or page. Row level locking or snapshot isolation are sometimes a huge help but fast local reads go a long way in alleviating the need. The main obstacle, however, is that the popular web frameworks assume a wire protocol SQL engine running on a separate tier like Heroku’s 12-Factor App architecture.
I'm working on mvsqlite [1], a distributed SQLite based on FoundationDB. When doing the VFS integration I have always wanted to patch SQLite itself, but didn't because of uncertainty around correctness of the patched version...
A few features on my wishlist:
1. Asynchronous I/O. mvsqlite is currently doing its own prefetch prediction that is not very accurate. I assume higher layers in SQLite have more information that can help with better prediction.
2. Custom page allocator. SQLite internally uses a linked list to manage database pages - this causes contention on any two transactions that both allocate or free pages.
3. Random ROWID, without the `max(int64)` row trick. Sequentially increasing ROWIDs is a primary source of contention, and causes significant INSERT slowdown in my benchmark [2].
This is interesting and I hope I can find a use case for it. However, the performance compared to vanilla SQLite makes me anxious that there is a trade-off elsewhere, such as crash integrity.
I am evaluating FoundationDB, TiKV and BedrockDB. BedrockDB really looks nice. BedrockDB is built on top of SQLite and supports MySQL protocol. So that is a huge plus. But I was wondering what is the performance like and what are the gotchas, for example:
0. What happens to the local commit if the node goes down before that commit is replicated to other nodes?
1. Does the DB support multiple write nodes?
2. What is the average time it takes for a commit to be available at other nodes?
3. Under the hood it works like a blockchain, so what happens if a chain of transactions splits from the main branch? Does the DB client have to retry all those transactions again from the split chain?
What is your experience?
Also, does anyone know how to add unique constraint on a column in an existing table in SQLite?
I don't quite agree. There's a reason relational won, and continues to win more than a half-century after it was invented, despite no lack of wannabe usurpers.
For your model to work, what would be needed is the reverse: a system which allows me to interact with TSV files with SQL.
For 98% of uses of SQLite, performance and storage efficiency don't matter; we're talking about dozens or hundreds of records. It doesn't need to be well-implemented from that perspective. It does need to be robust. For example, databases solve a lot of problems especially around consistency in threaded systems which this would need to handle as well (although a KISS solution, such as handling all access from one thread / process, and simply blocking other requests during transactions, would be fine).
Something like that would likely eat SQLite's lunch.
q seems close. It'd need to support writes, table creation, and all the other stuff, perhaps be more actively maintained, provide documented API (not just command-line) access, and work with things like ORMs.
Nifty, but not incredibly novel. SQLite's VFSes have been around for some time, albeit in smaller breadth and scope. I think one thing this kind of glosses over is the notion of transactions, what if load/fs contents change between independent parts of your query, are they memoized, recomputed, etc?
Having said all that I'm going to install it and try it out because it's new and shiny.
Its an interesting stack, for sure. I'm only surprised by the sqlite choice. With such a concurrent and speedy backend, what do you do about concurrent writes and the lack of row/page level locking in sqlite?
To get great performance for SQL on FoundationDB you really want an asynchronous execution engine that can take full advantage of the ability to hide latency by submitting multiple queries to the K/V store in parallel. For example if you are doing a nested loop join of tables A and B you will be reading rows of B randomly based on the foreign keys in A, but you want to be requesting hundreds or thousands of them simultaneously, not one by one.
Even our SQL Layer (derived from Akiban) only got this mostly right - its execution engine was not originally designed to be asynchronous and we modified it to do pipelining which can get a good amount of parallelism but still leaves something on the table especially in small fast queries.
reply