Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
New JSON query operators in SQLite 3.38.0 (tirkarthi.github.io) similar stories update story
2 points by xtreak29 | karma 617 | avg karma 7.09 2022-02-27 00:57:37 | hide | past | favorite | 36 comments



view as:

Well that’s exciting, congratulations to the SQLite team. The new type checking and json support makes SQLite a pretty amazing option for even more use cases.

Nice, I love using JSON query with SQLite, greatly simplifies object hydration.

I've been using it successfully for years on Android with sqlite-android [0] and the flexibility it has given me was quite a relief. It's great to see that it is now included by default.

[0] https://github.com/requery/sqlite-android


Brilliant article as always.

Thanks

I would love for sqlite to add import of JSON. They have great CSV import and export. If we could easily import, say, JSON arrays to a single column table, it would make JSON conversion to/from CSV easier. Especially since jq is slow and unmaintained.

Me too.

There are couple of threads related to json import but a default mode will be helpful as there is already .mode json for output and import support will be great.

https://sqlite.org/forum/forumpost/567caa57234b98e46b885bcf8...

https://sqlite.org/forum/forumpost/f4b44e09cea20851014420a8b...

sqlite-utils is also a helpful project for various utilities : https://sqlite.org/forum/forumpost/ff5b6198eb6e794d2f774ce06...

I guess you are trying to do below from your comment.

  $ echo '["Kate", "John", "Jim"]' > names.json
  $ ./sqlite3
  SQLite version 3.38.0 2022-02-22 18:58:40
  Enter ".help" for usage hints.
  Connected to a transient in-memory database.
  Use ".open FILENAME" to reopen on a persistent database.
  sqlite> create table users(id integer primary key, name text);
  sqlite> insert into users(name) select value from json_each(readfile('names.json'));
  sqlite> select * from users;
  1|Kate
  2|John
  3|Jim

Is this:

  select value from json_each(readfile('names.json'))
Built in to (standard) sqlite? It's a little unclear from your comment?

It seems readfile is from an extension but present in cli. json_each is present in standard sqlite core from 3.38.0 as json1 extension is also now part of core.

json_each : https://www.sqlite.org/json1.html#jeach

readfile : https://sqlite.org/cli.html#file_i_o_functions

> Note that the readfile(X) and writefile(X,Y) functions are extension functions and are not built into the core SQLite library. These routines are available as a loadable extension in the ext/misc/fileio.c source file in the SQLite source code repositories.


json1 being part of core is pretty big for me. I won’t have to build one anymore.


Sqlite utils is amazing. It just removes a lot of frictions from one off data tasks.

Shameless self promotion but this is already possible using my shell https://github.com/lmorg/murex

It works with YAML, TOML, JSON, jsonlines, CSV, and regular shell command output. You can import from any data format and convert to any other data format and even in line SQL relational look ups too.

Since SQL inlining literally just imports the data into an in memory sqlite3 database it means you can do your JSON import into sqlite3 using this shell. And in fact I did literally just this last month when using a cloud service restful API which returned two different JSON docs that needed to restructured into two different tables and then a relational query run between them.


The The first first and second second is the same t

How efficient is this? Is JSON stored as a blob and parsed and filtered on each query?

Yes.

If you look at the sample data section, you’ll see an example of the data they’re using.

It’s mainly to parse and work with json data that’s stored.


In case of PostgreSQL there is json and jsonb. For SQLite, hexdump of the database shows text representation and seems to be stored like json than jsonb. I am not aware of the full design and source code but it seems some functions parse and cache the JSON representation.

https://github.com/sqlite/sqlite/blob/a0318fd7b4fbedbce74f13...

https://www.postgresql.org/docs/current/datatype-json.html

> The json and jsonb data types accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.


> In case of PostgreSQL there is json and jsonb. For SQLite, hexdump of the database shows text representation and seems to be stored like json than jsonb. I am not aware of the full design and source code but it seems some functions parse and cache the JSON representation.

Wouldn’t a jsonb-type storage require a new storage mode for sqlite, which would be a major architectural change?

JSONB derives (at least logically and historically) from hstore, so postgres had a history of structured type stores. Not so for sqlite.


jsonb and hstore are both just an encoding format for the data within a single field and do not affect either the architecture nor the storage model of the database.

https://www.sqlite.org/json1.html#interface_overview mentions that a binary encoding of JSON was experimented with, but no significant speedup or space savings were observed. If the functions are parsing and caching JSON that might explain why the performance is so close.

> SQLite does not (currently) support a binary encoding of JSON. Experiments have been unable to find a binary encoding that is significantly smaller or faster than a plain text encoding. (The present implementation parses JSON text at over 1 GB/s.) All JSON functions currently throw an error if any of their arguments are BLOBs because BLOBs are reserved for a future enhancement in which BLOBs will store the binary encoding for JSON.


Still one of the most useful things I’ve ever come across. Glad to see SQLite development continuing and being amazing.

I did some research recently to figure out exactly how the path syntax used by the various SQLite JSON functions and operators works: https://til.simonwillison.net/sqlite/json-extract-path

Thanks, your sqlite-utils and other libraries have been very helpful.

These look like they match what postgres uses? Not sure if postgres invented it, or got it from somewhere else? Time for some specs/standards for JSON operators in SQL?

Yeah, the SQLite team cited PostgreSQL as the inspiration here.

I asked them if they'd considered the most recent syntax added by PostgreSQL - "where jsoncol["prop"] = 'x'" - but sadly that was incompatible with other existing SQLite syntax.



One comment there points out some differences with the operators in pg, which I don't entirely follow honestly.

https://sqlite.org/forum/forumpost/a0d681bb2432eebc


hi I was the one who made that comment. The JSON incompatibilities I was referring to at that time was for an older version of the JSON proposal. The current version of the JSON operator in SQLite mimics Postgres (and MySQL) perfectly and I'm very happy about that.

In the older version of the proposal, -> was identical to ->> except ->> returned NULL on malformed JSON (while -> raised an error). Both -> and ->> would automatically convert a JSON-encoded SQL string '"like this"' into an SQL string 'like this'. This is not how the -> operator behaves in Postgres and MySQL, and my examples were simply trying to point out that incompatibility.


Nice, thanks!

I also didn't realize MySQL supported these JSON query operators now, huh. Since MySQL 2.7.9 in 2015 apparently? Not that new!

https://dev.mysql.com/doc/refman/5.7/en/json-search-function...


I didn't even KNOW about that new syntax in postgres! I'll be using it now that you've pointed it out, thanks.

This is interesting; I didn't real it carefully enough to know for sure, but one of the big challenges in moving from SQL to semistructed data like JSON is being able to do queries into deep fields (combined with repeating fields).

This release of sqlite also has JSON support on by default, so JSON is much more likely in whatever sqlite build happens to be in the software you're using.

Legal | privacy