Postgres large JSON value query performance

about | archive

[ 2022-February-01 09:19 ]

Postgres supports three types for "schemaless" data: JSON (added in 9.2), JSONB (added in 9.4), and HSTORE (added in 8.2 as an extension). Unfortunately, the performance of queries of all three gets substantially slower (2-10×) for values larger than about 2 kiB, due to how Postgres stores long variable-length data (TOAST). The same performance cliff applies to any variable-length types, like TEXT and BYTEA. This article contains some quick-and-dirty benchmark results to explore how Postgres's performance changes for the "schemaless" data types when they become large. My conclusion is that you should expect a 2-10× slower queries once a row gets larger than Postgres's 2 kiB limit.

Most applications should use JSONB for schemaless data. It stores parsed JSON in a binary format, so queries are efficient. Accessing JSONB values is about 2× slower than accessing a BYTEA column. Queries on HSTORE values are slightly faster (~10-20%), so if performance is critical and string key/value pairs are sufficient, it is worth considering. Never use JSON because the performance is terrible. Compressed values makes queries take about 2× more time, and queries for values stored in external TOAST tables take about 5× more time. In cases where you need excellent query performance, you may want to consider trying to split large JSON values across multiple rows. If you are using Postgres 14 or later, you should use LZ4 compression. I didn't test it, but others have found it to use a bit more space but be signficantly faster (1, 2). This should reduce the performance penalty for compressed values.

This is a general database problem and not a Postgres problem: MySQL and others have their own performance cliffs for large values. However, Postgres's row length limit is pretty low. I suspect the 8 kiB page may be the wrong default these days. A rough rule of thumb is that smaller pages are better for workloads that read and write small values, but larger pages are likely better for queries that scan ranges. These benchmarks show that larger pages can be better for large rows, since it would increase the amount of storage before falling off this cliff. If we assume a 200 MiB/sec transfer rate and a 5 ms seek time (approximately a 2022 data center hard drive), going from 8 kiB pages to 64 kiB pages would mean random reads would go from about 5.04 ms to 5.4 ms, or an increase of about 7%, which means this would likely be a performance win for any application with large rows. For an SSD with a 75 µs random read latency and throughput of 3000 MiB/sec (approximate 2022 numbers), it would increase from about 77 µs to 95 µs, which is a 23% increase. This is large enough that it may hurt some workloads. Still: it seems like something that might be worth investigating for the Postgres maintainers.

How Postgres stores large values

Postgres stores rows in 8 kiB pages. It attempts to store a minimum of 4 tuples per page, which means a tuple has a maximum size of a bit less than 2 kiB. The exact number is 2032 bytes: (8196 byte page - 24 byte page header - 4×4 byte item offsets) / 4, rounded down to 8 byte alignment (see heaptoast.h). When a row is larger than this threshold, Postgres first tries to compress variable-length fields. If the tuple is still too large, it stores the variable-length fields in a separate table using a mechanism called TOAST (The Oversized-Attribute Storage Technique). This causes the performance of queries that need to read these values to get substantially worse. For more details, see the Postgres documentation on page layout, or a less formal article, particularly the section "How The Rows are Stored".

Query performance benchmark

To compare the performance of queries on the three "schemaless" types, I created a table with 1M rows, each containing a single column with 10 key/value pairs. The keys are 96 bytes, but the length of the values and the contents were tweaked to trigger the four different storage types: inline uncompressed, inline compressed, toast uncompressed, and toast compressed. The number of key/value pairs is the same, and the total amount of data is very close. For example, to go from inline uncompressed storage to inline compressed storage, I add a single byte to a compressible payload. The toast uncompressed payload is the same length, but stores a random payload that is not compressible. The toast compressed payloads required a fair amount of manual experimentation to try and find the smallest compressed payload that is larger than the 2032 byte threshold.

The tests were run on a GCP Postgres "Cloud SQL" instance (4 vCPUs; 15 GiB RAM; 100 GiB SSD Storage; PostgreSQL 13.4). I disabled parallel queries (SET max_parallel_workers_per_gather = 0), since I wanted to measure the amount of "work", rather than the actual wall clock query time. I measured the performance by running EXPLAIN ANALYZE (query) three times, and taking the smallest time, since I figure that will compare the "best possible" cases. This is not the most precise measurement, but the numbers were typically within about ~5%, so good enough for my purposes. The tables were always fully loaded in cache memory, so disk performance does not matter. The queries count the number of rows that contain a specific key.

Fastest query times (ms)

This table shows the fastest query times for each of the column types and sizes. In general, HSTORE is faster than JSONB which is much faster than JSON. A compressed column is generally about 2× slower than an uncompressed one, and the toast columns are about 5× than the inline ones. I also measured a query for the prefix of a BYTEA column with LIKE 'prefix%', which should be a "best case" for accessing the contents of a variable-length column. The times show that accessing a schemaless key is about 2× than reading the prefix of a BYTEA column. For details, see the benchmark source code in Github.

inline uncompressed inline compressed toast uncompressed toast compressed
HSTORE 553 1080 3437 7454
JSONB 746 1178 3393 7624
JSON 10797 11795 14011 22197
BYTEA starts with 380 606 3332 7054