We compared whether removing keys, originally stored in the json data column from the resources table, and creating columns for each of the keys would be more efficient than keeping them in the data column and applying jsonb indexing.
Our table of interest, resources, has 450327 rows.
We are looking at the following queries which will need to use different operators depending on what index we choose (using the GIN operator class requires specific operators in query)
Query A (with Gin index):
select data->'namespace' as namespace from search.resources where data @> '{"kind" : "Pod"}'
Query B (no index):
select data->'namespace' from search.resources where data ->> 'kind'= 'Pod'
We also created an index on two columns using the btree operator and use the following query:
Query C (BTREE Gin index):
select data->'namespace' as namespace from search.resources where data @> '{"kind" : "Pod"}' AND uid = 'someuniqueid';
which has alternative operator:
Query D (no index):
select data->'namespace' as namespace from search.resources where data ->> 'kind'= 'Pod' AND uid = 'someuniqueid';
index on | index type | OC | cost | query |
---|---|---|---|---|
no index | - | - | 298 ms | B |
no index | - | - | 285 ms | A |
data | GIN | jsonb | 276 ms | A |
data & kind key | GIN | jsonb | 248 ms | A |
key kind | GIN | jsonb | 288 ms | A |
no index | - | - | 98 | D |
key & uid | BTREE GIN | jsonb | 97 ms | C |
The following two queries were used for testing non-index key columns:
Query D:
SELECT namespace from search.resources where kind = 'Pod';
Query E:
SELECT namespace from search.resources where kind = 'Pod' AND uid = 'someuniqueid';
index on | index type | OC | cost | query |
---|---|---|---|---|
no index | - | - | 435 ms | D |
no index | - | - | 204 ms | E |
We can see from the results above, indexing jsonb data is more efficient than adding new columns to the table schema.