Skip to content

Latest commit

 

History

History
79 lines (42 loc) · 2.39 KB

benchmark.md

File metadata and controls

79 lines (42 loc) · 2.39 KB

Table of Contents

  1. Indexing

Indexing

We compared whether removing keys, originally stored in the json data column from the resources table, and creating columns for each of the keys would be more efficient than keeping them in the data column and applying jsonb indexing.

Our table of interest, resources, has 450327 rows.


Method 1: Keeping keys in data column and indexing keys using jsonb

We are looking at the following queries which will need to use different operators depending on what index we choose (using the GIN operator class requires specific operators in query)

Query A (with Gin index):

select data->'namespace' as namespace from search.resources where data @> '{"kind" : "Pod"}'

Query B (no index):

select data->'namespace' from search.resources where data ->> 'kind'= 'Pod'

We also created an index on two columns using the btree operator and use the following query:

Query C (BTREE Gin index):

select data->'namespace' as namespace from search.resources where data @> '{"kind" : "Pod"}' AND uid = 'someuniqueid';

which has alternative operator:

Query D (no index):

select data->'namespace' as namespace from search.resources where data ->> 'kind'= 'Pod' AND uid = 'someuniqueid';

index on index type OC cost query
no index - - 298 ms B
no index - - 285 ms A
data GIN jsonb 276 ms A
data & kind key GIN jsonb 248 ms A
key kind GIN jsonb 288 ms A
no index - - 98 D
key & uid BTREE GIN jsonb 97 ms C
* OC -type of GIN operator class

Method 2: Removing json keys and creating columns

The following two queries were used for testing non-index key columns:

Query D:

SELECT namespace from search.resources where kind = 'Pod';

Query E:

SELECT namespace from search.resources where kind = 'Pod' AND uid = 'someuniqueid';

index on index type OC cost query
no index - - 435 ms D
no index - - 204 ms E

Results

We can see from the results above, indexing jsonb data is more efficient than adding new columns to the table schema.