postgresql - How do I filter jsonb with multiple criteria? -


i have following table structure:

create table mytable (   id   serial primary key,   data jsonb ); 

and following data (partial brevity...notice randomness of years , sales/expense yrs don't align each other):

insert mytable (data) values ('{"employee": "jim romo",   "sales": [{"value": 10, "yr": "2012"}, {"value": 5, "yr": "2013"}, {"value": 40, "yr": "2014"}],  "expenses": [{"value": 2, "yr": "2007"}, {"value": 1, "yr": "2013"}, {"value": 3, "yr": "2014"}],   "product": "tv", "customer": "1", "updated": "20150501" }'), ('{"employee": "jim romo",   "sales": [{"value": 10, "yr": "2012"}, {"value": 5, "yr": "2013"}, {"value": 41, "yr": "2014"}],  "expenses": [{"value": 2, "yr": "2009"}, {"value": 3, "yr": "2013"}, {"value": 3, "yr": "2014"}],   "product": "tv", "customer": "2", "updated": "20150312" }'), ('{"employee": "jim romo",   "sales": [{"value": 20, "yr": "2012"}, {"value": 25, "yr": "2013"}, {"value": 33, "yr": "2014"}],  "expenses": [{"value": 8, "yr": "2012"}, {"value": 12, "yr": "2014"}, {"value": 5, "yr": "2009"}],   "product": "radio", "customer": "2", "updated": "20150311" }'), ('{"employee": "bill baker",   "sales": [{"value": 1, "yr": "2010"}, {"value": 2, "yr": "2009"}, {"value": 3, "yr": "2014"}],  "expenses": [{"value": 3, "yr": "2011"}, {"value": 1, "yr": "2012"}, {"value": 7, "yr": "2013"}],   "product": "tv", "customer": "1", "updated": "20150205" }'), ('{"employee": "bill baker",   "sales": [{"value": 10, "yr": "2010"}, {"value": 12, "yr": "2011"}, {"value": 3, "yr": "2014"}],  "expenses": [{"value": 4, "yr": "2011"}, {"value": 7, "yr": "2009"}, {"value": 4, "yr": "2013"}],   "product": "radio", "customer": "1", "updated": "20150204" }'), ('{"employee": "jim romo",  "sales": [{"value": 22, "yr": "2009"}, {"value": 17, "yr": "2013"}, {"value": 35, "yr": "2014"}],  "expenses": [{"value": 14, "yr": "2011"}, {"value": 13, "yr": "2014"}, {"value": 8, "yr": "2013"}],   "product": "tv", "customer": "3", "updated": "20150118" }') 

for each employee need evaluate most-recently updated row , find employees 2014 tv sales greater 30. there need further filter employees average tv expenses of less 5. average need take of tv expenses , not latest row.

my expected output 1 row:

employee    | customer | 2014 tv sales   |  2013 avg tv expenses ------------+----------+-----------------+---------------------- jim romo    |    1     |   40            |  4 

i can (kindof) 1 or other not both:

a. 2014 sales > 30 (but unable recent "tv" sales ;(

select * mytable (select (a->>'value')::float     (select jsonb_array_elements(data->'sales') a) b      @> json_object(array['yr', '2014'])::jsonb) > 30 

b. avg 2013 expense (this needs avg tv expenses)

select avg((a->>'value')::numeric)     (select jsonb_array_elements(data->'expenses') mytable) b   @> json_object(array['yr', '2013'])::jsonb 

edit: potentially large table commentary on performance , indexing needs appreciated new both postgresql , jsonb.

edit #2: i've tried both answers , neither seem efficient large table ;(

this (rather lengthy) answer problem. comments inside query should explain different pieces. basic ideas followed are: 1) keep simple each operation, try first produce correct result, optimize; 2) trasform as possible (but not much) json structure in more "relational like" structure, since relations have more powerful operators json data in postgres. of corse there space simplify query , produce more efficient version, @ least starting point.

with mytable1   -- transform table in more "relational-like" structure (just clarity)   (select id, data->>'employee' employee, data->>'product' product,        (data->>'updated')::integer updated, (data->>'customer')::integer customer,           data->'sales' sales, data->'expenses' expenses     mytable), avg_exp_for_2013_tv -- find average expenses tv in 2013 each employee    (select employee, avg(expenses.value) avg2013_expenses     mytable1 , jsonb_to_recordset(expenses) expenses(yr text, value float)     product = 'tv' , expenses.yr = '2013'     group employee), most_recent_updates_employees  -- find recent updates each employee     (select employee, max(updated) updated     mytable1 t1     group employee), most_recent_updated_rows   -- find rows recent updates    (select t1.*     mytable1 t1, most_recent_updates_employees m     t1.employee = m.employee , t1.updated = m.updated), employees_with_2014_tv_sales_gt_30    (select employee, customer, sales.value sales_value     most_recent_updated_rows m, jsonb_to_recordset(m.sales) sales(yr text, value float)     yr = '2014' , value > 30) select e1.employee, e1.customer, e1.sales_value "2014 tv sales", e2.avg2013_expenses "2013 avg tv expenses" employees_with_2014_tv_sales_gt_30 e1, avg_exp_for_2013_tv e2 e1.employee = e2.employee , avg2013_expenses < 5 

Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -