Parse json arrays using HIVE -
i have many json arrays stored in table (jt) looks this:
[{"ts":1403781896,"id":14,"log":"show"},{"ts":1403781896,"id":14,"log":"start"}] [{"ts":1403781911,"id":14,"log":"press"},{"ts":1403781911,"id":14,"log":"press"}] each array record.
i parse table in order new table (logs) 3 fields: ts, id, log. tried use get_json_object method, seems method not compatible json arrays because null values.
this code have tested:
create table logs  select get_json_object(jt.value, '$.ts') ts,  get_json_object(jt.value, '$.id') id, get_json_object(jt.value, '$.log') log jt; i tried use other functions seem complicated. thank you! :)
update! solved issue performing regexp:
create table jt_reg select regexp_replace(regexp_replace(value,'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') valuereg  jt;   create table logs  select get_json_object(jt_reg.valuereg, '$.ts') ts,  get_json_object(jt_reg.valuereg, '$.id') id, get_json_object(jt_reg.valuereg, '$.log') log ams_json_reg; 
i ran problem, json array stored string in hive table.
the solution bit hacky , ugly, works , doesn't require serdes or external udfs
select         get_json_object(single_json_table.single_json, '$.ts') ts,        get_json_object(single_json_table.single_json, '$.id') id,        get_json_object(single_json_table.single_json, '$.log') log ( select explode (   split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),             '"}","', '"}",,,,"'), ',,,,')       ) src_table) single_json_table; i broke lines little easier read. i'm using substr() strip first , last characters, removing [ , ] . i'm using regex_replace match separator between records in json array , adding or changing separator unique can used split() turn string hive array of json objects can used explode() described in previous solution.
note, separator regex used here ( "}"," ) wouldn't work original data set...the regex have ( "},\{" ) , replacement need "},,,,{" eg..
  split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),             '"},\\{"', '"},,,,{"'), ',,,,') 
Comments
Post a Comment