hive explode list from json-string

hive struct to json string
hive to json conversion
presto json explode
hive convert string to struct
lateral view inline
hive posexplode
hive split string into rows
lateral view explode with where clause

I have table with jsons:

CREATE TABLE TABLE_JSON (
  json_body string
 );

Json has structure:

{ obj1: { fields ... },  obj2: [array] }

I want to select all elements from array, but I can't.

For example, I can get all fields from first object:

SELECT f.fields...
    FROM (
        SELECT q1.obj1, q1.obj2
        FROM TABLE_JSON jt
        LATERAL VIEW JSON_TUPLE(jt.json_body, 'obj1', 'obj2') q1 AS obj1, obj2
      ) as json_table2
    LATERAL VIEW JSON_TUPLE(TABLE_JSON.obj1, 'fields...') f AS fields...;

But with array this method doesnt work.

I've tried to use

...
    LATERAL VIEW explode(json_table2.obj2) adTable AS arr;

hive explode doc

But obj2 - string with array. How to transform string-json to array and explode it?

The json_split UDF from Brickhouse ( http://github.com/klout/brickhouse ) can convert a JSON array to a Hive List, and then you can explode that.

See http://mail-archives.apache.org/mod_mbox/hive-user/201406.mbox/%3CCAO78EnLgSrrUY3Ad_ZWS9zWNKLQRwS9jXrqEE869FhUNiWgCXA@mail.gmail.com%3E and https://brickhouseconfessions.wordpress.com/2014/02/07/hive-and-json-made-simple/

Solved: Trying to use Hive EXPLODE function to "unfold" a , We are using NIFI to bring data in from JSON messages into Hortonworks HDFS. Our JSON Trying to use Hive EXPLODE function to "unfold" an array within a field into multiple rows. Solved Go to Research led me to believe that my "array​" is, in fact, being stored as a string. I've been For a complete list of trademarks,. Hive get_json_object Syntax. Following is the syntax of get_json_object function available in Hive. get_json_object(jsonString, '$.key') Where, jsonString is a valid json string. $.key is a key of a value that you are trying to extract. For example, consider below simple example to extract name from json string using get_json_object function.

You can consider using Hive-JSON SerDe to read the data from JSON.

Refer: https://github.com/rcongiu/Hive-JSON-Serde

Parse Json in Hive Using Hive JSON Serde, Since then I have also learnt about and used the Hive-JSON-Serde. them using the lateral view along with the explode() UDF provided by hive. Each JSON record contains the customerId, age and a list of trips taken by the customer. Hive table sampling explained with examples; Hive Bucketing with examples; Hive Partition by Examples; Hive Bitmap Indexes with example; Hive collection data type example; Hive built-in function explode example; Hive alter table DDL to rename table and add/repla Load mass data into Hive; Work with beeline output formating and DDL generat

This may not be an optimal solution but can help unblock you. For a JSON object which looks like below

'{"obj1":"field1","obj2":["a1","a2","a3"]}'

this query can help you obtain all items of array into individual columns given that the size of the array is constant across all rows.

    SELECT split(results,",")[0] AS arrayItem1,
       split(results,",")[1] AS arrayItem2,
       regexp_replace(split(results,",")[2], "[\\]|}]", "") AS arrayItem3
    FROM
       (SELECT split(translate(get_json_object(TABLE_JSON.json_body,'$.obj2'), '"\\[|]|\""',''), "},") AS r
       FROM TABLE_JSON) t1 LATERAL VIEW explode(r) rr AS results

It produces the result which looks like this

arrayitem1| arrayitem2| arrayitem3
a1        | a2        | a3

You can scale it to any number of array size on a condition that size is constant across the table.

Hive and JSON made simple, Often we'll have a string containing a JSON array, or a JSON map, and we simply want to interpret them as a Hive list or map. That's what `json_split` and `​json_map` lateral view explode(. json_split(' [ "val1", "val2", "val3" ] '). )  Hive has way to parse array data type using LATERAL VIEW. Use LATERAL VIEW with UDTF to generate zero or more output rows for each input row. Explode is one type of User Defined Table Building Function. So Lateral view first applies the UDTF(e.g Explode()) to input rows and then joins the resulting output rows back…

pythian/hive-json-split: Simple UDF to split JSON arrays , Contribute to pythian/hive-json-split development by creating an account on the string to the following array of structs, which are exploded into individual  hive-json-split. A simple UDF to split JSON arrays into Hive arrays. Building. Check out the code and run. mvn package to build an uberjar with everything you need. Split UDF. The split UDF accepts a single JSON string containing only an array. In the Hive CLI:

LanguageManual UDF - Apache Hive, NULL if A or B is NULL, TRUE if string A matches the SQL simple regular For a list of supported UDFs, see Mathematical UDFs in Hive Data Types. Extracts json object from a json string based on json path specified, and returns json string of the from ( select 0) t lateral view explode(array( 'A' , 'B' , 'C' )) tf as col;  Hive already has some builtin mechanisms to deal with JSON, but honestly, I think they are somewhat awkward. The `get_json_object` UDF allows you to pull out specific fields from a JSON string, but requires you to specify with XPATH, which can become hairy, and the output is always a string.

Querying JSON records via Hive, CREATE TABLE json_table ( json string ); LOAD DATA LOCAL many orders there are and we want a list of all a user's order Ids? This will work: Tried with lateral view explode function but it created 4 rows in your case if I  This happens when the UDTF used does not generate any rows which happens easily with explode when the column to explode is empty. In this case the source row would never appear in the results. OUTER can be used to prevent that and rows will be generated with NULL values in the columns coming from the UDTF.

Comments
  • is the size of array fixed across rows?