Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I want to create a Hive table out of some JSON data (nested) and run queries on it? Is this even possible?

I've gotten as far as uploading the JSON file to S3 and launching an EMR instance but I don't know what to type in the hive console to get the JSON file to be a Hive table?

Does anyone have some example command to get me started, I can't find anything useful with Google ...

1 Answer

0 votes
by (32.3k points)
edited by

You can use  JSON Serde. You have to create a table with a structure that maps the structure of JSON. 

Then you should upload your JSON file in the location path of the table, giving the right permissions and you are good to go.

For example:

Let data.json be:

{"X": 134, "Y": 55, "labels": ["L1", "L2"]}

{"X": 11, "Y": 166, "labels": ["L1", "L3", "L4"]}

Now, create a table:

CREATE TABLE Point

(

    X INT,

    Y INT,

    labels ARRAY<STRING>

)

ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'

STORED AS TEXTFILE

LOCATION 'path/to/table';

 If you want to know more about Hive, then do check out this awesome video tutorial:

Browse Categories

...