Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
4 views
in Big Data Hadoop & Spark by (11.9k points)

I'm new to Hadoop. I know that the HCatalog is a table and storage management layer for Hadoop. But how exactly it works and how to use it. Please give some simple example.

1 Answer

0 votes
by (32.1k points)
HCatalog supports reading and writing files in any format for which a Hive SerDe (serializer-deserializer) can be written. By default, HCatalog supports RCFile, CSV, JSON, and SequenceFile formats. To use a custom format, you must provide the InputFormat, OutputFormat, and SerDe.

HCatalog is built on top of the Hive metastore and incorporates components from the Hive DDL. HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s command line interface for issuing data definition and metadata exploration commands.

It also presents a REST interface to allow external tools access to Hive DDL (Data Definition Language) operations, such as “create table” and “describe table”.

HCatalog presents a relational view of data. Data is stored in tables and these tables can be placed into databases. Tables can also be partitioned on one or more keys. For a given value of a key (or set of keys) there will be one partition that contains all rows with that value (or set of values).

Browse Categories

...