• Articles
  • Tutorials
  • Interview Questions

Introduction To Hadoop Distributed File System

HDFS and its Architecture

Hadoop Distributed File System

Hadoop stores petabytes of data using the HDFS technology. Using HDFS it is possible to connect commodity hardware or personal computers, also known as nodes in Hadoop parlance. These nodes are connected over a cluster on which the data files are stored in a distributed manner. Using the power of HDFS the whole cluster and the nodes can be easily accessed for data storage and processing. The access to data is strictly on a streaming manner using the MapReduce process.

Key features of HDFS:

  • HDFS is highly resilient since upon failure the workload is immediately transferred to another node
  • It provides an extremely good amount of throughput even for gigantic volumes of data sets
  • It is unlike other distributed file systems since it is based on write-once-read-many model
  • It allows high data coherence, removes concurrency control issues and speeds up data access
  • HDFS moves computation to the place where data exists instead of the other way around
  • Thus, applications are moved closer to the point where data resides which is much cheaper, faster and improves the overall throughput.

Certification in Bigdata Analytics

The reasons why HDFS works so well with Big Data:

  • HDFS uses the method of MapReduce for access to data which is very fast
  • It follows a data coherency model that is simple yet highly robust and scalable
  • Compatible with any commodity hardware and operating system
  • Achieves economy by distributing data and processing on clusters with parallel nodes
  • Data is always safe as it is automatically saved in multiple locations in a foolproof way
  • It provides a JAVA API and even a C language wrapper on top
  • It is easily accessible using a web browser making it highly utilitarian.

Course Schedule

Name Date Details
Big Data Course 14 Dec 2024(Sat-Sun) Weekend Batch View Details
21 Dec 2024(Sat-Sun) Weekend Batch
28 Dec 2024(Sat-Sun) Weekend Batch

About the Author

Technical Research Analyst - Big Data Engineering

Abhijit is a Technical Research Analyst specialising in Big Data and Azure Data Engineering. He has 4+ years of experience in the Big data domain and provides consultancy services to several Fortune 500 companies. His expertise includes breaking down highly technical concepts into easy-to-understand content.