Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
+6 votes
3 views
in Big Data Hadoop & Spark by (1.5k points)

How can I find the size of a directory, using Hadoop?

3 Answers

+14 votes
by (13.2k points)

You can use the hadoop fs -ls command”.

This command displays the list of files in the current directory and all it’s details.In the output of this command, the 5th column displays the size of file in bytes.

For e.g.

command hadoop fs -ls     

input gives following output:

Found 1 items

-rw-r--r--   1 hduser supergroup  36789 2012-07-19 20:57 /user/hduser/input/shivangi

The size of file shivangi is 36789 bytes.

0 votes
by (33.1k points)

hadoop fs -du -s -h /path/to/dir

The above displays a directory's size in a readable form.

Do check out this tutorial video if you want to learn from scratch:

0 votes
by (1.9k points)

To find the size of a directory in HDFS with Hadoop, try these commands:

Basic Disk Usage with -du:

hadoop fs -du /path/to/directory

This lists the size of each file and folder inside the specified directory.

Get a Total Size Summary with -du -s:

hadoop fs -du -s /path/to/directory

Adding -s gives you the combined size of everything in the directory, instead of individual files.

Check Available Space with -df -h:

hadoop fs -df -h /path/to/directory

This displays the used and available storage in a readable format.

These commands are handy for checking how much HDFS storage you’re using

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...