0 votes
1 view
in Big Data Hadoop & Spark by (11.9k points)

I'm trying to understand if there is something wrong with my Hadoop cluster. When I go to web UI in cluster summary it says:

Cluster Summary

XXXXXXX files and directories, XXXXXX blocks = 7534776 total.

Heap Size is 1.95 GB / 1.95 GB (100%) 

And I'm concerned about why is this Heap size metric at 100%

Could someone please provide some explanation how namenode heap size impact cluster performance. And whether this needs to be fixed.

1 Answer

0 votes
by (31.6k points)

The namenode Web UI shows the values as this:

<h2>Cluster Summary (Heap Size is <%= StringUtils.byteDesc(Runtime.getRuntime().totalMemory()) %>/<%= StringUtils.byteDesc(Runtime.getRuntime().maxMemory()) %>)</h2>

The Runtime documents these as:

  • totalMemory() Returns the total amount of memory in the Java virtual machine.
  • maxMemory() Returns the maximum amount of memory that the Java virtual machine will attempt to use

Max is going to be the -Xmx parameter from the service start command. The total memory main factor is the number of blocks in your HDFS cluster. The namenode requires ~150 bytes for each block, +16 bytes for each replica, and it must be kept in live memory. So a default replication factor of 3 gives you 182 bytes, and you have 7534776 blocks gives about 1.3GB. Plus all other non-file related memory in use in the namenode, 1.95GB sounds about right. I would say that your HDFS cluster size requires a bigger namenode, more RAM. If possible, increase namenode startup -Xmx. If maxed out, you'll need a bigger VM/physical box.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !