Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.9k points)

Can anyone give a detailed analysis of memory consumption of NameNode? Or is there some reference material ? Can not find material in the network.

Thank you!

1 Answer

0 votes
by (32.1k points)

There are several technical limits to the NameNode (NN), and facing any of them will limit your scalability.

  1. Memory. NN consumes about 150 bytes per each block. From here you can calculate how much RAM you need for your data. 
  2. IO. NN is doing 1 IO for each change to the filesystem (like create, delete block etc). So your local IO should allow enough. It is harder to estimate how much you need. Taking into account the fact that we are limited in a number of blocks by memory you will not claim this limit unless your cluster is very big. If it is - consider SSD.
  3. CPU. Namenode has considerable load keeping track of the health of all blocks on all DataNodes. Each DataNode once a period of time reports the state of all its block. Again, unless the cluster is not too big it should not be a problem.

Browse Categories