In Hadoop, we use commodity hardware for large clusters. The downsides to using commodity hardware are frequent failures and data node crashes. This is why data is replicated to increase the fault tolerance of the system.
Hadoop provides us with the advantage of scaling up the data nodes for high volume storage. This is why nodes are constantly added and removed in Hadoop clusters. It entirely depends on the volume and the health of nodes.