0 votes
1 view
in Big Data Hadoop & Spark by (3k points)
What does 'throughput' mean in Hadoop?

1 Answer

0 votes
by (6.6k points)

In general terms, 'throughput' is defined as the net amount of work done per unit time. And Hadoop is a higher throughput when it concerns large datasets due to the following reasons:

  • HDFS in Hadoop uses a Write Once Read Many model. The principle implies that once data is written, it can't be modified. This sort of implement simplifies any sort of data coherency issue and in turn, provides high throughput.
  • Secondly, Hadoop follows a Data Locality principle. Normally, we move data to a location where we process it. In Hadoop, the data is processed at its local storage place. This reduces the overhead of data transport/transfer and increases overall throughput. 

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...