+10 votes
4 views
in Big Data Hadoop & Spark by (1.4k points)
I tried reading from various sources but I am still not very clear about their difference. What majorly seperates Mongodb and Hadoop?

2 Answers

+13 votes
by (13.2k points)
edited by
 
Best answer

The Difference is -

S.NO.

MongoDB

Hadoop

1

It provides a lot of sturdy answers, a lot of versatile then Hadoop. It Will replace existing RDBMS.

The most important strength of Hadoop is that it’s engineered to handle massive data. It’s wonderful for handling batch processes and long-running ETL jobs.

2

Stores data in collections, every data fields may be queried promptly. Data is held on as Binary JSON or BSON and is accessible for querying, aggregation, indexing, and replication.

Consists of different software, the important components are the Hadoop Distributed File System (HDFS) and MapReduce.

3

It is truly a database and is written in C++

Collection of various package that makes processing framework. Its Java primarily based application.

4

Designed to the method and analyze the immense volume of data.

It’s a database, Primarily designed for data storage and retrieval.

5

Major grievance relating to MongoDB is fault tolerance issue, which may result in data loss.

It depends in the main on ‘Name Node’, that is that the sole purpose of failure

0 votes
by (33.2k points)
edited by

MongoDB is a NoSQL database, whereas Hadoop is a framework for storing & processing Big Data in a distributed environment. 

MongoDB

MongoDB is a document-oriented NoSQL database. MongoDB stores data in flexible JSON like document format. The fields can vary from document to document, and it gives you the flexibility to change the schema at any time. MongoDB is a distributed database, so it provides high availability & horizontal scalability. You can perform real-time aggregations & ad-hoc querying. You can easily map the documents to your applications. 

To know more go through this blog:

https://intellipaat.com/tutorial/mongodb-tutorial/

Hadoop

Hadoop is a collection of software which is used to store & process big data. HDFS (Hadoop Distributed File System) is the storage part of Hadoop. HDFS file system stores data in a distributed environment, so that data can be processed in a parallel manner. YARN (Yet Another Resource Negotiator) is the resource manager in Hadoop. YARN is the one which allocates resources to the various job which are getting submitted to Hadoop. 

I would recommend you to go through these Hadoop Tutorial & Hadoop ecosystem blog:

https://intellipaat.com/tutorial/hadoop-tutorial/

I hope this answer helps you!

You can also check out this tutorial video which will teach you Hadoop from scratch: 

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...