Remember

Register

All Courses Ask a Question

Questions
Unanswered
Ask a Question
Blog
Tutorials
Interview Questions

Back

Explore Courses Blog Tutorials Interview Questions

community
Big Data Hadoop & Spark
Free Large datasets to experiment with Hadoop

Free Large datasets to experiment with Hadoop

Free Large datasets to experiment with Hadoop

0 votes

3 views

asked Jul 7, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)
edited Jul 7, 2019 by Aarav

Do you know any large datasets to experiment with Hadoop which is free/low cost? Any pointers/links related is appreciated.

Prefernce:

Atleast one GB of data.
Production log data of webserver.

Few of them which I found so far:

Wikipedia dump
http://wiki.freebase.com/wiki/Data_dumps
http://aws.amazon.com/publicdatasets/

Also can we run our own crawler to gather data from sites e.g. Wikipedia? Any pointers on how to do this is appreciated as well.

big-data
hadoop

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

0 votes

answered Jul 7, 2019 by Amit Rawat (32.3k points)
edited Sep 18, 2019 by Amit Rawat

First of all, talking about adding your own crawler to get data and the Wikipedia dump dataset that you have found, I would like to add a point:

Since you are linked to the wikipedia data dumps, you can use the Bespin project to work with this data in Hadoop.

Now, I would suggest you this pool of large datasets to experiment with Hadoop, you can choose any category of a dataset based on your choice:

http://www.open-bigdata.com/category/big-data-datasets-experiment/

If you want to know more about Hadoop, here you can refer the following video tutorial:

Please log in or register to add a comment.

Related questions

0 votes

1 answer

What is the step-by-step guide to learn Big Data Hadoop?

asked Jan 9, 2020 in Big Data Hadoop & Spark by anmolj (9k points)

bigdata
big-data
#hadoop
hadoop

0 votes

1 answer

What is Big Data? How to learn Big Data Hadoop?

asked Jan 9, 2020 in Big Data Hadoop & Spark by anmolj (9k points)

hadoop
bigdata
big-data
#hadoop

0 votes

1 answer

Do I need to learn Hadoop to use Spark?

asked Nov 4, 2019 in Big Data Hadoop & Spark by vinita (108k points)

big-data
hadoop

0 votes

1 answer

unable to instantiate org.apache.hadoop.hive.ql.metadata.sessionhivemetastoreclient

asked Aug 1, 2019 in Big Data Hadoop & Spark by vinita (108k points)

hive
hadoop
big-data

0 votes

1 answer

How to run a jar file in hadoop?

asked Aug 1, 2019 in Big Data Hadoop & Spark by ashely (50.2k points)

hadoop
big-data

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.9k questions

32.9k answers

500 comments

111k users

Popular Questions

Are there any industry-specific trends in data analysis that I should be aware of during my career transition? Mar 22
How do I showcase my MIS experience to make a successful transition into data analysis? Mar 22
Are there specific tools I should learn for a successful transition from MIS Executive to Data Analyst? Mar 22
How does the salary and benefits package typically differ between associate analyst and analyst positions? Mar 22
Are there any industry-specific trends or advancements that aspiring analysts should be aware of during their career transition? Mar 22
What are common challenges faced by individuals transitioning from associate analyst to analyst, and how can they be overcome? Mar 22

Browse Categories

Master Program
Big Data
Data Science
Business Intelligence
Salesforce
Cloud Computing Courses
Mobile Development
Digital Marketing
Database
Programming
Testing
Project Management
Web Development Courses

Popular Courses

Data Science Course | Artificial Intelligence Course | Machine Learning Course | Python Course | DevOps Course | AWS Solutions Architect | Cyber Security Course | Ethical Hacking Course | Digital Marketing Course | SQL Course | Business Analyst Course | Data Analyst course | Investment Banking Course | Electric Vehicle Course | UI UX Design Course | AWS DevOps Course | Azure DevOps Course | Full Stack Developer Course | Data Analytics Course | Python Data Science Course | Salesforce Certification Course | Informatica Course

Top Tutorials

Python Tutorial | AWS Tutorial | Devops Tutorial | Data Science Tutorial | Machine Learning Tutorial | Cyber Security Tutorial | Salesforce Tutorial | Azure Tutorial | Power BI Tutorial | SQL Tutorial | Selenium Tutorial | Ethical Hacking Tutorial | Artificial Intelligence Tutorial | Digital Marketing Tutorial | UI/UX Tutorial

Top Articles

Cloud Computing | Data Science | Machine Learning | AWS | Digital Marketing | Cyber Security | Artificial Intelligence | SQL | UI/UX Design | Ethical Hacking

Top Interview Questions

Python Interview Questions | AWS Interview Questions | Data Science Interview Questions | Devops Interview Questions | Salesforce Interview Questions | Java Interview Questions | Selenium Interview Questions | Cyber Security Interview Questions | Azure Interview Questions | Power Bi Interview Questions | Software Testing Interview Questions | Data Analyst Interview Questions

© COPYRIGHT 2011-2024 INTELLIPAAT.COM. ALL RIGHTS RESERVED.

Download Salary Trends Now !

Learn how professionals like you got up to 100% Salary Hike.

By providing your contact details, you agree to our Terms of Use & Privacy Policy

...