If you are not getting the result on the output/part-r-00000.txt file then:
Check your datanode’s configuration file hdfs-site.xml and make changes in it accordingly
You have to keep in mind that it detects the directories that you want to fetch. Be sure about the permissions and the value in dfs.datanode.data.dir parameter.
Go to etc/hadoop (inside Hadoop directory), there you will find your hdfs-site.xml file then set your dfs.datanode.data.dir as required according to your requirements
For, my Linux system following is the hadoop hdfs-site.xml file -
If still, It doesn’t work do one thing, Try to run these commands for making your daemons run and start your HDFS, if by any means it is not able to work this will hopefully start your namenode.
sudo service hadoop-hdfs-namenode start ;
sudo service hadoop-hdfs-datanode start ;
sudo service hadoop-hdfs-secondarynamenode start ;
Find your jps directory or set alias jps
alias jps='/usr/lib/jvm/jdk(version you found)/bin/jps’
Now run jps and see your datanode is working
Sudo jps
If it still doesn’t work just reinstall hadoop and verify its working :
Step 0: To Reset Cluster
sudo reset_cluster.sh
Step 1: Format the NameNode
sudo -u hdfs hdfs namenode -format
Step 2: Start HDFS
sudo service hadoop-hdfs-namenode start ;
sudo service hadoop-hdfs-datanode start ;
sudo service hadoop-hdfs-secondrynamenode start ;
# Command to check the hadoop services
sudo /usr/java/latest/bin/jps
Step 3: Create the /tmp Directory
# Create a new /tmp directory and set permissions:
sudo -u hdfs hadoop fs -mkdir /tmp
sudo -u hdfs hadoop fs -chmod -R 1777 /
Step 4: Create Staging and Log Directories
sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging
# Create the done_intermediate directory under the staging directory and set permissions:
sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging/history/done_intermediate
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging/history/done_intermediate
# Change ownership on the staging directory and subdirectory:
sudo -u hdfs hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn/staging
# Create the /var/log/hadoop-yarn directory and set ownership:
sudo -u hdfs hadoop fs -mkdir /var/log/hadoop-yarn
sudo -u hdfs hadoop fs -chown yarn:mapred /var/log/hadoop-yarn
Step 5: Verify the HDFS File Structure:
sudo -u hdfs hadoop fs -ls -R /
You should see some directory structure ( basically of YARN)
Step 6: Start YARN
sudo service hadoop-yarn-resourcemanager start
sudo service hadoop-yarn-nodemanager start
sudo service hadoop-mapreduce-historyserver start
Step 7: Create User Directories basically a home directory each MapReduce user. It is best to do this on the NameNode. Running an example application with YARN
sudo -u hdfs hadoop fs -mkdir /user/training
sudo -u hdfs hadoop fs -chown training /user/training
Make a directory in HDFS called input and copy some XML files into it by running the following commands in pseudo-distributed mode:
hadoop fs -mkdir input
hadoop fs -put /etc/hadoop/conf/*.xml input
hadoop fs -ls input
Then, you will find 4 items
input/core-site.xml , input/hdfs-site.xml , input/mapred-site.xml , input/yarn-site.xml
Now, set HADOOP_MAPRED_HOME :
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
Run your hadoop example now,
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input output23 'dfs[a-z.]+'
Now, you can read the results in the output file output/part-r-00000.txt .
You can also refer to the following video if you want more information regarding the same: