Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in AWS by (19.1k points)

I've created a ubuntu single node Hadoop cluster in EC2.

Testing a simple file upload to hdfs works from the EC2 machine, but doesn't work from a machine outside of EC2.

I can browse the filesystem through the web interface from the remote machine, and it shows one datanode which is reported as in service. Have opened all TCP ports in the security from 0 to 60000(!) so I don't think it's that.

I get the error

java.io.IOException: File /user/ubuntu/pies could only be replicated to 0 nodes, instead of 1

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1448)

at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:690)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)

at org.apache.hadoop.ipc.Client.call(Client.java:905)

at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)

at $Proxy0.addBlock(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)

at $Proxy0.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:928)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:811)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:427)

namenode log just gives the same error. Others don't seem to have anything interesting

Any ideas?

2 Answers

0 votes
by (44.4k points)

WARNING: the following can destroy ALL data on HDFS. Do not execute the steps in this answer unless you don't care about destroying existing data!

You should do this:

  1. stop all hadoop services
  2. delete dfs/name and dfs/data directories
  3. hdfs namenode -format Answer with a capital Y
  4. start hadoop services

Also, make sure about the disk space and the logs are not warning you about it.

0 votes
by (11.4k points)

 

The reason to your error describes the malfunctioning of namenode.

There might be any of the following problem:

  • Data Node disk is Full

  • Data Node is Busy with block report and block scanning

  • If Block Size is Negative value(dfs.block.size in hdfs-site.xml)

  • while write in progress primary datanode goes down

Follow the steps to resolve your problem:

1. Stop DFS completely.

2. Create a directory under root somewhere (I use Cloudera's distro,

and its default configured location for data files comes along as

/var/lib/hadoop-0.20/cache/, if you need an idea for a location) and

set it as your hadoop.tmp.dir in core-site.xml on all the nodes.

3. Reformat your NameNode (hadoop namenode -format, say Y) and restart

DFS. Things should be ok now.

Also, check the disk-space in your system and make sure the logs are not warning you about it.


 

Browse Categories

...