0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

I have installed cloudera CDH 5 by using cloudera manager.

I can easily do

hadoop fs -ls /input/war-and-peace.txt
hadoop fs -cat /input/war-and-peace.txt


this above command will print the whole txt file on the console.

now I start the spark shell and say

val textFile = sc.textFile("hdfs://input/war-and-peace.txt")
textFile.count


Now I get an error

Spark context available as sc.

scala> val textFile = sc.textFile("hdfs://input/war-and-peace.txt")


2014-12-14 15:14:57,874 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(177621) called with curMem=0, maxMem=278302556
2014-12-14 15:14:57,877 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0 stored as values in memory (estimated size 173.5 KB, free 265.2 MB)
textFile: org.apache.spark.rdd.RDD[String] = hdfs://input/war-and-peace.txt MappedRDD[1] at textFile at <console>:12

scala> textFile.count


2014-12-14 15:15:21,791 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 0 time(s); maxRetries=45
2014-12-14 15:15:41,905 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 1 time(s); maxRetries=45
2014-12-14 15:16:01,925 INFO  [main] ipc.Client 

…………………………..

………………………………………………...

java.net.ConnectException: Call From dn1home/192.168.1.21 to input:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
        at org.apache.hadoop.ipc.Client.call(Client.java:1415)


Why did I get this error? I am able to read the same file by using hadoop commands?

1 Answer

0 votes
by (31.4k points)
edited by

Here is the solution

Instead of using:

val textFile = sc.textFile("hdfs://input/war-and-peace.txt")

Use this as you fire up your Spark-shell:

sc.textFile("hdfs://nn1home:8020/input/war-and-peace.txt")

How did I find out nn1home:8020?

Just search for the file core-site.xml, the core-site.xml is always located in the conf directory either on the local or cluster installation of Spark and then look for xml element fs.defaultFS.

If you want to know more about Hadoop, then do check out this awesome video tutorial:

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...