Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (11.4k points)

I have installed cloudera CDH 5 by using cloudera manager.

I can easily do

hadoop fs -ls /input/war-and-peace.txt
hadoop fs -cat /input/war-and-peace.txt

this above command will print the whole txt file on the console.

now I start the spark shell and say

val textFile = sc.textFile("hdfs://input/war-and-peace.txt")

Now I get an error

Spark context available as sc.

scala> val textFile = sc.textFile("hdfs://input/war-and-peace.txt")

2014-12-14 15:14:57,874 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(177621) called with curMem=0, maxMem=278302556
2014-12-14 15:14:57,877 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0 stored as values in memory (estimated size 173.5 KB, free 265.2 MB)
textFile: org.apache.spark.rdd.RDD[String] = hdfs://input/war-and-peace.txt MappedRDD[1] at textFile at <console>:12

scala> textFile.count

2014-12-14 15:15:21,791 INFO  [main] ipc.Client ( - Retrying connect to server: input/ Already tried 0 time(s); maxRetries=45
2014-12-14 15:15:41,905 INFO  [main] ipc.Client ( - Retrying connect to server: input/ Already tried 1 time(s); maxRetries=45
2014-12-14 15:16:01,925 INFO  [main] ipc.Client 


………………………………………………... Call From dn1home/ to input:8020 failed on connection exception: Connection refused; For more details see:
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
        at java.lang.reflect.Constructor.newInstance(

Why did I get this error? I am able to read the same file by using hadoop commands?

1 Answer

0 votes
by (32.3k points)
edited by

Here is the solution

Instead of using:

val textFile = sc.textFile("hdfs://input/war-and-peace.txt")

Use this as you fire up your Spark-shell:


How did I find out nn1home:8020?

Just search for the file core-site.xml, the core-site.xml is always located in the conf directory either on the local or cluster installation of Spark and then look for xml element fs.defaultFS.

If you want to know more about Hadoop, then do check out this awesome video tutorial:

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

30.5k questions

32.5k answers


108k users

Browse Categories