0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

I have a Hadoop cluster setup and working under a common default username "user1". I want to put files into hadoop from a remote machine which is not part of the hadoop cluster. I configured hadoop files on the remote machine in a way that when

hadoop dfs -put file1 ...


is called from the remote machine, it puts the file1 on the Hadoop cluster.

the only problem is that I am logged in as "user2" on the remote machine and that doesn't give me the result I expect. In fact, the above code can only be executed on the remote machine as:

hadoop dfs -put file1 /user/user2/testFolder


However, what I really want is to be able to store the file as:

hadoop dfs -put file1 /user/user1/testFolder


If I try to run the last code, hadoop throws error because of access permissions. Is there anyway that I can specify the username within hadoop dfs command?

I am looking for something like:

hadoop dfs -username user1 file1 /user/user1/testFolder

1 Answer

0 votes
by (25.6k points)
edited by

To specify username when putting on HDFS, simply use the HADOOP_USER_NAME env variable that will tell the HDFS which user name to operate with. But keep in mind that this only works if your cluster isn't using any security features e.g. Kerberos.

HADOOP_USER_NAME=hdfs hadoop dfs -put ...

If you want to know more about Hadoop, then do check out this awesome video tutorial:

...