Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
in Big Data Hadoop & Spark by (160 points)

I have to insert duplicate columns which empty 

Header : , , , xxx, , , YYY, , ,

dataSet.coalesce(1).write().mode(ignore).format(csv).options(header= true,\u0000=false).save(targetPath);

while writing into csv file am getting a issue following below 

Caused by: org.apache.spark.sql.AnalysisException: Found duplicate column(s) when inserting into file:/C:/Users/xxx/summaryReport/20190901: 

at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)

1 Answer

0 votes
by (33.1k points)

Hi Hussain

You should assign a new name to each duplicate column. A unique name for every column would also help you to perform data preprocessing on a specific column.

I hope this answer will help you!

How to write duplicate columns as header in csv file using java and spark
by (160 points)
Hi Anurag,
Thanks for the suggestions !!!

Actual requirement is

Columns | A | B | C | D | E | F | G | H |
columns c and G should have headers XXX and YYY.
Columns A,B,D, E,F,H should be empty

I have fixed this . following below

created dataset with the above structure
while writing ds.write.options(header,false) header set as false .. it is resolves the issue

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Browse Categories