Remember

Register

All Courses Ask a Question

Questions
Unanswered
Ask a Question
Blog
Tutorials
Interview Questions

Back

Login

Explore Courses Blog Tutorials Interview Questions

community
Big Data Hadoop & Spark
How to “negative select” columns in spark's...

How to “negative select” columns in spark's dataframe

How to “negative select” columns in spark's dataframe

0 votes

2 views

asked Jul 25, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)
edited Jul 25, 2019 by Aarav

I can't figure it out, but guess it's simple. I have a spark dataframe df. This df has columns "A","B" and "C". Now let's say I have an Array containing the name of the columns of this df:

column_names = Array("A","B","C")

I'd like to do a df.select() in such a way, that I can specify which columns not to select. Example: let's say I do not want to select columns "B". I tried

df.select(column_names.filter(_!="B"))

but this does not work, as it gives

org.apache.spark.sql.DataFrame cannot be applied to (Array[String])

apache-spark
scala

Please log in to add a comment.

Please log in to answer this question.

1 Answer

0 votes

answered Jul 26, 2019 by Amit Rawat (32.3k points)

Since Spark 1.4 you can use drop method:

For Scala:

case class Point(x: Int, y: Int)
val df = sqlContext.createDataFrame(Point(0, 0) :: Point(1, 2) :: Nil)
df.drop("y")

For PySpark:

df = sc.parallelize([(0, 0), (1, 2)]).toDF(["x", "y"])
df.drop("y")
## DataFrame[x: bigint]

Please log in to add a comment.

Related questions

0 votes

1 answer

Apache Spark: StackOverflowError when trying to indexing string columns

asked Jul 29, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

apache-spark
scala

0 votes

1 answer

How to convert DataFrame to RDD in Scala?

asked Jul 8, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

apache-spark
scala
sql

+7 votes

3 answers

How to convert rdd object to dataframe in spark

asked May 23, 2019 in Big Data Hadoop & Spark by Rohan (1.5k points)

scala
apache-spark
apache-spark-sql
rdd

0 votes

1 answer

How to define partitioning of DataFrame?

asked Jul 9, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

apache-spark
scala
dataframe

0 votes

1 answer

How to create an empty DataFrame with a specified schema?

asked Jul 9, 2019 in Big Data Hadoop & Spark by Aarav (11.4k points)

apache-spark
scala

31k questions

32.8k answers

501 comments

693 users

Browse Categories

Master Program
Big Data
Data Science
Business Intelligence
Salesforce
Cloud Computing Courses
Digital Marketing
Database
Programming
Testing
Project Management
Web Development Courses

© COPYRIGHT 2011-2024 INTELLIPAAT.COM. ALL RIGHTS RESERVED.

...