Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (12.9k points)

Can anyone tell me what is sparksession in pyspark?

1 Answer

0 votes
by (108k points)

SparkSession is the entry point for programming Spark with the Dataset, DataFrame API, and SQL functionality.

It will be the first objects we create in developing a Spark SQL application

A SparkSession can be used to:

  • Create DataFrame,
  • Register DataFrame as tables
  • Execute SQL over tables, cache tables
  • Read parquet files.

To create a SparkSession, you can follow this builder pattern:

 

>>> spark = SparkSession.builder \

...     .master("local") \

...     .appName("Word Count") \

...     .config("spark.some.config.option", "some-value") \

...     .getOrCreate()

If you are looking for an online course to learn PySpark, check out this PySpark Training course by Intellipaat.

Also, check out this video for more information:

Related questions

0 votes
1 answer
0 votes
1 answer
asked Sep 23, 2020 in Big Data Hadoop & Spark by Amyra (12.9k points)
0 votes
1 answer
asked Feb 7, 2020 in Big Data Hadoop & Spark by anmolj (9k points)

Browse Categories

...