What is Spark Job ?

1 Answer

answered Jul 16, 2019 by Amit Rawat (32.3k points)

Generally, a Job can be described as a piece of code that reads some input from HDFS or local, performs some computation on the data and writes some output data.

Spark has his own definition for "job".

An ideal definition for a job in case of Spark can be described as a parallel computation consisting of multiple tasks that get spawned in response to a Spark action (e.g. save, collect).

Let's say you need to do the following:

Load a file into RDD1 with people names and addresses
Load a file with people phone no.s and names into RDD2
Join RDD1 and RDD2 by name, to get RDD3
Map on RDD3 to get a nice HTML presentation card for each person as RDD4
Save RDD4 to file.
Map RDD1 to extract zipcodes from the addresses to get RDD5
Aggregate on RDD5 to get a count of how many people live on each zipcode as RDD6
Collect RDD6 and prints these stats to the stdout.

So, now the driver program is this entire piece of code, running all 8 steps.

Step % will be considered as a job as in this step the entire HTML card set is produced(we are using the save action, not a transformation). Similarly, with the collect on step 8 a job is created.

Other steps will be sorted into stages, with each job being the result of a sequence of stages. For simple things a job can have a single stage, but the need to repartition data (for example: join on step 3) or anything that breaks the locality of the data usually causes more stages to appear. You can conceive stages as computations that produce intermediate results, which can, in fact, be persisted.

Now, since we'll be using RDD1 more than once, we can persist it, avoiding recomputation.

Now if I conclude, we basically talked about how the logic of a given algorithm will be broken. Also, a task is a particular piece of data that will go through a given stage, on a given executor.

I hope this helped you in a better understanding of “job” in Spark.

What is Spark Job ?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions