There are no prerequisites for taking up this course. Basic knowledge of database, SQL and query language can help.
Topics: What is Spark, what is in-memory MapReduceFrom Hadoop MapReduce to Spark, Spark Hadoop YARN, HDFS Revision, YARN Revision, Spark Overview and how spark is better than Hadoop and Spark without Hadoop used in Industries?
Topics: How to install spark, using the Spark Shell, RDDs (Resilient Distributed Datasets), Functional Programming in Spark, Spark Architecture
Topics: Creating RDDs, Other General RDD Operations
Topics: Key-Value Pair RDDs, Spark MapReduce, Other Pair RDD Operations
Topics: Spark Applications vs. Spark Shell,Creating the SparkContext, Building a Spark Application (Scala and Java), Running a Spark Application, the Spark Application Web UI, Hands-On Exercise: Write and Run a Spark Application, Configuring Spark Properties, Logging
Topics: Review: Spark on a Cluster, RDD Partitions, Partitioning of File-based RDDs, HDFS and Data Locality, Executing Parallel Operations, Stages and Tasks
Topics: RDD Lineage, RDD Persistence Overview, Distributed Persistence
Topics: Spark Streaming Overview, Example: Streaming Request Count, DStreams, Developing Spark Streaming Applications, Spark Stream processing
Topics: Multi-Batch Operations, State Operations, Sliding Window Operations, Advanced Data Sources
Topics: Common Spark Use Cases, Iterative Algorithms in Spark, Spark Graph Processing and Analysis, Machine Learning spark example k-means
Topics: Shared Variables: Broadcast Variables, Shared Variables: Accumulators, Common Performance Issues, Diagnosing Performance Problems
Topics: Spark SQL and the SQL Context, Creating DataFrames, Transforming and Querying Data Frames, Saving DataFrames, DataFrames and RDDs, Comparing Spark SQL, Impala and Hive-on-Spark
Topics: Task Scheduling/ Distribution, Scheduling Around Applications, Static Partitioning, Dynamic Sharing, Scheduling Within an Application, Fair Scheduling, High Availability of Spark Master,Standby Masters With Zookeeper, Single Node Recovery With Local File System, High Order Functions
Topics: Practical’s: Creating Maps, Transformations, Capacity planning in spark, Concurrency in java, Concurrency in Scala
Topics: Array Buffers, Compact Buffer, Protocol Buffer, Log Analysis With Spark, First Log Analyzers In Spark.
Topics: Scala Overview and Scala for big data and Apache Spark analytics
Topics: Play with Scala, Advantages of Scala, REPL (Read Evaluate print loop), Language Features, Type Interface, Higher order function, Option, Pattern Matching, Collection, Currying, Traits, Application Space and Scala for data analysis
Topics: Uses of Scala interpreter, Example of static object timer in Scala, Testing of String equality in Scala, Implicit classes in Scala with examples, Recursion in Scala for each, Currying in Scala with examples, Classes in Scala
Topics: Constructor, Constructor overloading, Properties, Abstract classes, Type hierarchy in Scala, Object equality, Val and var methods
Topics: Sealed traits, Case classes, Constant pattern in case classes, Wild card pattern, Variable pattern, Constructor pattern, Tuple pattern
Topics: Java equivalents, Advantages of traits, avoiding boilerplate code, Linearization of traits, modeling a real world example
Topics: How traits are implemented in Scala and java, How extending multiple traits is handled
Topics: Classification of Scala collections, Iterable, Iterator and iterable, List sequence example in Scala
Topics: Array in Scala, List in Scala, Difference between list and list buffer, Array buffer, Queue in Scala, Dequeue in Scala, Mutable queue in Scala, Stacks in Scala, Sets and maps in Scala, Tuples
Topics: Different import types, Selective imports, Testing-Assertions, Scala test case- Scala test fun. Suite, Junit test in Scala, Interface for Junit via Junit 3 suite in Scala test, SBT, Directory structure for packaging Scala application, Scala Split and Spark Scala example.
Project 1. Movie Recommendation
Topics – This is a project wherein you will gain hands-on experience in deploying Apache Spark for movie recommendation. You will be introduced to the Spark Machine Learning Library, a guide to MLlib algorithms and coding which is a machine learning library. Understand how to deploy collaborative filtering, clustering, regression, and dimensionality reduction in MLlib. Upon completion of the project you will gain experience in working with streaming data, sampling, testing and statistics.
Project 2. Twitter API Integration for tweet Analysis
Topics – With this project you will learn to integrate Twitter API for analyzing tweets. You will write codes on the server side using any of the scripting languages like PHP, Ruby or Python, for requesting the Twitter API and get the results in JSON format. You will then read the results and perform various operations like aggregation, filtering and parsing as per the need to come up with tweet analysis.
Project 3. Data Exploration Using Spark SQL – Wikipedia dataset
Topics – This project lets you work with Spark SQL. You will gain experience in working with Spark SQL for combining it with ETL applications, real time analysis of data, performing batch analysis, deploying machine learning,creating visualizations and processing of graphs.
Speed is crucial for maintaining a competitive edge. Cassandra, with its in-memory database option, can handle very high volumes of data at high velocity. This fits perfectly with Spark’s in-memory data analysis to provide the combination necessary to translate data into information in real time.
The Spark supported open source integration provides developers an easy path for their applications to run on top of. Also, both Spark and Cassandra allow developers to quickly write applications in languages such as Python, Java and others with updated drivers and support.
Cluster managers Hardware & configuration Linking with Spark Monitoring and measuring
In Intellipaat self-paced training program you will receive recorded sessions, course material, Quiz, related software’s and assignments.The courses are designed such that you will get real world exposure and focused on clearing relevant certification exam. After completion of training you can take quiz which enable you to check your knowledge and enables you to clear relevant certification at higher marks/grade also you will be able to work on the technology independently.
In Self-paced courses trainer is not available whereas in Online training trainer will be available for answering queries at the same time. In self-paced course we provide email support for doubt clearance or any query related to training also if you face some unexpected challenges we will arrange live class with trainer.
All Courses are highly interactive to provide good exposure. You can learn at your own place and at your leisure time. Prices of self-paced is training is 75% cheaper than online training. You will have lifetime access hence you can refer it anytime during your project work or job.
Yes, at the top of the page of course details you can see sample videos.
As soon as you enroll to the course, your LMS (The Learning Management System) Access will be Functional. You will immediately get access to our course content in the form of a complete set of previous class recordings, PPTs, PDFs, assignments and access to our 24×7 support team. You can start learning right away.
24/7 access to video tutorials and Email Support along with online interactive session support with trainer for issue resolving.
Yes, You can pay difference amount between Online training and Self-paced course and you can be enrolled in next online training batch.
Yes, we will provide you the links of the software to download which are open source and for proprietary tools we will provide you trail version if available.
Please send an email . You can also chat with us to get an instant solution.
Intellipaat verified certificates will be awarded based on successful completion of course projects. There are set of quizzes after each couse module that you need to go through . After successful submission, official Intellipaat verified certificate will be given to you.
Towards the end of the Course, you will have to work on a Training project. This will help you understand how the different components of course are related to each other.
Classes are conducted via LIVE Video Streaming, where you get a chance to meet the instructor by speaking, chatting and sharing your screen. You will always have the access to videos and PPT. This would give you a clear insight about how the classes are conducted, quality of instructors and the level of Interaction in the Class.
Yes, We do keep launching multiple offers, please see offer page.
We will help you with the issue and doubts regarding the course. You can attempt the quiz again.
This training course is designed to help you clear the Apache Spark component of the Cloudera Spark and Hadoop Developer certification (CCA175) exam. Check our Hadoop training course for gaining proficiency in the Hadoop component of the CCA175 exam. The entire training course content is in line with the certification program and helps you clear the certification exam with ease and get the best jobs in the top MNCs.
As part of this training you will be working on real time projects and assignments that have immense implications in the real world industry scenario thus helping you fast track your career effortlessly.
At the end of this training program there will be quizzes that perfectly reflect the type of questions asked in the certification exams and helps you score better marks in certification exam.
Intellipaat Course Completion Certificate will be awarded on the completion of Project work (on expert review) and upon scoring of at least 60% marks in the quiz. Intellipaat certification is well recognized in top 80+ MNCs like Ericsson, Cisco, Cognizant, Sony, Mu Sigma, Saint-Gobain, Standard Chartered, TCS, Genpact, Hexaware, etc.
We provide 24X7 support by email for issues or doubts clearance for Self-paced training.
In online Instructor led training, trainer will be available to help you out with your queries regarding the course. If required, the support team can also provide you live support by accessing your machine remotely. This ensures that all your doubts and problems faced during labs and project work are clarified round the clock.
This course is designed for clearing Apache Spark Certification examination of any reputed company. At the end of the course there will be a quiz and project assignments once you complete them you will be awarded with Intellipaat Course Completion certificate.
At the end of the course there will be a quiz and project assignments once you complete them you will be awarded with Intellipaat Course Completion certificate.
"PMI®", "PMP®" and "PMI-ACP®" are registered marks of the Project Management Institute, Inc.
The Open Group®, TOGAF® are trademarks of The Open Group.
The Swirl logoTM is a trade mark of AXELOS Limited.
ITIL® is a registered trade mark of AXELOS Limited.
PRINCE2® is a Registered Trade Mark of AXELOS Limited.
Certified ScrumMaster® (CSM) and Certified Scrum Trainer® (CST) are registered trademarks of SCRUM ALLIANCE®
Professional Scrum Master is a registered trademark of Scrum.org