Courses ×


Corporate Training Hire From Us Explore Courses
University Logo
Electronics & ICT Academy IIT Guwahati

Advanced Certification in Data Science & Data Engineering

This Advanced Certification in Data Science & Data Engineering by E&ICT, IIT Guwahati and Intellipaat is designed by domain experts and is in line with industry requirements to help you master the required skills like Python, Linux, SQL, Machine Learning, Spark and Power BI etc. through real time case studies. Learn from industry experts & become a certified Data Science & Data Engineering professional.

Only Few Seats Left No Prior Programming Knowledge Required

Ranked #1 Data Science Program by India TV

Upskill for Your Dream Job

Learning Format

Online Bootcamp

Live Classes + Projects

9 Months

Career Services

by Intellipaat

E&ICT IIT Guwahati



Hiring Partners

About the Program

In this program you will get deep understanding of the Data Science and engineering core skill sets like Linux, NumPy, Pandas, Spicy, SQL, Machine Learning, Deep Learning, Azure Data Engineering, NLP and Data Visualization, etc. You will be trained by industry experts & faculty from top universities as well as work on real-time projects and case studies that will help you become proficient in the fastest-growing data science sector.

Data Science & Data Engineering Course Key Highlights

Advanced Certification from E&ICT, IIT Guwahati
9 Months of Live Sessions from Industry Experts
180 Hrs of Live Sessions
218 Hrs of Self-paced Learning
50+ Industry Projects & Case Studies
One-on-one with Industry Mentors
Soft Skills Essential Training
Dedicated Learning Management Team
Free Career Counselling
Career Services by Intellipaat
24/7 Support
3 Guaranteed Interviews by Intellipaat
Designed for Working Professionals & Freshers
No Cost EMI Option

Free Career Counselling

We are happy to help you 24/7

Partnering with E&ICT, IIT Guwahati

This Certification Program in Data Science & Data Engineering is in partnership with E&ICT Academy IIT Guwahati. E&ICT IIT Guwahati is an initiative of Meity (Ministry of Electronics and Information Technology, Govt. of India) and formed with the team of IIT Guwahati professors to provide high-quality education programs.

Upon completion of this program, you will:

  • Receive a certificate from E&ICT, IIT Guwahati
Note: All certificate images are for illustrative purposes only and may be subject to change at the discretion of the EICT IIT Guwahati.

Career Transition

55% Average Salary Hike

$1,20,000 Highest Salary

12000+ Career Transitions

400+ Hiring Partners

Career Transition Handbook

Who Can Apply for the Data Science & Data Engineering Certification Program?

  • Individuals with a bachelor’s degree and a keen interest in data Science & data Engineering.
  • IT professionals looking for a career transition as Data Scientists and Data Engineers.
  • Professionals aiming to move ahead in their IT career.
  • Artificial Intelligence and Business Intelligence professionals.
  • Fresher’s who aspire to build their career in the field of Data Science and Data Engineering.
Who can aaply

What roles can an Data Science and Data Engineering professional play?

Senior Data Scientist

Understand the issues and create models based on the data gathered, and also manage a team of Data Scientists.

AI Expert

Build strategies on frameworks and technologies to develop AI solutions and help the organization prosper.

Machine Learning Expert

With the help of several Machine Learning tools and technologies, build statistical models with huge chunks of business data.

Applied Scientist

Design and build Machine Learning models to derive intelligence for the numerous services and products offered by the organization.

Big Data Specialist

Create and manage pluggable service-based frameworks that are customized in order to import, cleanse, transform, and validate data.

Senior Business Analyst

Extract data from the respective sources to perform business analysis, and generate reports, dashboards, and metrics to monitor the company’s performance.

Solution Architect

Create overall technical vision for a solution to a specific business problem, while designing, describing and managing the solution.

View More

Skills to Master


Data Science

Data Analysis

Data Pipelines

Data Processing








Data Wrangling

Story Telling

Machine Learning

Prediction algorithms




Data visualization

Azure Data Engineering

View More


Live Course
  1. Python
  • Introduction to Python and IDEs – The basics of the python programming language, how you can use various IDEs for python development like Jupyter, Pycharm, etc.
  • Python Basics – Variables, Data Types, Loops, Conditional Statements, functions, decorators, lambda functions, file handling, exception handling, etc.
  • Object Oriented Programming – Introduction to OOPs concepts like classes, objects, inheritance, abstraction, polymorphism, encapsulation, etc.
  • Hands-on Sessions and Assignments for Practice – The culmination of all the above concepts with real-world problem statements for better understanding.
  1. Linux
  • Introduction to Linux – Establishing the fundamental knowledge of how linux works and how you can begin with Linux OS.
  • Linux Basics – File Handling, data extraction, etc.
  • Hands-on Sessions and Assignments for Practice – Strategically curated problem statements for you to start with Linux.
  • Extract Transform Load
    • Web Scraping, Interacting with APIs [Ma1]
  • Data Handling with NumPy
    • NumPy Arrays, CRUD Operations,etc.
    • Linear Algebra – Matrix multiplication, CRUD operations, Inverse, Transpose, Rank, Determinant of a matrix, Scalars, Vectors, Matrices.
  • Data Manipulation Using Pandas
    • Loading the data, dataframes, series, CRUD operations, splitting the data, etc.
  • Data Preprocessing
    • Exploratory Data Analysis, Feature engineering, Feature scaling, Normalization, standardization, etc.
    • Null Value Imputations, Outliers Analysis and Handling, VIF, Bias-variance trade-off, cross validation techniques, train-test split, etc.
  • Scientific Computing with Scipy
    • Introduction to scipy, building on top of numpy
    • What are the characteristics of scipy?
    • Various subpackages for scipy like Signal, Integrate, Fftpack, Cluster, Optimize, Stats and more, Bayes Theorem with scipy.
  • Hands-on Exercise:
    • Importing of scipy
    • Applying the Bayes theorem on the given dataset.
  • Data Visualization
    • Bar charts, scatter plots, count plots, line plots, pie charts, donut charts, etc, with Python matplotlib.
    • Regression plots, categorical plots, area plots, etc, with Python seaborn.
  • SQL Basics-
    • Fundamentals of Structured Query Language
    • SQL Tables, Joins, Variables
  • Advanced SQL-
    • SQL Functions, Subqueries, Rules, Views
    • Nested Queries, string functions, pattern matching
    • Mathematical functions, Date-time functions, etc.
  • Deep Dive into User Defined Functions
    • Types of UDFs, Inline table value, multi-statement table.
    • Stored procedures, rank function, SQL ROLLUP, etc.
  • SQL Optimization and Performance
    • Record grouping, searching, sorting, etc.
    • Clustered indexes, common table expressions.
  • Basic Mathematics – Linear Algebra, Multivariate Calculus
  • Descriptive Statistics –
    • Measure of central tendency, measure of spread, five points summary, etc.
  • Probability
    • Definition, Random Variable, Probability Distributions and use cases, Bayes theorem, Mathematical Expectation, Markov and Chebyshev Inequality.
  • Inferential Statistics –
    • Correlation, covariance, confidence intervals, hypothesis testing, F-test, Z-test, t-test, ANOVA, chi-square test, etc.
  • Introduction to Machine learning
    • Supervised, Unsupervised learning.
    • Introduction to scikit-learn, Keras, etc.
  • Supervised Learning
    • Regression – Introduction classification problems, Identification of a regression problem, dependent and independent variables. How to train the model in a regression problem. How to evaluate the model for a regression problem. How to optimize the efficiency of the regression model.
    • Classification – Introduction to classification problems, Identification of a classification problem, dependent and independent variables. How to train the model in a classification problem. How to evaluate the model for a classification problem. How to optimize the efficiency of the classification model[Ma5]
    • Linear Regression – Creating linear regression models for linear data using statistical tests, data pre-processing, standardization, normalization, etc.
    • Logistic Regression – Creating logistic regression models for classification problems – such as if a person is diabetic or not, if there will be rain or not, etc.
    • Decision Tree – Creating decision tree models on classification problems in a tree like format with optimal solutions.
    • Random Forest – Creating random forest models for classification problems in a supervised learning approach.
    • Support Vector Machine – SVM or support vector machines for regression and classification problems.
    • Gradient Descent – Gradient descent algorithm that is an iterative optimization approach to finding local minimum and maximum of a given function.
    • K-Nearest Neighbors – A simple algorithm that can be used for classification problems.
    • Time Series Forecasting – Making use of time series data, gathering insights and useful forecasting solutions using time series forecasting.
  • Unsupervised Learning
    • Clustering – Introduction to clustering problems, Identification of a clustering problem, dependent and independent variables, How to train the model in a clustering problem, How to evaluate the model for a clustering problem, How to optimize the efficiency of the clustering model.
    • K-means – The k-means algorithm that can be used for clustering problems in an unsupervised learning approach.
    • Dimensionality reduction – Handling multi-dimensional data and standardizing the features for easier computation.
    • Principal Component Analysis – PCA follows the same approach in handling the multidimensional data.
    • Linear Discriminant Analysis – LDA or linear discriminant analysis to reduce or optimize the dimensions in the multidimensional data.
    • Association Rule Mining – Identifying strong rules in the data using machine learning.
    • Apriori Algorithm – For finding frequent itemsets in a dataset.
  • Performance Metrics
    • Classification reports – To evaluate the model on various metrics like recall, precision, f-support, etc.
    • Confusion matrix – To evaluate the true positive/negative, false positive/negative outcomes in the model.
    • r2, adjusted r2, mean squared error, etc.

1. Non-Relational Data Stores and Azure Data Lake Storage

1.1 Document data stores
1.2 Columnar data stores
1.3 Key/value data stores
1.4 Graph data stores
1.5 Time series data stores
1.6 Object data stores
1.7 External index
1.8 Why NoSQL or Non-Relational DB?
1.9 When to Choose NoSQL or Non-Relational DB?
1.10 Azure Data Lake Storage

Definition, Azure Data Lake-Key Components, How it stores data? Azure Data Lake Storage Gen2, Why Data Lake? Data Lake Architecture

2. Data Lake and Azure Cosmos DB

2.1 Data Lake Key Concepts
2.2 Azure Cosmos DB
2.3 Why Azure Cosmos DB?
2.4 Azure Blob Storage
2.5 Why Azure Blob Storage?
2.6 Data Partitioning: Horizontal partitioning, vertical partitioning, Functional partitioning
2.7 Why Partitioning Data?
2.8 Consistency Levels in AzureCosmos DB:  Semantics of the five-consistency level

3. Relational Data Stores

3.1 Introduction to Relational Data Stores
3.2 Azure SQL Database – Deployment Models, Service Tiers
3.3 Why SQL Database Elastic Pool?

4. Why Azure SQL?

4.1 Azure SQL Security Capabilities
4.2 High-Availability and Azure SQL Database: Standard Availability Model, Premium Availability Model
4.3 Azure Database for MySQL
4.4 Azure Database for PostgreSQL
4.5 Azure Database for MariaDB
4.6 What is PolyBase and Why PolyBase?
4.7 What is Azure Synapse Analytics (formerly SQL DW): SQL Analytics and SQL pool in Azure Synapse, Key component of a big data solution, SQL Analytics MPP architecture components

5. Azure Batch

5.1 What is Azure Batch?
5.2 Intrinsically Parallel Workloads
5.3 Tightly Coupled Workloads
5.4 Additional Batch Capabilities
5.5 Working of Azure Batch

6. Azure Data Factory

6.1 Flow Process of Data Factory
6.2 Why Azure Data Factory
6.3 Integration Runtime in Azure Data Factory
6.4 Mapping Data Flows

7. Azure Data Bricks

7.1 What is Azure Databricks?
7.2 Azure Spark-based Analytics Platform
7.3 Apache Spark in Azure Databricks

8. Azure Stream Analytics

8.1 Working of Stream Analytics
8.2 Key capabilities and benefits
8.3 Stream Analytics Windowing Functions: Tumbling window, Hopping Window, Sliding Window, Session Window

  • Artificial Intelligence Basics
    • Introduction to tensorflow
    • Keras API
  • Neural Networks
    • Neural networks
    • Multi-layered Neural Networks
    • Artificial Neural Networks
  • Deep Learning
    • Deep Learning Libraries
    • Deep neural networks
    • Convolutional Neural Networks
    • LSTM
    • Recurrent Neural Networks
    • GPU in deep learning
    • Autoencoders, restricted boltzmann machine
    • Deep Learning Applications
    • Chatbots
  • Text Mining, Cleaning, and Pre-processing
    • Various Tokenizers, Tokenization, Frequency Distribution, Stemming, POS Tagging, Lemmatization, Bigrams, Trigrams & Ngrams, Lemmatization, Entity Recognition.
  • Text classification, NLTK, sentiment analysis, etc
    • Overview of Machine Learning, Words, Term Frequency, Countvectorizer, Inverse Document Frequency, Text conversion, Confusion Matrix, Naive Bayes Classifier.
  • Sentence Structure, Sequence Tagging, Sequence Tasks, and Language Modeling
    • Language Modeling, Sequence Tagging, Sequence Tasks, Predicting Sequence of Tags, Syntax Trees, Context-Free Grammars, Chunking, Automatic Paraphrasing of Texts, Chinking.
  • AI Chatbots and Recommendations Engine
    • Using the NLP concepts, build a recommendation engine and an AI chatbot assistant using AI.
  • Introduction to MLOps
    • MLOps lifecycle
    • MLOps pipeline
    • MLOps Components, Processes, etc
  • Deploying Machine Learning Models
    • Introduction to Azure Machine Learning
    • Deploying Machine Learning Models using Azure
  • Power BI Basics
    • Introduction to PowerBI, Use cases and BI Tools , Data Warehousing, Power BI components, Power BI Desktop, workflows and reports , Data Extraction with Power BI.
    • SaaS Connectors, Working with Azure SQL database, Python and R with Power BI
    • Power Query Editor, Advance Editor, Query Dependency Editor, Data Transformations, Shaping and Combining Data ,M Query and Hierarchies in Power BI.
  • DAX
    • Data Modeling and DAX, Time Intelligence Functions, DAX Advanced Features
  • Data Visualization with Analytics
    • Slicers, filters, Drill Down Reports
    • Power BI Query, Q & A and Data Insights
    • Power BI Settings, Administration and Direct Connectivity
    • Embedded Power BI API and Power BI Mobile
    • Power BI Advance and Power BI Premium

Data Science Capstone Projects

  • The Data Science capstone project focuses on establishing a strong hold of analyzing a problem and coming up with solutions based on insights from the data analysis perspective. The capstone project will help you master the following verticals:
  • Extracting, loading and transforming data into usable format to gather insights.
  • Data manipulation and handling to pre-process the data.
  • Feature engineering and scaling the data for various problem statements.
  • Model selection and model building on various classification, regression problems using supervised/unsupervised machine learning algorithms.
  • Assessment and monitoring of the model created using the machine learning models.

Business Case Studies

  • Recommendation Engine – The case study will guide you through various processes and techniques in machine learning to build a recommendation engine that can be used for movie recommendations, restaurant recommendations, book recommendations, etc.
  • Rating Predictions – This text classification and sentiment analysis case study will guide you towards working with text data and building efficient machine learning models that can predict ratings, sentiments, etc.
  • Census – Using predictive modeling techniques on the census data, you will be able to create actionable insights for a given population and create machine learning models that will predict or classify various features like total population, user income, etc.
  • Housing – This real estate case study will guide you towards real world problems, where a culmination of multiple features will guide you towards creating a predictive model to predict housing prices.
  • Object Detection – A much more advanced yet simple case study that will guide you towards making a machine learning model that can detect objects in real time.
  • Stock Market Analysis – Using historical stock market data, you will learn about how feature engineering and feature selection can provide you some really helpful and actionable insights for specific stocks.
  • Banking Problem – A classification problem that predicts consumer behavior based on various features using machine learning models.
  • AI Chatbot – Using the NLTK python library, you will be able to apply machine learning algorithms and create an AI chatbot.
View More

Program Highlights

9 Months of Live Sessions from Industry Experts
50+ Industry Projects & Case Studies
E&ICT, IIT Guwahati Certification
One-on-one with Industry Mentors

Interested in This Program? Secure your spot now.

The application is free and takes only 5 minutes to complete.


Projects will be a part of your Certification in Data Science & Data Engineering to consolidate your learning. It will ensure that you have real-world experience in Data Science and Data Engineering.


4.8 ( 2,511 )

Hear From Our Hiring Partners

Career Services By Intellipaat

Career Services

Career Oriented Sessions

Throughout the course

Over 20+ live interactive sessions with an industry expert to gain knowledge and experience on how to build skills that are expected by hiring managers. These will be guided sessions and that will help you stay on track with your up skilling objective.

Resume & LinkedIn Profile Building

After 70% of course completion

Get assistance in creating a world-class resume & LinkedIn Profile from our career services team and learn how to grab the attention of the hiring manager at profile shortlisting stage

Mock Interview Preparation

After 80% of the course completion.

Students will go through a number of mock interviews conducted by technical experts who will then offer tips and constructive feedback for reference and improvement.

1 on 1 Career Mentoring Sessions

After 90% of the course completion

Attend one-on-one sessions with career mentors on how to develop the required skills and attitude to secure a dream job based on a learners’ educational background, past experience, and future career aspirations.

3 Guaranteed Interviews

After 80% of the course completion

Guaranteed 3 job interviews upon submission of projects and assignments. Get interviewed by our 400+ hiring partners.

Exclusive access to Intellipaat Job portal

After 80% of the course completion

Exclusive access to our dedicated job portal and apply for jobs. More than 400 hiring partners’ including top start-ups and product companies hiring our learners. Mentored support on job search and relevant jobs for your career growth.

Our Alumni Works At

Master Client Desktop

Peer Learning

Via Intellipaat PeerChat, you can interact with your peers across all classes and batches and even our alumni. Collaborate on projects, share job referrals & interview experiences, compete with the best, make new friends – the possibilities are endless and our community has something for everyone!


Admission Details

The application process consists of three simple steps. An offer of admission will be made to selected candidates based on the feedback from the interview panel. The selected candidates will be notified over email and phone, and they can block their seats through the payment of the admission fee.

Submit Application

Submit Application

Tell us a bit about yourself and why you want to join this program

Application Review

Application Review

An admission panel will shortlist candidates based on their application


Application Review

Selected candidates will be notified within 1–2 weeks

Program Fee

Total Admission Fee

$ 1,492

Upcoming Application Deadline 26th March 2023

Admissions are closed once the requisite number of participants enroll for the upcoming cohort. Apply early to secure your seat.

Program Cohorts

Next Cohorts

Date Time Batch Type
Program Induction 26th March 2023 08:00 PM IST Weekend (Sat-Sun)
Regular Classes 26th March 2023 08:00 PM IST Weekend (Sat-Sun)

Data Science & Data Engineering FAQs

What can I expect from the Advanced Certification in Data Science & Data Engineering that Intellipaat Offers?

This is one of the best data science and data engineering certification courses as it is designed keeping the industry requirement in mind to provide you with required expertise to handle various aspects of data science and data engineering roles and responsibilities. The career prospects that you will achieve post the completion of the course are innumerable and have highly lucrative opportunities.

The Advanced Certification in Data Science & Data Engineering is offered by E&ICT, IIT Guwahati and Intellipaat. These instructors aim to make you proficient in the field of Data Science & Engineering and have designed a curated curriculum in the form of online video lectures and projects. The course is designed to help you gain in-depth knowledge in Data Science & data Engineering concepts, apart from providing hands-on experience in these domains through real-time projects.

If you fail to attend any of the live lectures, you will get a copy of the recorded session in the next 12 hours. Moreover, if you have any other queries, you can get in touch with our course advisors or post the questions on our community page.

On the successful completion of the training program and the fulfillment of all the requirements, including successfully passing the certification exam by Intellipaat, you will be awarded an Advanced Certification in Data Science & Data Engineering by E&ICT, IIT Guwahati.

Intellipaat is known for its quality training and industry mentorship. Our alumni are placed in reputed organizations globally such as Amazon, Microsoft, Genpact, Sony, Gartner, etc. Our learners also get lifetime access to free upgrades and learning material, which will help them at any point of time in their career.

By enrolling with Intellipaat’s data science & engineering courses online, you will be able to take advantage of exclusive career guidance benefits, interview preparation, etc.

On an average the starting salary of a data scientist is 10 LPA and that of a data engineer is 9 LPA. You can also check our dedicated blog on Data science salary in India based on various job roles.

Learners need to devote at least 8–10 hours per week for effective learning. Our live classes are flexible, and hence, working professionals can easily manage their learning and job together.

The duration of this program is 9 months, which includes 8 months of live sessions, and 1 month of multiple project hours, and real-life assignments.

View More

What is included in this course?

  • Non-biased career guidance
  • Counselling based on your skills and preference
  • No repetitive calls, only as per convenience
  • Rigorous curriculum designed by industry experts
  • Complete this program while you work

I’m Interested in This Program

Select Currency