All Courses
×
IIT Guwahati logo
Electronics & ICT Academy IIT Guwahati

Advanced Certification in Data Science & Data Engineering

4,535 Ratings

This Advanced Certification in Data Science and Data Engineering course is specially designed to help you master skills necessary for you to land a job in this field. This online program covers essential concepts like the Data Analysis using Python , Data Wrangling, Machine Learning, Azure Data Engineering, Deep Learning, NLP, etc. The course is designed in a way to help you gain the relevant skills required to land you your dream job.

Apply Now

Learning Format

Online Bootcamp

Live Classes

9 months

E&ICT IIT Guwahati

Certification

3 Guaranteed Interviews

by Intellipaat

EMI Starts

at ₹8,000/month*

trustpilot 3109
sitejabber 1493
mouthshut 24542

About the Program

The program offers complete advanced certification training for those wishing to pursue a career in Data Science & Data Engineering. The course curriculum is led by the leading faculty of IIT Guwahati and designed for aspiring Data Scientists and Data Engineers who want to land themselves in top organizations.

Key Highlights

400 Hrs of Applied Learning
70+ Live sessions across 9 months
218 Hrs of Self-Paced Learning
Certification from EICT, IIT Guwahati
Designed for working Professionals & Freshers
Learn from IIT Guwahati Faculty & Industry Experts
3 Guaranteed Interviews by Intellipaat
24*7 Support
Lifetime Free Upgrade
Learn & Mentored by top Industry Practitioners
Soft Skills Essential Training
No Cost EMI Option
Dedicated Learning Management Team
Career Services by Intellipaat

About E&ICT, IIT Guwahati

This advanced certification program is in partnership with E&ICT Academy, IIT Guwahati. E&ICT, IIT Guwahati is an initiative of MeitY (Ministry of Electronics and Information Technology, Govt. of India) and formed with the team of IIT Guwahati professors to provide high-quality education programs.

Achievements of IIT Guwahati

  • Ranked in top 7 by NIRF India Rankings 2021 – NIRF
  • It also holds a global rank of 41 in Economic Times 2021 Rankings –Economic Times

Upon the completion of this program, you will receive:

  • Advanced certificate in Data Science & Data Engineering by E&ICT, IIT Guwahati
Advanced Certification in Data Science Data Engineering Certificate Click to Zoom

Career Transition

55% Average Salary Hike

45 LPA Highest Salary

12000+ Career Transitions

400+ Hiring Partners

*Past record is no guarantee of future job prospects

Who Can Apply For This Course?

  • Individuals with a bachelor’s degree and a keen interest to learn Data Science & Data Engineering.
  • IT professionals looking for a career transition as Data Scientists and Data Engineers.
  • Professionals aiming to move ahead in their IT career.
  • Developers and Project Managers.
  • Freshers who aspire to build their career in the field of Data Science.
who can apply

What roles can a Data Scientist & Data Engineer play?

Senior Data Scientist

Understand the issues and create models based on the data gathered, and also manage a team of Data Scientists.

Big Data Specialist

Create and manage pluggable service-based frameworks that are customized in order to import, cleanse, transform, and validate data.

Analytics & Modeling Specialist

Responsible for conceptualizing, developing, testing, and implementing various advanced Statistical models on business data.

Big Data Engineer

Responsible for assessing the feasibility of migrating customer solutions and/or integrating with third-party systems both Microsoft and non-Microsoft platforms.

Senior Business Analyst

Extract data from the respective sources to perform business analysis, and generate reports, dashboards, and metrics to monitor the company’s performance.

Sr. Data Engineer

Responsible for gathering and translating business requirements into technical specifications and designing and developing data pipelines.

View More

Skills to Master

Python

Linux

Data Science

SQL

Data Analytics

Machine Learning

Data Wrangling

NLP

Azure Data Engineering

Deep Learning

Data Visualization

View More

Tools to master

python linux jupyter pyspark SQL Power BI tensorflow excel git pandas

Curriculum

Live Course Self Paced

1. Python

  • Introduction to Python and IDEs – The basics of the Python programming language, how you can use various IDEs for Python development like Jupyter, Pycharm, etc.
  • Python Basics – Variables, Data Types, Loops, Conditional Statements, Functions, Decorators, Lambda Functions, File Handling, Exception Handling, etc.
  • Object Oriented Programming – Introduction to OOPs concepts like Classes, Objects, Inheritance, Abstraction, Polymorphism, Encapsulation, etc.
  • Hands-on Sessions and Assignments for Practice – The culmination of all the above concepts with real-world problem statements for better understanding.

2. Linux

  • Introduction to Linux – Establishing the fundamental knowledge of how Linux works and how you can begin with Linux OS.
  • Linux Basics – File Handling, Data extraction, etc.
  • Hands-on Sessions and Assignments for Practice – Strategically curated problem statements for you to start with Linux.

1. Extract Transform Load

  • Web Scraping, Interacting with APIs [Ma1]

 

2. Data Handling with NumPy

  • NumPy Arrays, CRUD Operations, etc.
  • Linear Algebra – Matrix multiplication, CRUD operations, Inverse, Transpose, Rank, Determinant of a matrix, Scalars, Vectors, Matrices.

 

3. Data Manipulation Using Pandas

  • Loading the data, Data frames, Series, CRUD operations, Splitting the data, etc.

 

4. Data Preprocessing

  • Exploratory Data Analysis, Feature engineering, Feature scaling, Normalization, standardization, etc.
  • Null Value Imputations, Outliers Analysis and Handling, VIF, Bias-variance Trade-off, Cross-validation techniques, Train-test split, etc.

 

5. Scientific Computing with Scipy

  • Introduction to Scipy, Building on top of NumPy
  • What are the characteristics of Scipy?
  • Various subpackages for scipy like Signal, Integrate, Fftpack, Cluster, Optimize, Stats, and more, Bayes Theorem with Scipy.

 

Hands-on Exercise:

  • Importing of scipy
  • Applying the Bayes theorem on the given dataset.

 

6. Data Visualization

  • Bar charts, Scatter plots, Count plots, Line plots, Pie charts, Donut charts, etc, with Python matplotlib.
  • Regression plots, Categorical plots, Area plots, etc, with Python seaborn.

1. SQL Basics –

  • Fundamentals of Structured Query Language
  • SQL Tables, Joins, Variables

2. Advanced SQL –

  • SQL Functions, Subqueries, Rules, Views
  • Nested Queries, String functions, Pattern matching
  • Mathematical functions, Date-time functions, etc.

3. Deep Dive into User Defined Functions

  • Types of UDFs, Inline table value, Multi-statement table.
  • Stored procedures, Rank function, SQL ROLLUP, etc.

4. SQL Optimization and Performance

  • Record grouping, Searching, Sorting, etc.
  • Clustered indexes, Common table expressions.

1. Basic Mathematics – Linear Algebra, MultiVariate Calculus

2. Descriptive Statistics –

  • Measure of central tendency, Measure of spread, Five points summary, etc.

3. Probability

  • Definition, Random Variable, Probability Distributions and use cases, Bayes theorem, Mathematical Expectation, Markov and Chebyshev Inequality.

4. Inferential Statistics –

  • Correlation, covariance, Confidence intervals, Hypothesis testing, F-test, Z-test, t-test, ANOVA, Chi-square test, etc.

1. Introduction to Machine learning

  • Supervised, Unsupervised learning.
  • Introduction to Scikit-learn, Keras, etc.

 

2. Supervised Learning

  • Regression – Introduction classification problems, Identification of a regression problem, dependent and independent variables. How to train the model in a regression problem. How to evaluate the model for a regression problem. How to optimize the efficiency of the regression model.
  • Classification – Introduction to classification problems, Identification of a classification problem, dependent and independent variables. How to train the model in a classification problem. How to evaluate the model for a classification problem. How to optimize the efficiency of the classification model[Ma5]
  • Linear Regression – Creating linear regression models for linear data using Statistical tests, Data Pre-processing, Standardization, Normalization, etc.
  • Logistic Regression – Creating logistic regression models for classification problems – such as if a person is diabetic or not, if there will be rain or not, etc.
  • Decision Tree – Creating decision tree models on classification problems in a tree-like format with optimal solutions.
  • Random Forest – Creating random forest models for classification problems in a supervised learning approach.
  • Support Vector Machine – SVM or support vector machines for regression and classification problems.
  • Gradient Descent – Gradient descent algorithm that is an iterative optimization approach to finding local minimum and maximum of a given function.
  • K-Nearest Neighbors – A simple algorithm that can be used for classification problems.
  • Time Series Forecasting – Making use of time series data, gathering insights and useful forecasting solutions using time series forecasting.

 

3. Unsupervised Learning

  • Clustering – Introduction to clustering problems, Identification of a clustering problem, dependent and independent variables, How to train the model in a clustering problem, How to evaluate the model for a clustering problem, How to optimize the efficiency of the clustering model.
  • K-means – The k-means algorithm that can be used for clustering problems in an unsupervised learning approach.
  • Dimensionality reduction – Handling multi-dimensional data and standardizing the features for easier computation.
  • Principal Component Analysis – PCA follows the same approach in handling multi-dimensional data.
  • Linear Discriminant Analysis – LDA or linear discriminant analysis to reduce or optimize the dimensions in the multidimensional data.
  • Association Rule Mining – Identifying strong rules in the data using machine learning.
  • Apriori Algorithm – For finding frequent itemsets in a dataset.

4. Performance Metrics

  • Classification reports – To evaluate the model on various metrics like recall, precision, f-support, etc.
  • Confusion matrix – To evaluate the true positive/negative, false positive/negative outcomes in the model.
  • r2, adjusted r2, mean squared error, etc.

1 – Non-Relational Data Stores and Azure Data Lake Storage

1.1  Document data stores
1.2  Columnar data stores
1.3  Key/value data stores
1.4  Graph data stores
1.5  Time series data stores
1.6  Object data stores
1.7  External index
1.8  Why NoSQL or Non-Relational DB?
1.9  When to Choose NoSQL or Non-Relational DB?
1.10  Azure Data Lake Storage
Definition, Azure Data Lake-Key Components, How it stores data? Azure Data Lake Storage Gen2, Why Data Lake? Data Lake Architecture

2 – Data Lake and Azure Cosmos DB
2.1  Data Lake Key Concepts
2.2  Azure Cosmos DB
2.3  Why Azure Cosmos DB?
2.4  Azure Blob Storage
2.5  Why Azure Blob Storage?
2.6  Data Partitioning: Horizontal partitioning, Vertical partitioning, Functional partitioning
2.7  Why Partitioning Data?
2.8  Consistency Levels in AzureCosmos DB: Semantics of the five-consistency level

3 – Relational Data Stores
3.1  Introduction to Relational Data Stores
3.2  Azure SQL Database – Deployment Models, Service Tiers
3.3  Why SQL Database Elastic Pool?

4 – Why Azure SQL?
4.1  Azure SQL Security Capabilities
4.2  High-Availability and Azure SQL Database: Standard Availability Model, Premium Availability Model
4.3  Azure Database for MySQL
4.4  Azure Database for PostgreSQL
4.5  Azure Database for MariaDB
4.6  What is PolyBase and Why PolyBase?
4.7  What is Azure Synapse Analytics (formerly SQL DW): SQL Analytics and SQL pool in Azure Synapse, Key component of a big data solution, SQL Analytics MPP architecture components

5 – Azure Batch
5.1  What is Azure Batch?
5.2  Intrinsically Parallel Workloads
5.3  Tightly Coupled Workloads
5.4  Additional Batch Capabilities
5.5  Working of Azure Batch

6 – Azure Data Factory
6.1  Flow Process of Data Factory
6.2  Why Azure Data Factory
6.3  Integration Runtime in Azure Data Factory
6.4  Mapping Data Flows

7 – Azure Data Bricks
7.1  What is Azure Databricks?
7.2  Azure Spark-based Analytics Platform
7.3  Apache Spark in Azure Databricks

8 – Azure Stream Analytics
8.1  Working of Stream Analytics
8.2  Key capabilities and benefits
8.3  Stream Analytics Windowing Functions: Tumbling window, Hopping Window, Sliding Window, Session Window

1. Artificial Intelligence Basics

  • Introduction to TensorFlow
  • Keras API

2. Neural Networks

  • Neural networks
  • Multi-layered Neural Networks
  • Artificial Neural Networks

3. Deep Learning

  • Deep Learning Libraries
  • Deep neural networks
  • Convolutional Neural Networks
  • LSTM
  • Recurrent Neural Networks
  • GPU in deep learning
  • Autoencoders, Restricted Boltzmann machine
  • Deep Learning Applications
  • Chatbots

1. Text Mining, Cleaning, and Pre-processing

  • Various Tokenizers, Tokenization, Frequency Distribution, Stemming, POS Tagging, Lemmatization, Bigrams, Trigrams & Ngrams, Lemmatization, Entity Recognition.

2. Text classification, NLTK, sentiment analysis, etc

  • Overview of Machine Learning, Words, Term Frequency, Countvectorizer, Inverse Document Frequency, Text conversion, Confusion Matrix, Naive Bayes Classifier.

3. Sentence Structure, Sequence Tagging, Sequence Tasks, and Language Modeling

  • Language Modeling, Sequence Tagging, Sequence Tasks, Predicting Sequence of Tags, Syntax Trees, Context-Free Grammars, Chunking, Automatic Paraphrasing of Texts, Chinking.

4. AI Chatbots and Recommendations Engine

  • Using the NLP concepts, build a recommendation engine and an AI chatbot assistant using AI.

1. Introduction to MLOps

  • MLOps lifecycle
  • MLOps pipeline
  • MLOps Components, Processes, etc

2. Deploying Machine Learning Models

  • Introduction to Azure Machine Learning
  • Deploying Machine Learning Models using Azure

1. Power BI Basics

  • Introduction to PowerBI, Use cases and BI Tools, Data Warehousing, Power BI components, Power BI Desktop, workflows and reports, Data Extraction with Power BI.
  • SaaS Connectors, Working with Azure SQL database, Python and R with Power BI
  • Power Query Editor, Advance Editor, Query Dependency Editor, Data Transformations, Shaping and Combining Data, M Query and Hierarchies in Power BI.

2. DAX

  • Data Modeling and DAX, Time Intelligence Functions, DAX Advanced Features

3. Data Visualization with Analytics

  • Slicers, Filters, Drill Down Reports
  • Power BI Query, Q & A and Data Insights
  • Power BI Settings, Administration and Direct Connectivity
  • Embedded Power BI API and Power BI Mobile
  • Power BI Advance and Power BI Premium
  • The Data Science capstone project focuses on establishing a stronghold of analyzing a problem and coming up with solutions based on insights from the data analysis perspective. The capstone project will help you master the following verticals:
    • Extracting, Loading, and Transforming data into a usable format to gather insights.
    • Data manipulation and handling to pre-process the data.
    • Feature engineering and scaling the data for various problem statements.
    • Model selection and model building on various classification, and regression problems using supervised/unsupervised machine learning algorithms.
    • Assessment and monitoring of the model created using the machine learning models.

1. Recommendation Engine – The case study will guide you through various processes and techniques in machine learning to build a recommendation engine that can be used for movie recommendations, Restaurant recommendations, book recommendations, etc.
2. Rating Predictions – This text classification and sentiment analysis case study will guide you towards working with text data and building efficient machine learning models that can predict ratings, Sentiments, etc.
3. Census – Using predictive modeling techniques on the census data, you will be able to create actionable insights for a given population and create machine learning models that will predict or classify various features like total population, user income, etc.
4. Housing – This real estate case study will guide you towards real-world problems, where a culmination of multiple features will guide you towards creating a predictive model to predict housing prices.
5. Object Detection – A much more advanced yet simple case study that will guide you toward making a machine-learning model that can detect objects in real-time.
6. Stock Market Analysis – Using historical stock market data, you will learn about how feature engineering and feature selection can provide you with some really helpful and actionable insights for specific stocks.
7. Banking Problem – A classification problem that predicts consumer behavior based on various features using machine learning models.
8. AI Chatbot – Using the NLTK python library, you will be able to apply machine learning algorithms and create an AI chatbot.

View More

Program Highlights

400 Hrs of Applied Learning
70+ Live sessions across 9 months
218 Hrs of Self-Paced Learning
24*7 Support

Project Work

All the projects included in this program are aligned with the industry demands and standards. These industry-oriented projects will test your level of knowledge in the Cyber Security domain and also help you get exposure to real-life scenarios.

Career Services By Intellipaat

Career Services
guaranteed
3 Guaranteed Interviews
job portal
Exclusive access to Intellipaat Job portal
Mock Interview Preparation
1 on 1 Career Mentoring Sessions
resume 1
Career Oriented Sessions
linkedin 1
Resume & LinkedIn Profile Building
View More

Our Alumni Works At

Hiring Partners

Peer Learning

Via Intellipaat PeerChat, you can interact with your peers across all classes and batches and even our alumni. Collaborate on projects, share job referrals & interview experiences, compete with the best, make new friends – the possibilities are endless and our community has something for everyone!

class-notifications
hackathons
career-services
major-announcements
collaborative-learning

Admission Details

The application process consists of three simple steps. An offer of admission will be made to selected candidates based on the feedback from the interview panel. The selected candidates will be notified over email and phone, and they can block their seats through the payment of the admission fee.

ad submit

Submit Application

Tell us a bit about yourself and why you want to join this program

ad review

Application Review

An admission panel will shortlist candidates based on their application

ad admission 1

Admission

Selected candidates will be notified within 1–2 weeks

Program Fee

Total Admission Fee

₹ 99,009

Apply Now

EMI Starts at

₹ 5,500

We partnered with financing companies to provide competitive finance options at 0% interest rate with no hidden costs

Financing Partners

dk-fee-emi-logo

The credit facility is provided by a third-party credit facility provider and any arrangement with such third party is outside Intellipaat’s purview.

Upcoming Application Deadline 11th December 2024

Admissions are closed once the requisite number of participants enroll for the upcoming cohort. Apply early to secure your seat.

Program Cohorts

Next Cohorts

Next Cohorts

Date Time Batch Type
Program Induction 26th Nov 2022 08:00 PM IST Weekend (Sat-Sun)
Regular Classes 26th Nov 2022 08:00 PM IST Weekend (Sat-Sun)
Apply Now

Frequently Asked Questions

Why should I sign up for this Data Science and Data Engineering course?

The Advanced Certification in Data Science and Data Engineering course is conducted by leading experts from EICT, IIT Guwahati and Intellipaat who will make you proficient in these fields through online video lectures and projects. They will help you gain in-depth knowledge in Data Science, apart from providing hands-on experience in these domains through real-time projects.

After completing the course and successfully executing the assignments and projects, you will gain an Advanced Certification in Data Science and Data Engineering from Intellipaat and EICT, IIT Guwahati which will be recognized by top organizations around the world. Also, our job assistance team will prepare you for your job interview by conducting several mock interviews, preparing your resume, and more.

Intellipaat provides career services that include Guarantee interviews for all the learners who successfully complete this course. EICT IIT Guwahati is not responsible for the career services.

To complete this program, it requires 9 months of attending live classes and completing the assignments and projects along the way.

If, by any circumstance, you’ve missed the live class, you will be given a recording of the class you missed within the next 12 hours. Also, if you need any support, you will have access to  24*7 technical support for any sort of query resolution and help.

To complete this program, you will have to spare around 6 hours a week to learn. Classes will be held over weekends (Sat/ Sun) and each session will of 3 hrs.

Upon completion of all of the requirements of the program, you will be awarded a certificate from E&ICT, IIT Guwahati.

View More

What is included in this course?

  • Non-biased career guidance
  • Counselling based on your skills and preference
  • No repetitive calls, only as per convenience
  • Rigorous curriculum designed by industry experts
  • Complete this program while you work