Explore Courses

Data Science Course for Undergraduates

The Data Science course for undergraduates provided by Intellipaat gives an in-depth understanding of R programming for statistical analysis, data manipulation and visualization techniques, Machine Learning algorithms, fundamentals of Big Data, business analytics, and more. Besides, you will learn associate rule mining and the implementation of recommendation engines. This certification course in Data Science will give you the experience of working with real-life industry projects in fields such as retail, banking, finance, and entertainment that will make you a skilled Data Scientist.

Key Features

  • Instructor-led Training: 24 weeks
  • Real World Industry Projects:
  • Get Certified & Job Assistance
  • Min. 3 interviews guaranteed with top IT companies
  • Industry Level Hackathons to prove Expertise

About Course

This comprehensive Data Science course provides detailed training on Data Analytics, data manipulation and visualization, statistical methods for data analysis, and Machine Learning. At the end of each module of this course, you have to solve assignments related to the respective topics to get a grasp on the concepts in-depth. Finally, you will work on different real-life projects that will enhance your learning and implementation skills.

What will you learn in this course?

The learning outcome of this course includes the following topics:

  1. Introduction to Data Science
  2. Data Science life cycle
  3. Data retrieval processes
  4. Data mining and data manipulation
  5. Machine Learning algorithms
  6. Basics of Big Data
  7. Integration of Hadoop with R
  8. Creating recommender systems

What are the requirements for enrolling in this Data Science course?

There are no specific requirements for pursuing a Data Science certification course for an undergraduate student.

Why should you join this course?

According to Indeed, the average salary of a Data Scientist in the United States is US$121,650 and, in India, it is ₹856,000. These salary stats show how Data Science jobs are the highest paying jobs in the industry. Data Scientists are hired for adding value to organizations and increasing revenue by extracting business insights.

The subject matter experts at Intellipaat ensure that students gain all the skills required to become a Data Scientist by working on real-life projects. Also, the students can add these projects in their resume or can use it as their final year project.

Intellipaat provides all its learners with job assistance. They will get exposure to connect to 200+ employers and get a chance for three face-to-face interviews. Therefore, it is worth to enroll in Intellipaat’s Data Science certification course.

view more
Read Less

Data Science Course Content

Introduction to Data Science with R

What is Data Science, significance of Data Science in today’s digitally-driven world, applications of Data Science, lifecycle of Data Science, components of the Data Science lifecycle, introduction to big data and Hadoop, introduction to Machine Learning and Deep Learning, introduction to R programming and R Studio.

Hands-on Exercise – Installation of R Studio, implementing simple mathematical operations and logic using R operators, loops, if statements and switch cases.

Data Exploration

Introduction to data exploration, importing and exporting data to/from external sources, what is data exploratory analysis, data importing, dataframes, working with dataframes, accessing individual elements, vectors and factors, operators, in-built functions, conditional, looping statements and user-defined functions, matrix, list and array.

Hands-on Exercise – Accessing individual elements of customer churn data, modifying and extracting the results from the dataset using user-defined functions in R.

Data Manipulation

Need for Data Manipulation, Introduction to dplyr package, Selecting one or more columns with select() function, Filtering out records on the basis of a condition with filter() function, Adding new columns with the mutate() function, Sampling & Counting with sample_n(), sample_frac() & count() functions, Getting summarized results with the summarise() function, Combining different functions with the pipe operator, Implementing sql like operations with sqldf.

Hands-on Exercise – Implementing dplyr to perform various operations for abstracting over how data is manipulated and stored.

Data Visualization

Introduction to visualization, Different types of graphs, Introduction to grammar of graphics & ggplot2 package, Understanding categorical distribution with geom_bar() function, understanding numerical distribution with geom_hist() function, building frequency polygons with geom_freqpoly(), making a scatter-plot with geom_pont() function, multivariate analysis with geom_boxplot, univariate Analysis with Bar-plot, histogram and Density Plot, multivariate distribution, Bar-plots for categorical variables using geom_bar(), adding themes with the theme() layer, visualization with plotly package & building web applications with shinyR, frequency-plots with geom_freqpoly(), multivariate distribution with scatter-plots and smooth lines, continuous vs categorical with box-plots, subgrouping the plots, working with co-ordinates and themes to make the graphs more presentable, Intro to plotly & various plots, visualization with ggvis package, geographic visualization with ggmap(), building web applications with shinyR.

Hands-on Exercise – Creating data visualization to understand the customer churn ratio using charts using ggplot2, Plotly for importing and analyzing data into grids. You will visualize tenure, monthly charges, total charges and other individual columns by using the scatter plot.

Introduction to Statistics

Why do we need Statistics?, Categories of Statistics, Statistical Terminologies,Types of Data, Measures of Central Tendency, Measures of Spread, Correlation & Covariance,Standardization & Normalization,Probability & Types of Probability, Hypothesis Testing, Chi-Square testing, ANOVA, normal distribution, binary distribution.

Hands-on Exercise – Building a statistical analysis model that uses quantifications, representations, experimental data for gathering, reviewing, analyzing and drawing conclusions from data.

Machine Learning

Introduction to Machine Learning, introduction to Linear Regression, predictive modeling with Linear Regression, simple Linear and multiple Linear Regression, concepts and formulas, assumptions and residual diagnostics in Linear Regression, building simple linear model, predicting results and finding p-value, introduction to logistic regression, comparing linear regression and logistics regression, bivariate & multi-variate logistic regression, confusion matrix & accuracy of model, threshold evaluation with ROCR, Linear Regression concepts and detailed formulas, various assumptions of Linear Regression,residuals, qqnorm(), qqline(), understanding the fit of the model, building simple linear model, predicting results and finding p-value, understanding the summary results with Null Hypothesis, p-value & F-statistic, building linear models with multiple independent variables.

Hands-on Exercise – Modeling the relationship within the data using linear predictor functions. Implementing Linear & Logistics Regression in R by building model with ‘tenure’ as dependent variable and multiple independent variables.

Logistic Regression

Introduction to Logistic Regression, Logistic Regression Concepts, Linear vs Logistic regression, math behind Logistic Regression, detailed formulas, logit function and odds, Bi-variate logistic Regression, Poisson Regression, building simple “binomial” model and predicting result, confusion matrix and Accuracy, true positive rate, false positive rate, and confusion matrix for evaluating built model, threshold evaluation with ROCR, finding the right threshold by building the ROC plot, cross validation & multivariate logistic regression, building logistic models with multiple independent variables, real-life applications of Logistic Regression.

Hands-on Exercise – Implementing predictive analytics by describing the data and explaining the relationship between one dependent binary variable and one or more binary variables. You will use glm() to build a model and use ‘Churn’ as the dependent variable.

Decision Trees & Random Forest

What is classification and different classification techniques, introduction to Decision Tree, algorithm for decision tree induction, building a decision tree in R, creating a perfect Decision Tree, Confusion Matrix, Regression trees vs Classification trees, introduction to ensemble of trees and bagging, Random Forest concept, implementing Random Forest in R, what is Naive Bayes, Computing Probabilities, Impurity Function – Entropy, understand the concept of information gain for right split of node, Impurity Function – Information gain, understand the concept of Gini index for right split of node, Impurity Function – Gini index, understand the concept of Entropy for right split of node, overfitting & pruning, pre-pruning, post-pruning, cost-complexity pruning, pruning decision tree and predicting values, find the right no of trees and evaluate performance metrics.

Hands-on Exercise – Implementing Random Forest for both regression and classification problems. You will build a tree, prune it by using ‘churn’ as the dependent variable and build a Random Forest with the right number of trees, using ROCR for performance metrics.

Unsupervised learning

What is Clustering & it’s Use Cases, what is K-means Clustering, what is Canopy Clustering, what is Hierarchical Clustering, introduction to Unsupervised Learning, feature extraction & clustering algorithms, k-means clustering algorithm, Theoretical aspects of k-means, and k-means process flow, K-means in R, implementing K-means on the data-set and finding the right no. of clusters using Scree-plot, hierarchical clustering & Dendogram, understand Hierarchical clustering, implement it in R and have a look at Dendograms, Principal Component Analysis, explanation of Principal Component Analysis in detail, PCA in R, implementing PCA in R.

Hands-on Exercise – Deploying unsupervised learning with R to achieve clustering and dimensionality reduction, K-means clustering for visualizing and interpreting results for the customer churn data.

Association Rule Mining & Recommendation Engine

Introduction to association rule Mining & Market Basket Analysis, measures of Association Rule Mining: Support, Confidence, Lift, Apriori algorithm & implementing it in R, Introduction to Recommendation Engine, user-based collaborative filtering & Item-Based Collaborative Filtering, implementing Recommendation Engine in R, user-Based and item-Based, Recommendation Use-cases.

Hands-on Exercise – Deploying association analysis as a rule-based machine learning method, identifying strong rules discovered in databases with measures based on interesting discoveries.

view more
Read Less

Data Science Projects

What projects I will be working in this Data Science certification course?

Project 1 : Augmenting retail sales with Data Science

Industry :  Retail

Problem Statement : How to deploy the various rules and algorithms of Data Science for analyzing stationary store purchase data.

Topics : In this project you will deploy the various tools of Data Science like association rule, Apriori algorithm in R, support, lift and confidence of association rule. You will analyze the purchase data of the stationary outlet for three days and understand the customer buying patterns across products.


  • Association rules for transaction data
  • Association mining with Apriori algorithm
  • Generating rules and identifying patterns.

Project 2 : Analyzing pre-paid model of stock broking

Industry : Finance

Problem Statement : Finding out the deciding factor for people to opt for the pre-paid model of stock broking.

Topics : In this Data Science project you will learn about the various variables that are highly correlated in pre-paid brokerage model, analysis of various market opportunities, developing targeted promotion plans for various products sold under various categories. You will also do competitor analysis, the advantages and disadvantages of pre-paid model.

Highlights :

  • Deploying the rules of statistical analysis
  • Implementing data visualization
  • Linear regression for predictive modeling.

Project 3 : Cold Start Problem in Data Science

Industry : Ecommerce

Problem Statement :  how to build a recommender system without the historical data available

Topics : This project involves understanding of the cold start problem associated with the recommender systems. You will gain hands-on experience in information filtering, working on systems with zero historical data to refer to, as in the case of launching a new product. You will gain proficiency in working with personalized applications like movies, books, songs, news and such other recommendations. This project includes the various ways of working with algorithms and deploying other data science techniques.

Highlight :

  • Algorithms for Recommender
  • Ways of Recommendation
  • Types of Recommendation -Collaborative Filtering Based Recommendation, Content-Based Recommendation
  • Complete mastery in working with the Cold Start Problem.

Project 4 : Recommendation for Movie, Summary

Topics : This is real world project that gives you hands-on experience in working with a movie recommender system. Depending on what movies are liked by a particular user, you will be in a position to provide data-driven recommendations. This project involves understanding recommender systems, information filtering, predicting ‘rating’, learning about user ‘preference’ and so on. You will exclusively work on data related to user details, movie details and others. The main components of the project include the following:

Recommendation for movie

  • Two Types of Predictions – Rating Prediction, Item Prediction
  • Important Approaches: Memory Based and Model-Based
  • Knowing User Based Methods in K-Nearest Neighbor
  • Understanding Item Based Method
  • Matrix Factorization
  • Decomposition of Singular Value
  • Data Science Project discussion
  • Collaboration Filtering
  • Business Variables Overview

Project 5 : Prediction on Pokemon dataset

Industry : Gaming

Problem Statement : For the purpose of this case study, you are a Pokemon trainer who is on his way to catch all the 800 Pokemons

Topics : This real-world project will give you a hands-on experience on the data science life cycle. You’ll understand the structure of the ‘Pokemon’ dataset & use machine learning algorithms to make some predictions. You will use the dplyr package to filter out specific Pokemons and use decision trees to find if the Pokemon is legendary or not.

Highlight :

  • dplyr package to filter Pokemons
  • Decision Tree algorithm
  • Linear regression algorithm.

Project 6 : Book Recommender System

Industry : E-commerce

Problem Statement : Building a book recommender system for readers with similar interests

Topics : This real-world project will give you a hands-on experience in working with a book recommender system. Depending on what books are read by a particular user, you will be in a position to provide data-driven recommendations. You will understand the structure of the data and visualize it to find interesting patterns.


  • Data analysis & visualization
  • Recommender Lab
  • User Based Collaborative Filtering Model.

Project 7: Census Income

Domain: Social

Problem Statement: In this project, you will process the data and then develop an understanding of different features of the data by performing explanatory analysis and creating the visualizations. After having enough knowledge about the attributes, you will perform a predictive task of classification to predict whether an individual makes over 50K a year or less by using different Machine Learning Algorithms.

Topics: An end-to-end exhaustive project comprising topics in:

  • Data Processing
  • Data Manipulation
  • Data Visualization
  • Linear Regression
  • Logistic Regression
  • Decision Tree
  • Random Forest

Project 8: Loan Prediction

Domain: Banking

Problem Statement:  You are the Senior Data Scientist at a major private bank. Since the last 6 months, the number of customers who are not able to repay their loan has increased. Keeping this in mind, you have to look at your customer data and analyse which customers should be given the loan approval and which customers should be denied.

Topics: An exhaustive project on Customer_loan Dataset comprising topics in:

  • Data Processing
  • Model Building

Project 9: Capstone

Industry: Analytics

Problem Statement: Predicting if the customer will churn or not.

Topics: An end-to-end capstone project comprising:

  • Manipulating and envisioning the data for insights.
  • Implementing the linear regression model to predict continuous values.
  • Implementing classification models – decision tree, logistic regression, and random forest on “customer churn”.


An end-to-end capstone project covering all the modules. You’ll start off by manipulating and visualizing the data to get interesting insights. Then you’d have to implement the linear regression model to predict continuous values. Following which you’ll implement these classification models – logistic regression, decision tree & random forest on the “customer churn” data frame to find if the customer will churn or not.

view more
Read Less Project

Data Science Certification

As part of this training, you will be working on real-time projects and assignments that have immense implications in the real-world industry scenarios, thus helping you fast-track your career effortlessly.

Intellipaat Course Completion Certificate will be awarded upon the completion of the project work (after expert review) and upon scoring at least 60% marks in the assignments that will be made available as part of training program. Intellipaat certification is well recognized in top 80+ MNCs like Ericsson, Cisco, Cognizant, Sony, Mu Sigma, Saint-Gobain, Standard Chartered, TCS, Genpact, Hexaware, etc. for its structured learning, industry-oriented projects, professional teaching methodologies among other advantages over others in the industry.

view more
Read Less Certification

Data Science Reviews

view more
View Less Reviews Video
  1. Profile photo of Nitesh Kumar Dash Nitesh Kumar Dash 

    Learner-friendly training

    The Intellipaat Data Science certification training videos really made me excited about studying Data Science. They were so elaborate and so professionally created that I could learn Data Science from the comfort of my home, thanks to those learner-friendly videos. I am grateful to Intellipaat.

  2. Profile photo of Giri Karnal Giri Karnal 

    Excellent training

    I had taken the Data Science masters’ program which is a combo of SAS, R and Apache Mahout. Since there are so many technologies involved in the Data Science course, getting your query resolved at the right time becomes the most important aspect. But with Intellipaat, there was no such problem as all my queries were resolved in less than 24 hours.

  3. Profile photo of swetha pandit Swetha Pandit 

    Valuable material for learning. Worth spending!

    Their Data Science courses are well structured and taught by recognized professionals which helps one to learn Data Science fast. I have found the videos to be of excellent quality. Thanks.

Frequently Asked Questions on Data Science

How can I make full use of the projects in this program?

Since the projects are industry-level and from various domains, alongside learning and mastering the concepts, this can be repurposed to be the final year project as well. It shows that you’ve thoroughly learnt the concepts and poses as a huge advantage for potential employers.

What are the different modes of training that Intellipaat provides?
At Intellipaat you can enroll either for the instructor-led online training or self-paced training. Apart from this Intellipaat also offers corporate training for organizations to upskill their workforce. All trainers at Intellipaat have 12+ years of relevant industry experience and they have been actively working as consultants in the same domain making them subject matter experts. Go through the sample videos to check the quality of the trainers.
Can I request for a support session if I need to better understand the topics?
Intellipaat is offering the 24/7 query resolution and you can raise a ticket with the dedicated support team anytime. You can avail the email support for all your queries. In the event of your query not getting resolved through email we can also arrange one-to-one sessions with the trainers. You would be glad to know that you can contact Intellipaat support even after completion of the training. We also do not put a limit on the number of tickets you can raise when it comes to query resolution and doubt clearance.
Can you explain the benefits of the Intellipaat self-paced training?
Intellipaat offers the self-paced training to those who want to learn at their own pace. This training also affords you the benefit of query resolution through email, one-on-one sessions with trainers, round the clock support and access to the learning modules or LMS for lifetime. Also you get the latest version of the course material at no added cost. The Intellipaat self-paced training is 75% lesser priced compared to the online instructor-led training. If you face any problems while learning we can always arrange a virtual live class with the trainers as well.
What kind of projects are included as part of the training?
Intellipaat is offering you the most updated, relevant and high value real-world projects as part of the training program. This way you can implement the learning that you have acquired in a real-world industry setup. All training comes with multiple projects that thoroughly test your skills, learning and practical knowledge thus making you completely industry-ready. You will work on highly exciting projects in the domains of high technology, ecommerce, marketing, sales, networking, banking, insurance, etc. Upon successful completion of the projects your skills will be considered equal to six months of rigorous industry experience.
Does Intellipaat offer job assistance?
Intellipaat actively provides placement assistance to all learners who have successfully completed the training. For this we are exclusively tied-up with over 80 top MNCs from around the world. This way you can be placed in outstanding organizations like Sony, Ericsson, TCS, Mu Sigma, Standard Chartered, Cognizant, Cisco, among other equally great enterprises. We also help you with the job interview and résumé preparation part as well.
Is it possible to switch from self-paced training to instructor-led training?
You can definitely make the switch from self-paced to online instructor-led training by simply paying the extra amount and joining the next batch of the training which shall be notified to you specifically.
How are Intellipaat verified certificates awarded?
Once you complete the Intellipaat training program along with all the real-world projects, quizzes and assignments and upon scoring at least 60% marks in the qualifying exam; you will be awarded the Intellipaat verified certification. This certificate is very well recognized in Intellipaat affiliate organizations which include over 80 top MNCs from around the world which are also part of the Fortune 500 list of companies.
Will The Job Assistance Program Guarantee Me A Job?
In our Job Assistance program we will be helping you land in your dream job by sharing your resume to potential recruiters and assisting you with resume building, preparing you for interview questions. Intellipaat training should not be regarded either as a job placement service or as a guarantee for employment as the entire employment process will take part between the learner and the recruiter companies directly and the final selection is always dependent on the recruiter.
view more
Read Less FAQ
You have of $0 in your cart.
Online Classroom


Sat & Sun
8 PM IST (GMT +5:30)


Sat & Sun
8 PM IST (GMT +5:30)


Sat & Sun
8 PM IST (GMT +5:30)
Drop Us a Query

Call Us

Training in Cities: Bangalore, Hyderabad, Chennai, Delhi, Kolkata, UK, London, Chicago, San Francisco, Dallas, Washington, New York, Orlando, Boston

Select Currency

Sign Up or Login to view the Free Data Science Course for Undergraduates course.