Case Study 1: DECISION TREE
Topics: To understand the structure of the dataset (Decision Tree on Pima Indians Diabetes Database) and create a decision tree model based on it by using Scikit-learn.
Case Study 2: Insurance cost Prediction (Linear Regression)
Topics: To understand the structure of the Medical Insurance Dataset and implement simple as well as multiple linear regression and predict the values.
Case Study 3: Diabetes Classification (Logistic Regression)
Topics: To understand the structure of the dataset (Pima Indians Diabetes), and implement multiple logistic regression & classify. Fit your model on the test and train data for prediction and evaluate your model using Confusion Matrix then visualize it.
Case Study 4: Random Forest
Topics: To create a model which would help in classifying whether the patient is ‘Normal’, ‘Suspected to have disease’ or in actuality has the ‘disease’ on “Cardiotography” dataset.
Case Study 5: Principle Component Analysis (PCA)
Topics: Read the sample iris dataset given to you and use the PCA to figure out the number of most important principal features and then reduce the number of features using PCA. Train and test the RandomForestClassifier algorithm to check if reducing the number of dimensions is causing the model to perform poorly. Figure out the most optimal number which produces good quality results and predict the accuracy.
Case Study 6: K-Means Clustering
- Analyze Data
- Extract useful columns from dataset.
- Visualize data
- Find out appropriate number of groups or clusters for data to be segmented in (use elbow method).
- Using K Means Clustering segment the data into K groups (K was found in the previous step).
- Visualize a scatter plot of the clusters and a lot more.
Project 1: Customer Churn Classification
Topics: This is a real-world project that gives you hands-on experience in working with most of the machine learning algorithms. The main components of the project include the following:
- Manipulating data to extract meaningful insights
- Visualizing data to find patterns among different factors
- Implementing these algorithms – linear regression, decision tree, naïve Bayes
Project 2: Recommendation for Movie, Summary
Topics: This is a real-world project that gives you hands-on experience in working with a movie recommender system. Depending on what movies are liked by a particular user, you will be in a position to provide data-driven recommendations. This project involves understanding recommender systems, information filtering, predicting ‘rating’, learning about user ‘preference’ and so on. You will exclusively work on data related to user details, movie details, and others. The main components of the project include the following:
- Recommendation for movie
- Two Types of Predictions – Rating Prediction, Item Prediction
- Important Approaches: Memory-Based and Model-Based
- Knowing User Based Methods in K-Nearest Neighbor
- Understanding Item Based Method
- Matrix Factorization
- Decomposition of Singular Value
- Data Science Project discussion
- Collaboration Filtering
- Business Variables Overview