If you are finding it hard to remember all the different commands to perform different operations in Scikit Learn then don’t worry, you are not alone, it happens more often than you would think.
At Intellipaat, we make sure that our learners get the best out of our e-learning services and that is exactly why we have come up with this Sklearn Cheat-Sheet to support our learners, in case they need a handy reference to help them get started with Scikit in python.
This cheat sheet has been designed assuming that you have a basic knowledge of python and machine learning but need a quick reference to turn to when you need to look up the commands in Scikit.
Download the printable PDF of this cheat sheet
Scikit-Learn or “sklearn“ is a free, open source machine learning library for the Python programming language. It’s simple yet efficient tool for data mining, Data analysis and Machine Learning. It features various machine learning algorithms and also supports Python’s scientific and numerical libraries, that is, SciPy and NumPy respectively.
Learn Python in 16 hrs from experts
Before you can start using Scikit-learn, you need to remember that it is a Python library and you need to import it. To do that all you have to do is type the following command:
The process of converting raw data set into a meaningful and clean data set is referred to as Preprocessing of data. This is a ‘must- follow’ technique before you can feed your data set to a machine learning algorithm. There are mainly three steps that you need to follow while preprocessing the data. The steps are listed below:
1. Data Loading:
You need your data in numeric form stored in numeric arrays. Following are the two ways you can load the data, you can also use some other numeric array to load your data.
2. Train-Test data:
The next step is to split your data in training data set and testing data set
3. Data Preparation:
Standardization: It makes the training process well behaved improving the numerical condition of the optimization problems.
Normalization: It makes training less sensitive to the scale of features, also makes the data better conditioned for convergence.
Wish to Learn Python? Click Here
After making all the necessary transformation in our dataset, in order to make it algorithm-ready, we need to work on our model, that is, choosing a correct model or an algorithm that represents our dataset and will help us make the kind of predictions that we want from our chosen data set and then performing model fitting.
Supervised learning, as the name suggests, is the kind of machine learning where we supervise the outcome by training the model with well labeled data, which means that some of the data in the dataset will already be tagged with correct answers.
a. Linear Regression:
b. Support Vector Machine:
c. Naive Bayes:
d. KNN:
Unlike Supervised learning, unsupervised learning is where we train the model with non labeled data or non classified data and let the algorithm do all the work on that dataset without any assistance.
a. Principal Component Analysis (PCA):
b. K Means:
The goal of implementing model fitting is to learn how well a model will generalize when trained with a dataset similar to the dataset that the model was initially trained on. The more fitting model will produce more accurate outcomes.
After getting comfortable with our dataset and model, the next step is to finally follow the main goal of machine learning algorithms, that is, to forecast outcomes and make predictions.
a. Confusion Matrix:
b. Accuracy Score
a. Mean Absolute Error:
b. Mean Squared Error:
c. R² Score
>>> from sklearn.metrics import r2_score>>> r2_score(y_true, y_predict)
a. Homogeneity:
b. V-measure:
c. Cross-validation:
In Grid search, parameter tuning is done methodically and then it evaluates model for each set of parameter that is specified in a grid.
In Randomised Search, random search is performed on a fixed set of parameters. The number of parameters that are used is given by n-iter.
With this, comes the end of this Sklearn cheat sheet. You can enroll for Python Certification Training provided by Intellipaat for detailed and in-depth knowledge. This training program will guide you step by step will provide you with all the right set of skills to master one of the most popular and widely used language, Python. Not only that, you will also gain knowledge on all the important libraries and modules in python such as, like SciPy, NumPy, MatPlotLib,Scikit-learn, Pandas, Lambda function and more. You will have 24*7 technical support and assistance from the experts in respective technologies here at intellipaat throughout the certification period.
Previous NextLearn SQL in 16 hrs from experts
"0 Responses on Scikit-Learn Cheat Sheet"