Understanding Machine Learning
The term ‘Machine Learning’ seems to be a hot cake these days. So, what exactly is it?
Well, simply put, Machine Learning is the sub-field of Artificial Intelligence, where we teach a machine how to learn, with the help of input data.
Watch this video on Machine Learning by Intellipaat:
Now that we know, what exactly is machine learning, let’s have a look at the types of Machine Learning algorithms.
Types of Machine Learning Algorithms
Machine Learning Algorithms can be grouped into two types:
- Supervised Learning &
- Unsupervised Learning
Learn Machine Learning from experts, click here to more in this Machine Learning Training in London!
In supervised machine learning algorithms, we have input variables and output variables. The input variables are denoted by ‘x’ and the output variables are denoted by ‘y’.
Here, the aim of supervised learning is to understand, how does ‘y’ vary with ‘x’, i.e. the goal is to approximate the mapping function so well that when we have a new input data (x) we can predict the output variables (Y) for that data.
Or, in other words, we have dependent variables and independent variables and our aim is to understand how does a dependent variable change with respect to an independent variable.
Let’s understand supervised learning through this example:
Here, our independent variable is “Gender” of the student and dependent variable is “Output” of the student and we are trying to determine whether the student would pass the exam or not based of the student’s gender.
Now, supervised learning can again be divided into regression and classification, so let’s start with regression.
Watch this complete Machine Learning Tutorial Video
Regression in Machine Learning
In regression, the output variable is a continuous numeric value. So, let’s take this example to understand regression better:
Here, the output variable is the “cost” of apple, which is a continuous value, i.e. we are trying to predict the “cost” of apple with respect to other factors.
Now, it’s time to look at one of the most popular regression algorithm -> Linear Regression.
For the best of career growth, check out Intellipaat’s Machine Learning Course and get certified.
As the name states, linear regression is used to determine the linear relationship between independent and dependent variable. Or in other words, it is used in estimating exactly how much of y will linearly change, when x changes by a certain amount.
As we see in the image, a car’s mpg(Miles per Gallon) is mapped onto the x-axis and the hp(Horse Power) is mapped on the y-axis and we are determining if there is a linear relationship between “hp” and “mpg”.
So, this was the linear regression algorithm, now let’s head onto classification in machine learning.
Classification in Machine Learning
In classification, the output variable is categorical in nature. So, let’s take this example to understand classification better:
Here, the output variable is the “gender” of the person, which is a categorical value and we are trying to classify the person into a specific gender based on other factors.
Now, we’ll look at these classification algorithms in brief:
- Decision Tree
- Random Forest
Decision tree is one of the most used machine learning algorithms in use, currently. As the name suggests, in Decision Tree, we have a tree-like structure of decisions and their possible consequences.
At each node there is a test condition and the node splits into left and right children based on the test condition.
Now, let’s look at some terminologies of decision tree:
- Root Node: It represents the entire population or sample, and this further gets divided into two or more homogeneous sets.
- Splitting: Dividing a node into two or more sub-nodes.
- Decision Node/Branch Node: When a sub-node splits into further sub-nodes, then it is called a decision node.
- Leaf/Terminal Node: Nodes which do not split further are called leaf or terminal nodes.
Become Master of Machine Learning by going through this online Machine Learning course in Singapore.
As the name states, random forest is an ensemble of multiple decision tree models. In this algorithm, random subsets are generated from the original dataset. Let’s say, if ‘x’ datasets are created from the original dataset, then, ‘x’ decision trees are built on top of these datasets. So, each of these ‘decision trees generate a result and the optimal solution is found out by taking the aggregate of all the individual results.
So, these were some of the classification algorithms, now, let’s head onto unsupervised learning:
In unsupervised machine learning algorithms, we have input data with no class labels and we build a model to understand the underlying structure of the data. Let’s understand this with an example:
Here, we have input data with no class labels and this input data comprises of fish and birds. Now, let’s build an unsupervised model on top of this input data. So, this will give out two clusters. The first cluster comprises of all the fish and the second cluster comprises of all the birds.
Now, you guys need to keep in mind that even though there were no class labels, this unsupervised learning model was able to divide this data into two clusters and this clustering has been done on the basis of similarity of characteristics.
Now, out of all the unsupervised machine learning algorithms, k-means clustering is the most popular, so let’s understand that.
Go through this Artificial Intelligence Interview Questions And Answers to excel in your Artificial Intelligence Interview.
K means clustering is an unsupervised machine learning algorithm, where the aim is to group similar data points into a single cluster. So, there must be high intra-cluster similarity and low inter-cluster similarity, i.e. all the data points within a cluster should be as similar as possible and the data points between two different clusters should be as dissimilar as possible.
In k-means clustering, ‘k’ denotes the number of clusters to be formed. So, in the above picture, the value of k=3 and hence 3 clusters are formed.
So, these were some of the most popular machine learning algorithms.