Machine Learning Tutorial
Seems like you would have stumbled upon the term machine learning and must be wondering what exactly it is. Well, this machine learning tutorial will clear out all of your confusion!
Machine learning is a field of artificial intelligence with the help of which you can perform magic! Yes, you read it right. Let’s take some real-life examples to understand this. I believe all of you must have heard of Google’s self-driving car. A car that drives by itself without any human support; that is just amazing, isn’t it?
Watch this complete course video on Machine Learning Full Course
Now, how about virtual personal assistants such as Apple’s Siri or Microsoft’s Cortana? If you ask Siri what is the distance between Earth and Moon, it will immediately reply that the distance is 384,400km.
You also must have used Google maps. If you want to go from New Jersey to New York via road, google maps will show you the distance between these two places, the shortest route and also how much traffic is there along the road.
Now, you would agree with me that all of these are some magical applications, and the magic behind these applications is machine learning. So, simply put, machine learning is a sub-domain of artificial intelligence, where a machine is provided data to learn and make insightful decisions.
Now, that we have understood what is machine learning, let’s go ahead in this machine learning tutorial and look at the types of machine learning algorithms:
- Supervised Learning
- Unsupervised Learning
- Semi-supervised Learning
- Reinforcement Learning
Now, let’s go ahead and understand each of these machine learning algorithms comprehensively.
Watch Top 10 Machine Learning Applications
In supervised learning, the machine learns from the labeled data, i.e., we already know the result of the input data. In other words, we have input and output variables, and we only need to map a function between the two. The term “supervised learning” stems from the impression that an algorithm learns from a dataset (training). Here, the input is an independent variable, and the output is a dependent variable. The goal is to generate a mapping function that is accurate enough so that the algorithm can predict the output when we feed new input. This is an iterative process. Each time an algorithm makes a prediction, we need to check its performance. If it is not ideal, we have to keep repeating the process.
Let’s take this example to understand supervised learning in a better way.
So, this is an apple, isn’t it? Now, how do you know, it’s an apple? Well, as a kid, you would have come across an apple and you were told that it’s an apple and your brain learned that anything which looks like that is an apple.
Now, let’s apply the same analogy to a machine. Let’s say we feed in different images of apples to the machine and all of these images have the label “apple” associated with them.
Similarly, we will feed in different images of oranges to the machine and all of these images would have the label “orange” associated with them. So, here we are feeding in input data to the machine which is labeled.
So, this part in supervised learning, where the machine learns all the features of the input data along with it’s labels is known as ‘training’.
Once, the training is done, it will be fed new data or test data to determine, how well the training has been done.
So, here, if we feed in this new image of orange to the machine without its label, the machine should be able to predict the correct label based on all of its training.
This is the concept of supervised learning, where we train the machine using labeled data and then use this training to find new insights.
Now, supervised learning can again be divided into two categories:
Moving on in this machine learning tutorial, we will understand these two comprehensively.
Since Regression is a supervised learning algorithm, there will be an input variable as well as an output variable and the point to keep in mind is that the output variable is a continuous numerical, i.e. the dependent variable is a continuous numerical.
Let’s take this example to understand regression:
Let’s say you have two variables, “Number of hours studied” & “Number of marks scored”. Here we want to understand how does the number of marks scored by a student change with the number of hours studied by the student, i.e. “Marks scored” is the dependent variable and “Hours studied” is the independent variable.
Based on this data, I now want to know: “How many hours should a student learn to get 60 points?” So, this is where regression techniques come in. The regression model would understand that there is an increment of 10 marks for every extra hour studied and to score 60 marks the student has to study for 6 hours.
You need to note that “marks scored” is the dependent variable and it is a continuous numerical.
So, this is how regression algorithms work. Now, let’s move onto the next type of supervised learning algorithms which are classification algorithms.
Classification algorithms also need both the input data as well as the output data. Here, the output variable or the dependent variable should be categorical in nature.
Let’s take this example to understand classification.
Consider these three variables, “Person has lung cancer or not”, “Weight of the person”, “Number of cigarettes smoked in a day”. Here, we want to understand does the person have lung cancer based on the weight of the person and the number of cigarettes he/she smokes in a day, i.e. “Having lung cancer” is the dependent variable and “weight” and “No of cigarettes smoked” are the independent variables.
Again, you need to note here that “Having lung cancer” is a categorical variable, which has two categories, “yes” and “No”. Based on the independent variables, we classify whether the person has lung cancer or not.
Now, there are a variety of classification algorithms available such as:
- Decision Tree
- Random Forest
- Naïve Bayes
- Support Vector Machine
Let’s go ahead and understand one of these algorithms -> “Decision Tree”.
Decision Tree Classifier
Decision tree is a popular machine learning classifier. So, a decision tree as the name states has an inverted tree like structure. The topmost node in the tree is known as the root node and the nodes at the bottom of the tree are known as the leaf nodes. Every node has a test condition and based on that test condition, the tree splits into either its left child or right child.
Let’s go through this example on a decision tree. Here, we are trying to determine whether a person would watch the movie “Avengers” based on a series of test conditions.
Here, the test condition on the root node is “likes action films”. If the result is true, you go to the left child, else to the right child. If you like action films, then on the left child, there is another test condition, “Movie length greater than 2 hours”. So, if this evaluates to true, you go again go the left child, i.e., you are fine watching a movie which is greater than 2 hours. Again, when you go to the left child, there is another test condition, “Likes Robert Downey Jr”. Again, if this is true, it means you are looking forward to watching “Avengers”. So, this is how a decision tree classifier works.
Once you understand what is supervised learning, let’s move ahead in this blog on machine learning tutorials, and understand unsupervised learning.
Watch How to Learn Machine Learning
In unsupervised Learning the machine learns from unlabeled data, i.e. the result for the input data is not known beforehand. Here, the algorithm tries to determine the underlying structure of the data.
Now, let’s go through this example to see how does unsupervised learning works.
Here, we have a bunch of fruits and none of these fruits have labels associated with them. Now, let’s take these fruits and feed them to an unsupervised learning model. So, the model determines the features associated with the data and understands that all the apples are similar in nature and thus groups them together. Similarly, it understands that all the bananas have the same features and thus group them together and the same is the case with all the mangoes.
Thus, you must understand that even when the data did not associate with the class labels, the model still grouped the data into different clusters. Reason? – Data similarity.
These are some unsupervised learning algorithms:
- K-means clustering
- Hierarchical Clustering
- Principal Component Analysis
Further in this machine learning tutorial, we go through the next type of machine learning algorithm – Semi-supervised learning.
In semi-supervised learning, the machine learns from a combination of labeled and unlabeled data. In other words, you can consider semi-supervised learning as a fusion of supervised learning and unsupervised learning.
Let’s go through this example. Here, we have a bunch of different items -> phones, apples, books, and chairs. Now, as you see over here, only a minor proportion of the items are labeled and the rest are unlabeled. Here, the basic idea is to start off by grouping similar data together. So, all the phones would be put into one group, apples into another and the same is the case with books and chairs.
Now we have four clusters containing similar data in them. Here, the algorithm assumes that all the data points which are in proximity tend to have the same label associated with them. Now, the semi-supervised algorithm uses the existing labeled data to assign labels to the rest of the unlabeled data.
So, this is the underlying concept of semi-supervised learning. Now, in this machine learning tutorial, let’s head onto the final type of machine learning algorithm, which is reinforcement learning.
In reinforcement learning, the algorithm learns through a system of rewards and punishment and the goal here is to maximize the total reward. So, let’s go through this example to understand reinforcement learning.
So, here we have a self-driving car that will reach its destination without encountering a roadblock. Thus, the self-driving car is the agent and the road is the environment.
Now, the car takes action and goes straight, but when it goes straight, it hits the roadblock. Thus, due to the wrong action, it has to bear the punishment.
The car realizes that going straight is wrong and it has to go right. Evidently, when it goes right, it gets a reward. Therefore, this process continues and the car learns how to drive by itself without hitting any barricades.
And this brings us to the end of this “Machine Learning Tutorial”. We comprehensively understood what is machine learning and then we looked at the types of machine learning.
Now, if you are interested in doing an end-to-end certification course in Machine Learning, you can check out Intellipaat’s Machine Learning Course with Python.