Introduction to Machine Learning

Machine learning is the direct application of Artificial Intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of programs that can access data and use it learn on their own.

Introduction to Machine Learning
17th Sep, 2019
1650 Views

Introduction to Machine Learning

Machine Learning—a term that is widely used in almost all the fields ranging from the simple optimizations of advertisements all the way to plotting the quickest space path and navigation system to Mars.

On this Introduction to Machine Learning blog, I will be discussing about the following topics:

  • What is Machine Learning?
  • Netflix and Machine Learning
  • How does a Machine learn?
  • Machine Learning and Mars
  • Machine Learning libraries
  • Training the Machine Learning model
  • Packages required for Machine Learning concepts
  • KNN classification using IRIS dataset

 

Let us begin this Introduction to Machine Learning blog by checking out what Machine Learning actually is.


What is Machine Learning?

When you are being introduced to the concept of Machine Learning, you need to know that Machine Learning basically involves the procedure to train a certain computer machine using a given dataset. This dataset might be organized or unorganized data. Consider, for example, training a machine by inputting, say, 3,000 images of dogs and 3,000 images of some other entity other than dogs. Now, we would go about letting the computer learn to recognize when it sees an image of a dog. This way, the machine trains itself to figure out the nuances and differences between various images. Easy, right?

The actual process of training a machine and getting it to predict images is done by making use of a couple of specialized algorithms which are tailor-made for Machine Learning. To satisfy your curiosity, we can take a quick look at one of the algorithms; it goes by the name K-Nearest Neighbor (KNN Classification Algorithm). So, what does our new algorithm do? Well, it basically takes in a set of data and finds out the nearest data values corresponding to a test (or custom) dataset. What happens next is really Spartan. The algorithm selects the neighbor which is having maximum frequency and later outputs certain properties as a set of data with the prediction results.

Machine Learning Tutorial | What is Machine Learning | Intellipaat

Next up on this Introduction to Machine Learning blog, let us look at how the popular streaming service makes use of Machine Learning.


Netflix and Machine Learning

There are a couple of ways of expatiating what Machine Learning actually is. Let us begin with a simple example:

Introduction to Machine Learning - Intellipaat

Let’s say, you’re watching your favorite TV show on Netflix; if you’ve been a little observant of the Netflix landing page, you will notice that the thumbnail on every TV show title always varies depending on a couple of factors. The people at Netflix have made sure to code in an algorithm which picks out the main cast members from the set who are popular for that TV show or for others and then use a small automated sequence of face recognition algorithms to create new thumbnails every time the users log in to their Netflix account. Netflix calls this AVA (Aesthetic Visual Analysis). This is one among the many use cases of Machine Learning subtly incorporated into our daily lives.

Next up on this Introduction to Machine Learning blog, let us check out the concepts required for Machine Learning.


How does a machine learn?

So, what makes Machine Learning what it is? What does it take to achieve cognition? The simplest way to convey the concepts when you’re learning the Introduction to Machine Learning, is to know pretty much how it is classified. Here’s a simple chart for you check out.

Artificial Intelligence branches into two:

  1. Machine Learning
  2. Neural Networks (and Deep Learning)

 

Machine Learning - Intellipaat

The Artificial Intelligence set involving Machine Learning and Deep Learning as subsets

 

To put it into simple terms, Machine Learning is the first dimensional subset of Artificial Intelligence. Neural Networks are a form of implementation of Machine Learning (to a certain extent), and they use various other learning models to teach machines a piece of simple instruction and to attain an expected/unexpected output.

 Introduction to Machine Learning - Intellipaat

Manual Feature Extraction in Machine Learning

 

The above image shows how Machine Learning differs from Deep Learning. It can be explicated further to understand that the process of feature extraction in Machine Learning is manual as opposed to automated in Deep Learning.

If all of this seems interesting to you, you might want to know how to go about becoming a certified Machine Learning engineer, right?

  • Well for starters, you need to understand simple programming languages like Python or R. Both these programming languages are entities of their own. They hold their own ground in terms of ease of use, diversity, mode of application, and implementation on the whole. They’re fun to work with and easy to grasp.
  • You will later be looking at a specialization to understand how data works and how data is organized. Once you understand how it is possible, you can later start looking at how data can be handled and analyzed in such a way that the machine makes sense of it.
  • When you are at this stage, the machine (not very intelligent!) understands the data it sees. Now, you will pretty much be at crossroads. There are two paths that an ML Programmer takes.

They are:

  • Supervised Learning
  • Unsupervised Learning

Introduction to Machine Learning - Intellipaat

An image classifier using Machine Learning to differentiate between cats and dogs

What is supervised learning? Well, consider, you just fed a picture of your dog Suzi into the algorithm and the computer understands that. Structured learning will let the machine tell you if Suzi is a dog or a cat. It can go one step ahead to find out what breed he is! Interesting, right?

Now, we come to unsupervised learning—consider the same machine and the same pet in this scenario. Unstructured learning, upon fruition, has the capability to tell you if Suzi is happy, or if Suzi is healthy. It can even go ahead and tell you if the pet is cute (for sure, you’re imagining a cute pet you saw a couple of days ago!) How cool is this? Machine Learning does exactly this!

Next up on this Introduction to Machine Learning blog, let us check out how Machine Learning is playing out its role in space!


Machine Learning and Mars

Next up is another interesting use case of this technology. Imagine, a couple of decades from now, planet Earth runs out of its natural resources. Considering the stubborn species that we are as Homo sapiens, we’ll be looking frantically to move bases to another planet. Well, clearly Mars is our friendly neighbor! Space travel is not as simple as it seems. Since we’re talking about the future, let’s say it is easy. Now, the next challenge is to map out the planet for us and do a survey of the same. Here’s where Machine Learning comes into the picture: Initial survey, data blotting, structure analysis, soil compounding, vertical cross-mapping, and analysis of terrain are taken care of by our computers for us (those are certainly some big fancy terms!). If manually done, this would take about 10 years of daily constant work even by the most pristine and esoteric space scientist. The machines? Three months flat— this is a Mars program which is in full swing development.

Introduction to Machine Learning - Intellipaat

Machine Learning is already implemented on satellites and interplanetary vehicles


Machine Learning libraries in Python

Machine Learning can be achieved using many number of tools and libraries from Python and other languages as well. Here are a couple of libraries which are popular among Python learners:

  • TensorFlow
  • PyTorch
  • CNTK (Microsoft’s proprietary Cognitive Toolkit)
  • Theano
  • Caffe

TensorFlow is the brainchild of the developers at Google just like how CNTK is proprietary to Microsoft. PyTorch is the renamed version of Torch. Torch was mainly used in the development of Facebook, and it was created by its employees in the year 2016.

Next up on this Introduction to Machine Learning blog, let us look at how the process of training works.


Training the Machine Learning model

All in all, are the concepts and your thought process making you get head over heels thinking about how complex it could be? Worry not, it isn’t! We spoke about ‘training’ the model. How many thousands of lines of code do you need to go about doing this? Well, it turns out that it’s just one single line that does the job for us! Torch.nn() and model.learn() are functions from PyTorch and TensorFlow, respectively, which are moving the world by helping us achieve the actual steps of cognition and teaching the model what to do.

Here’s a step-by-step indication of how Machine Learning models work:

Step 1: Finding a problem and its corresponding data.

Step 2: Exploiting the data and understanding the ways of data optimization and in-house analysis.

Step 3: Finding the probability of occurrence of certain events and plotting those as the raw data inputs.

Step 4: Making use of techniques such as regression to analyze how the data deviates from the standard.

Understanding the nuances in Step 4 lets the machine to figure out what goes on in the process of learning about the dataset.

The procedure involved with Learning begins by performing various looping activities and using a constant (error tracker) to check the progress. With every iteration, the error comes down slightly. The lowest it can reach is zero—indicating that the model is 100 percent efficient—which shows the model’s inability to discern the data and learn. The outputs are plotted using various tools such as Matplotlib or PyPlot or other BI (Business Intelligence) tools.

Rejoice, Machine Learning is achieved!

Next up on this Introduction to Machine Learning blog, let us look at a small use (and your first piece of code!)


Machine Learning Use Case: KNN classification using IRIS dataset

Well, to go about using Python for Machine Learning, you will have to go ahead and install a couple of Python modules. You’ll need to install modules such as NumPy, Scikit, and Scikit-Learn. To install these modules, you would be recommended to use pip or other packages like Miniconda or even Anaconda packages which are available for Python. The best part is, all of the modules come pre-bundled with these Python packages.

Remember the KNN algorithm we checked out at the start? Let’s dive a little further into checking out more on this. Let’s go ahead and create an example dataset for you guys to understand the concepts easily.

Let us look at a similar example, but this time let us make use of an IRIS dataset and use KNN to classify the data. The dataset is available in Kaggle, do check it out.

Below is first code you will be looking at on your journey of learning about the introduction to Machine learning. We will discuss a Python code snippet which uses the KNN classification in all of its glory! For the people who are not aware, the IRIS dataset is very famous among learners. The flower chart from the IRIS dataset is very similar to the fruit example we saw earlier. For simplicity’s sake, let us look at the fruits example and map it exactly to the flower example. So, in this case, FLOWER_SIZE and FLOWER_TYPE are considered for data input. Well, actually there are three types of Iris (Iris Versicolor, Iris Virginica, and Iris Sentosa). We will be measuring four features from each of these species of Iris for the score of this example.

To go ahead and run this code, you could use any standard Python Interpreter—either local or a Jupyter Notebook online. Here is the code:

# Python program to demonstrate

# KNN classification algorithm

# on IRIS dataset


from sklearn.datasets import load_iris

from sklearn.neighbors import KNeighborsClassifier

import numpy as np

from sklearn.model_selection import train_test_split

  

iris_dataset=load_iris()

  

X_train, X_test, y_train, y_test = train_test_split(iris_dataset["data"], iris_dataset["target"], random_state=0)

  

kn = KNeighborsClassifier(n_neighbors=1)

kn.fit(X_train, y_train)

  

x_new = np.array([[5, 2.9, 1, 0.2]])

prediction = kn.predict(x_new)

  

print("Predicted target value: {}\n".format(prediction))

print("Predicted feature name: {}\n".format

    (iris_dataset["target_names"][prediction]))

print("Test score: {:.2f}".format(kn.score(X_test, y_test)))

Output:

Predicted target name: [0]
Predicted feature name: ['setosa']
Test score: 0.97

Go ahead and try it out on your own! But, since this is an Introduction to Machine Learning blog, we need to check out how it works. So, how does it work?

There are two primary steps to understanding a how a Machine Learning program works. They are:

  1. Training dataset
  2. Testing dataset

Let us begin by checking out how we can go about two datasets. We use one dataset to train the model and the other dataset as testing data to figure out the level of accuracy achieved by the machine.

Training dataset

  • Let’s go through what every line or piece of code does. Since we are making use of the sklearn module, the first line just imports the IRIS dataset which is actually predefined in the sklearn.
  • Next, we import the kNeighborsClassifier algorithm and another class called train_test_split. Again, these are from sklearn and from the NumPy modules which we will be using for the program.
  • Followed by this, we encapsulate the load_iris() method into the variable, iris_dataset. Now, if you remember, we need to divide the dataset into training data and test data. How do we do this? Well, we just make use of the train_test_split method to do the same. To denote the feature values in the set, we make use of ‘X’ as shown above, and ‘Y’ denotes the target value.
  • What the method actually does is that it divides the actual dataset into training data and test data randomly in the ratio of 75:25. Followed by this, we can go ahead and encapsulate the KNeighborsClassifier method into its cocoon variable. We can do all this while holding and maintaining the value of ‘k’ to be one.
  • Later, we need to make sure that we fit whatever training data we have into the algorithm so that the machine can start training based on the data we just inputted and manipulated for it.
  • As of now, the part which involves training the machine is over. Now, we can go further testing it to see if it works as expected.

Next up on this Introduction to Machine Learning blog, let us learn about the training dataset.

Testing dataset

In the previous section, we saw how we can actually train the model. Now, we can go about testing the same.

  • If we look at the program, we have an array x_new which is basically a NumPy array. This NumPy array is being used to specify the dimensions of a new flower entity. How is this done? Simple, we make use of the predict method to our advantage and let it take this array as its required input and then compute it. Later, it gives out the predicted target value as the product output (which is exactly what we require).
  • So, how can we discern the output? Well, let’s say the output predicted target value turns out to be a ‘0’, then we can know with surety that this is of the flower type Sentosa. The chances that we land at this are very good!
  • Lastly, we can make use of the test scores that are obtained to actually gauge the predictions. Since the output we have is a very simple ratio of the number of predictions that are correct and the total number of predictions which are made, this juxtaposes the values which we intend to compare and analyze. That is done when we compare the actual value with that of the respective predicted value based on the same.

Conclusion

It is not always the case where you have fun by making your machine differentiate between a dog and a cat! The highest accolades (till date) received by the technology is that the detection of premature breast cancer in the female population (worldwide) just crossed the highest probability of being discerned easily that too in its nascent stages by making use of Machine Learning. This is the revolution in terms of putting computers to good use in terms of cognition and easier detection of various anomalies with respect to our species.

The world of Machine Learning is no longer the upcoming concept it once was, but, it is already implemented in our daily lives very subtly and this has been noted to be the best usage of computers to solve our problems since the internet-boom from the earlier part of the 21st century.

If you have any questions regarding this post, please put them down in the comment section. I’d be happy to discuss about it and help you out.

 

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Solve : *
5 + 8 =