Azure ML Tutorial
Azure Machine Learning saves both cost and time, along with making development easy. Who would have thought that one could build Machine Learning models using features like drag and drop? It is possible to do so in Azure Machine Learning Studio, and it offers almost all major algorithms built-in to work on. We can also hard-code everything in the Azure Machine Learning Service workspace. Azure ML allows users to connect directly with sources like Hive Query, Azure SQL, on-premise data sources, and much more! If we are working on video analysis, Azure Machine Learning can translate our videos into nine different languages, whereas AWS and GCP do not even support video translation. To sum it up, Azure Machine Learning stands out with its unique features.
In case you want to brush up your fundamentals on Machine Learning, here is a video:
Businesses want to incorporate Machine Learning for predictive analysis to grow. But, the cost is another factor alongside hardware and special coding skills, which might cause a hindrance in this process. Azure Machine Learning Services give opportunities to those small- and medium-sized businesses by being a money saver and also make development easy for developers.
Before moving forward, here are the topics that will be covered in this blog:
- Why Azure Machine Learning?
- What is Azure ML?
- Working of Azure Machine Learning Studio
- Hands-on with Azure ML Studio
When Machine Learning and Cloud Computing collaborate with each other, the possibilities are endless. But why ML and Cloud? Below are some advantages of ML in Cloud Computing:
- Traditional Machine Learning libraries do not support well processing of huge datasets. We can store Big Data in the cloud without worrying even a little. Hence, it is easier to perform ML on datasets present in the Cloud.
- There are billion people who use the cloud to store their data. As a result, these applications are capable of predicting outcomes that are more accurate because the machine gets a large amount of data to learn from.
- With the integration of ML in the cloud, the outcome is an ‘intelligent cloud’. With more advancements in technology, this intelligent cloud will play a crucial role in healthcare, finance, and many other sectors. It will be able to give the best possible outcome for any problem.
- Integrating ML in the cloud helps Business Intelligence companies by handling real-time data and analyzing it to make future predictions.
These are a few benefits of incorporating Cloud Computing with Machine Learning, and the scope to grow is limitless in this area.
Coming back to our blog, let us see why enterprises are turning toward Azure Machine Learning.
Why Azure Machine Learning?
As mentioned earlier, due to the excess amount of data present in the cloud, it is easy for the system to learn on its own without having to exclusively feed data. With Azure being the second largest Cloud Computing service provider, it surely has enough datasets from where machines can learn and predict. The service runs on Azure public cloud which implies that we do not have to buy hardware or software, and all deployment and maintenance are taken care of by Azure.
In case you want to brush up your fundamentals on Azure, here is a video:
Here are some of the benefits of Azure ML:
- We can easily use our model in a web service, IoT device, or Power BI.
- It provides us with predictive analytics at low cost.
- Microsoft gives us full support in terms of documentation on how to work with it.
- Azure Machine Learning Studio provides us with a drag-and-drop workspace which does not require coding.
- We do not need to replicate our data for other computing environments. Once we have created our data store, we can mount or download our data in any Azure ML computing environment.
- Azure Machine Learning Service provides framework-independent hyper-parameter tuning.
Now that we know some of the perks of Azure ML, let us see its definition in detail.
What is Azure ML?
Azure Machine Learning comes with different tools:
- Azure Machine Learning Studio
- Azure Machine Learning Service
Azure Machine Learning Service allows us to prep, train, and test our data. We can deploy, manage, and track Machine Learning models starting from our local machines and then shifting to cloud without any hassle. It supports open-source technologies such as TensorFlow, PyTorch, and Scikit-Learn.
There is a slight difference between Azure ML Services and Azure ML Studio.
|Azure Machine Learning Studio||Azure Machine Learning Service|
|No coding is required.||It is a coding environment.|
|It has a drag-and-drop environment.||It is an environment for Python coding.|
|There are some in-built algorithms and data transformation tools.||We have the full freedom over our ML algorithms or any free library.|
|We can use it when the predefined algorithms provide a solution.||This is preferred if the predefined algorithms in ML Studio do not meet our requirements.|
This blog will be focusing on Azure ML Studio since it is convenient to use as compared to Azure ML Service. Let’s now talk about the working of Azure ML Studio.
Working of Azure ML Studio
As mentioned earlier, Azure ML Studio uses a drag-and-drop feature which does not need any coding. There are predefined algorithms and sample datasets that we can work with.
But, how to choose the right algorithm? Microsoft provides a cheat sheet that we can download for deciding on the right algorithm. Otherwise, here is how we do it:
- If we want to predict value, we can use regression. We can forecast the future by estimating the relationship between variables. Examples are product demand estimation, sales figures prediction, equipment servicing priorities determination, etc. The algorithm options are:
1. Ordinal regression
2. Poisson regression
3. Fast Forest Quantile regression
4. Linear regression
5. Bayesian regression
6. Neural network regression
7. Decision forest regression
8. Boosted decision tree regression
- If we want to identify and predict rare data points, our options are:
1. One class SVM
2. PCA-based anomaly detection
Watch this Azure Certification video
Some examples of anomaly detection are fraud detection, abnormal equipment readings, etc.
- If we want to group similar data into one set, K-means clustering is the algorithm we should use. Examples are customer taste prediction, customer segmentation, etc.
- If we want to predict between two categories (like to predict whether a tweet is positive, to which the answer would be either ‘yes’ or ‘no’), the algorithms to use are:
1. Two-class SVM
2. Two-class averaged perceptron
3. Two-class Bayes point machine
4. Two-class decision forest
5. Two-class logistic regression
6. Two-class boosted decision tree
7. Two-class decision jungle
8. Two-class locally deep SVM
9. Two-class neural network
- If we want to predict between multiple categories (such as finding the mood of a tweet), then here are the options:
1. Multiclass logistic regression
2. Multiclass neural network
3. Multiclass decision forest
4. Multiclass decision jungle
5. One-vs-all multiclass
For downloading the cheat sheet, click here
With the basics being cleared, let us now take up an example.
Hands-on with Azure ML Studio
It is recommended to create an Azure account beforehand. The first 12 months are free with 13,300 credits so that we can practice. Here, the example is of Predicting Diabetes for people depending on different fields. In this example, we have to predict if a person has diabetes or not. We can also check the accuracy, precision, and F1 score of the model. In this hands-on, we are going to select a dataset that is already available and use the two-class regression algorithm for training the dataset.
Here is the step-by-step process of building the prediction model
Step 1: Search for Azure Machine Learning Studio on Google and click on the first link. Login with the credentials and we can see the studio. To create a new experiment, click on NEW which is on the bar at the bottom of the studio
Step 2: When we click on NEW, here are the options that will pop up. Click on Blank Experiment and we will be redirected to our workspace, where we can start with our experiment
Step 3: Before moving on, rename the experiment. As the example is on predicting if a person has diabetes or not, here, the experiment will be renamed as Diabetes prediction. Here is how it would look like:
Step 4: Now, select a dataset. There are many sample datasets available for experiment. Take the sample dataset of Diabetes binary classification
1. From the menu on the left, select Saved Datasets
2. For more insights on sample datasets, click on Samples and go through the list of sample datasets available
3. Select Pima Indian Diabetes Binary Classification Dataset, drag it to the center of the screen and drop it
Step 5: Now that we have our dataset, let us see what it has.
1. Click on ①
2. And from the options, click on Visualize
Once we click on Visualize, we will be able to see what our data looks like. Here is our data that we will be working with:
Step 6: Now, we have to select columns that are relevant to train our model. Here is how we do it:
1. On the left-hand side, there is a search bar, where we will search for Select Columns in Dataset
2. Drag and drop the item below the dataset and connect the two
3. On the right side, there is a box that says Launch column selector. Click on it to select columns
Step 7: After clicking on that entity, we can see a screen popping up. In this example, all columns are relevant and, hence, we will select all of them
1. Select all items
2. Click on the arrow that points right. This indicates that columns are selected
3. Navigate below and click on the Tick mark
4. Close the window
Step 8: Now that we have selected the columns we want to train our model on, we need to split the data into training and test datasets
1. For that, search for Split Data and drop it on the workspace
2. Join Column selector and split data
3. Toward the right, we can change the percentage of train and test datasets. It is initially 0.5, but we want 70 percent training data and 30 percent test data. Hence, we will change it to 0.7. We can make the ratio 80-20 as well or as per our requirement
In this, ① is our training dataset and ② is our test dataset.
Step 9: We now have our training and testing datasets. Next, we need an algorithm to train our model. The algorithm we are choosing is two-class logistic regression. Logistic regression is used to predict the probability of an outcome. It predicts the probability of the appearance of an event by providing data to a logistic function.
Since there are two outcomes, it is two-class logistic regression. For a single value, we can use linear regression.
Also, our aim of this prediction model is to find if a person is diabetic or not. Hence, this falls under classification. Search for Classification, and under the category we will be able to find this algorithm
Step 10: It is now time for us to train our model
1. Search for Train Model and drop it on the workspace
2. Connect the Algorithm to the train model.
3. Connect the training dataset from split data to train model
4. Then, we need to select the column that we need to test
Step 11: Now, we need to score our trained model and then evaluate it
1. Drag and drop the Score Model on to the center
2. Connect the Train Model and ②
3. Connect the Training Dataset from Split Data to train model
4. Drag and drop Evaluate Model
5. Connect Score Model with it
6. Save the experiment
7. Run the experiment
Step 12: After running the model, we need to visualize the result and find precision, accuracy, etc.
1. Click on Evaluate and then on ①
2. Select Visualize and there will be a pop-up window like below
Here are the results:
In case we want to set the threshold, we can do so by moving the scale.
Note: Usually, False Positive and False Negative should be minimum. In this example, the most important entity in the confusion matrix is the number of False Negatives. False Negative is when a person has diabetes but the prediction was that he/she does not have it, which is more fatal. So, set the threshold in a way that you get a minimum number of False Negatives.
To give a clear picture of the confusion matrix, here is what you need to know:
Let us say you have taken a test to detect diabetes. Now, your result is either positive or negative. It is positive if you have diabetes and negative if you don’t. With that in mind, check out the conditions:
TP: When someone has diabetes and the predicted result is positive.
FN: When someone has diabetes and the predicted result is negative.
FP: When someone does not have diabetes and the predicted result is positive.
TF: When someone does not have diabetes and the predicted result is negative.
Now, we have come to an end of this blog. Simple, isn’t it? It is way easier to drag and drop instead of coding every bit. We can also upload our datasets from local files and run different Machine Learning algorithms on it.
Intellipaat provides a range of courses for you to learn from experts. In case you want to become a certified professional in Azure, here are the certification courses to help you start your journey today:
- If you want to become a certified Developer, Intellipaat is providing this course:
- If you want to become a certified Solution Architect, here is the course:
- If you want to become a certified Administrator, you can enroll in this course: