Keras Tutorial

Keras works very well in handling most of the Deep Learning requirements of today. And, this is why it has gained an immense amount of traction among companies like start-ups and even big corporations. The power of Keras can be shown well when it is used with TensorFlow and these organizations know that as well. Having Keras in your arsenal of Deep Learning libraries is going to be hugely beneficial for you in terms of skills, knowledge, and monetary benefits.

On this ‘Keras Tutorial’ we will be discussing the following aspects:

What is Keras?
Who uses Keras?
Foundational Concepts of Keras
The Keras Workflow Model
Deep Learning with Keras
Regression Deep Learning Model Using Keras
Building a Classification Model in Keras

Check out our Keras video on YouTube designed especially for beginners:

What is Keras?

Keras, even though it cannot work with low-level computation, is designed to work as a high-level API wrapper, which then caters to the lower-level APIs out there. With the Keras high-level API, we can create models, define layers, and set up multiple input-output models easily.

Since Keras has the amazing functionality to behave like a high-level wrapper, it can run on top of Theano, CTNK, and TensorFlow seamlessly. This is very advantageous because it becomes very convenient to train any kind of Deep Learning model without much effort.

Following are some of the noteworthy features of Keras:

Keras gives users an easy-to-use framework, alongside faster prototyping methods and tools.
It works efficiently on both CPU and GPU, without any hiccups.
Keras supports working with both Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) for a variety of applications such as computer vision and time series analysis, respectively.
It has seamless functionality provisions to make use of both CNN and RNN if need be.
It completely supports arbitrary network architectures, making model sharing and layer sharing available to users to work with.

Next up, to learn Keras you have to understand who makes use of Keras.

Who uses Keras?

Keras is so popular that it has over 250,000+ users and is growing by the minute. Be it researchers or engineers or graduate students, Keras has grown to be the favorite of many out there. From a variety of startups to Google, Netflix, Microsoft, and others now use it on a day-to-day basis for Machine Learning needs!

Keras Download – How to install Keras in Anaconda is a common question. Heading straight to their website will give you access to the single command that you can use for this purpose.

TensorFlow still receives the highest number of searchers and users in today’s world, but Keras is the runner-up and catching up with TensorFlow pretty quickly! Next up on this Keras Tutorial, let us check out the basic concepts.

Foundational Concepts of Keras

Among the top frameworks out there such as Caffe, Theano, Torch, and more, Keras offers users four main components that make it easier for a developer to work with the framework. Following are the four concepts:

User-friendly syntax
Modular approach
Extensibility methods
Native support to Python

It is very important that you install Keras and begin checking it out. With TensorFlow, there is full-blown support for performing operations such as tensor creation and manipulation and further operations, such as differentiation and more.

With Keras, the advantage lies in the contact between Keras and the backend, which serves as the low-level library with an already existing tensor library.

Another notable mention is that, with Keras, we can use a backend engine of our choice, be it TensorFlow backend, Theano backend, or even Microsoft’s Cognitive Toolkit (CNTK) backend!

To learn Keras in detail, you have to learn about the Keras Workflow Model. Read on.

The Keras Workflow Model

Installing Keras is very straightforward. A simple pip command will get you started with it. Now, to quickly get an overview of what Keras can do, let’s begin by understanding Keras via some code.

Define the training data—the input tensor and the target tensor
Build a model or a set of Keras layers, which leads to the target tensor
Structure a learning process by adding metrics, choosing a loss function, and defining the optimizer
Use the fit() method to work through the training data and teach the model

The first concept in the Keras tutorial that you should look out for is how to build models in Keras.

Model Definition in Keras

Models in Keras library can be defined in two ways. Following are the simple code snippets that cover them.

Sequential Class: This is a linear stack of layers arranged one after the other.

from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(784,)))
model.add(layers.Dense(10, activation='softmax'))

Functional API: With the Functional API, we can define DAG (Directed Acyclic Graphs) layers as inputs.

input_tensor = layers.Input(shape=(784,))
x = layers.Dense(32, activation='relu')(input_tensor)
output_tensor = layers.Dense(10, activation='softmax')(x)
model = models.Model(inputs=input_tensor, outputs=output_tensor)

Implementation of Loss Function, Optimizer, and Metrics

Implementing the above-mentioned concepts in Keras is very simple and has a very straightforward syntax as shown below:

from keras import optimizers
model.compile(optimizer=optimizers.RMSprop(lr=0.001),
loss='mse',
metrics=['accuracy'])

Passing Input and Target Tensors

model.fit(input_tensor, target_tensor, batch_size=128, epochs=10)

With this, we can check out how easy it is to build our own Deep Learning model with Keras.

Deep Learning with Keras

One of the most widely used concepts today is Deep Learning. It is very vital that you learn Keras metrics and implement them actively. Deep Learning originates from Machine Learning and eventually contributes to the achievement of Artificial Intelligence.

With a neural network, inputs can easily be supplied to it and processed to obtain insights. The processing is done by making use of hidden layers with weights, which are continuously monitored and tweaked when training the model. These weights are used to find patterns in data to make a prediction. With neural networks, users need not specify what pattern to hunt for because neural networks learn this aspect on their own and work with it!

Keras gets the edge over the other deep learning libraries in the fact that it can be used for both regression and classification. Let’s check out both in the following sections.

Regression Deep Learning Using Keras Model

Before beginning with the code, to keep it simple, the dataset is already preprocessed, and it is pretty much clean to begin working with. Do note that datasets will require some amount of preprocessing in a majority of the cases before we begin working on it.

Reading Data

When it comes to working with any model, the first step is to read the data, which will form the input to the network. For this particular use case, we will consider the hourly wages dataset.

Import pandas as pd
#read in data using pandas
train_df = pd.read_csv(‘data/hourly_wages_data.csv’)
#check if data has been read in properly
train_df.head()

As seen above, Pandas is used to read in the data, and it sure is an amazing library to work with when considering Data Science or Machine Learning.

The ‘df’ here stands for DataFrame. What it means is that Pandas will read the data to a CSV file as a DataFrame. Followed by that is the head() function. This will basically print the first 5 rows of the DataFrame, so we can see and verify that the data is read correctly and see how it is structured as well.

If you have to install Keras Anaconda correctly, then all of the code should execute perfectly!

Splitting up the Dataset

The dataset has to be split up into the input and the target, which form train_x and train_y,respectively. The input will consist of every column in the dataset, except for the ‘wage_per_hour’ column. This is done because we are trying to predict the wage per hour using the model, and hence, it forms to be the target.

#create a dataframe with all the training data except the target column

train_X = train_df.drop(columns=['wage_per_hour'])

#check if target variable has been removed

train_X.head()

As seen from the above code snippet, the drop function from Pandas is used to remove (drop) the column from the DataFrame and store it in the variable train_x, which will form the input.

With that done, we can insert the wage_per_hour column into the target variable, which is train_y.

#create a dataframe with only the target column

train_y = train_df[['wage_per_hour']]

#view dataframe

train_y.head()

Building the Neural Network Model

Building the model is a simple and straightforward process as shown in the below code segment. We will be using the sequential model as it is one of the easiest ways we can build in Keras. The layer build logic is what makes it structured and easy to comprehend, and each of these layers will comprise the weight of the layer that follows it.

from keras.models import Sequential
from keras.layers import Dense
#create model
model = Sequential()

#get number of columns in training data
n_cols = train_X.shape[1]

#add model layers
model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
model.add(Dense(10, activation='relu'))
model.add(Dense(1))

As the name suggests, the add function is used here to add multiple layers to the model. In this particular case, we are adding two layers and an input layer as shown.

Dense is basically the type of layer that we use. It is a standard practice to use Dense, and it is cooperative enough to work with almost all the cases of requirement. With Dense, every node in a layer is compulsorily connected to another node in the next layer.

The number ‘10’ indicates that there are 10 nodes in every single input layer. This can be anything as per the need of the hour. More the number, the more the model capacity.

The activation function used is ReLu (Rectified Linear Unit) that allows the model to work with non-linear relationships. It is pretty different to predict diabetic patients from the age group of 9 to 12 or patients aged 50 and above. This is where the activation function helps.

One important thing here is that the first layer will need an input shape, i.e., we need to specify the number of columns and rows in the data. The number of columns present in the input is in the variable n_cols. The number of rows is not defined, i.e., there is no limit for the number of rows in the input.

The output layer will be the last layer with only one single node, which is used for the prediction.

Model Compilation

For us to compile the model, we need two things (parameters): the optimizer and the loss function.

#compile model using mse as a measure of model performance

model.compile(optimizer='adam', loss='mean_squared_error')

The optimizer ensures to control and maintain the learning rate. A commonly used optimizer is the Adam optimizer. Again, just like Dense, it works in most cases, and it works well to adjust the learning rate throughout the training process.

The learning rate is the measure of how fast the correct weights are calculated for the model. Smaller the learning rate, the more accurate the weights will be. The downside here is that it might take more time to compute the weights.

When it comes to the loss function, MSE is a very widely used loss function. MSE stands for Mean Squared Error, and it is calculated by taking the average between predicted values and actual values and later squaring this result. If the loss function is closer to zero, it means that the model is working well.

Model Training

Model training will use the fit() function and take in five parameters for the process. The parameters include the training data, the target data, validation split, the number of epochs, and callbacks.

from keras.callbacks import EarlyStopping
#set early stopping monitor, so the model stops training when it won't improve anymore
early_stopping_monitor = EarlyStopping(patience=3)
#train model
model.fit(train_X, train_y, validation_split=0.2, epochs=30, callbacks=[early_stopping_monitor])

The validation split will simply split the data randomly as training and testing. Validation loss is seen during training as MSE on the validation set. If the validation split is set as 0.3, it means that 30 percent of the training data fed to the model will be kept aside to test the model performance later, and hence, the model does not see this data at all.

The number of epochs denotes how many times the model will run through the data in an iteration. Until a certain point, more epochs will relate to model improvement directly, and further, it will not improve anymore.

To check this and to stop the model, we make use of early stopping. It helps the model stop the training process if it reaches its culmination point before the number of epochs ends. Patience = 3 means, it will check for improvements in 3 epochs. If there are no improvements for 3 epochs straight, the model will stop training.

Predictions on Data

Performing predictions on data is easily done by making use of the predict() function as shown below:

#example on how to use our newly trained model to make predictions on the unseen data (we will pretend that our new data is saved in a dataframe called 'test_X')
test_y_predictions = model.predict(test_X)

With this, the model is actually built successfully! But, with Keras, we can make it a lot more accurate than this. Let’s talk about model capacity.

As mentioned previously, with more nodes and layers, the capacity goes up. More capacity means more accuracy in learning to a certain limit. With this, presenting the model with more data will make the model large. Larger the model, more computation power is needed. More computation power means more time to train!

#training a new model on the same data to show the effect of increasing model capacity

#create model
model_mc = Sequential()

#add model layers
model_mc.add(Dense(200, activation='relu', input_shape=(n_cols,)))
model_mc.add(Dense(200, activation='relu'))
model_mc.add(Dense(200, activation='relu'))
model_mc.add(Dense(1))

#compile model using mse as a measure of model performance
model_mc.compile(optimizer='adam', loss='mean_squared_error')
#train model
model_mc.fit(train_X, train_y, validation_split=0.2, epochs=30, callbacks=[early_stopping_monitor])

Here’s another model with the same data. Now, nodes in each layer are 200, and post-training, we can see that the validation loss went from 32.63 to 28.06 here.

With Keras, this is the advantage we get! Now, let us work on building the classification model.

Building a Classification Model in Keras

The advantage with Keras and its syntax in Python is that most of the steps we just did above will apply here as well. So, to keep the readability high, let’s discuss only the new concepts that we will need to predict if patients are diagnosed with diabetes or not.

Reading in the dataset and viewing them is straightforward:

#read in training data
train_df_2 = pd.read_csv('documents/data/diabetes_data.csv')

#view data structure
train_df_2.head()

Output:

Removal of the target column to ensure that we can keep it as the output to train for:

#create a dataframe with all training data except the target column
train_X_2 = train_df_2.drop(columns=['diabetes'])

#check that the target variable has been removed
train_X_2.head()

A patient with no diabetes will be represented by 0, while someone who has diabetes will be represented by 1. The to_categorical() function is used to perform one-hot encoding. With this, we will be removing the integers and putting in a binary value for each of the categories present. For us, here it is 2: no diabetes and diabetes. So, a patient with no diabetes will be represented as [1 0], while a patient with diabetes will be represented as [0 1].

from keras.utils import to_categorical
#one-hot encode target column
train_y_2 = to_categorical(train_df_2.diabetes)

#check that target column has been converted
train_y_2[0:5]

And in this neural network, the last layer will have two nodes because of the criteria that the patient has diabetes or not.

Check out the following code snippet:

#create model
model_2 = Sequential()

#get number of columns in training data
n_cols_2 = train_X_2.shape[1]

#add layers to model
model_2.add(Dense(250, activation='relu', input_shape=(n_cols_2,)))
model_2.add(Dense(250, activation='relu'))
model_2.add(Dense(250, activation='relu'))
model_2.add(Dense(2, activation='softmax'))

As we can see above, the activation function used is softmax. With softmax, the output sums up to 1, and this makes it extremely convenient for us to interpret the probabilities as they are in the range of 0 to 1 now.

The model compilation is pretty straightforward as well. Categorical cross-entropy is used as the loss function as it works really well and is probably the most common choice to perform classification. The lower the score, the better the model performance (the same as before!)

#compile model using accuracy to measure model performance
model_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

The accuracy metric is used to check the accuracy score at the end of every single epoch to help in interpreting the results easily and quickly.

#train model
model_2.fit(X_2, target, epochs=30, validation_split=0.2, callbacks=[early_stopping_monitor])

Here, we have worked with the two categories of neural networks very easily in Keras and understood how powerful they can be at the same time.

Conclusion

As discussed in the entirety of the Keras tutorial, it adds benefit to reinforce the idea of how simple it is to work with Keras. Now, you can go on to building your own neural network models for various use cases. It is very straightforward and can help you solve a variety of problems. The number of Keras applications is growing every day. It is adding more and more value to our lives so to speak.

A good point to add here is that Keras Developers are in need today. Companies out there are looking for certified professionals who can provide solutions to a variety of problems they face. Make sure to jump onto this demand train to make the best use of Keras for your career! Keras documentation can get slightly overwhelming if you are a beginner so make sure to take note of this.