Model Subclassing with Keras

How to build flexible deep learning models with the Keras Subclassing API?

Tirendaz AI
Level Up Coding

--

Image by Freepik

You can easily build deep learning models with the Keras sequential API or functional API. But, it is difficult to create complex models with these methods. To build flexible models, it’s a better idea to use the subclassing API. This blog will walk you through how to address a regression problem with the Keras subclassing API.

Let’s start with loading the dataset.

Data Loading

To show the model subclassing, let’s deal with a regression problem using the California Housing dataset. This dataset contains the house prices and features of these houses in California.

California Housing Prices Dataset

To download this dataset, we can use the fetch_california_housing function in Scikit-Learn. First, let’s import this function and then assign this dataset to the housing variable using this function.

from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

Cool, we have downloaded our dataset. Let’s move on to data preprocessing.

Data Preprocessing

Data preprocessing is one of the most important steps in data analysis. Note that if you have clean data, you can build an awesome model. What we’re going to do now is split the dataset into training and testing sets.

To do this, we’re going to use the train_test_split function in Scikit-Learn. Let’s import this function first.

from sklearn.model_selection import train_test_split

Let’s create the training and test sets using this function. Keep in mind that the model is built with the training set and the model is tested with the test set.

X_train, X_test, y_train, y_test = train_test_split(
housing.data, housing.target, random_state=42)

Nice, we’ve created our datasets. Let’s move on to the model-building step.

Model Building

To build the model, we’re going to use the Keras Subclassing API. This approach allows you to build flexible deep-learning models. First, let’s import TensorFlow and Keras.

import tensorflow as tf
from tensorflow import keras

Note that, in the model subclassing, we’re going to define the layers in the constructor, and then build the model with these layers in the call method.

The model we’re going to build consists of two inputs and one output, like this.

Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow

To do this, we’re going to create a class using the Keras subclassing API. When creating the model with subclassing approach, you need to inherit from the keras.Model class.

class My_Model(tf.keras.Model):                                        #(1)
def __init__(self, units=30, activation="relu"): #(2)
super().__init__() #(3)
self.norm_layer_wide = tf.keras.layers.Normalization() #(4)
self.norm_layer_deep = tf.keras.layers.Normalization() #(4)
self.hidden1 = tf.keras.layers.Dense(units, activation=activation) #(5)
self.hidden2 = tf.keras.layers.Dense(units, activation=activation) #(5)
self.main_output = tf.keras.layers.Dense(1) #(6)
def call(self, inputs): #(7)
input_wide, input_deep = inputs #(7)
norm_wide = self.norm_layer_wide(input_wide) #(8)
norm_deep = self.norm_layer_deep(input_deep) #(8)
hidden1 = self.hidden1(norm_deep) #(9)
hidden2 = self.hidden2(hidden1) #(9)
concat = tf.keras.layers.concatenate([norm_wide, hidden2]) #(10)
output = self.main_output(concat) #(11)
return output #(12)

Let’s go through these codes step by step:

(1) We defined a class called My_Model. This class inherits from tf.keras.Model, which is the base class for all Keras models.

(2) We used the init function to create an instance of this class. We know from object-oriented programming that the first parameter of the init function is self. This constructor takes two optional arguments: units and activation.

(3) We used the super keyword. This allows us to use all the variables and methods of a superclass inside another class. In the other words, we can specify our own attributes while using the attributes of the superclass with this keyword.

(4) We created two instances of tf.keras.layers.Normalization(), which are used to normalize the two input branches separately.

(5) We defined two instances of tf.keras.layers.Dense(), which are fully connected (dense) layers with units number of neurons and activation as the activation function.

(6) We created the output layer, which is another dense layer with one output neuron. This layer will output a single scalar value.

(7) We defined the call() method, which is called during the forward pass of the model. This method takes the input as an argument, which is a tuple of two tensors representing the wide and deep input branches.

(8) We applied the normalization layers to the two input branches separately.

(9) We passed the two hidden layers to the normalized deep input branch.

(10) We concatenated the normalized wide input branch with the output of the second hidden layer.

(11) We applied the output layer to the concatenated tensor to produce the final output.

(12) We returned the output of the model.

Nice, so we’ve built the architecture of our model. Now, let’s create an object from the class. To do this, let’s first use the set_seed method to fix the randomness and then instantiate an object of the class. Let me pass this object to 30 neurons and for the activation parameter, let’s give the relu function.

tf.random.set_seed(42)
model = my_model(30, activation= "relu")

Nice, we created a model. Let’s go ahead and compile the model. In this step, we’ll determine the loss, optimizer, and metrics. First, let’s create an optimizer variable and pass it to Adam optimizer with the learning rate.

optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)

Now, let’s call the compile method and specify the parameters.

model.compile(loss="mse", 
optimizer=optimizer,
metrics=["RootMeanSquaredError"])`

Cool, we compiled our model. Our model will have two inputs, right? Now, let’s create these inputs. Let’s first create the inputs for the training. Our dataset consists of 7 features. Let me take the first 5 features for the wide input, and the last 5 features for the deep input.

X_train_wide, X_train_deep = X_train[:, :5], X_train[:, 2:]
X_test_wide, X_test_deep = X_test[:, :5], X_test[:, 2:]

Nice, we’ve created our input data for the model. Now, let’s move on to training the model.

Model Training

Before we train the model, we need to use the adapt method for the model’s normalization layer. What does the adapt method do? It learns statistics such as mean and variance from training data. After that, these statistics use for test data. Now, let’s use the adapt method to learn the statistics of inputs.

model.norm_layer_wide.adapt(X_train_wide)
model.norm_layer_deep.adapt(X_train_deep)

Next, let’s call the fit method to train the model.

history = model.fit((X_train_wide, X_train_deep),
y_train,
validation_split=0.1,
epochs=10)

Here, you can see the loss and metric values for the training and validation data. Cool, we trained our model. Note that, we can improve the performance of the model by fine-tuning the hyperparameter, but in this video, I just want to cover the Keras subclassing API.

Model Evaluation

Now, let’s test the model using the test data.

eval_results = model.evaluate((X_test_wide, X_test_deep), (y_test, y_test))
eval_results

#Output:
[0.34964966773986816, 0.5913118124008179]

Here you can see the loss and metric values of our model on the test data.

Prediction

Now let’s take a look at how our model predicts the new data. First, let’s take a few examples of test data.

X_new_wide, X_new_deep = X_test_wide[:3], X_test_deep[:3]

Nice, we have sample data. Now let’s predict the labels of this data using our model. For this, we will call the predict method.

y_pred= model.predict((X_new_wide, X_new_deep))

Cool, labels predicted. Let’s see these labels.

y_pred

# Output:
array([[0.36844897],
[1.4869882 ],
[3.3377788 ]], dtype=float32)

These are the values predicted by the model. Now let’s look at the actual values.

y_test[:3]

# Output:
array([0.477 , 0.458 , 5.00001])

These values are real values. As you can see, our model is not very bad. You can improve this model by fine-tuning the hyperparameter.

You can find the notebook I used in this blog here.

Final Words

This blog walked you through how to deal with a regression problem with the Keras model subclassing. In short, the best way to build complex models is the subclassing method. But, you need to write more code for this method. As a rule of thumb, unless you want to build a very complex model, I recommend using the Sequential API or Functional API. These methods tackle most of your problems.

That’s it. Thanks for reading. Let’s connect YouTube | Medium | Twitter | Instagram.

Level Up Coding

Thanks for being a part of our community! Before you go:

🚀👉 Join the Level Up talent collective and find an amazing job

--

--