{:check ["true"]}
split=4
In this video, I describe the elements of Keras' abstraction of neural networks using simple graphical representations.
split=4
Here is a video in which we discuss the specific API calls provided by the Keras library. We work through a complete example of using Keras API to perform line fitting.
import numpy as np
import matplotlib.pyplot as pl
import tensorflow as tf
Keras API design is based on modular composition of layers.
Using layers, we can construct ever more complex neural networks.
Each layer is a function: $ f(x | \theta) $
$x$ is the input which is a tensor of some shape.
$\theta$ is a collection of model parameters.
Shapes of input $x$ and layer output are important.
Each layer is callable.
layer(input_tensor_batch)
The input must be a batch of input tensors.
Why batch processing?
Dense Layer
$$f(x | w, b) = w\cdot x + b$$Input:
The model parameters are $(w, b)$ where
They are usually randomly seeded.
#
# The layers module comes all the built-in types of layers
#
import tensorflow.keras.layers as layers
#
# We can construct a dense layer and specify the
# output dimensionality.
#
dense = layers.Dense(5)
#
# Layers are callable.
#
# It's important to note that the input to a layer is
# a **batch** of input vectors.
#
dense(np.ones((4,10)))
#
# Layer objects provide many methods and properties.
# Layer.weights gives us the list of tf.Variables which
# are the model parameters associated with the layer.
#
dense.weights
Activation Layer
An activation function is just another word for element-wise rescaling functions.
$$ f: \mathbb{R}^n\to\mathbb{R}^n $$Here are some popular activation functions:
relu
:
$$f(x) = \left\{\begin{array}{ll}
0 & \mathrm{if}\ x \leq 0 \\
x & \mathrm{else}
\end{array}\right.$$Other frequently used activation functions are:
softmax
and tanh
#
# We can build a sigmoid activation layer.
#
sigmoid = layers.Activation('sigmoid')
sigmoid(tf.zeros((3,2)))
#
# Activation layers do not have any model parameters
#
sigmoid.weights
We will be explore many other types of layers throughout the course.
Keras bundles layers, loss function and optimizer into a single Model
object.
Models are extremely easy to build using the sequential model API.
Using the sequential API, we need to specify a sequence of layers, and add them to the sequential model.
We need an input layer.
import tensorflow.keras.models as models
#
# This is all we need for linear regression.
#
model = models.Sequential([
layers.Input(shape=(1,)),
layers.Dense(1),
])
Before training, we need to pick the loss function, and the optimizer.
We can also specify many other useful training related properties.
import tensorflow.keras.losses as losses
import tensorflow.keras.optimizers as optimizers
import tensorflow.keras.metrics as metrics
model.compile(
loss=losses.MeanSquaredError(),
optimizer=optimizers.SGD(learning_rate=1e-5),
metrics=[metrics.MeanAbsoluteError()],
)
Keras models can be inspected by its summary method.
model.summary()
Let's generate some data.
x_data = np.linspace(0, 10, 1000)
y_data = 3*x_data + np.random.randn(1000)*4
pl.plot(x_data, y_data, '.');
The model has a method fit
that performs the model parameter tuning using the gradient based training loop.
model.fit(x_data, y_data, epochs=10, batch_size=32, verbose=2)
Okay, now the model is reasonably trained. Let's see how well it works.
model.predict
performs the prediction on given batch of input tensors.
Note, model.predict
returns a NumPy tensor.
y_pred = model.predict(x_data)
pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, np.squeeze(y_pred), linewidth=2);
Keras can save the model to file.
model.save('./model.h5')
Let's try to load the model.
restored_model = models.load_model('./model.h5')
restored_model.summary()
pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, restored_model.predict(x_data).squeeze());