Keras API¶

import numpy as np
import matplotlib.pyplot as pl
import tensorflow as tf

import numpy as np
import matplotlib.pyplot as pl
import tensorflow as tf

Layers¶

Keras API design is based on modular composition of layers.

Using layers, we can construct ever more complex neural networks.

Each layer is a function: $ f(x | \theta) $

$x$ is the input which is a tensor of some shape.
$\theta$ is a collection of model parameters.
Shapes of input $x$ and layer output are important.

Each layer is callable.

layer(input_tensor_batch)

The input must be a batch of input tensors.

Why batch processing?

Single observation training is too inefficient.

When we have billions of observations, batch becomes necessary.

Dense Layer

$$f(x | w, b) = w\cdot x + b$$

Input:

$x$ needs to have the shape $(n,)$

The model parameters are $(w, b)$ where
- $w : (m, n)$ and
- $b: (n,)$
They are usually randomly seeded.

The output shape is $(m,)$. We call $m$ the output dimensionality of the layer.

#
# The layers module comes all the built-in types of layers
#
import tensorflow.keras.layers as layers

#
# We can construct a dense layer and specify the
# output dimensionality.
#
dense = layers.Dense(5)

#
# Layers are callable.
#
# It's important to note that the input to a layer is
# a **batch** of input vectors.
#
dense(np.ones((4,10)))

<tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325]],
      dtype=float32)>

#
# The layers module comes all the built-in types of layers
#
import tensorflow.keras.layers as layers

#
# We can construct a dense layer and specify the
# output dimensionality.
#
dense = layers.Dense(5)

#
# Layers are callable.
#
# It's important to note that the input to a layer is
# a **batch** of input vectors.
#
dense(np.ones((4,10)))

<tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325],
       [ 0.9466287 , -1.0576787 ,  0.6624387 ,  0.55489326, -0.07569325]],
      dtype=float32)>

#
# Layer objects provide many methods and properties.
# Layer.weights gives us the list of tf.Variables which
# are the model parameters associated with the layer.
#
dense.weights

[<tf.Variable 'dense_9/kernel:0' shape=(10, 5) dtype=float32, numpy=
 array([[ 0.6025072 , -0.43672186,  0.01523787, -0.05744964, -0.02783871],
        [ 0.3051511 ,  0.5362379 , -0.00778657,  0.1384775 ,  0.59329134],
        [-0.30592373, -0.12877333, -0.15314567,  0.36254793, -0.59227914],
        [ 0.26332307, -0.01757264,  0.48372084,  0.10010713,  0.06260192],
        [-0.590446  , -0.37000114,  0.40983242,  0.49618083,  0.53072447],
        [ 0.5794999 , -0.08634806, -0.47732228, -0.27232084, -0.05219626],
        [-0.19411871, -0.22270411,  0.4445173 , -0.45263886, -0.5300445 ],
        [-0.5467794 , -0.42953312,  0.08185697, -0.4732017 ,  0.21752769],
        [ 0.31189018, -0.3273468 ,  0.32280928,  0.42046827, -0.28268683],
        [ 0.5215251 ,  0.4250844 , -0.45728153,  0.29272258,  0.00520676]],
       dtype=float32)>,
 <tf.Variable 'dense_9/bias:0' shape=(5,) dtype=float32, numpy=array([0., 0., 0., 0., 0.], dtype=float32)>]

#
# Layer objects provide many methods and properties.
# Layer.weights gives us the list of tf.Variables which
# are the model parameters associated with the layer.
#
dense.weights

[<tf.Variable 'dense_9/kernel:0' shape=(10, 5) dtype=float32, numpy=
 array([[ 0.6025072 , -0.43672186,  0.01523787, -0.05744964, -0.02783871],
        [ 0.3051511 ,  0.5362379 , -0.00778657,  0.1384775 ,  0.59329134],
        [-0.30592373, -0.12877333, -0.15314567,  0.36254793, -0.59227914],
        [ 0.26332307, -0.01757264,  0.48372084,  0.10010713,  0.06260192],
        [-0.590446  , -0.37000114,  0.40983242,  0.49618083,  0.53072447],
        [ 0.5794999 , -0.08634806, -0.47732228, -0.27232084, -0.05219626],
        [-0.19411871, -0.22270411,  0.4445173 , -0.45263886, -0.5300445 ],
        [-0.5467794 , -0.42953312,  0.08185697, -0.4732017 ,  0.21752769],
        [ 0.31189018, -0.3273468 ,  0.32280928,  0.42046827, -0.28268683],
        [ 0.5215251 ,  0.4250844 , -0.45728153,  0.29272258,  0.00520676]],
       dtype=float32)>,
 <tf.Variable 'dense_9/bias:0' shape=(5,) dtype=float32, numpy=array([0., 0., 0., 0., 0.], dtype=float32)>]

Activation Layer

An activation function is just another word for element-wise rescaling functions.

$$ f: \mathbb{R}^n\to\mathbb{R}^n $$

Here are some popular activation functions:

Rectilinear relu: $$f(x) = \left\{\begin{array}{ll} 0 & \mathrm{if}\ x \leq 0 \\ x & \mathrm{else} \end{array}\right.$$
Sigmoid: $$ f(x) = \frac{1}{1+e^{-x}}$$

Other frequently used activation functions are:

softmax and tanh

#
# We can build a sigmoid activation layer.
#
sigmoid = layers.Activation('sigmoid')

sigmoid(tf.zeros((3,2)))

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[0.5, 0.5],
       [0.5, 0.5],
       [0.5, 0.5]], dtype=float32)>

#
# We can build a sigmoid activation layer.
#
sigmoid = layers.Activation('sigmoid')

sigmoid(tf.zeros((3,2)))

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[0.5, 0.5],
       [0.5, 0.5],
       [0.5, 0.5]], dtype=float32)>

#
# Activation layers do not have any model parameters
#
sigmoid.weights

[]

#
# Activation layers do not have any model parameters
#
sigmoid.weights

[]

We will be explore many other types of layers throughout the course.

Model¶

Keras bundles layers, loss function and optimizer into a single Model object.

Models are extremely easy to build using the sequential model API.

Using the sequential API, we need to specify a sequence of layers, and add them to the sequential model.

We need an input layer.

The input layer is actually just a placeholder for Keras to know the shape of the tensors to expect during training and prediction.

import tensorflow.keras.models as models

#
# This is all we need for linear regression.
#
model = models.Sequential([
    layers.Input(shape=(1,)),
    layers.Dense(1),
])

import tensorflow.keras.models as models

#
# This is all we need for linear regression.
#
model = models.Sequential([
    layers.Input(shape=(1,)),
    layers.Dense(1),
])

Before training, we need to pick the loss function, and the optimizer.

We can also specify many other useful training related properties.

import tensorflow.keras.losses as losses
import tensorflow.keras.optimizers as optimizers
import tensorflow.keras.metrics as metrics

model.compile(
    loss=losses.MeanSquaredError(),
    optimizer=optimizers.SGD(learning_rate=1e-5),
    metrics=[metrics.MeanAbsoluteError()],
)

import tensorflow.keras.losses as losses
import tensorflow.keras.optimizers as optimizers
import tensorflow.keras.metrics as metrics

model.compile(
    loss=losses.MeanSquaredError(),
    optimizer=optimizers.SGD(learning_rate=1e-5),
    metrics=[metrics.MeanAbsoluteError()],
)

Inspecting the model¶

Keras models can be inspected by its summary method.

model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_13 (Dense)             (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_13 (Dense)             (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

Training¶

Let's generate some data.

x_data = np.linspace(0, 10, 1000)
y_data = 3*x_data + np.random.randn(1000)*4
pl.plot(x_data, y_data, '.');

x_data = np.linspace(0, 10, 1000)
y_data = 3*x_data + np.random.randn(1000)*4
pl.plot(x_data, y_data, '.');

The model has a method fit that performs the model parameter tuning using the gradient based training loop.

model.fit(x_data, y_data, epochs=10, batch_size=32, verbose=2)

Epoch 1/10
32/32 - 0s - loss: 17.1070 - mean_absolute_error: 3.3410
Epoch 2/10
32/32 - 0s - loss: 17.0865 - mean_absolute_error: 3.3386
Epoch 3/10
32/32 - 0s - loss: 17.0677 - mean_absolute_error: 3.3364
Epoch 4/10
32/32 - 0s - loss: 17.0491 - mean_absolute_error: 3.3341
Epoch 5/10
32/32 - 0s - loss: 17.0305 - mean_absolute_error: 3.3318
Epoch 6/10
32/32 - 0s - loss: 17.0142 - mean_absolute_error: 3.3299
Epoch 7/10
32/32 - 0s - loss: 16.9978 - mean_absolute_error: 3.3279
Epoch 8/10
32/32 - 0s - loss: 16.9812 - mean_absolute_error: 3.3261
Epoch 9/10
32/32 - 0s - loss: 16.9658 - mean_absolute_error: 3.3240
Epoch 10/10
32/32 - 0s - loss: 16.9514 - mean_absolute_error: 3.3222

<tensorflow.python.keras.callbacks.History at 0x7fcdf46cc8e0>

model.fit(x_data, y_data, epochs=10, batch_size=32, verbose=2)

Epoch 1/10
32/32 - 0s - loss: 17.1070 - mean_absolute_error: 3.3410
Epoch 2/10
32/32 - 0s - loss: 17.0865 - mean_absolute_error: 3.3386
Epoch 3/10
32/32 - 0s - loss: 17.0677 - mean_absolute_error: 3.3364
Epoch 4/10
32/32 - 0s - loss: 17.0491 - mean_absolute_error: 3.3341
Epoch 5/10
32/32 - 0s - loss: 17.0305 - mean_absolute_error: 3.3318
Epoch 6/10
32/32 - 0s - loss: 17.0142 - mean_absolute_error: 3.3299
Epoch 7/10
32/32 - 0s - loss: 16.9978 - mean_absolute_error: 3.3279
Epoch 8/10
32/32 - 0s - loss: 16.9812 - mean_absolute_error: 3.3261
Epoch 9/10
32/32 - 0s - loss: 16.9658 - mean_absolute_error: 3.3240
Epoch 10/10
32/32 - 0s - loss: 16.9514 - mean_absolute_error: 3.3222

<tensorflow.python.keras.callbacks.History at 0x7fcdf46cc8e0>

Prediction¶

Okay, now the model is reasonably trained. Let's see how well it works.

model.predict performs the prediction on given batch of input tensors.

Note, model.predict returns a NumPy tensor.

y_pred = model.predict(x_data)

pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, np.squeeze(y_pred), linewidth=2);

y_pred = model.predict(x_data)

pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, np.squeeze(y_pred), linewidth=2);

Saving and loading the model¶

Keras can save the model to file.

Model architecture is serialized.

Model parameters are serialized.

model.save('./model.h5')

model.save('./model.h5')

Let's try to load the model.

restored_model = models.load_model('./model.h5')
restored_model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_13 (Dense)             (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

restored_model = models.load_model('./model.h5')
restored_model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_13 (Dense)             (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, restored_model.predict(x_data).squeeze());

pl.plot(x_data, y_data, '.', color='#ccc');
pl.plot(x_data, restored_model.predict(x_data).squeeze());

Index

Basics of Keras

Elements of Neural Networks

split=4

Building Neural Networks with Keras