Tensorflow API¶

import numpy as np
import matplotlib.pyplot as pl

import numpy as np
import matplotlib.pyplot as pl

We will be using Tensorflow throughout the course. In this worksheet, we will demonstrate some basic elements of the TensorFlow API.

import tensorflow as tf
print(tf.__version__)

2.4.0

import tensorflow as tf
print(tf.__version__)

2.4.0

Tensorflow is a re-implementation of NumPy. It augments NumPy in two important ways.

GPU acceleration to support very large concurrency
Perform automatic gradient computation on arbitrary potentia field on tensors

There are two types of tensors containers:

Constants
Variables

# Constructing a constant tensor.

c_tf = tf.constant(np.array([1,2,3]), dtype=tf.int16)
c_tf

<tf.Tensor: shape=(3,), dtype=int16, numpy=array([1, 2, 3], dtype=int16)>

# Constructing a constant tensor.

c_tf = tf.constant(np.array([1,2,3]), dtype=tf.int16)
c_tf

<tf.Tensor: shape=(3,), dtype=int16, numpy=array([1, 2, 3], dtype=int16)>

# Constructing a variable tensor

x_tf = tf.Variable(np.random.randn(2,3), dtype=tf.float64)
x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

# Constructing a variable tensor

x_tf = tf.Variable(np.random.randn(2,3), dtype=tf.float64)
x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

We can obtain the NumPy version from TensorFlow objects.

c_tf.numpy()

array([1, 2, 3], dtype=int16)

c_tf.numpy()

array([1, 2, 3], dtype=int16)

x_tf.numpy()

array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])

x_tf.numpy()

array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])

TensorFlow reimplements NumPy¶

TensorFlow reimplements a large portion of NumPy API. Most NumPy API have TensorFlow equivalence.

However,

TensorFlow renamed a few functions
- np.concatenate $\rightarrow$ tf.concat.
- np.sum $\rightarrow$ tf.reduce_sum
- np.mean $\rightarrow$ tf.reduce_mean

TensorFlow uses function call convention over methods.
- NumPy: x.reshape(3,1)
- TensorFlow: tf.reshape(x, (3,1))

TensorFlow cares deeply about datatypes used. They cannot be implicitly mixed.

tf.repeat(tf.reshape(c_tf, (3,1)), 3, axis=1)

<tf.Tensor: shape=(3, 3), dtype=int16, numpy=
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]], dtype=int16)>

tf.repeat(tf.reshape(c_tf, (3,1)), 3, axis=1)

<tf.Tensor: shape=(3, 3), dtype=int16, numpy=
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]], dtype=int16)>

try:
    x_tf + c_tf
except Exception as e:
    print(e)

cannot compute AddV2 as input #1(zero-based) was expected to be a double tensor but is a int16 tensor [Op:AddV2]

try:
    x_tf + c_tf
except Exception as e:
    print(e)

cannot compute AddV2 as input #1(zero-based) was expected to be a double tensor but is a int16 tensor [Op:AddV2]

x_tf + tf.cast(c_tf, tf.float64)

<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[3.12193846, 1.03735555, 4.43014798],
       [1.16723455, 1.98257317, 3.35427835]])>

x_tf + tf.cast(c_tf, tf.float64)

<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[3.12193846, 1.03735555, 4.43014798],
       [1.16723455, 1.98257317, 3.35427835]])>

Variables¶

TensorFlow variables are used as model parameters. Only variables can be updated.

Variable.assign_add(...)
Variable.assign_sub(...)

Variable will be modified inplace.

x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

x_tf.assign_add(np.array([[1,1,1],
                          [2,2,2]]))

x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[3.12193846, 0.03735555, 2.43014798],
       [2.16723455, 1.98257317, 2.35427835]])>

x_tf.assign_add(np.array([[1,1,1],
                          [2,2,2]]))

x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[3.12193846, 0.03735555, 2.43014798],
       [2.16723455, 1.98257317, 2.35427835]])>

x_tf.assign_sub(np.array([[1,1,1],
                          [2,2,2]]))
x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

x_tf.assign_sub(np.array([[1,1,1],
                          [2,2,2]]))
x_tf

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

Auto-differentiation¶

Oen of the most significant feature of modern day deep learning programming libraries is the ability to perform gradient computation automatically.

We can define an (almost) arbitrary computation:

$$ (x_1, x_2, \dots, x_n) \mapsto y $$

# We have a function that returns the vector lengths
# of each row in the input.
def f(y):
    return tf.reduce_sum(y ** 2, axis=1)

# We have a function that returns the vector lengths
# of each row in the input.
def f(y):
    return tf.reduce_sum(y ** 2, axis=1)

# Here are the two vector lengths in x_tf
f(x_tf).numpy()

array([7.47463042, 0.15378424])

# Here are the two vector lengths in x_tf
f(x_tf).numpy()

array([7.47463042, 0.15378424])

# Now, we can determine the difference between the two
# lengths.
y = tf.abs(f(x_tf)[0] - f(x_tf)[1])
y

<tf.Tensor: shape=(), dtype=float64, numpy=7.320846174108029>

# Now, we can determine the difference between the two
# lengths.
y = tf.abs(f(x_tf)[0] - f(x_tf)[1])
y

<tf.Tensor: shape=(), dtype=float64, numpy=7.320846174108029>

Auto-differentiation

The gradients of $\frac{\partial y}{\partial x_i}$ can be automatically evaluated as long as $x_i$ is a tensor variable.

This is done using the GradientTape() context manager.

with tf.GradientTape() as tape:
    ...computation...

with tf.GradientTape() as tape:
    z = f(x_tf)
    y = tf.abs(z[0] - z[1])

with tf.GradientTape() as tape:
    z = f(x_tf)
    y = tf.abs(z[0] - z[1])

# The tape object contains all gradient measurements
# collected during the computation of y

grad = tape.gradient(y, x_tf)
grad

<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[ 4.24387692, -1.9252889 ,  2.86029596],
       [-0.33446911,  0.03485366, -0.70855671]])>

# The tape object contains all gradient measurements
# collected during the computation of y

grad = tape.gradient(y, x_tf)
grad

<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[ 4.24387692, -1.9252889 ,  2.86029596],
       [-0.33446911,  0.03485366, -0.70855671]])>

Line Fitting With TensorFlow¶

Let's revisit line fitting as a case study of how TensorFlow makes it much easier to tackle complex learning tasks.

Training Data

We will generate some data. This time, we will generate a non-linear curve.

x_data = np.linspace(0, 1, 10)
y_data = 3 * x_data + np.sin(6*x_data) + 1.

pl.plot(x_data, y_data, '--o');

x_data = np.linspace(0, 1, 10)
y_data = 3 * x_data + np.sin(6*x_data) + 1.

pl.plot(x_data, y_data, '--o');

Model and Model Parameter

Let's define a model function which performs prediction on y based on the given x values, and some model parameter.

The model parameter $\theta = [w, b]$.

theta = [
    tf.Variable(-1.0, dtype=tf.float64),
    tf.Variable(0.0, dtype=tf.float64)
]

def model(x):
    w,b = theta
    return w*x + b

theta = [
    tf.Variable(-1.0, dtype=tf.float64),
    tf.Variable(0.0, dtype=tf.float64)
]

def model(x):
    w,b = theta
    return w*x + b

Loss Function

We define the loss function:

$$L = \frac{1}{n}\sum_i (y_\mathrm{data}[i] - y_\mathrm{pred}[i])$$

def loss(y_data, y_pred):
    return tf.reduce_mean((y_data - y_pred) ** 2)

def loss(y_data, y_pred):
    return tf.reduce_mean((y_data - y_pred) ** 2)

Optimizer

We can reimplement the gradient descent algorithm with TensorFlow.

def optimize(alpha):
    with tf.GradientTape() as tape:
        L = loss(y_data, model(x_data))
        
    for (grad, v) in zip(tape.gradient(L, theta), theta):
        v.assign_sub(alpha * grad)
    return loss(y_data, model(x_data))

def optimize(alpha):
    with tf.GradientTape() as tape:
        L = loss(y_data, model(x_data))
        
    for (grad, v) in zip(tape.gradient(L, theta), theta):
        v.assign_sub(alpha * grad)
    return loss(y_data, model(x_data))

def train(theta0, alpha, epochs):
    # initialize the variables
    for var, val0 in zip(theta, theta0):
        var.assign(val0)
    for i in range(epochs):
        L = optimize(alpha)
        if (i % (epochs // 10)) == 0:
            print("[%.2d] %.2f" % (i, L.numpy()))

def train(theta0, alpha, epochs):
    # initialize the variables
    for var, val0 in zip(theta, theta0):
        var.assign(val0)
    for i in range(epochs):
        L = optimize(alpha)
        if (i % (epochs // 10)) == 0:
            print("[%.2d] %.2f" % (i, L.numpy()))

Training¶

We will initialize the model parameters as:

$w = -1.0$
$b = 0$

Note that $w$ and $b$ are to be adjusted inplace by the training loop.

train([-1.0, 0], 0.01, 1000)

[00] 9.33
[100] 0.39
[200] 0.31
[300] 0.29
[400] 0.27
[500] 0.26
[600] 0.25
[700] 0.25
[800] 0.24
[900] 0.24

train([-1.0, 0], 0.01, 1000)

[00] 9.33
[100] 0.39
[200] 0.31
[300] 0.29
[400] 0.27
[500] 0.26
[600] 0.25
[700] 0.25
[800] 0.24
[900] 0.24

We can plot it.

pl.plot(x_data, y_data, 'o')
pl.plot(x_data, model(x_data), color='red');

pl.plot(x_data, y_data, 'o')
pl.plot(x_data, model(x_data), color='red');