{:check ["true"]}

Index

1 Tensorflow Api

Tensorflow API

In [1]:
import numpy as np
import matplotlib.pyplot as pl

We will be using Tensorflow throughout the course. In this worksheet, we will demonstrate some basic elements of the TensorFlow API.

In [2]:
import tensorflow as tf
print(tf.__version__)
2.4.0

Tensorflow is a re-implementation of NumPy. It augments NumPy in two important ways.

  1. GPU acceleration to support very large concurrency

  2. Perform automatic gradient computation on arbitrary potentia field on tensors


There are two types of tensors containers:

  • Constants
  • Variables
In [3]:
# Constructing a constant tensor.

c_tf = tf.constant(np.array([1,2,3]), dtype=tf.int16)
c_tf
Out[3]:
<tf.Tensor: shape=(3,), dtype=int16, numpy=array([1, 2, 3], dtype=int16)>
In [4]:
# Constructing a variable tensor

x_tf = tf.Variable(np.random.randn(2,3), dtype=tf.float64)
x_tf
Out[4]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

We can obtain the NumPy version from TensorFlow objects.

In [5]:
c_tf.numpy()
Out[5]:
array([1, 2, 3], dtype=int16)
In [6]:
x_tf.numpy()
Out[6]:
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])

TensorFlow reimplements NumPy

TensorFlow reimplements a large portion of NumPy API. Most NumPy API have TensorFlow equivalence.

However,

  1. TensorFlow renamed a few functions

    • np.concatenate $\rightarrow$ tf.concat.
    • np.sum $\rightarrow$ tf.reduce_sum
    • np.mean $\rightarrow$ tf.reduce_mean
  1. TensorFlow uses function call convention over methods.

    • NumPy: x.reshape(3,1)
    • TensorFlow: tf.reshape(x, (3,1))
  1. TensorFlow cares deeply about datatypes used. They cannot be implicitly mixed.
In [7]:
tf.repeat(tf.reshape(c_tf, (3,1)), 3, axis=1)
Out[7]:
<tf.Tensor: shape=(3, 3), dtype=int16, numpy=
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]], dtype=int16)>
In [8]:
try:
    x_tf + c_tf
except Exception as e:
    print(e)
cannot compute AddV2 as input #1(zero-based) was expected to be a double tensor but is a int16 tensor [Op:AddV2]
In [9]:
x_tf + tf.cast(c_tf, tf.float64)
Out[9]:
<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[3.12193846, 1.03735555, 4.43014798],
       [1.16723455, 1.98257317, 3.35427835]])>

Variables

TensorFlow variables are used as model parameters. Only variables can be updated.

  • Variable.assign_add(...)

  • Variable.assign_sub(...)

Variable will be modified inplace.

In [10]:
x_tf
Out[10]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>
In [11]:
x_tf.assign_add(np.array([[1,1,1],
                          [2,2,2]]))

x_tf
Out[11]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[3.12193846, 0.03735555, 2.43014798],
       [2.16723455, 1.98257317, 2.35427835]])>
In [12]:
x_tf.assign_sub(np.array([[1,1,1],
                          [2,2,2]]))
x_tf
Out[12]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float64, numpy=
array([[ 2.12193846, -0.96264445,  1.43014798],
       [ 0.16723455, -0.01742683,  0.35427835]])>

Auto-differentiation

Oen of the most significant feature of modern day deep learning programming libraries is the ability to perform gradient computation automatically.

We can define an (almost) arbitrary computation:

$$ (x_1, x_2, \dots, x_n) \mapsto y $$
In [13]:
# We have a function that returns the vector lengths
# of each row in the input.
def f(y):
    return tf.reduce_sum(y ** 2, axis=1)
In [14]:
# Here are the two vector lengths in x_tf
f(x_tf).numpy()
Out[14]:
array([7.47463042, 0.15378424])
In [15]:
# Now, we can determine the difference between the two
# lengths.
y = tf.abs(f(x_tf)[0] - f(x_tf)[1])
y
Out[15]:
<tf.Tensor: shape=(), dtype=float64, numpy=7.320846174108029>

Auto-differentiation

The gradients of $\frac{\partial y}{\partial x_i}$ can be automatically evaluated as long as $x_i$ is a tensor variable.

This is done using the GradientTape() context manager.

with tf.GradientTape() as tape:
    ...computation...
In [16]:
with tf.GradientTape() as tape:
    z = f(x_tf)
    y = tf.abs(z[0] - z[1])
In [17]:
# The tape object contains all gradient measurements
# collected during the computation of y

grad = tape.gradient(y, x_tf)
grad
Out[17]:
<tf.Tensor: shape=(2, 3), dtype=float64, numpy=
array([[ 4.24387692, -1.9252889 ,  2.86029596],
       [-0.33446911,  0.03485366, -0.70855671]])>

Line Fitting With TensorFlow

Let's revisit line fitting as a case study of how TensorFlow makes it much easier to tackle complex learning tasks.

Training Data

We will generate some data. This time, we will generate a non-linear curve.

In [19]:
x_data = np.linspace(0, 1, 10)
y_data = 3 * x_data + np.sin(6*x_data) + 1.

pl.plot(x_data, y_data, '--o');

Model and Model Parameter

Let's define a model function which performs prediction on y based on the given x values, and some model parameter.

The model parameter $\theta = [w, b]$.

In [20]:
theta = [
    tf.Variable(-1.0, dtype=tf.float64),
    tf.Variable(0.0, dtype=tf.float64)
]

def model(x):
    w,b = theta
    return w*x + b

Loss Function

We define the loss function:

$$L = \frac{1}{n}\sum_i (y_\mathrm{data}[i] - y_\mathrm{pred}[i])$$
In [21]:
def loss(y_data, y_pred):
    return tf.reduce_mean((y_data - y_pred) ** 2)

Optimizer

We can reimplement the gradient descent algorithm with TensorFlow.

In [28]:
def optimize(alpha):
    with tf.GradientTape() as tape:
        L = loss(y_data, model(x_data))
        
    for (grad, v) in zip(tape.gradient(L, theta), theta):
        v.assign_sub(alpha * grad)
    return loss(y_data, model(x_data))
In [29]:
def train(theta0, alpha, epochs):
    # initialize the variables
    for var, val0 in zip(theta, theta0):
        var.assign(val0)
    for i in range(epochs):
        L = optimize(alpha)
        if (i % (epochs // 10)) == 0:
            print("[%.2d] %.2f" % (i, L.numpy()))

Training

We will initialize the model parameters as:

  • $w = -1.0$
  • $b = 0$

Note that $w$ and $b$ are to be adjusted inplace by the training loop.

In [30]:
train([-1.0, 0], 0.01, 1000)
[00] 9.33
[100] 0.39
[200] 0.31
[300] 0.29
[400] 0.27
[500] 0.26
[600] 0.25
[700] 0.25
[800] 0.24
[900] 0.24

We can plot it.

In [32]:
pl.plot(x_data, y_data, 'o')
pl.plot(x_data, model(x_data), color='red');