Basic programming

!

We use Python for this course.

Basic Python will do for the most part of this course.

!

If you are not used to Python (anymore), you are only one day away of catching up.

Programming & This Course

!

We don’t need much programming for this course because we absolutely will be focused on the algorithms.

!

But …

The reality is that programming is absolutely essential.

Basic programming for this course

!

x = [1, 2, 3]

print "The middle number is %d" % x[1]

!

We need arrays as a fundamental data structure.

! =================================================

!

class Student:
    name = "Unknown"
    age = 18

    def __init__(self, name):
        self.name = name

jack = Student("Jack")

print "%s: %s" % (jack.name, jack.age)

!

We need objects.

! =================================================

!

i = 0
while i < 10:
  print i
  i += 1
for i in range(10):
  print i

!

Various loops are needed.

! ==================================================================

!

def reverse_string(s):
    reversed_s = ""
    for c in s:
      reversed_s += c
    return reversed_s

!

We need functions, mostly to better organize the implementation of an algorithm.

! ============

We may need to rely on external Python libraries for data generation and visualization of algorithmic actions.

!

!

import networkx as nx
import matplotlib.pyplot as plot

G = nx.erdos_renyi_graph(100, 0.015)
nx.draw(G)
plot.save("random-graph.png")

The sorting problem

!

!

The sorting problem

Insertion sort

It’s a highly inefficient sorting algorithm.

It’s simple enough that:

  1. It’s intuitive.
  2. It can be fully analyzed relatively easily.

Insertion sort: the intuition

Suppose you have an array such that the initial subarray is already sorted. But the last element may be out of place.

+---+---+---+----+---+
| 2 | 4 | 5 | 10 | x |
+---+---+---+----+---+

Q:

What is a procedure to rearrange such array? Imagine that x=7.

Insertion sort

!

Chapter 2, Figure 2.1 of textbook.

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

Try it out

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key
import random
A = [random.randint(0, 100) for i in range(10)]
print A
insertion_sort(A)
print A

!

Analysis of correctness

Loop invariance

!

Consider a loop:

total = 0
count = 0

for v in array:
  total += v
  count += 1

avg = total / count

!


Correctness of INSERTION SORT by loop-invariance

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

There are two loops, with while-loop nested in the for-loop.

Correctness of INSERTION SORT by loop-invariance

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

Inner-loop invariance

If we assume that $A[0 .. j-1]$ is sorted, then the following loop invariance holds for the for loop:

At the end of the iteration, we have $A[i\dots j]$ is sorted.


Can we say anything about A[j+1] to A[len(A)-1]?

Correctness of INSERTION SORT by loop-invariance

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

Outer-loop invariance

The subarray A[0 .. j] is sorted after the $j$-th iteration of the for-loop.

Proof: We prove by induction on $j$.

Base case

$j=0$, trivially true

Induction

If A[0 .. j-1] is sorted, by the inner-loop invariance, moving key to A[i+1] makes A[0 .. j] sorted.

Correctness of INSERTION SORT by loop-invariance

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

Theorem:

The insertion_sort algorithm always correctly sorts the input array.

Proof:

By the outer-loop invariance, by the end of the last iteration, with $j=$len(A)-1, the entire array A[0 .. len(A)-1] is sorted.

! =================================================================

Performance analysis

!

! =================================================================

Every statement takes the same amount of time.

This is not exactly correct, but it is accurate enough to gauge the performance of algorithms.


Corollary

We measure the number of lines executed by the program before an array is completely sorted by insertion_sort.

$T_\mathrm{LOC}(\mathrm{length}(A)) \Rightarrow T(n)$

! =================================================================

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

! =================================================================

!

def insertion_sort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j-1
    while i >= 0 and A[i] > key:
        A[i+1] = A[i]
        i = i - 1
    A[i+1] = key

!

Worst case analysis

Let $n = \mathrm{length}(A)$

  • The outer-loop always iterates $n$ times.

  • The inner loop iterates in the worst case $j$ times.


$$ T(n) \leq \sum_{j=0}^n j = \frac{n(n-1)}{2}$$

! =================================================================

!

insert_sort is a pretty inefficient algorithm.

n T time (1M LOC/s)
10 45 45 $\mu s$
100 4950 5 $ms$
1000 499500 0.5 $s$
1000000 499999500000 5.9 days

!

A more efficient algorithm (to be discussed later):

n T time (1M LOC/s)
10 23 23 $\mu s$
100 460 0.46 $ms$
1000 6907 6.9 $s$
1000000 13815510 13.8 $s$

Summary

!


Text: 2.1, 2.2,

Challenge

Is INSERTION sort a stable sorting algorithm?

Recall:

Stable sorting requires an additional condition on the permutation:

$$ \forall i,j\in [0, \mathrm{length}(x)],\ i < j\ \mathrm{and}\ f(x[i], x[j]) = 0 \implies \pi(i) < \pi(j) $$

back
© KEN PU, 2016