Mergesort

Divide and conquer

It’s a way to instruct “simple” solver to solve a complex problem.

!

!

Divide and conquer (and combine)

Divide:

Given a problem $P$, the division of a problem is a procedure that produces a collection of smaller problems $\mathbf{Q} = \{Q_1, Q_2, \dots\}$.

Conquer:

Each sub-problem $Q_i\in\mathbf{Q}$ (hopefully) is easier to solve. Let $S_i = \mathrm{solve}(Q_i)$

Combine:

Hopefully, the solutions of subproblems, ${S_i}$, are not useless toward the solution of $P$.

We need a procedure to combine:

$$\mathrm{solve}(P) = \mathrm{combine}(S_1, S_2, S_3, \dots)$$

Example: cleaning a mess

Consider how you can instruct a minion to clean a messy office. The minion is overwhelmed by the mess of the entire office. Divide and conquer is a great way to solve the cleaning problem.

!

Here are some assumptions:

  1. The minion can clean up a very small section of an office - like a drawer, or a book shelf.

  2. If every section of an office is organized, we deem the whole office organized.

Example: cleaning a mess

Sorting with Divide and Conquer

!

Divide & Conquer:

A list of numbers can be divided into smaller lists. If a list is so small, we can sort it trivially.

!

How small does a list need to be for us to know how to sort it?

Sorting with Divide and Conquer

Combine:

If we have two lists that are already sorted, can we figure out how to combine them in a sorted way?

!

+---+---+---+---+
| 1 | 5 | 6 | 8 |          \
+---+---+---+---+          |
                           |
                           +------>  combined sorted list
                           |
+---+---+---+---+----+     |
| 3 | 4 | 5 | 8 | 10 |     /
+---+---+---+---+----+

Sorting with Divide and Conquer

Combine:

!

def merge(list1, list2):
    i, j = 0, 0
    list3 = []

    while i < len(list1) and j < len(list2):
        if list1[i] <= list2[j]:
            list3.append(list1[i])
            i += 1
        else:
            list3.append(list2[j])
            j += 1

    if i < len(list1): list3.extend(list1[i:])

    if j < len(list2): list3.extend(list2[j:])

    return list3

!

We want to be able to merge to sorted list into a larger list.

!

Bottom-up Mergesort

!

Bottom-up Mergesort

We start with a list of arbitrary lenght $n$.

Bottom-up Mergesort

def mergesort(list):
    sortedness = 1

    while sortedness < len(list):
        for i in range(0, len(list), 2*sortedness):
            list1 = list[i:i+sortedness]
            list2 = list[i+sortedness:i+2*sortedness]
            list3 = merge(list1, list2)
            list[i:i+len(list3)] = list3
        sortedness = 2 * sortedness

    return list

Bottom-up Mergesort

Bottom-up Mergesort

!

!

!

The conquer stage always starts with the bottom of the hierarchy.

Analysis: correctness

!

def mergesort(list):
    sortedness = 1

    while sortedness < len(list):
        for i in range(0, len(list), 2*sortedness):
            list1 = list[i:i+sortedness]
            list2 = list[i+sortedness:i+2*sortedness]
            list3 = merge(list1, list2)
            list[i:i+len(list3)] = list3
        sortedness = 2 * sortedness

    return list

!

Loop-invariance:

At the end of each loop,

for all $L[i\Delta \dots (i+1)\Delta]$ is sorted for all $i$, where $\Delta$ is the sortedness at the end of the iteration.

Overall correctness

The loop terminates when $\Delta$ exceeds the length of the list, so the entire list is sorted by the loop invariance.

Guaranteed termination

Each iteration strictly increases the sortedness $\Delta$ (by doubling it).

Analysis: performance

!

def mergesort(list):
    sortedness = 1

    while sortedness < len(list):
        for i in range(0, len(list), 2*sortedness):
            list1 = list[i:i+sortedness]
            list2 = list[i+sortedness:i+2*sortedness]
            list3 = merge(list1, list2)
            list[i:i+len(list3)] = list3
        sortedness = 2 * sortedness

    return list

!

How many iterations if there are $n$ elements in the list?

Each iteration grows the sortedness by a factor of 2 up to $n$. So, we need $\log_2(n)$ iterations.

How many instructions does each iteration take?

Each iteration has four steps. Each step requires no more than a linear scan of the list. We can guarantee that number of instructions do not exceed: $k\cdot n$ for some constant $k$.


Putting it together:

mergesort() requires no more than $k\cdot n\log(n)$ number of instructions.

Top-down mergesort

!

Top-down mergesort

def mergesort(L):
    n = len(L)
    if n <= 1:
        return L
    else:
        L1 = mergesort(L[:n/2])
        L2 = mergesort(L[n/2:])
        return merge(L1, L2)

Top-down mergesort: Analysis

!

def mergesort(L):
    n = len(L)
    if n <= 1:
        return L
    else:
        L1 = mergesort(L[:n/2])
        L2 = mergesort(L[n/2:])
        return merge(L1, L2)

!

Correctness:

Induction on depth of recursion.

Base case: depth = 0

If there is no recursion, than the list $L$ is such that $|L|\leq 1$.

Inductive case: depth = $k$, and assume that all mergesort at level $< k$ are correct.

Let $L_1$ and $L_2$ be the two sublists. The invocations mergesort($L_i$) use $< k$ recursions, and thus, are correct.

Since $L_1$ and $L_2$ are properly sorted, and that merge() is correct, we conclude that mergesort(L) is correct.

Top-down mergesort: Analysis

!

def mergesort(L):
    n = len(L)
    if n <= 1:
        return L
    else:
        L1 = mergesort(L[:n/2])
        L2 = mergesort(L[n/2:])
        return merge(L1, L2)

!

Performance

Let $n = |L|$. How much time does it take to complete top-down mergesort($L$)?

Let $T(n)$ = time of mergesort($L$).

\begin{eqnarray*} T(n) &\simeq& T(n/2) + T(n/2) + n \\ &=& 2 T(n/2) + n \end{eqnarray*}

!

We will develop an array of techniques to solve recurrence equations to analyze recursive algorithms.

!

Spoiler Alert:

$$ T(n) = n\log(n) + n $$

Conclusion

!

Challenge

Is MERGESORT a stable sorting algorithm?

Recall:

Stable sorting requires an additional condition on the permutation:

$$ \forall i,j\in [0, \mathrm{length}(x)],\ i < j\ \mathrm{and}\ f(x[i], x[j]) = 0 \implies \pi(i) < \pi(j) $$

back
© KEN PU, 2016