This is the second segment of the multi-part series covering various algorithm design paradigms. As the title suggests, today our job is to present, discuss, and learn as much as we can, as briefly and clearly possible, about the divide-and-conquer algorithm technique. It is definitely an important concept in computer science and should be ready to be pulled out of every coder’s toolbox.

Divide and Conquer Algorithm Technique - Merge Sort Problem and Solution (Page 4 of 5 )

Now let’s see another problem, a sorting algorithm: Merge Sort. As its name already suggests, it is a sorting algorithm that is based on “merging.” Simply put, we have a series of unsorted numbers located in a one-dimensional array/vector. Our task is to sort these numbers in increasing order. We can do this using the famous Merge Sort algorithm, which is based on the divide-and-conquer algorithm paradigm.

In the previously presented problem we saw that by dividing a vector at its middle element, we end up with two halves (first element –> middle element; middle element + 1 –> last element). Assuming our vector has n elements, we can write this as: [1 … n/2] and [n/2 … n]. Our approach will be the same; we are going to divide the vector into sub-arrays until we end up with a trivial sub-array. When is this met?

Well, obviously, if a vector contains only one element (a number in our case) then it is already sorted. So this is what we call a trivial solution. There’s nothing left to sort in this case. As a result, our target is to divide the array into sub-arrays until each one of them becomes trivial (containing only one element) and then, using a merging algorithm, we can merge these sub-solutions (each element one by one).

In short, the merging algorithm merges two halves of a specific sub-array which is part of the original array. We already know that those two halves are sorted; merging them is easy because, by using only one if statement, we can pick the lower one and place it into the destination vector. Check out the following pseudo-code.

procedure merge_sort (x[], i, j)

if i < j then

k := (i + j) / 2

merge_sort (x, i, k)

merge_sort (x, k+1, j)

do_merge(x, i, k, j)

end if

end procedure

As you can see, without the actual merging part, this is pretty darn short. We call this procedure like this: merge_sort (x, 1, n) knowing that 1 is the first element of the vector while n stands for the last one. The first condition (if statement) checks whether there’s still something left to divide; if yes, then it divides it into two halves, recursively calling the procedure itself; after all of this is finished, the merging is required.

procedure do_merge (x[], left, middle, right)

for each i := left to middle do

a{i} := x{i}

end for

for each i := middle + 1 to right do

b{i} := x{i}

end for

a{middle+1} = infinite

b{right+1} = infinite

i = left

j = middle + 1

for each k := left to right do

if a{i} < b{j} then

x{k} := a{i}

i := i + 1

else

x{k} := b{i}

j := j + 1

end if

end for

end procedure

This is where things get a little bit interesting; nonetheless, it isn’t as hard as it seems, it's just that being written in pseudo-code makes it look longer (and complex) than it should be. As you can see, we call the do_merge() procedure with 4 parameters. The key parameters are the left, middle, and right—these stand for the actual positions, and with them we can manipulate and go through the vector.

We first build two different vectors (a[] and b[]) containing the first half of the x[] vector (which is specified in the parameter) and the second half, respectively. After this step, we can move on and actually merge these two together. We use a single if statement where the condition is that the lower number goes first; it makes sense. The rest is just variable initialization (infinite is a very large number).

Now let’s see this actually coded in ANSI C. Keep in mind that the following snippet uses compact style (such as a compact if statement) and both of the procedures (mergesort and do_merge) are combined into only one procedure (mergesort). We use a temporary tmp[] vector in which we sort the vector. At the end, we copy its content back to the original a[] and then we free up the temporary tmp[].

void mergesort(int *a, int n)

{

int i, j, k, *tmp, f = n / 2; // initializing the variables

if (n == 1) // if trivial

return;

mergesort(a, f); // first half

mergesort(a + f, n - f); // second half

tmp = (int*)malloc(n*sizeof(int)); // performing the merge

for (i = 0, j = f, k = 0; i < f && j < n;)

tmp[k++] = (a[i] < a[j]) ? a[i++] : a[j++];

// merging the leftovers

while (i < f)

tmp[k++] = a[i++];

while (j < n)

tmp[k++] = a[j++];

for (i = 0; i < n; i++)

a[i] = tmp[i];

free(tmp); // freeing up the temp array

}

As a final note, please don’t forget that in C/C++, vectors/arrays begin from 0 and not 1. This is the reason all of the loops go from 0to n.

Once again, if you want to download the entire source code of the block of code implemented above (Merge Sort) then just click on the button below. Enjoy!