Data Science

Binary Stride - A Variant On Binary Search

charles.loboJuly 27, 2017

In this blog, let’s take a look at an iconic staple of computer science algorithms – the trusty binary search and see if it holds any surprises. Everyone knows, of course, that binary search is one of the most effective ways to search a sorted array – splitting it in half at every iteration to give us an O(logn) solution.

The way it works is also simple and elegant – find the midpoint and go left or right. Think about how you naturally look for words in a dictionary or numbers in a phone book and you’ve got the idea.

I was first intrigued by binary search when I read about it in the excellent book "Beautiful Code". Here I learned that although the basic idea of binary search is simple, its implementation can be surprisingly tricky!

When Jon Bentley assigned binary search as a problem in a course for professional programmers, he found that ninety percent failed to provide a correct solution after several hours of working on it! Another study published in 1988 shows that accurate code for it is was only found in five out of twenty textbooks. Fascinatingly, Bentley's own implementation of binary search, published in his 1986 book Programming Pearls, contained an overflow error that remained undetected for over twenty years. The Java programming language library implementation of binary search had the same overflow bug for more than nine years.

What fascinated me more was there is another variant of the binary search algorithm that I call the “binary stride” version. This version helps us look at this old problem in a different and interesting way. I will describe both below.

Standard Binary Search

This is the correct version of the commonly known algorithm:

[objc] int binary_search(int a[], int sz, int needle) { int l = 0, r = sz - 1; while(l &amp;lt;= r) { int mid = (l + r) &amp;gt;&amp;gt; 1; if(a[mid] == needle) return mid; if(a[mid] &amp;lt; needle) l = mid+1; else r = mid-1; } return -1; } [/objc]

Binary Stride Version

Now let’s look at the stride variant of binary search. Here the idea is slightly different and very interesting.

First let’s be clear that binary search is only efficient when we have random access. If we have to walk step by step (like a linked list) then a linear search would make more sense. So we know we can make jumps to random points in our search array efficiently. The stride version uses this in it’s idea - let’s make large jumps and only slow the speed as we get closer to our target.

The search goes through the array from left to right, and the initial jump length is n/2. At each step, the jump length will be halved: first n/4, then n/8, n/16, etc., until finally the length is 1. After the jumps, either the target element has been found or we know that it does not appear in the array.

The following code implements the binary stride version:

[objc] int binary_stride(int a[], int sz, int needle) { int pos = 0; for(int stride = sz/2;stride &amp;gt;= 1;stride /= 2) { while(pos+stride &amp;lt; sz &amp;amp;&amp;amp; a[pos+stride] &amp;lt;= needle) pos += stride; } if(a[pos] == needle) return pos; return -1; } [/objc]

This formulation can sometimes be pretty useful. For example, when we need to find the point in an array where a graph becomes positive (or some function changes). This is useful because we often would like to minimize some cost or find some other optimum point.

In that case, the stride version is a very natural fit:

[objc] int find_crossover_point(int a[], int sz) { int pos = 0; for(int stride = sz/2;stride &amp;gt;= 1;stride /= 2) { while(fn(a[pos+stride]) &amp;lt;= 0) pos += stride; } return pos; } [/objc]

I hope you found the binary stride as interesting as I did and have a new tool for your thinking toolbox!