We all know that binary search is a great algorithm for searching elements with an average running time complexity of O( log N ). It always checks the value at the middle index and discards one half according to the searching element, then the search is reduced using this approach. Follow this link for more on Binary Search.
New data points in the range of a set of given data points can be created with the use of interpolation search. In many use cases, one is provided with a number of data points which represent the values of a function for a number of independent variables. It is often used to interpolate the value of some function for an intermediate value of the independent variable.
Interpolation search may check on a number of locations based on the data being searched. All the elements in the array should be sorted and follow a uniform distribution for interpolation search to work properly.
Interpolation gives us the comfortability to estimate the function at intermediate points, for example for x = 2.5.
Since 2.5 is between 2 and 3, we can find f(2.5) by using f(2) and f(3), i.e,
(20+30)/2 = 25
For our article, we are going to use one famous method for interpolation - **linear interpolation. **
Linear interpolation works by taking two points, for example, (x1,y1) & (x2,y2), and estimating, or interpolating, between using the given formula:
y = y1 + (y2 - y1) (x - x1)/(x2-x1) at point (x,y)
As we estimate the value of f(2.5) we use the constant ½, but we can use a better constant which might lead us to our element faster.
And we are going to use constant “K”,
K = (data - low)/(high-low)
Here high, low are the endpoints and data is the element to be estimated.
Estimating using the above technique works in the same way as humans do while searching for a word in a dictionary. We know in advance that if the word starts with “n”, we should start the search from the middle of the dictionary. This is because we know how words are ordered in the dictionary. Knowing this, we can state that binary search is not the best in this case. So we use the above formula to calculate our next point K = (data - low)/(high-low), while in the case of binary search we use K = (low+high)/2.
Pseudo Code:
Elements must be sorted and uniformly distributed.
Initialize low = 0 and high = n-1.
Now calculate position mid to start searching at:
Check if A[mid] == data, element found at index mid.
Otherwise, if data > A[mid], we make low = mid + 1.
Or else we make end = mid -1.
Code Implementation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include<bits/stdc++.h>
using namespace std;
int interpolationSearch(vector < int > A, int data) {
int low = 0, high = A.size() - 1, mid;
while (low <= high) {
mid = low + (((data - A[low]) * (high - low)) / (A[high] - A[low]));
if (data == A[mid]) return mid;
if (data > A[mid]) low = mid + 1;
else high = mid - 1;
}
return -1;
}
int main() {
int n, data;
cin >> n;
vector < int > a(n);
for (int i = 0; i < n; i++) cin >> a[i];
cin >> data;
cout << interpolationSearch(a, data);
}
Input:
1
2
3
5
2 6 10 14 17
14
Output:
3
Complexity:
Runtime: The average runtime complexity of interpolation search is O(log log N) and has a worst case of O(N), which happens when the keys increase exponentially.
Space: O(1) for initializing variables high, low, mid.