Binary Search

Wiki

In computer science, binary search, also known as half-interval search or logarithmic search, is a search algorithm that finds the position of a target value within a sorted array. It compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until it is successful.
Binary search runs in at worst logarithmic time, making O(logn) comparisons, where n is the number of elements in the array and log is the binary logarithm; and using only constant O(1) space. Although specialized data structures designed for fast searching—such as hash tables—can be searched more efficiently, binary search applies to a wider range of search problems.

Although the idea is simple, implementing binary search correctly requires attention to some subtleties about its exit conditions and midpoint calculation.

Implementation issues

  • If the midpoint of the span is calculated as (L + R) / 2, then the value of L + R may exceed the range of integers of the data type used to store the midpoint, even if L and R are within the range. This can be avoided by calculating the midpoint as L + (R − L) / 2.

  • An infinite loop may occur if the exit conditions for the loop—or equivalently, recursive step—are not defined correctly. Once L exceeds R, the search has failed and must convey the failure of the search. In addition, the loop must be exited when the target element is found, or in the case of an implementation where this check is moved to the end, checks for whether the search was successful or failed at the end must be in place.

  • If the search space consists only of integers, test your algorithm on a two-element set to be sure it doesn’t lock up

  • Verify that the lower and upper bounds are not overly constrained: it’s usually better to relax them as long as it doesn’t break the predicate

  • Clarify what exactly you are looking for and then make sure each round your searching space will not lose it

Examples

Valid Perfect Square

ven a positive integer num, write a function which returns True if num is a perfect square else False.
Note: Do not use any built-in library function such as sqrt.
Example 1:
Input: 16
Returns: True

Example 2:
Input: 14
Returns: False

1
2
3
4
5
6
7
8
9
10
bool isPerfectSquare(int num) {
long l = 0, r = num, m = 0;
while(l < r) {
int m = l+((r-l)>>1), t = m*m;
if(t == num) return true; //quite essential
else if(t > num) r = m;
else if(t < num) l = m+1;
}
return l*l == num;
}

Longest Increasing Subsequence

Given an unsorted array of integers, find the length of longest increasing subsequence.
For example,
Given [10, 9, 2, 5, 3, 7, 101, 18],
The longest increasing subsequence is [2, 3, 7, 101], therefore the length is 4. Note that there may be more than one LIS combination, it is only necessary for you to return the length.
Your algorithm should run in O(n2) complexity.
Follow up: Could you improve it to O(n log n) time complexity?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution {
public:
int lowerBound(vector<int>& nums, int a) {
if(nums.empty()) return 0;
int l = 0, r = nums.size()-1;
while(l < r) {
int m = l+((r-l)>>1);
if(nums[m] < a) l = m+1;
else r = m;
}
return nums[l]<a? l+1 : l;
}
int lengthOfLIS(vector<int>& nums) {
vector<int> v;
for(int i = 0; i < nums.size(); ++i) {
int pos = lowerBound(v, nums[i]);
if(pos == v.size()) v.push_back(nums[i]);
else v[pos] = nums[i];
}
return v.size();
}
};

More C++ way using lower_bound, which means the previous binary search is doing the lower_bound thing.

1
2
3
4
5
6
7
8
9
int lengthOfLIS(vector<int>& nums) {
vector<int> collector;
for(int i = 0; i < nums.size(); ++i) {
auto iter = std::lower_bound(collector.begin(), collector.end(), nums[i]);
if(iter == collector.end()) collector.push_back(nums[i]);
else *iter = nums[i];
}
return collector.size();
}

Find the Duplicate Number

Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.
Note:
You must not modify the array (assume the array is read only).
You must use only constant, O(1) extra space.
Your runtime complexity should be less than O(n2).
There is only one duplicate number in the array, but it could be repeated more than once.

1
2
3
4
5
6
7
8
9
10
11
12
int findDuplicate(vector<int>& nums) {
int l = 1, r = nums.size()-1;
while(l < r) {
int m = l+(r-l)/2;
int count = 0;
for(int i = 0; i < nums.size(); ++i)
if(l<=nums[i] && m>=nums[i]) count++;
if(count <= m-l+1) l = m+1;
else r = m;
}
return l;
}

Another neat solution using loop searching technique.

1
2
3
4
5
6
7
8
9
10
11
12
13
int findDuplicate(vector<int>& nums) {
int slow = nums[0], fast = nums[nums[0]];
while(slow != fast) {
slow = nums[slow];
fast = nums[nums[fast]];
}
fast = 0;
while(fast != slow) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
}

First Bad Version

You are a product manager and currently leading a team to develop a new product. Unfortunately, the latest version of your product fails the quality check. Since each version is developed based on the previous version, all the versions after a bad version are also bad. Suppose you have n versions [1, 2, …, n] and you want to find out the first bad one, which causes all the following ones to be bad. You are given an API bool isBadVersion(version) which will return whether version is bad. Implement a function to find the first bad version. You should minimize the number of calls to the API.

1
2
3
4
5
6
7
8
9
int firstBadVersion(int n) {
int l = 1, r = n, m = 0;
while(l < r) {
m = l+((r-l)>>1);
if(isBadVersion(m)) r = m;
else l = m+1;
}
return l;
}

H-Index II

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.

According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
What if the citations array is sorted in ascending order? Could you optimize your algorithm?

1
2
3
4
5
6
7
8
9
10
11
int hIndex(vector<int>& citations) {
if(citations.empty()) return 0;
int size = citations.size();
int l = 0, r = size, m = 0;
while(l < r) {
m = l+(r-l)/2;
if(citations[m] >= size-m) r = m;
else l = m+1;
}
return size-l;
}

if we change the termination to l <= r then correspondingly r = m-1

Search a 2D Matrix II

Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:

Integers in each row are sorted in ascending from left to right.
Integers in each column are sorted in ascending from top to bottom.
For example,
Consider the following matrix:
[
[1, 4, 7, 11, 15],
[2, 5, 8, 12, 19],
[3, 6, 9, 16, 22],
[10, 13, 14, 17, 24],
[18, 21, 23, 26, 30]
]
Given target = 5, return true.
Given target = 20, return false.

1
2
3
4
5
6
7
8
9
10
bool searchMatrix(vector<vector<int>>& matrix, int target) {
if(matrix.empty()) return false;
int r=matrix.size()-1, c=0;
while(r>-1 && c<matrix[0].size()) {
if(matrix[r][c] < target) c++;
else if(matrix[r][c] > target) r--;
else return true;
}
return false;
}

Find Minimum in Rotated Sorted Array II

Suppose a sorted array is rotated at some pivot unknown to you beforehand. (i.e., 0 1 2 4 5 6 7 might become 4 5 6 7 0 1 2). Find the minimum element. The array may contain duplicates.

1
2
3
4
5
6
7
8
9
10
11
12
int findMin(vector<int>& nums) {
if(nums.empty()) return 0;
int size = nums.size();
int l = 0, r = size-1;
while(l < r) {
int m = (l+r)/2;
if(nums[m] > nums[r]) l = m+1;
else if(nums[m] < nums[r]) r = m;
else r--; //removing duplicates only
}
return nums[r];
}

Search in Rotated Sorted Array II

Follow up for “Search in Rotated Sorted Array”: What if duplicates are allowed? Write a function to determine if a given target is in the array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
bool search(vector<int>& nums, int target) {
if(nums.empty()) return false;
int l = 0, r = nums.size()-1;
while(l < r)
{
int m = l+(r-l)/2; //quite essential
if(nums[m] == target) return true;
if(nums[m] > nums[r]) {
if(nums[l]<=target && nums[m]>target) r = m;
else l = m+1;
}
else if(nums[m] < nums[r]) {
if(nums[m]<target && nums[r]>=target) l = m+1;
else r = m;
}
else r--; //removing duplicates only
}
return nums[r] == target;
}

Conclusion

  • termination l < r - r = m and l = m+1 and after searching l == r
  • termination l <= r - r = m-1 and l = m+1 and after searching l > r
  • always clearly know what you are searching for and keep that valid in the searching space