Algorithm: the core idea of binary search and problem solving (novel)

Overview of binary search:

Binary search, also known as half search, is a relatively simple algorithm that can quickly find satisfactory answers in an ordered array with O(logn) time complexity. When n is very large, the efficiency of binary search is very high compared with traversing the array. Although the idea of binary search is very simple to understand, it may be full of loopholes if you can't thoroughly understand the algorithm and analyze specific problems when it's time to do the problem. As the online information says:

"Binary search is easy to write, but it is difficult to write right. According to statistics, only 10% of programmers can write bug free binary search code. The causes of errors mainly focus on the selection of judgment conditions and boundary values, which can easily lead to boundary crossing or dead cycle."

Today, I'd like to share my core ideas and experience of binary search.

Core idea:

It must be difficult for you to understand if you only explain orally without combining specific examples. Here I have found several different boundary situations of binary search questions. Take you to analyze the corresponding ideas.

Common binary search problem solution (pseudo code)

  int l = 0; //L < = minimum value of answer range
  int r = n; //R > = maximum value of answer range
  while(l < r){ //The while loop condition l < R can make any problem, so there is no need to change it
    long long mid = (l + r) >> 1;  //Sometimes the mid should be (l + r + 1) > > 1, which is affected by the values of l and r below
    if(a[mid] < target){ //The boundary conditions here vary according to the specific topic, and will affect the values of l and r below
      l = mid + 1; //It should be combined with specific conditions
    }else{
      r = mid; //It should be combined with specific circumstances
    }
  }
  return l; //When the while loop condition is l < R, the final answer is always L

The above only lists the basic structure of binary search problem solution, not the template for which problem

Browsing my notes, you will find that different binary search questions are only fine tuned in the while loop, and the code outside the while loop is almost the same

Let me briefly analyze several types of questions. I hope you can understand my core idea and make it all accessible

Note: my classification of binary search questions may be different from that of other leaders. I classify binary search questions according to the value of median mid,

Because there are only two kinds of mid values: Mid = (L + R) > > 1 or mid = (L + R + 1) > > 1

If you explain it according to different subject situations and different boundary judgment conditions, Xiaobai will be easily confused.

In fact, in a topic situation, the boundary conditions can be nums [mid] < target, nums [mid] < = target, etc., but the values of l and r will be different for different boundary conditions.

Therefore, it will be complicated and chaotic to divide the situation according to the boundary conditions, but there are only two cases according to the value of mid.

1. Median mid = (L + R + 1) > > 1

Given an ordered (ascending) integer array of # n # elements # nums and a target value # target, write a function to search for the target in # nums # and return the subscript if the target value exists, otherwise return - 1

class Solution {
public:
    int search(vector<int>& nums, int target) {
        int l = 0;
        int r = nums.size() - 1;
        //The answer is in the interval with the subscript [l,r] of the array element
        while(l < r){
            long long mid = (l + r + 1) >> 1; //Because l=mid, when taking half value, mid should take ceil((l+r)/2) first to avoid dead cycle
            if(nums[mid] <= target){
                l = mid; //nums[mid] may be the answer
            }else{
                r = mid - 1;//If num [mid] is definitely not the answer, kick the mid out of the [l,r] range where the answer is located
            }
        }
        if(nums[l] == target) return l;
        return -1;
    }
};
This question target May not exist, Array in l and r All elements between may be the answer.

When judging, when the element <= target May be the answer if the element >target, Then it must not be the answer.
So,
When nums[mid] <= target Time, mid The value pointed to may be the answer,therefore l=mid, if l=mid+1,Then the element is not where the answer may be[l,r]Interval
else When nums[mid] > target Time, mid The value pointed to cannot be the answer,All r = mid - 1,Kick the element out[l,r]section

Here's the analysis,l and r The conditional value of has been written, only mid The expression for has not been determined yet.
Why(l + r + 1) >> 1, instead of(l + r) >> 1 And?

According to the above l and r The value of can be judged.
When l+1=r In case of, 
l The element referred to may be the answer,therefore l Will always be after judgment=mid,
if mid = (l + r) >> 1, that mid Will always=l
 Then it's in a dead circle
 and mid = (l + r + 1) >> 1, that mid After calculation, it will be equal to r,You can get the answer normally at this time.

Source: LeetCode

Link: https://leetcode-cn.com/problems/binary-search

2. Median mid = (L + R) > > 1

Give you a sorted character list letters, which contains only lowercase English letters. Another target letter, target, is given. Please find the smallest letter larger than the target letter in this sequence table.

In comparison, letters appear in a circular sequence. for instance:

If the target letter target = 'z' and the character list is' letters = ['a', 'b'], the answer returns' a '

class Solution {
public:
    char nextGreatestLetter(vector<char>& letters, char target) {
        int n = letters.size();
        int l = 0;
        int r = n;
        //The answer is in the interval with the subscript [l,r] of the array element
        while(l < r){
            long long mid = (l + r) >> 1;
            if(letters[mid] <= target){ //The answer must be > target. If letters [mid] < = target, it must not be the answer
                l = mid + 1; //letters[mid] can't be the answer, so kick the mid out of the [l,r] range where the answer is located
            }else{
                r = mid; //letters[mid] may be the answer
            }
        }
        return letters[l % n];
    }
};
This question, when the element > target Time, May be the answer,  If element <= target, Then it must not be the answer
 When letters[mid] <= target Time, mid The value pointed to cannot be the answer,All l = mid + 1,Kick the element out[l,r]section
else When letters[mid] > target Time, mid The value pointed to may be the answer,therefore r=mid, if r=mid-1,Then the element is not where the answer may be[l,r]Interval

Here's the analysis,l and r The conditional value of has been written, only mid The expression for has not been determined yet.
Why(l + r) >> 1, instead of(l + r + 1) >> 1 And?

According to the above l and r The value of can be judged.
When l+1=r In case of, 
r The element referred to may be the answer,therefore r Will always be after judgment=mid,
if mid = (l + r + 1) >> 1, that mid Will always=r
 Then it's in a dead circle
 and mid = (l + r) >> 1, that mid After calculation, it will be equal to l,You can get the answer normally at this time.

Source: LeetCode

Link: https://leetcode-cn.com/problems/find-smallest-letter-greater-than-target

Summary:

After careful understanding, do you think it will be much clearer to classify binary search according to mid value than to classify according to boundary conditions? In fact, when doing questions,

All codes except while (l < R) {} are similar. Just change them slightly according to the problem, such as the initial values of l and r

When writing the code in the while loop, you can first make mid = (L + R) > > 1, and then write down. After you select the boundary conditions and write the value equations of L and R according to the boundary conditions, analyze the special situation. When l + 1 =r, according to the value equations of L and r you write, whether it will cause an endless loop. If so, change the mid to (L + R + 1) > > 1

Whether the value of l = mid+1 or l = mid, and whether the value of r = mid - 1 or r = mid should be judged according to the boundary conditions you choose. The values of L and R are different under different boundary conditions, but the idea is the same. If the element pointed to by mid may be the answer, you can't kick mid out of the [l,r] range where the answer is located. Specifically, combined with my above two examples.

If the above explanation is wrong, please correct it in the comment area!

Added by 2705ap on Thu, 03 Feb 2022 03:54:58 +0200