Skip to main content

Binary Search the Answer (BSTA)

 First and foremost, this post serves as my first ever competitive programming-related post and also as a testing of the equations and code snippets in this blog. Therefore, tips and advice are highly appreciated.

In this post, I would like to write about a commonly used Divide and Conquer algorithm called Binary Search The Answer (BSTA). Many programmers might already know about binary search but some might not be familiar with BSTA. 

What is BSTA?

To recall, binary search is an algorithm for searching a value in a static sorted array by considering the median value and reducing the problem's scope by half, as shown here. Binary search the answer uses a similar principle to a normal binary search of halving the range of the problem. However, we do not binary search a position with a certain value. Instead, we binary search the maximum/minimum value that satisfies the given constraints. In one iteration, we consider the average $mid$ of the current possible range and check whether the condition still satisfy for $mid$, using a boolean $check()$ function that tells us whether we can use the current value or not. Then, we would halve the range accordingly. 

When and why BSTA?

Unfortunately, not all problems can be solved using BSTA. So when can we use BSTA?
We can use BSTA only if the function is monotonic. This means the function can be non-decreasing or non-increasing. For example, the function $f(x) = \sqrt{x}$ is a monotonic function. As you can see from the graph below,
 
the function is a non-decreasing function.
If we compel to use it in a non-monotonic function, the algorithm might terminate in the local maxima or minima.

Some of you might be wondering why we should bother doing this. The reason is that this algorithm might prove useful, if checking or simulating a solution to the problem is relatively easier compared to finding the value directly.

Let us see a simple demonstration of BSTA algorithm in action.

Easy problem - Largest Cube 

Let us say you want to find the largest cube number that is lesser than or equal to $70$.
In this case, the function is $f(x)=x^3$  is an increasing function for positive values of $x$. Therefore, we can use BSTA. We would check if f(x) is lesser than or equal to $70$. Say we want to check for x in the interval $[1,5]$. (In other problems, they usually will not give the interval so you must deduce it yourself.)
In the first iteration, $mid$ is $3$. $f(3)=27 \leq 70$. Therefore, we should shift the range interval into $[4,5]$ and keep $3$ as a possible answer.
In the second iteration $mid$ is $4$. $f(4)=64 \leq 70$. Therefore, we should shift the range interval into $[5,5]$ and keep $4$ as a possible answer.
In the last iteration $mid$ is $5$. $f(5)=125 > 70$. Therefore, we cannot use $5$ as an answer. We can stop because the next interval $[6,5]$ is not valid anymore.
Thus, we found the largest cube, which is $4^3=64$, with only $3$ times of checking, as supposed to the $5$ times of brute checking.

To understand the algorithm even better, let us look at an harder problem.

Multiplication Table (Codeforces - 448D)

The abridged problem statement is as follows:
Given an $n$ by $m$ multiplication table, find the $k^{th}$ smallest value in that table.
Constraints: $1 \leq n,m \leq 5 \cdot 10^5$
For example, look at the picture below.

In this table, $3$ is the $4^{th}$ smallest value and $2$ is both the $2^{nd}$ and $3^{rd}$ smallest value.

To help us solve this problem, we can rephrase the problem to a more approachable form.
Given an $n$ by $m$ multiplication table, let $f(x)$ be the amount of numbers that is smaller than or equal to $x$. Find the smallest integer value $a$ such that $f(a) \geq k$.

Notice that by defining the function in that way, the function $f(x)$ is a monotonic non-decreasing function. This can be easily seen since the counted elements in $f(a)$ is a subset of $f(b)$ for $a<b$.  Therefore, BSTA is a reasonable next step.
By using BSTA, we can further reduce the problem to calculate $f(x)$ in a short amount of time. Let us visualize the chosen numbers in an example test case with $n=m=5,x=7$.
Here, I have highlighted the numbers that are $\leq 7$. 
Notice that for any row $i$, the highlighted cells $(i,j)$ must satisfy the equation $i \cdot j \leq k$. From that, we can deduce that the number of cells highlighted in row $i$ is equal to $min({\lfloor}\frac{x}{i}{\rfloor},m)$. Therefore, $f(x) = \sum_{i=1}^n (min({\lfloor}\frac{x}{i}{\rfloor}, m))$.
Therefore, we can calculate $f(x)$ in $O(N)$ time. The overall complexity would be $O(NlogMAX)$ with $MAX = (5 \cdot 10^5)^2 =  25 \cdot 10^{10}$.

BSTA Implementation

Now, let us implement the problem above.
First, we need to set the range of possible answers. For this problem, the answer lies between $1$ to $n \cdot m$. Then check with function $check(mid)$ whether $f(mid) \leq k$.  If $check(mid)$ returns $True$, update the $r$ to $mid-1$ and make $mid$ as the current best answer. There is no use checking $[mid,r]$ anymore since $mid$ is already the current best value and we want the minimum value. If instead the function returns $False$, then update $l$ to $mid+1$, since $[1,mid]$ must also be $False$ due to the monotonic property. Therefore, BSTA can be implemented as the following.
Note that (with adjusted range) this code can be used for any BSTA in general. The difference would be the function $check(mid)$. For the BSTA to work, there must exist a value $v$ such that the $check()$ function returns $False$ for values $l$ to $v-1$ and $True$ for values $v$ to $r$.

Now we are going to implement for the check function. Recall that function $check(x)$ should return whether or not $f(x) \leq k$. Also, $f(x) =  \sum_{i=1}^n (min({\lfloor}\frac{x}{i}{\rfloor}, m))$. Therefore, we can convert the equation into code as shown below.
After the BSTA algorithm is done, the variable $ans$ would be the answer to the problem.

You can look at the full solution here.

Comments

Post a Comment

Popular posts from this blog

Proof for fun!

This post will be about several proofs of random theorems or formulas that I enjoy proving and I would like to share them with everyone. I hope that everyone could see the beauty of mathematical proofs from this post. Formula 1 For a positive integer $n$, it always holds true that: [\lim_{x \to 0} \frac{1-\prod_{i=1}^n cos(a_i x)}{x^2} = \frac{1}{2} \cdot \sum_{i=1}^n a_i^2] Note for some who might not be used to the notation, $\prod$ here is just like $\sum$ but instead of summation of terms, it is the product of all the terms. For example $\prod_{i=1}^3 i = 1 \cdot 2 \cdot 3 = 6$ Simple Example: [\lim_{x \to 0} \frac{1-cos3x \cdot cos4x \cdot cos5x}{x^2} = \frac{3^2+4^2+5^2}{2} = 25] Proof: In dealing with problem where there exist $n$ term products, we should try to build it from the most simple case of $n=1$, and then try to prove for $n=k+1$ using the equality from $n=k$ (similar to a domino effect ). In mathematics, we call this method as  induction . First step i...

Integral Problems Discussion

In this post, I would like to write a solution to an integral problem (possibly more problems in the future) as well as give insights that are usually not taught, in spite of its importance. Without any longer, let us delve into the problem. Problem 1 Find the value of the integral: [ \int \frac{\sqrt{4x-x^2}}{x} dx ] Insight 1 First of all, in dealing with integrals, we should usually try to avoid forms with square roots . Moreover, it is usually harder to manipulate the equation if we have multiple $x$ terms in a square root. Thus, in this problem, the most troublesome part is $\sqrt{4x-x^2}$. After knowing this, several ideas to simplify this form might come up.  The first most common ways to deal with $x$ and $x^2$ terms is by completing the square . In this case, $4x-x^2$ can be written as $-(x-2)^2+4$. However, the problem with this approach is that the denominator is $x$, so if we want to do a substitution of $u=x-2$, the denominator would be in the form of $u+2$, which...