First and foremost, this post serves as my first ever competitive programming-related post and also as a testing of the equations and code snippets in this blog. Therefore, tips and advice are highly appreciated.
In this post, I would like to write about a commonly used Divide and Conquer algorithm called Binary Search The Answer (BSTA). Many programmers might already know about binary search but some might not be familiar with BSTA.
What is BSTA?
To recall, binary search is an algorithm for searching a value in a static sorted array by considering the median value and reducing the problem's scope by half, as shown here. Binary search the answer uses a similar principle to a normal binary search of halving the range of the problem. However, we do not binary search a position with a certain value. Instead, we binary search the maximum/minimum value that satisfies the given constraints. In one iteration, we consider the average $mid$ of the current possible range and check whether the condition still satisfy for $mid$, using a boolean $check()$ function that tells us whether we can use the current value or not. Then, we would halve the range accordingly.
When and why BSTA?
Unfortunately, not all problems can be solved using BSTA. So when can we use BSTA?We can use BSTA only if the function is monotonic. This means the function can be non-decreasing or non-increasing. For example, the function $f(x) = \sqrt{x}$ is a monotonic function. As you can see from the graph below,
the function is a non-decreasing function.
If we compel to use it in a non-monotonic function, the algorithm might terminate in the local maxima or minima.
Some of you might be wondering why we should bother doing this. The reason is that this algorithm might prove useful, if checking or simulating a solution to the problem is relatively easier compared to finding the value directly.
Let us see a simple demonstration of BSTA algorithm in action.
Easy problem - Largest Cube
Let us say you want to find the largest cube number that is lesser than or equal to $70$.
In this case, the function is $f(x)=x^3$ is an increasing function for positive values of $x$. Therefore, we can use BSTA. We would check if f(x) is lesser than or equal to $70$. Say we want to check for x in the interval $[1,5]$. (In other problems, they usually will not give the interval so you must deduce it yourself.)
In the first iteration, $mid$ is $3$. $f(3)=27 \leq 70$. Therefore, we should shift the range interval into $[4,5]$ and keep $3$ as a possible answer.
In the second iteration $mid$ is $4$. $f(4)=64 \leq 70$. Therefore, we should shift the range interval into $[5,5]$ and keep $4$ as a possible answer.
In the last iteration $mid$ is $5$. $f(5)=125 > 70$. Therefore, we cannot use $5$ as an answer. We can stop because the next interval $[6,5]$ is not valid anymore.
Thus, we found the largest cube, which is $4^3=64$, with only $3$ times of checking, as supposed to the $5$ times of brute checking.
To understand the algorithm even better, let us look at an harder problem.
Multiplication Table (Codeforces - 448D)
The abridged problem statement is as follows:
Given an $n$ by $m$ multiplication table, find the $k^{th}$ smallest value in that table.
Constraints: $1 \leq n,m \leq 5 \cdot 10^5$
For example, look at the picture below.
In this table, $3$ is the $4^{th}$ smallest value and $2$ is both the $2^{nd}$ and $3^{rd}$ smallest value.
To help us solve this problem, we can rephrase the problem to a more approachable form.
Given an $n$ by $m$ multiplication table, let $f(x)$ be the amount of numbers that is smaller than or equal to $x$. Find the smallest integer value $a$ such that $f(a) \geq k$.
Notice that by defining the function in that way, the function $f(x)$ is a monotonic non-decreasing function. This can be easily seen since the counted elements in $f(a)$ is a subset of $f(b)$ for $a<b$. Therefore, BSTA is a reasonable next step.
By using BSTA, we can further reduce the problem to calculate $f(x)$ in a short amount of time. Let us visualize the chosen numbers in an example test case with $n=m=5,x=7$.
Here, I have highlighted the numbers that are $\leq 7$.
Notice that for any row $i$, the highlighted cells $(i,j)$ must satisfy the equation $i \cdot j \leq k$. From that, we can deduce that the number of cells highlighted in row $i$ is equal to $min({\lfloor}\frac{x}{i}{\rfloor},m)$. Therefore, $f(x) = \sum_{i=1}^n (min({\lfloor}\frac{x}{i}{\rfloor}, m))$.
Therefore, we can calculate $f(x)$ in $O(N)$ time. The overall complexity would be $O(NlogMAX)$ with $MAX = (5 \cdot 10^5)^2 = 25 \cdot 10^{10}$.
BSTA Implementation
Now, let us implement the problem above.
First, we need to set the range of possible answers. For this problem, the answer lies between $1$ to $n \cdot m$. Then check with function $check(mid)$ whether $f(mid) \leq k$. If $check(mid)$ returns $True$, update the $r$ to $mid-1$ and make $mid$ as the current best answer. There is no use checking $[mid,r]$ anymore since $mid$ is already the current best value and we want the minimum value. If instead the function returns $False$, then update $l$ to $mid+1$, since $[1,mid]$ must also be $False$ due to the monotonic property. Therefore, BSTA can be implemented as the following.
Note that (with adjusted range) this code can be used for any BSTA in general. The difference would be the function $check(mid)$. For the BSTA to work, there must exist a value $v$ such that the $check()$ function returns $False$ for values $l$ to $v-1$ and $True$ for values $v$ to $r$.
Now we are going to implement for the check function. Recall that function $check(x)$ should return whether or not $f(x) \leq k$. Also, $f(x) = \sum_{i=1}^n (min({\lfloor}\frac{x}{i}{\rfloor}, m))$. Therefore, we can convert the equation into code as shown below.
After the BSTA algorithm is done, the variable $ans$ would be the answer to the problem.
You can look at the full solution here.
jv orzzzzz
ReplyDelete