Closest Pair

Consider a set $S$

points in $\Re^k$ (set of real numbers in $k$

dimensional space). The closest pair problem is to find in $S$

a pair whose distance is the minimum, i.e., find $p_i$

and

, such that $d(p_i, p_j) = min_{k \neq l} \{d(p_k, p_l),$ for all point $p_k, p_l \in S\}$ , where $d(a, b)$

denotes the Euclidean distance between $a$

and

. The brute-force method takes $O(k.n^2)$

time by computing all $O(n^2)$

inter-point distances and taking the minimum; the pair that gives the minimum distance is the closest pair. Here, the $k$

multiplier is due to dimension $d \geq 2$ .

As is well-known, in one dimension, one can solve the problem much more efficiently: since the closest pair of points must occur consecutively on the real line, one can sort these points and then scan them in order to solve the closest pair problem in $O(n\log n)$ time. The time complexity turns out to be the best possible, since the problem has a lower bound of $\Omega(n \log n)$ , following from a linear time transformation from the element uniqueness problem.

But unfortunately there is no total ordering for points in $\Re^k$ for $k \geq 2$ , hence, sorting is not applicable. We will show that by using divide-and-conquer approach, one can solve this problem in $O(n\log n)$ optimal time. Let us consider the case when $k=2$

. In the following, we only compute the minimum distance between the closest pair; the actual identity of the closest pair that realizes the minimum distance can be found easily by some straightforward book-keeping operations. Consider a vertical separating line $\mathcal{V}$ that divides $S$

into

and

such that $\vert S_1\vert = \vert S_2\vert = n/2$ . Let $\delta_i$ denote the minimum distance defined by the closest pair of points in $S_i$

. Observe that the minimum distance defined by the closest pair of points in $S$

is either $\delta_1, \delta_2$ , or $d(p, q)$

for some $p \in S_1$ and $q \in S_2$ . In the former cases, we are done. In the latter, points $p$

and

must lie in the vertical strip of width $\delta = min\{\delta_1, \delta_2\}$ on each side of the separating line $\mathcal{V}$ (see figure

The problem now reduces to that of finding the closest pair between points in $S_1$

and

that lie inside the strip $\mathcal{L}$ of width $2\delta$ . This subset of points $\mathcal{L}$ possesses a special property, known as sparsity, i.e., for each square box (a hypercube of higher dimensions) of length $2\delta$ the number of points in $\mathcal{L}$ is bounded by a constant $c = 4\times3^{k_1}$ , since in each set $S_i$

, there exists no point that lies in the interior of the $\delta$ -ball centered at each point in $S_i$

**Figure:** Divide-and-conquer scheme for closest pair problem. Solutions to subproblems and (a) and candidates must lie in the vertical strip of width $\delta$ on each side of $\mathcal{V}$ (b).
$\includegraphics[scale=0.8]{qconq}$