ABSTRACT
We show tight upper and lower bounds for time-space trade-offs for the c-approximate Near Neighbor Search problem. For the d-dimensional Euclidean space and n-point datasets, we develop a data structure with space n1+ρu+o(1) + O(dn) and query time nρq+o(1) + dno(1) for every ρu, ρq ≥ 0 with:
[EQUATION]
In particular, for the approximation c = 2 we get:
• Space n1.77 ... and query time no(1), significantly improving upon known data structures that support very fast queries [IM98, KOR00];
• Space n1.14... and query time n0.14..., matching the optimal data-dependent Locality-Sensitive Hashing (LSH) from [AR15];
• Space n1+o(1) and query time n0.43..., making significant progress in the regime of near-linear space, which is arguably of the most interest for practice [LJW+07].
This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor c > 1, improving upon [Kap15]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on Spherical Locality-Sensitive Filtering [BDGL16] and data-dependent hashing [AINR14, AR15].
Our matching lower bounds are of two types: conditional and unconditional. First, we prove tightness of the whole trade-off (0.1) in a restricted model of computation, which captures all known hashing-based approaches. We then show unconditional cell-probe lower bounds for one and two probes that match (0.1) for ρq = 0, improving upon the best known lower bounds from [PTW10]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.
Recommendations
Optimal Data-Dependent Hashing for Approximate Near Neighbors
STOC '15: Proceedings of the forty-seventh annual ACM symposium on Theory of ComputingWe show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an n-point dataset in a d-dimensional space our data structure achieves query time O(d ⋅ nρ+o(1)) and space O(n1+ρ+o(1) + d ⋅ n), where ρ=1/(2c2-1) for the ...
Optimal time-space trade-offs for non-comparison-based sorting
SODA '02: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithmsWe study the problem of sorting n integers of w bits on a unit-cost RAM with word size w, and in particular consider the time-space trade-off (product of time and space in bits) for this problem. For comparison-based algorithms, the time-space ...
Time-space trade-offs for predecessor search
STOC '06: Proceedings of the thirty-eighth annual ACM symposium on Theory of ComputingWe develop a new technique for proving cell-probe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an ...
Comments