research-article

Optimal hashing-based time-space trade-offs for approximate near neighbors

Authors:
Alexandr Andoni

Columbia

Columbia
View Profile

,
Thijs Laarhoven

IBM Research Zürich

IBM Research Zürich
View Profile

,
Ilya Razenshteyn

MIT CSAIL

MIT CSAIL
View Profile

,
Erik Waingarten

Columbia

Columbia
View Profile

Authors Info & Claims

SODA '17: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete AlgorithmsJanuary 2017Pages 47–66

Published:16 January 2017Publication History

SODA '17: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms

Pages 47–66

ABSTRACT

We show tight upper and lower bounds for time-space trade-offs for the c-approximate Near Neighbor Search problem. For the d-dimensional Euclidean space and n-point datasets, we develop a data structure with space n^1+ρ_u+o(1) + O(dn) and query time n^ρ_q+o(1) + dn^o(1) for every ρ_u, ρ_q ≥ 0 with:

[EQUATION]

In particular, for the approximation c = 2 we get:

• Space n^{1.77 ...} and query time n^o(1), significantly improving upon known data structures that support very fast queries [IM98, KOR00];

• Space n^1.14... and query time n^0.14..., matching the optimal data-dependent Locality-Sensitive Hashing (LSH) from [AR15];

• Space n^1+o(1) and query time n^0.43..., making significant progress in the regime of near-linear space, which is arguably of the most interest for practice [LJW⁺07].

This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor c > 1, improving upon [Kap15]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on Spherical Locality-Sensitive Filtering [BDGL16] and data-dependent hashing [AINR14, AR15].

Our matching lower bounds are of two types: conditional and unconditional. First, we prove tightness of the whole trade-off (0.1) in a restricted model of computation, which captures all known hashing-based approaches. We then show unconditional cell-probe lower bounds for one and two probes that match (0.1) for ρ_q = 0, improving upon the best known lower bounds from [PTW10]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.

References

References are not available

Recommendations

Optimal Data-Dependent Hashing for Approximate Near Neighbors
STOC '15: Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an n-point dataset in a d-dimensional space our data structure achieves query time O(d ⋅ n^ρ+o(1)) and space O(n^1+ρ+o(1) + d ⋅ n), where ρ=1/(2c²-1) for the ...
Read More
Optimal time-space trade-offs for non-comparison-based sorting
SODA '02: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms

We study the problem of sorting n integers of w bits on a unit-cost RAM with word size w, and in particular consider the time-space trade-off (product of time and space in bits) for this problem. For comparison-based algorithms, the time-space ...
Read More
Time-space trade-offs for predecessor search
STOC '06: Proceedings of the thirty-eighth annual ACM symposium on Theory of Computing

We develop a new technique for proving cell-probe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SODA '17: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms
January 2017
2756 pages
Program Chair:
Philip N. Klein
Brown University
Sponsors
In-Cooperation
Publisher
Society for Industrial and Applied Mathematics
United States
Publication History
- Published: 16 January 2017
Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate411of1,322submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 196
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimal hashing-based time-space trade-offs for approximate near neighbors

SODA '17: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms

ABSTRACT

References

Cited By

Recommendations

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Optimal time-space trade-offs for non-comparison-based sorting

Time-space trade-offs for predecessor search