FALCONN (FAst Lookups of Cosine and Other Nearest Neighbors) is a C++ library with a Python wrapper for similarity search over high-dimensional data. It supports cosine similarity and the Euclidean distance. The main ingredient of FALCONN is a Locality-Sensitive Hashing family for cosine similarity that is:
See the github repo for the source
code and documentation (released under the MIT license) or just download version 1.3.1.
To install the Pypi package, simply type pip install falconn
in a terminal.
On data sets with about 1 million points in around 100 dimensions, FALCONN typically requires a few milliseconds per query (running on a reasonably modern desktop CPU).
For more detailed results, see ann-benchmarks of Erik Bernhardsson. Let us point out that FALCONN is especially competitive, when the RAM budget is quite restrictive, which is not the regime the above benchmarks use.
The underlying algorithms are described and analyzed in the following paper:
© 2015–2016