FALCONN

FALCONN (FAst Lookups of Cosine and Other Nearest Neighbors) is a C++ library with a Python wrapper for similarity search over high-dimensional data. It supports cosine similarity and the Euclidean distance. The main ingredient of FALCONN is a Locality-Sensitive Hashing family for cosine similarity that is:

See the github repo for the source code and documentation (released under the MIT license) or just download version 1.2. To install the Pypi package, simply type pip install falconn in a terminal.

Benchmarks

Below are the results of FALCONN being compared with other open source similarity search algorithms. The dataset, which consists of vector representations for words produced by GloVe, has 1.2M points in 100 dimensions. The results for other algorithms are taken from ann-benchmarks created by Erik Bernhardsson. Note that FALCONN is especially good in the regime of high accuracy (0.8 or more).

The plot axes are: accuracy retrieving 10 closest data points vs. the number of queries per second. The picture is clickable.

Publications

The underlying algorithms are described and analyzed in the following paper:

Authors

FALCONN is designed and implemented by: It grew out of a research project joint with: (see the paper above). If you would like to ask any questions, or tell us anything related to FALCONN, write to falconn.lib@gmail.com.

© 2015–2016