... | @@ -2,26 +2,45 @@ |
... | @@ -2,26 +2,45 @@ |
|
|
|
|
|
An implementation of approximate k-nearest-neighbor search with locality-sensitive hashing (LSH). Given a set of reference points and a set of query points, this will compute the k approximate nearest neighbors of each query point in the reference set.
|
|
An implementation of approximate k-nearest-neighbor search with locality-sensitive hashing (LSH). Given a set of reference points and a set of query points, this will compute the k approximate nearest neighbors of each query point in the reference set.
|
|
|
|
|
|
|
|
```prolog
|
|
|
|
:- use_module('path/to/.../src/methods/adaboost/adaboost.pl').
|
|
|
|
|
|
|
|
%% usage example
|
|
|
|
TrainData = [5.1,3.5,1.4, 4.9,3.0,1.4, 4.7,3.2,1.3, 4.6,3.1,1.5],
|
|
|
|
TestData = [3,2,0, 5,1,4, 0,0,4, 3,3,5, 0,5,5, 2,5,5],
|
|
|
|
lsh_initAndTrainModel(TrainData, 3, 20, 10, 0.0, 99901, 500),
|
|
|
|
lsh_searchWithQuery(TestData, 3, 3, NeighborResultsList, _, DistancesList, _, 0, 0).
|
|
|
|
```
|
|
|
|
|
|
# Available Predicates
|
|
# Available Predicates
|
|
|
|
|
|
* [initModel/8](/PrologMethods/Geometry/lsh#initmodel8)
|
|
* [lsh_initAndTrainModel/7](/PrologMethods/Geometry/lsh#lsh_initandtrainmodel7)
|
|
* [computeRecall/7](/PrologMethods/Geometry/lsh#computerecall7)
|
|
* [lsh_computeRecall/5](/PrologMethods/Geometry/lsh#lsh_computerecall5)
|
|
* [searchWithQuery/12](/PrologMethods/Geometry/lsh#searchwithquery12)
|
|
* [lsh_searchWithQuery/9](/PrologMethods/Geometry/lsh#lsh_searchwithquery9)
|
|
* [searchNoQuery/9](/PrologMethods/Geometry/lsh#searchnoquery9)
|
|
* [lsh_searchNoQuery/7](/PrologMethods/Geometry/lsh#lsh_searchnoquery7)
|
|
* [train/8](/PrologMethods/Geometry/lsh#train8)
|
|
|
|
|
|
|
|
---
|
|
---
|
|
|
|
|
|
[links/resources](/PrologMethods/Geometry/lsh#connected-linksresources)
|
|
[links/resources](/PrologMethods/Geometry/lsh#connected-linksresources)
|
|
|
|
|
|
## **_initModel/8_**
|
|
## **_lsh_initAndTrainModel/7_**
|
|
|
|
|
|
Initiatzes the model and trains it.
|
|
Initiatzes the model and trains it.
|
|
|
|
|
|
```prolog
|
|
```prolog
|
|
%% part of the predicate definition
|
|
%% predicate definition
|
|
initModel( +pointer(float_array), +integer, +integer,
|
|
lsh_initAndTrainModel(ReferenceList, ReferenceRows, NumProj, NumTables, HashWidth, SecondHashSize, BucketSize) :-
|
|
+integer, +integer, +float32, +integer, +integer).
|
|
NumProj >= 0,
|
|
|
|
NumTables >= 0,
|
|
|
|
HashWidth >= 0.0,
|
|
|
|
SecondHashSize > 0,
|
|
|
|
BucketSize > 0,
|
|
|
|
convert_list_to_float_array(ReferenceList, ReferenceRows, array(Xsize, Xrownum, X)),
|
|
|
|
initAndTrainModelI(X, Xsize, Xrownum, NumProj, NumTables, HashWidth, SecondHashSize, BucketSize).
|
|
|
|
|
|
|
|
%% foreign c++ predicate definition
|
|
|
|
foreign(initAndTrainModel, c, initAndTrainModelI( +pointer(float_array), +integer, +integer,
|
|
|
|
+integer, +integer, +float32, +integer, +integer)).
|
|
```
|
|
```
|
|
|
|
|
|
### Parameters
|
|
### Parameters
|
... | @@ -36,15 +55,21 @@ initModel( +pointer(float_array), +integer, +integer, |
... | @@ -36,15 +55,21 @@ initModel( +pointer(float_array), +integer, +integer, |
|
|
|
|
|
---
|
|
---
|
|
|
|
|
|
## **_computeRecall/7_**
|
|
## **_lsh_computeRecall/5_**
|
|
|
|
|
|
Compute the recall (% of neighbors found) given the neighbors returned by searchWithQuery/12 or searchNoQuery/9 and a "ground truth" set of neighbors.
|
|
Compute the recall (% of neighbors found) given the neighbors returned by searchWithQuery/12 or searchNoQuery/9 and a "ground truth" set of neighbors.
|
|
|
|
|
|
```prolog
|
|
```prolog
|
|
%% part of the predicate definition
|
|
%% predicate definition
|
|
computeRecall( +pointer(float_array), +integer, +integer,
|
|
lsh_computeRecall(FoundNeighborsList, FoundNeighborsRows, RealNeighborsList, RealNeighborsRows, Percentage) :-
|
|
|
|
convert_list_to_float_array(FoundNeighborsList, FoundNeighborsRows, array(Xsize, Xrownum, X)),
|
|
|
|
convert_list_to_float_array(RealNeighborsList, RealNeighborsRows, array(Ysize, Yrownum, Y)),
|
|
|
|
computeRecallI(X, Xsize, Xrownum, Y, Ysize, Yrownum, Percentage).
|
|
|
|
|
|
|
|
%% foreign c++ predicate definition
|
|
|
|
foreign(computeRecall, c, computeRecallI( +pointer(float_array), +integer, +integer,
|
|
+pointer(float_array), +integer, +integer,
|
|
+pointer(float_array), +integer, +integer,
|
|
[-float32]).
|
|
[-float32])).
|
|
```
|
|
```
|
|
|
|
|
|
### Parameters
|
|
### Parameters
|
... | @@ -56,19 +81,29 @@ computeRecall( +pointer(float_array), +integer, +integer, |
... | @@ -56,19 +81,29 @@ computeRecall( +pointer(float_array), +integer, +integer, |
|
|
|
|
|
---
|
|
---
|
|
|
|
|
|
## **_searchWithQuery/12_**
|
|
## **_lsh_searchWithQuery/9_**
|
|
|
|
|
|
Compute the nearest neighbors of the points in the given query set and store the output in the given matrices.
|
|
Compute the nearest neighbors of the points in the given query set and store the output in the given matrices.
|
|
|
|
|
|
The matrices will be set to the size of n columns by k rows, where n is the number of points in the query dataset and k is the number of neighbors being searched for.
|
|
The matrices will be set to the size of n columns by k rows, where n is the number of points in the query dataset and k is the number of neighbors being searched for.
|
|
|
|
|
|
```prolog
|
|
```prolog
|
|
%% part of the predicate definition
|
|
%% predicate definition
|
|
searchWithQuery( +pointer(float_array), +integer, +integer,
|
|
lsh_searchWithQuery(QueryList, QueryRows, K, ResultingNeighborsList, YCols, DistancesList, ZCols, NumTablesToSearch, T) :-
|
|
|
|
K > 0,
|
|
|
|
NumTablesToSearch >= 0,
|
|
|
|
T >= 0,
|
|
|
|
convert_list_to_float_array(QueryList, QueryRows, array(Xsize, Xrows, X)),
|
|
|
|
searchWithQueryI(X, Xsize, Xrows, K, Y, YCols, YRows, Z, ZCols, ZRows, NumTablesToSearch, T),
|
|
|
|
convert_float_array_to_2d_list(Y, YCols, YRows, ResultingNeighborsList),
|
|
|
|
convert_float_array_to_2d_list(Z, ZCols, ZRows, DistancesList).
|
|
|
|
|
|
|
|
%% foreign c++ predicate definition
|
|
|
|
foreign(searchWithQuery, c, searchWithQueryI( +pointer(float_array), +integer, +integer,
|
|
+integer,
|
|
+integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
+integer, +integer).
|
|
+integer, +integer)).
|
|
```
|
|
```
|
|
|
|
|
|
### Parameters
|
|
### Parameters
|
... | @@ -83,18 +118,27 @@ searchWithQuery( +pointer(float_array), +integer, +integer, |
... | @@ -83,18 +118,27 @@ searchWithQuery( +pointer(float_array), +integer, +integer, |
|
|
|
|
|
---
|
|
---
|
|
|
|
|
|
## **_searchNoQuery/9_**
|
|
## **_lsh_searchNoQuery/7_**
|
|
|
|
|
|
Compute the nearest neighbors and store the output in the given matrices.
|
|
Compute the nearest neighbors and store the output in the given matrices.
|
|
|
|
|
|
The matrices will be set to the size of n columns by k rows, where n is the number of points in the query dataset and k is the number of neighbors being searched for.
|
|
The matrices will be set to the size of n columns by k rows, where n is the number of points in the query dataset and k is the number of neighbors being searched for.
|
|
|
|
|
|
```prolog
|
|
```prolog
|
|
%% part of the predicate definition
|
|
%% predicate definition
|
|
searchNoQuery( +integer,
|
|
lsh_searchNoQuery(K, ResultingNeighborsList, YCols, DistancesList, ZCols, NumTablesToSearch, T) :-
|
|
|
|
K > 0,
|
|
|
|
NumTablesToSearch >= 0,
|
|
|
|
T >= 0,
|
|
|
|
searchNoQueryI(K, Y, YCols, YRows, Z, ZCols, ZRows, NumTablesToSearch, T),
|
|
|
|
convert_float_array_to_2d_list(Y, YCols, YRows, ResultingNeighborsList),
|
|
|
|
convert_float_array_to_2d_list(Z, ZCols, ZRows, DistancesList).
|
|
|
|
|
|
|
|
%% foreign c++ predicate definition
|
|
|
|
foreign(searchNoQuery, c, searchNoQueryI( +integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
-pointer(float_array), -integer, -integer,
|
|
+integer, +integer).
|
|
+integer, +integer)).
|
|
```
|
|
```
|
|
|
|
|
|
### Parameters
|
|
### Parameters
|
... | @@ -108,36 +152,12 @@ searchNoQuery( +integer, |
... | @@ -108,36 +152,12 @@ searchNoQuery( +integer, |
|
|
|
|
|
---
|
|
---
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
## **_train/8_**
|
|
|
|
|
|
|
|
Train the LSH model on the given dataset.
|
|
|
|
|
|
|
|
```prolog
|
|
|
|
%% part of the predicate definition
|
|
|
|
train( +pointer(float_array), +integer, +integer,
|
|
|
|
+integer, +integer, +float32, +integer, +integer).
|
|
|
|
```
|
|
|
|
|
|
|
|
### Parameters
|
|
|
|
| Name | Type | Description | Default |
|
|
|
|
|------|------|-------------|---------|
|
|
|
|
| referenceSet | +matrix | Matrix containing the reference dataset. | - |
|
|
|
|
| numProj | +integer | Number of projections in each hash table (anything between 10-50 might be a decent choice). | 10-50 |
|
|
|
|
| numTables | +integer | Total number of hash tables (anything between 10-20 should suffice). | 10-20 |
|
|
|
|
| hashWidth | +float | The width of hash for every table. If 0 (the default) is provided, then the hash width is automatically obtained by computing the average pairwise distance of 25 pairs. This should be a reasonable upper bound on the nearest-neighbor distance in general. | 0.0 |
|
|
|
|
| secondHashSize | +integer | The size of the second hash table. This should be a large prime number. | 99901 |
|
|
|
|
| bucketSize | +integer | The size of the bucket in the second hash table. This is the maximum number of points that can be hashed into single bucket. A value of 0 indicates that there is no limit (so the second hash table can be arbitrarily large—be careful!). | 500 |
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
# Connected Links/Resources
|
|
# Connected Links/Resources
|
|
|
|
|
|
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
|
|
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
|
|
|
|
|
|
* [**MLpack::lsh_C++\_documentation**](https://www.mlpack.org/doc/stable/doxygen/classmlpack_1_1neighbor_1_1LSHSearch.html)
|
|
* [**MLpack::lsh_C++\_documentation**](https://www.mlpack.org/doc/mlpack-3.4.2/doxygen/classmlpack_1_1neighbor_1_1LSHSearch.html)
|
|
* [**MLpack::lsh_Python_documentation**](https://www.mlpack.org/doc/stable/python_documentation.html#local_coordinate_coding)
|
|
* [**MLpack::lsh_Python_documentation**](https://www.mlpack.org/doc/mlpack-3.4.2/python_documentation.html#local_coordinate_coding)
|
|
|
|
|
|
added some of the links from the python documentation
|
|
added some of the links from the python documentation
|
|
|
|
|
... | | ... | |