Large Margin Nearest Neighbors
An implementation of Large Margin Nearest Neighbors (LMNN), a distance learning technique. Given a labeled dataset, this learns a transformation of the data that improves k-nearest-neighbor performance. This can be useful as a preprocessing step.
:- use_module('path/to/.../src/methods/adaboost/adaboost.pl').
%% usage example
TrainData = [5.1,3.5,1.4, 4.9,3.0,1.4, 4.7,3.2,1.3, 4.6,3.1,1.5],
lmnn(amsgrad, TrainData, 3, [0,1,0,1], 1, 0.5, 0.01, 50, 100000, 0.000001, 0, 0, 50, 1, 0, DistanceList, _).
Available Predicates
lmnn/17
Is a single predicate that initiates the lmnn model with all the given params and then performs Large Margin Nearest Neighbors metric learning on the reference data.
%% predicate definition
lmnn(Optimizer, DataList, DataRows, LabelsList, K, Regularization, StepSize, Passes, MaxIterations, Tolerance, Center, Shuffle, BatchSize, Range, Rank, DistanceList, ZCols) :-
K > 0,
Regularization >= 0.0,
StepSize >= 0.0,
Passes >= 0,
MaxIterations >= 0,
Tolerance >= 0.0,
BatchSize > 0,
Range > 0,
Rank >= 0,
convert_list_to_float_array(DataList, DataRows, array(Xsize, Xrownum, X)),
convert_list_to_float_array(LabelsList, array(Ysize, Y)),
lmnnI(Optimizer, X, Xsize, Xrownum, Y, Ysize, K, Regularization, StepSize, Passes, MaxIterations, Tolerance, Center, Shuffle, BatchSize, Range, Rank, Z, ZCols, ZRows),
convert_float_array_to_2d_list(Z, ZCols, ZRows, DistanceList).
%% foreign c++ predicate definition
foreign(lmnn, c, lmnnI( +string,
+pointer(float_array), +integer, +integer,
+pointer(float_array), +integer,
+integer,
+float32, +float32, +integer, +integer, +float32,
+integer, +integer,
+integer, +integer, +integer,
-pointer(float_array), -integer, -integer)).
Parameters
Name | Type | Description | Default |
---|---|---|---|
optimizer | +string | Optimizer to use; "amsgrad", "bbsgd", "sgd", or "lbfgs". | amsgrad |
data | +matrix | Input dataset to run LMNN on. | - |
labels | +vec | Labels for input dataset. | - |
k | +integer | Number of target neighbors to use for each datapoint. | 1 |
regularization | +float | Regularization for LMNN objective function. | 0.5 |
stepSize | +float | Step size for AMSGrad, BB_SGD and SGD (alpha). | 0.01 |
passes | +integer | Maximum number of full passes over dataset for AMSGrad, BB_SGD and SGD. | 50 |
maxIterations | +integer | Maximum number of iterations for L-BFGS (0 indicates no limit). | 100000 |
tolerance | +integer | Maximum tolerance for termination of AMSGrad, BB_SGD, SGD or L-BFGS. | 1e-7 |
center | +integer(bool) | Perform mean-centering on the dataset. It is useful when the centroid of the data is far from the origin. | (0)false |
shuffle | +integer(bool) | ||
batchSize | +integer | Batch size for mini-batch SGD. | 50 |
range | +integer | Number of iterations after which impostors needs to be recalculated. | 1 |
rank | +integer | Rank of distance matrix to be optimized. | 0 |
distance | -matrix | Output matrix for learned distance matrix. | - |
Connected Links/Resources
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
added some of the links from the python documentation