|
|
# Neighborhood Components Analysis (NCA)
|
|
|
|
|
|
An implementation of neighborhood components analysis, a distance learning technique that can be used for preprocessing. Given a labeled dataset, this uses NCA, which seeks to improve the k-nearest-neighbor classification, and returns the learned distance metric
|
|
|
|
|
|
# Available Predicates
|
|
|
|
|
|
* [nca/20](/PrologMethods/Transformation/nca#nca20)
|
|
|
|
|
|
---
|
|
|
|
|
|
[links/resources](/PrologMethods/Transformation/nca#connected-linksresources)
|
|
|
|
|
|
## **_nca/20_**
|
|
|
|
|
|
Initialize the nca model and the selected optimizer.
|
|
|
|
|
|
Then perform nca on the given data and return the learned distance.
|
|
|
|
|
|
```prolog
|
|
|
%% part of the predicate definition
|
|
|
nca( +string,
|
|
|
+float32, +integer, +float32,
|
|
|
+integer,
|
|
|
+integer, +float32, +float32, +integer, +float32, +float32, +integer,
|
|
|
+pointer(float_array), +integer, +integer,
|
|
|
+pointer(float_array), +integer,
|
|
|
-pointer(float_array), -integer, -integer)
|
|
|
```
|
|
|
|
|
|
### Parameters
|
|
|
| Name | Type | Description | Default |
|
|
|
|------|------|-------------|---------|
|
|
|
| optimizerType | +string | Optimizer to use; "sgd" or "lbfgs". | sgd |
|
|
|
| stepSize | +float | Step size for stochastic gradient descent (alpha). | 0.01 |
|
|
|
| maxIterations | +integer | Maximum number of iterations for SGD or L-BFGS (0 indicates no limit). | 500000 |
|
|
|
| tolerance | +float | Maximum tolerance for termination of SGD or L-BFGS. | 1e-7 |
|
|
|
| shuffle | +integer(bool) | | |
|
|
|
| numBasis | +integer | Number of memory points to be stored for L-BFGS. | 5 |
|
|
|
| armijoConstant | +float | Armijo constant for L-BFGS. | 0.0001 |
|
|
|
| wolfe | +float | Wolfe condition parameter for L-BFGS. | 0.9 |
|
|
|
| maxLineSearchTrials | +integer | Maximum number of line search trials for L-BFGS. | 50 |
|
|
|
| minStep | +float | Minimum step of line search for L-BFGS. | 1e-20 |
|
|
|
| maxStep | +float | Maximum step of line search for L-BFGS. | 1e+20 |
|
|
|
| batchSize | +integer | Batch size for mini-batch SGD. | 50 |
|
|
|
| data | +matrix | Input dataset to run NCA on. | - |
|
|
|
| labels | +vector | Labels for input dataset. | - |
|
|
|
| distance | -matrix | Output matrix for learned distance matrix. | - |
|
|
|
|
|
|
---
|
|
|
|
|
|
# Connected Links/Resources
|
|
|
|
|
|
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
|
|
|
|
|
|
* [MLpack::nca_C++\_documentation](https://www.mlpack.org/doc/stable/doxygen/classmlpack_1_1nca_1_1NCA.html)
|
|
|
* [MLpack::nca_Python_documentation](https://www.mlpack.org/doc/stable/python_documentation.html#nca)
|
|
|
|
|
|
added some of the links from the python documentation
|
|
|
|
|
|
* lmnn
|
|
|
* [Neighbourhood components analysis on Wikipedia](https://en.wikipedia.org/wiki/Neighbourhood_components_analysis)
|
|
|
* [Neighbourhood components analysis (pdf)](http://papers.nips.cc/paper/2566-neighbourhood-components-analysis.pdf) |
|
|
\ No newline at end of file |