Update PrologMethods/Transformation/nca authored by Dean Samuel Schmitz's avatar Dean Samuel Schmitz
# Neighborhood Components Analysis (NCA)
An implementation of neighborhood components analysis, a distance learning technique that can be used for preprocessing. Given a labeled dataset, this uses NCA, which seeks to improve the k-nearest-neighbor classification, and returns the learned distance metric
# Available Predicates
* [nca/20](/PrologMethods/Transformation/nca#nca20)
---
[links/resources](/PrologMethods/Transformation/nca#connected-linksresources)
## **_nca/20_**
Initialize the nca model and the selected optimizer.
Then perform nca on the given data and return the learned distance.
```prolog
%% part of the predicate definition
nca( +string,
+float32, +integer, +float32,
+integer,
+integer, +float32, +float32, +integer, +float32, +float32, +integer,
+pointer(float_array), +integer, +integer,
+pointer(float_array), +integer,
-pointer(float_array), -integer, -integer)
```
### Parameters
| Name | Type | Description | Default |
|------|------|-------------|---------|
| optimizerType | +string | Optimizer to use; "sgd" or "lbfgs". | sgd |
| stepSize | +float | Step size for stochastic gradient descent (alpha). | 0.01 |
| maxIterations | +integer | Maximum number of iterations for SGD or L-BFGS (0 indicates no limit). | 500000 |
| tolerance | +float | Maximum tolerance for termination of SGD or L-BFGS. | 1e-7 |
| shuffle | +integer(bool) | | |
| numBasis | +integer | Number of memory points to be stored for L-BFGS. | 5 |
| armijoConstant | +float | Armijo constant for L-BFGS. | 0.0001 |
| wolfe | +float | Wolfe condition parameter for L-BFGS. | 0.9 |
| maxLineSearchTrials | +integer | Maximum number of line search trials for L-BFGS. | 50 |
| minStep | +float | Minimum step of line search for L-BFGS. | 1e-20 |
| maxStep | +float | Maximum step of line search for L-BFGS. | 1e+20 |
| batchSize | +integer | Batch size for mini-batch SGD. | 50 |
| data | +matrix | Input dataset to run NCA on. | - |
| labels | +vector | Labels for input dataset. | - |
| distance | -matrix | Output matrix for learned distance matrix. | - |
---
# Connected Links/Resources
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
* [MLpack::nca_C++\_documentation](https://www.mlpack.org/doc/stable/doxygen/classmlpack_1_1nca_1_1NCA.html)
* [MLpack::nca_Python_documentation](https://www.mlpack.org/doc/stable/python_documentation.html#nca)
added some of the links from the python documentation
* lmnn
* [Neighbourhood components analysis on Wikipedia](https://en.wikipedia.org/wiki/Neighbourhood_components_analysis)
* [Neighbourhood components analysis (pdf)](http://papers.nips.cc/paper/2566-neighbourhood-components-analysis.pdf)
\ No newline at end of file