|
|
# Principal Components Analysis
|
|
|
|
|
|
An implementation of several strategies for principal components analysis (PCA), a common preprocessing step. Given a dataset and a desired new dimensionality, this can reduce the dimensionality of the data using the linear transformation determined by PCA.
|
|
|
|
|
|
# Available Predicates
|
|
|
|
|
|
* [pca/13](/PrologMethods/Transformation/pca#pca13)
|
|
|
* [pcaDimReduction/10](/PrologMethods/Transformation/pca#pcadimreduction10)
|
|
|
* [pcaVarianceDimReduction/10](/PrologMethods/Transformation/pca#pcavariancedimreduction10)
|
|
|
|
|
|
---
|
|
|
|
|
|
[links/resources](/PrologMethods/Transformation/pca#connected-linksresources)
|
|
|
|
|
|
## **_pca/13_**
|
|
|
|
|
|
Apply Principal Component Analysis to the provided data set.
|
|
|
|
|
|
```prolog
|
|
|
%% part of the predicate definition
|
|
|
pca( +integer, +string,
|
|
|
+pointer(float_array), +integer, +integer,
|
|
|
-pointer(float_array), -integer, -integer,
|
|
|
-pointer(float_array), -integer,
|
|
|
-pointer(float_array), -integer, -integer).
|
|
|
```
|
|
|
|
|
|
### Parameters
|
|
|
| Name | Type | Description | Default |
|
|
|
|------|------|-------------|---------|
|
|
|
| scaleData | +integer(bool) | Whether or not to scale the data. | (0)false |
|
|
|
| decompositionPolicy | +string | Decomposition policy to use: "exact", "randomized", "randomized-block-krylov", "quic" | exact |
|
|
|
| data | +matrix | Input dataset to perform PCA on. | - |
|
|
|
| newDimension | +integer | Desired dimensionality of output dataset. If 0, no dimensionality reduction is performed. | 0 |
|
|
|
| transformedData | -matrix | Matrix to put results of PCA into. | - |
|
|
|
| eigenValues | -vector | Vector to put eigenvalues into. | - |
|
|
|
| eigenVectors | -matrix | Matrix to put eigenvectors (loadings) into. | - |
|
|
|
|
|
|
---
|
|
|
|
|
|
## **_pcaDimReduction/10_**
|
|
|
|
|
|
Use PCA for dimensionality reduction on the given dataset.
|
|
|
|
|
|
This will save the newDimension largest principal components of the data and remove the rest. The parameter returned is the amount of variance of the data that is retained; this is a value between 0 and 1. For instance, a value of 0.9 indicates that 90% of the variance present in the data was retained.
|
|
|
|
|
|
```prolog
|
|
|
%% part of the predicate definition
|
|
|
pcaDimReduction( +integer, +string,
|
|
|
+pointer(float_array), +integer, +integer,
|
|
|
+integer,
|
|
|
-pointer(float_array), -integer, -integer,
|
|
|
[-float32]).
|
|
|
```
|
|
|
|
|
|
### Parameters
|
|
|
| Name | Type | Description | Default |
|
|
|
|------|------|-------------|---------|
|
|
|
| scaleData | +integer(bool) | Whether or not to scale the data. | (0)false |
|
|
|
| decompositionPolicy | +string | Decomposition policy to use: "exact", "randomized", "randomized-block-krylov", "quic" | exact |
|
|
|
| data | +matrix | Input dataset to perform PCA on. | - |
|
|
|
| newDimension | +float | Desired dimensionality of output dataset. If 0, no dimensionality reduction is performed. | 0 |
|
|
|
| transformedData | -matrix | Matrix to put results of PCA into. | - |
|
|
|
| retainedVar | -float | Amount of Variance retained. Between \[0,1\] | - |
|
|
|
|
|
|
---
|
|
|
|
|
|
## **_pcaVarianceDimReduction/10_**
|
|
|
|
|
|
Use PCA for dimensionality reduction on the given dataset.
|
|
|
|
|
|
This will save as many dimensions as necessary to retain at least the given amount of variance (specified by parameter varRetained). The amount should be between 0 and 1; if the amount is 0, then only 1 dimension will be retained. If the amount is 1, then all dimensions will be retained.
|
|
|
|
|
|
The method returns the actual amount of variance retained, which will always be greater than or equal to the varRetained parameter.
|
|
|
|
|
|
```prolog
|
|
|
%% part of the predicate definition
|
|
|
pcaVarianceDimReduction( +integer, +string,
|
|
|
+pointer(float_array), +integer, +integer,
|
|
|
+float32,
|
|
|
-pointer(float_array), -integer, -integer,
|
|
|
[-float32]).
|
|
|
```
|
|
|
|
|
|
### Parameters
|
|
|
| Name | Type | Description | Default |
|
|
|
|------|------|-------------|---------|
|
|
|
| scaleData | +integer(bool) | Whether or not to scale the data. | (0)false |
|
|
|
| decompositionPolicy | +string | Decomposition policy to use: "exact", "randomized", "randomized-block-krylov", "quic" | exact |
|
|
|
| data | +matrix | Input dataset to perform PCA on. | - |
|
|
|
| varToRetaine | +float | Amount of variance to retain; should be between 0 and 1. If 1, all variance is retained. | 0 |
|
|
|
| transformedData | -matrix | Matrix to put results of PCA into. | - |
|
|
|
| retainedVar | -float | Amount of Variance actualy retained. Between \[0,1\] | - |
|
|
|
|
|
|
---
|
|
|
|
|
|
# Connected Links/Resources
|
|
|
|
|
|
If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
|
|
|
|
|
|
* [MLpack::pca_C++\_documentation](https://www.mlpack.org/doc/stable/doxygen/classmlpack_1_1pca_1_1PCA.html)
|
|
|
* [MLpack::pca_Python_documentation](https://www.mlpack.org/doc/stable/python_documentation.html#pca)
|
|
|
|
|
|
[added some of the links from the python documentation](https://www.mlpack.org/doc/stable/python_documentation.html#pca)
|
|
|
|
|
|
* [Principal component analysis on Wikipedia](https://en.wikipedia.org/wiki/Principal_component_analysis) |
|
|
\ No newline at end of file |