Changes

Dean Samuel Schmitz · c09ba4ff
--- a/PrologMethods/Clustering/kmeans.md
+++ b/PrologMethods/Clustering/kmeans.md
+# K-Means Clustering
+An implementation of several strategies for efficient k-means clustering. Given a dataset and a value of k, this computes and returns a k-means clustering on that data.
+# Available Predicates
+* [naiveKMeans/12](/PrologMethods/Clustering/kmeans#naivekmeans12)
+* [dualTreeKMeans/12](/PrologMethods/Clustering/kmeans#dualtreekmeans12)
+* [elkanKMeans/12](/PrologMethods/Clustering/kmeans#elkankmeans12)
+* [hamerlyKMeans/12](/PrologMethods/Clustering/kmeans#hamerlykmeans12)
+* [pellegMooreKMeans/12](/PrologMethods/Clustering/kmeans#pellegmoorekmeans12)
+---
+[links/resources](/PrologMethods/Clustering/kmeans#connected-linksresources)
+## **_naiveKMeans/12_**
+Runs kmeans with naive as the algorithm for the Lloyd iteration.
+```prolog
+%% part of the predicate definition
+naiveKMeans(  +integer, +integer, +integer, 
+              +pointer(float_array), +integer, +integer,
+              +integer, 
+              -pointer(float_array), -integer, 
+              -pointer(float_array), -integer, -integer).
+```
+### Parameters
+| Name | Type | Description | Default |
+|------|------|-------------|---------|
+| maxIterations | +integer | Maximum number of iterations before k-means terminates. | 1000 |
+| initialPartition | +string | Sets the initialPartitionPolicy : "SampleInitialzation", "RandomPartition" | SampleInitialzation |
+| emptyCluster | +string | Sets the emptyClusterPolicy: "MaxVarianceNewCluster", "KillEmptyCluster", "AllowEmptyCluster" | AllowEmptyCluster |
+| data | +matrix | Input dataset to perform clustering on. | - |
+| clusters | +integer | Number of clusters to find (0 autodetects from initial centroids). | 0 |
+| assignments | -vector | Vector to store cluster assignments in. | - |
+| centroids | -matrix | Matrix in which centroids are stored. | - |
+---
+## **_dualTreeKMeans/12_**
+Runs kmeans with dualtree as the algorithm for the Lloyd iteration.
+```prolog
+%% part of the predicate definition
+dualTreeKMeans(  +integer, +integer, +integer, 
+                 +pointer(float_array), +integer, +integer,
+                 +integer, 
+                 -pointer(float_array), -integer, 
+                 -pointer(float_array), -integer, -integer).
+```
+### Parameters
+| Name | Type | Description | Default |
+|------|------|-------------|---------|
+| maxIterations | +integer | Maximum number of iterations before k-means terminates. | 1000 |
+| initialPartition | +string | Sets the initialPartitionPolicy : "SampleInitialzation", "RandomPartition" | SampleInitialzation |
+| emptyCluster | +string | Sets the emptyClusterPolicy: "MaxVarianceNewCluster", "KillEmptyCluster", "AllowEmptyCluster" | AllowEmptyCluster |
+| data | +matrix | Input dataset to perform clustering on. | - |
+| clusters | +integer | Number of clusters to find (0 autodetects from initial centroids). | 0 |
+| assignments | -vector | Vector to store cluster assignments in. | - |
+| centroids | -matrix | Matrix in which centroids are stored. | - |
+---
+## **_elkanKMeans/12_**
+Runs kmeans with elkan as the algorithm for the Lloyd iteration.
+```prolog
+%% part of the predicate definition
+elkanKMeans(  +integer, +integer, +integer, 
+              +pointer(float_array), +integer, +integer,
+              +integer, 
+              -pointer(float_array), -integer, 
+              -pointer(float_array), -integer, -integer).
+```
+### Parameters
+| Name | Type | Description | Default |
+|------|------|-------------|---------|
+| maxIterations | +integer | Maximum number of iterations before k-means terminates. | 1000 |
+| initialPartition | +string | Sets the initialPartitionPolicy : "SampleInitialzation", "RandomPartition" | SampleInitialzation |
+| emptyCluster | +string | Sets the emptyClusterPolicy: "MaxVarianceNewCluster", "KillEmptyCluster", "AllowEmptyCluster" | AllowEmptyCluster |
+| data | +matrix | Input dataset to perform clustering on. | - |
+| clusters | +integer | Number of clusters to find (0 autodetects from initial centroids). | 0 |
+| assignments | -vector | Vector to store cluster assignments in. | - |
+| centroids | -matrix | Matrix in which centroids are stored. | - |
+---
+## **_hamerlyKMeans/12_**
+Runs kmeans with hamerly as the algorithm for the Lloyd iteration.
+```prolog
+%% part of the predicate definition
+hamerlyKMeans(  +integer, +integer, +integer, 
+                +pointer(float_array), +integer, +integer,
+                +integer, 
+                -pointer(float_array), -integer, 
+                -pointer(float_array), -integer, -integer).
+```
+### Parameters
+| Name | Type | Description | Default |
+|------|------|-------------|---------|
+| maxIterations | +integer | Maximum number of iterations before k-means terminates. | 1000 |
+| initialPartition | +string | Sets the initialPartitionPolicy : "SampleInitialzation", "RandomPartition" | SampleInitialzation |
+| emptyCluster | +string | Sets the emptyClusterPolicy: "MaxVarianceNewCluster", "KillEmptyCluster", "AllowEmptyCluster" | AllowEmptyCluster |
+| data | +matrix | Input dataset to perform clustering on. | - |
+| clusters | +integer | Number of clusters to find (0 autodetects from initial centroids). | 0 |
+| assignments | -vector | Vector to store cluster assignments in. | - |
+| centroids | -matrix | Matrix in which centroids are stored. | - |
+---
+## **_pellegMooreKMeans/12\]_**
+Runs kmeans with pelleg morre as the algorithm for the Lloyd iteration.
+```prolog
+%% part of the predicate definition
+pellegMooreKMeans(  +integer, +integer, +integer, 
+                    +pointer(float_array), +integer, +integer,
+                    +integer, 
+                    -pointer(float_array), -integer, 
+                    -pointer(float_array), -integer, -integer).
+```
+### Parameters
+| Name | Type | Description | Default |
+|------|------|-------------|---------|
+| maxIterations | +integer | Maximum number of iterations before k-means terminates. | 1000 |
+| initialPartition | +string | Sets the initialPartitionPolicy : "SampleInitialzation", "RandomPartition" | SampleInitialzation |
+| emptyCluster | +string | Sets the emptyClusterPolicy: "MaxVarianceNewCluster", "KillEmptyCluster", "AllowEmptyCluster" | AllowEmptyCluster |
+| data | +matrix | Input dataset to perform clustering on. | - |
+| clusters | +integer | Number of clusters to find (0 autodetects from initial centroids). | 0 |
+| assignments | -vector | Vector to store cluster assignments in. | - |
+| centroids | -matrix | Matrix in which centroids are stored. | - |
+---
+# Connected Links/Resources
+If you want a more detailed explanation, then go to the python documentation. There is most of the time a good explanation on how the methods work and what the parameters do.
+* [MLpack::kmeans_C++\_documentation](https://www.mlpack.org/doc/stable/doxygen/classmlpack_1_1kmeans_1_1KMeans.html)
+* [MLpack::kmeans_Python_documentation](https://www.mlpack.org/doc/stable/python_documentation.html#kmeans)
+added some of the links from the python documentation
+* dbscan()
+* [Using the triangle inequality to accelerate k-means (pdf)](http://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf)
+* [Making k-means even faster (pdf)](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.586.2554&rep=rep1&type=pdf)
+* [Accelerating exact k-means algorithms with geometric reasoning (pdf)](http://reports-archive.adm.cs.cmu.edu/anon/anon/usr/ftp/usr0/ftp/2000/CMU-CS-00-105.pdf)
+* [A dual-tree algorithm for fast k-means clustering with large k (pdf)](http://www.ratml.org/pub/pdf/2017dual.pdf)
\ No newline at end of file