Skip to content
Snippets Groups Projects
Commit b4505bc7 authored by Marc Feger's avatar Marc Feger
Browse files

Add Aufgabe 1

parent 4ba45951
No related branches found
No related tags found
No related merge requests found
# A1
## a)
![](./images/a.jpg)
## b)
Es wurden die initialen Centeroiden P7, P8, P9 gewählt.
Zu sehen ist, dass ein anderes Cluster entsteht.
Die Cluster aus A1a) machen jedoch mehr Sinn, da dort tatsächlich drei Cluster zu sehen sind.
Bei dem unten stehenden Bild sind die zwei Unterend Punkte-Wolden disjunkt aber dennoch in einem Cluster.
![](./images/b.jpg)
## c)
Ja es kommt auf die Reihenfolge an.
Spiegelt man die Reihenfolge der Datan P1, ..., P10 zu P10, ..., P1, und lässt man die Center P3, P4 und P8 gleich, so sieht man, dass die
Cluster-Label anders sind.
Die Reihenfolge in der die Daten mit den Centroiden verglichen werden, wird durch das Spiegeln der Daten auch gespiegelt.
![](./images/c.jpg)
\ No newline at end of file
A1/images/a.jpg

72.6 KiB

A1/images/b.jpg

70.8 KiB

A1/images/c.jpg

72.8 KiB

from typing import List
import matplotlib.pyplot as plt
import numpy as np
def euclidean(vector1: List, vector2: List) -> float:
"""
This method calculates the euclidean distance.
:param vector1: Vector as list
:param vector2: Vector as list
:return: Euclidean distance between vector1 and vector2.
"""
return np.linalg.norm(np.subtract(vector1, vector2))
def pairwise_arg_min(X: List, Y: List) -> np.ndarray:
"""
This method returns a list of all pairwise distances from X to Y.
:param X: Vector with features
:param Y: Centroids
:return: List of all pairwise distances from X to Y.
"""
return np.asarray([np.argmin([euclidean(x, y) for y in Y]) for x in X])
def find_clusters_with_fix_init_centers(X, n_clusters, centers):
"""
This method finds all clusters.
:param centers: pre defined centers
:param X: Data to be clustered
:param n_clusters: amount of clusters
:return: All labels for clustering and the centroids.
"""
while True:
labels = pairwise_arg_min(X, centers)
new_centers = np.array([X[labels == i].mean(0) for i in range(n_clusters)])
if np.all(centers == new_centers):
break
centers = new_centers
return centers, labels
if __name__ == '__main__':
X = np.array([[1, 1],
[1, 4],
[2, 2],
[10, 3],
[11, 2],
[11, 4],
[4, 12],
[6, 11],
[7, 10],
[8, 10]])
centers, labels = find_clusters_with_fix_init_centers(X, 3, [X[2], X[3], X[7]])
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], c='Red', marker='X')
plt.savefig('./images/a.jpg', dpi=300, bbox_inches='tight')
plt.close()
centers, labels = find_clusters_with_fix_init_centers(X, 3, [X[6], X[7], X[8]])
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], c='Red', marker='X')
plt.savefig('./images/b.jpg', dpi=300, bbox_inches='tight')
plt.close()
X = X[::-1]
centers, labels = find_clusters_with_fix_init_centers(X, 3, [X[2], X[3], X[7]])
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], c='Red', marker='X')
plt.savefig('./images/c.jpg', dpi=300, bbox_inches='tight')
plt.close()
# Histograms
| Histogram
:---------------------------------------------:|
![](./images/chalsea/Kadse.jpeg) |
![](./images/chalsea/H1.jpeg) |
![](./images/chalsea/H5.jpeg) |
![](./images/chalsea/H10.jpeg) |
![](./images/chalsea/H15.jpeg) |
![](./images/chalsea/H20.jpeg) |
Es lässt sich erkenne, dass beide Histograme ähnlich arbeiten.
Mit der Partitionierung ist es möglich zu erkennen in welchen Intervall sich Farben häufen.
Mit dem einfachen Histogram ist es nur möglich das gesamte Sprektrum zu begutachten.
Durch die Partitionierung können so Intervalle festgelegt werden in denen sich besonders viele Farben/Feature häufen.
\ No newline at end of file
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('./images/martian/martian.jpg', 0)
# Initiate ORB detector
orb = cv.ORB_create()
# find the keypoints with ORB
kp = orb.detect(img, None)
# compute the descriptors with ORB
kp, des = orb.compute(img, kp)
key_points = [k.pt for k in kp]
# draw only keypoints location,not size and orientation
img2 = cv.drawKeypoints(img, kp, None, color=(0, 255, 0), flags=0)
# plt.scatter(*zip(*key_points))
plt.imshow(img2)
# plt.show()
from sklearn.metrics import pairwise_distances_argmin
def find_clusters(X, n_clusters, rseed=2):
# 1. Randomly choose clusters
rng = np.random.RandomState(rseed)
i = rng.permutation(X.shape[0])[:n_clusters]
centers = X[i]
while True:
# 2a. Assign labels based on closest center
labels = pairwise_distances_argmin(X, centers)
# 2b. Find new centers from means of points
new_centers = np.array([X[labels == i].mean(0)
for i in range(n_clusters)])
# 2c. Check for convergence
if np.all(centers == new_centers):
break
centers = new_centers
return centers, labels
X = np.array([list(x) for x in key_points])
centers, labels = find_clusters(X, 3)
plt.scatter(X[:, 0], X[:, 1], marker='o', c=labels,
s=50, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], marker='+', color='red')
plt.show()
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment