AC
AnkiCollab
AnkiCollab
Sign in
Explore Decks
Helpful
Join Discord
Download Add-on
Documentation
Leave a Review
Notes in
Week 6 - Classification
To Subscribe, use this Key
carbon-mango-hotel-bravo-seventeen-black
Status
Last Update
Fields
Published
11/26/2024
In supervised learning, the training set contains both input features {{c1::x}} and target labels {{c2::y}}.
Published
11/26/2024
Clustering involves grouping objects such that similar objects are in the same {{c1::group}}, while dissimilar ones are in {{c2::different groups}}.
Published
11/26/2024
Clustering can assist with {{c1::anomaly detection}} by learning normal data patterns and identifying deviations as potential issues.
Published
11/26/2024
In {{c1::partitional clustering}}, data objects are divided into non-overlapping subsets, with each object belonging to only {{c2::one subset}}.
Published
11/26/2024
In {{c1::hierarchical clustering}}, data objects are organized in a tree structure, allowing points to be grouped into clusters at multiple {{c2::leve…
Published
11/26/2024
The K-Means algorithm requires specifying the number of clusters {{c1::K}} before running.
Published
11/26/2024
Each cluster in K-Means is associated with a {{c1::centroid}}, and each point is assigned to the cluster with the closest {{c2::centroid}}.
Published
11/26/2024
The K-Means cost function minimizes the sum of squared distances between each point and its {{c1::assigned centroid}}.
Published
11/26/2024
K-Means does not guarantee finding the {{c1::global minimum}} of the objective function due to its sensitivity to initial centroid placement.
Published
11/26/2024
Random initialization in K-Means involves randomly picking {{c1::K}} training examples to set as initial centroids.
Published
11/26/2024
A common K-Means strategy is to run the algorithm multiple times with different initializations and pick the solution with the {{c1::lowest cost}}.
Published
11/26/2024
The optimal number of clusters in K-Means is often determined by examining when the cost function {{c1::decreases slowly}}, known as the {{c2::elbow m…
Published
11/26/2024
K-Means struggles with clusters of varying sizes, densities, or non-{{c1::spherical shapes}}.
Published
11/26/2024
Hierarchical clustering results can be visualized with a {{c1::dendrogram}}, which shows how points are grouped into clusters at each level.
Published
11/26/2024
Hierarchical clustering does not require specifying the {{c1::number of clusters}} in advance.
Published
11/26/2024
In {{c1::agglomerative}} hierarchical clustering, points start as individual clusters and merge until only one or a specified number of clusters remai…
Published
11/26/2024
In {{c1::divisive}} hierarchical clustering, all points start in one cluster and are repeatedly split until reaching individual clusters or a set numb…
Published
11/26/2024
Agglomerative clustering requires a {{c1::distance metric}} to measure the similarity between points or clusters.
Published
11/26/2024
Single linkage (MIN) in hierarchical clustering connects clusters based on the {{c1::shortest distance}} between points in each cluster.
Published
11/26/2024
Complete linkage (MAX) in hierarchical clustering connects clusters based on the {{c1::largest distance}} between points in each cluster.
Published
11/26/2024
Group average (average linkage) in hierarchical clustering is a {{c1::compromise}} between MIN and MAX linkage.
Published
11/26/2024
DBSCAN is a density-based clustering method where clusters are defined by {{c1::dense regions}} separated by sparse regions.
Published
11/26/2024
In DBSCAN, a point is a {{c1::core point}} if it has more than a specified number of neighbors (MinPts) within a radius (Eps).
Published
11/26/2024
In DBSCAN, a {{c1::border point}} has fewer than MinPts neighbors within Eps but is within the neighborhood of a core point.
Published
11/26/2024
A {{c1::noise point}} in DBSCAN is a point that is neither a core point nor a border point.
Published
11/26/2024
Compared to K-Means, DBSCAN can handle clusters of varying {{c1::densities}} and {{c2::non-spherical shapes}}.
Published
11/26/2024
Cluster cohesion measures how {{c1::closely related}} objects within a cluster are.
Published
11/26/2024
Cluster separation measures how {{c1::distinct}} or well-separated a cluster is from other clusters.
Published
11/26/2024
Cluster cohesion is often calculated as the {{c1::within-cluster sum of squares (WSS)}}, also known as SSE.
Published
11/26/2024
Cluster separation is often calculated as the {{c1::between-cluster sum of squares (BSS)}}.
Published
11/26/2024
In cluster analysis, the {{c1::similarity matrix}} can be visualized to assess the organization of clusters.
Published
11/26/2024
Ordering the similarity matrix by cluster labels can provide hints about {{c1::cluster validity}}.
Published
11/26/2024
A key difference between supervised and unsupervised learning is that supervised learning requires {{c1::labeled data}}.
Published
11/26/2024
The K-Means algorithm repeatedly assigns points to the nearest {{c1::centroid}} and then updates centroids based on cluster points.
Published
11/26/2024
In hierarchical clustering, cutting the dendrogram at different levels provides {{c1::different numbers of clusters}}.
Published
11/26/2024
A benefit of hierarchical clustering over K-Means is that it can reveal {{c1::nested cluster structures}}.
Published
11/26/2024
The {{c1::choice of distance metric}} (e.g., Euclidean, Manhattan) affects the shape and structure of clusters in clustering algorithms.
Published
11/26/2024
In DBSCAN, the Eps parameter determines the {{c1::radius}} for defining dense regions, influencing the number and shape of clusters.
Status
Last Update
Fields