spcal.cluster

Agglomerative clustering.

spcal.cluster.agglomerative_cluster(X: ndarray, max_dist: float) ndarray

Cluster data.

Performs agglomerative clustering by merging close clusters until none are closer than max_dist. Distance is measured as Euclidean distance.

Parameters:
  • X – 2D array (samples, features)

  • max_dist – maximum distance between clusters

Returns:

cluster indicies, sorted by size (1=largest)

spcal.cluster.cluster_information(X: ndarray, T: ndarray) tuple[ndarray, ndarray, ndarray]

Get information about a clustering result.

Clusters are sorted by size, largest to smallest.

Parameters:
  • X – 2D array (samples, features)

  • T – cluster indicies

Returns:

cluster means cluster stds cluster counts

spcal.cluster.prepare_data_for_clustering(data: ndarray) ndarray

Prepare data by stacking into 2D array.

Takes a dictionary or structured array and creates an NxM array, where M is the number of names / keys and N the length of each array.

Parameters:

data – dictionary of names: array or structured array

Returns:

2D array, ready for agglomerative_cluster

spcal.cluster.prepare_results_for_clustering(results: list[SPCalProcessingResult], number_peaks: int, key: str) tuple[ndarray, ndarray]

Prepare data by stacking into 2D array.

Conveience method for list of results.

Parameters:
  • results – list of results, peak_indicies must be generated

  • number_peaks – number of peaks

Returns:

2D array with length number_peaks, ready for agglomerative_cluster mask of valid (unfiltered) peaks

See also

prepare_data_from_clustering