glidertools.cleaning.data_density_filter
- glidertools.cleaning.data_density_filter(x, y, conv_matrix=None, min_count=5, return_figures=True)
Use the 2D density cloud of observations to find outliers for any variables
The data density filter needs tuning to work well. This uses convolution to create the density cloud - you can specify the exact convolution matrix, or its shape
- Parameters:
x (np.array / pd.Series, shape=[n, ]) – e.g. temperature
y (np.array / pd.Series, shape=[n, ]) – e.g. salinity
conv_matrix (int, list, np.array, optional) – int = size of the isotropic round convolution window. [int, int] = anisotropic (oval) convoltion window. 2d array is a weighted convolution window; rectangle = np.ones([int, int]); more advanced anisotropic windows can also be created
min_count (int, default=5, optional) – masks the 2d histogram counts smaller than this limit when performing the convolution
return_figures (bool, default=True, optional) – returns figures of the data plotted for blob detection…
- Returns:
mask (np.array, shape=[n, ]) – a mask that returns only values
figure – only returned if return_figure is True