glidertools.cleaning.data_density_filter

glidertools.cleaning.data_density_filter(x, y, conv_matrix=None, min_count=5, return_figures=True)

Use the 2D density cloud of observations to find outliers for any variables

The data density filter needs tuning to work well. This uses convolution to create the density cloud - you can specify the exact convolution matrix, or its shape

Parameters:
  • x (np.array / pd.Series, shape=[n, ]) – e.g. temperature

  • y (np.array / pd.Series, shape=[n, ]) – e.g. salinity

  • conv_matrix (int, list, np.array, optional) – int = size of the isotropic round convolution window. [int, int] = anisotropic (oval) convoltion window. 2d array is a weighted convolution window; rectangle = np.ones([int, int]); more advanced anisotropic windows can also be created

  • min_count (int, default=5, optional) – masks the 2d histogram counts smaller than this limit when performing the convolution

  • return_figures (bool, default=True, optional) – returns figures of the data plotted for blob detection…

Returns:

  • mask (np.array, shape=[n, ]) – a mask that returns only values

  • figure – only returned if return_figure is True