glidertools.cleaning.despike

glidertools.cleaning.despike(var, window_size, spike_method='median')

Return a smooth baseline of data and the anomalous spikes

This script is copied from Nathan Briggs’ MATLAB script as described in Briggs et al (2011). It returns the baseline of the data using either a rolling window method and the residuals of [measurements - baseline].

Parameters:
  • arr (numpy.ndarray or pandas.Series) – Array of data variable for cleaning to be performed on.

  • window_size (int) – the length of the rolling window size

  • method (str) – A string with minmax or median. ‘minmax’ first applies a rolling minimum to the dataset thereafter a rolling maximum is applied. This forms the baseline, where the spikes are the difference from the baseline. ‘median’ first applies a rolling median to the dataset, which forms the baseline. The spikes are the difference between median and baseline, and thus are more likely to be negative.

Returns:

  • baseline (numpy.ndarray or pandas.Series) – The baseline from which outliers are determined.

  • spikes (numpy.ndarray or pandas.Series) – Spikes are the residual of [measurements - baseline].