User guide

Introduction

Time series are very common data and classifying them can be of interest in a lot of fields. However standard machine learning algorithms for classification, like Logistic Regression, Support Vector Machine or K-Nearest Neighbors with usual metrics, don’t work very well. To be more precise, these algorithms don’t work well on raw time series of real numbers. Most algorithms developed recently have been focusing on transforming the raw time series before applying a standard machine learning classification algorithm.

In the following sections we’ll present the algorithms implemented in pyts. If you want more information about the algorithms, you can have a look at the references and the Examples section.

Preprocessing

It is standard in machine learning to perform some preprocessing on raw data. Likewise it is standard to perform some preprocessing on time series. Implemented algorithms can be found in the pyts.preprocessing module.

Currently the only preprocessing tool implemented is StandardScaler It performs standardization (z-normalization) for each time series: the preprocessed time series all have zero mean and unit variance. It is implemented as pyts.preprocessing.StandardScaler.

Approximation

Time series can be of huge size or be very noisy. It can be useful to sum up the most important information of each time series. Implemented algorithms to approximate a time series can be found in the pyts.approximation module.

The first algorithm implemented is Piecewise Aggregate Approximation (PAA). The main idea of this algorithm is to apply windows along a time series and to take the mean value in each window. It is implemented as pyts.approximation.PAA.

The second algorithm implemented is Discrete Fourier Transform (DFT). The idea is to approximate a time series with a subsample of its Fourier coefficients. The selected Fourier coefficients are either the first ones (as they represent the trend of the time series) or the ones that discriminate the different classes the most if a vector of class labels is provided. It is implemented as pyts.approximation.DFT.

References

  • Eamonn J. Keogh and Michael J. Pazzani. A simple dimensionality reduction technique for fast similarity search in large time series databases. Knowledge Discovery and Data Mining ,2000.
  • Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos. Fast Subsequence Matching in Time-Series Databases. ACM SIGMOD Record, 2000.

Quantization

One of the most interesting parts in time series classification is that several state-of-the-art algorithms use text mining techniques for classification and thus transform time series into bag of words. But first a time series of real numbers needs to be transformed into a sequence of letters. Implemented algorithms that quantize time series can be found in the pyts.quantization module.

The first algorithm implemented is Symbolic Aggregate approXimation (SAX). For each time series, bins are computed using gaussian or empirical quantiles. Then each datapoint is replaced by the bin it is in. It is implemented as pyts.quantization.SAX.

The second algorithm implemented is Multiple Coefficient Binning (MCB). The idea is very similar to SAX and the difference is that the quantization is applied at each timestamp. It is implemented as pyts.quantization.MCB.

The third algorithm implemented is Symbolic Fourier Approximation (SFA). It performs DFT then MCB, i.e. MCB is applied to the selected Fourier coefficients of each time series. It is implemented as pyts.quantization.SFA.

References

  • Jessica Lin, Eamonn Keogh, Li Wei, and Stefano Lonardi. Experiencing SAX: a Novel Symbolic Representation of Time Series. Data Mining and Knowledge Discovery, 2007.
  • Patrick Schäfer and Mikael Högqvist. SFA: A Symbolic Fourier Approximation and Index for Similarity Search in High Dimensional Datasets. ACM International Conference Proceeding Series, 2012.

Bag of Words

Now that you know how you can transform a time series of real numbers into a sequence of letters, it’s time to create bag of words. These algorithms are can be found in the pyts.bow module.

The only algorithm implemented for the moment is Bag of Words (BOW). It applies a sliding window of fixed length along the sequence of letters to create words. It is implemented as pyts.bow.BOW.

Transformation

The pyts.transformation module consists of more complex algorithms that transform a dataset of raw time series with shape [n_samples, n_timestamps] into a more standard dataset of features with shape [n_samples, n_features] that can be used as input data for a standard machine learning classification algorithm.

The first algorithm implemented is Bag-of-SFA Symbols (BOSS). Each time series is first transformed into a bag of words using SFA and BOW. After this transformation the features that are created are the frequencies of each word. It is implemented as pyts.transformation.BOSS.

The second algorithm implemented is Word ExtrAction for time SEries cLassification (WEASEL). The idea is similar to BOSS: first transform each time series into a bag of words then compute the frequencies of each word. WEASEL is more sophisticated in the sense that the selected Fourier coefficients are the most discrimative ones (based on the one-way ANOVA test), several lengths for the sliding window are used and the most discriminative features (i.e. words) are kept (based on the chi-2 test). It is implemented as pyts.transformation.WEASEL.

References

  • Patrick Schäfer. The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 2015.
  • Patrick Schäfer and Ulf Leser. Fast and Accurate Time Series Classification with WEASEL. CoRR, 2017.

Classification

The pyts.classification module consists of several classification algorithms.

The first algorithm implemented is K-Nearest Neighbors (KNN). For time series classification it is the go-to algorithm for a good baseline. The most common metrics used for time series classification are the Euclidean distance and the Dynamic Time Warping distance. It is implemented as pyts.classification.KNNClassifier.

The second algorithm implemented is SAX-VSM. The outline of this algorithm is to first transform raw time series into bags of words using SAX and BOW, then merge, for each class label, all bags of words for this class label into only one bag of words, and finally compute tf-idf for each bag of words. This leads to a tf-idf vector for each class label. To predict an unlabeled time series, this time series if first transformed into a term frequency vector, then the predicted label is the one giving the highest cosine similarity among the tf-idf vectors learned in the training phase. It is implemented as pyts.classification.SAXVSMClassifier.

The third algorithm implemented is Bag-of-SFA Symbols in Vector Space (BOSSVS). The outline of this algorithm is quite similar to the one of SAX-VSM but words are created using SFA instead of SAX. It is implemented as pyts.classification.BOSSVSClassifier.

References

  • Meinard Müller. Dynamic Time Warping (DTW). Information Retrieval for Music and Motion, 2007.
  • Senin Pavel and Malinchik Sergey. SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model. Data Mining (ICDM), 2013 IEEE 13th International Conference on, pp.1175,1180, 2013.
  • Patrick Schäfer. Scalable Time Series Classification. DMKD and ECML/PKDD, 2016.

Image

Instead of transforming a time series into a bag of words, it is also possible to transform it into an image ! The pyts.image module consists of several algorithms that perform that kind of transformation.

The first algorithm implemented is Recurrence Plot. It transforms a time series into a matrix where each value corresponds to the distance between two trajectories (a trajectory is a sub time series, i.e. a subsequence of back-to-back values of a time series). The matrix can be binarized using a threshold. It is implemented as pyts.image.RecurrencePlots.

The second algorithm implemented is Gramian Angular Field (GAF). First a time series is represented as polar coordinates. Then the time series can be transformed into a Gramian Angular Summation Field (GASF) when the cosine of the sum of the angular coordinates is computed or a Gramian Angular Difference Field (GADF) when the sine of the difference of the angular coordinates is computed. It is implemented as pyts.image.GASF and pyts.image.GADF.

The third algorithm implemented is Markov Transition Field (MTF). The outline of the algorithm is to first quantize a time series using SAX, then to compute the Markov transition matrix (the quantized time series is seen as a Markov chain) and finally to compute the Markov transition field from the transition matrix. It is implemented as pyts.image.MTF.

References

  • J.-P. Eckmann, S. Oliffson Kamphorst and D. Ruelle. Recurrence Plots of Dynamical Systems. Europhysics Letters, 1987.
  • Zhiguang Wang and Tim Oates. Imaging time-series to improve classification and imputation. Proceedings of the 24th International Conference on Artificial Intelligence, 2015.

Decomposition

The pyts.decomposition module consists of algorithms that decompose a time series into several time series. The idea is to distinguish the different parts of time series, such as the trend, the noise, etc.

The only algorithm implemented currently is Singular Spectrum Analysis (SSA). The outline of the algorithm is to first compute a matrix from a time series using lagged vectors, then compute the eigenvalues and eigenvectors of this matrix multiplied by its transpose, after compute the eigenmatrices and finally compute the time series for each eigenmatrice. It is implemented as pyts.decomposition.SSA.

References

  • Nina Golyandina and Anatoly Zhigljavsky. Singular Spectrum Analysis for Time Series. 2013