Transformation

The documentation of the transformation module.

The pyts.transformation module includes transformation algorithms.

class pyts.transformation.BOSS(n_coefs, window_size, anova=False, norm_mean=True, norm_std=True, n_bins=4, quantiles=u'empirical', variance_selection=False, variance_threshold=0.0, numerosity_reduction=True)[source]

Bag-of-SFA Symbols.

Parameters:
n_coefs : None or int (default = None)

The number of Fourier coefficients to keep. If n_coefs=None, all Fourier coefficients are returned. If n_coefs is an integer, the n_coefs most significant Fourier coefficients are returned if anova=True, otherwise the first n_coefs Fourier coefficients are returned. A even number is required (for real and imaginary values) if anova=False.

window_size : int

The size of the window.

anova : bool (default = False)

If True, the Fourier coefficients selection is done via a one-way ANOVA test. If False, the first Fourier coefficients are selected.

norm_mean : bool (default = True)

If True, center the data before scaling. If norm_mean=True and anova=False, the first Fourier coefficient will be dropped.

norm_std : bool (default = True)

If True, scale the data to unit variance.

n_bins : int (default = 4)

The number of bins. Ignored if quantiles='entropy'.

quantiles : {‘gaussian’, ‘empirical’} (default = ‘gaussian’)

The way to compute quantiles. If ‘gaussian’, quantiles from a gaussian distribution N(0,1) are used. If ‘empirical’, empirical quantiles are used.

variance_selection : bool (default = False)

If True, the Fourier coefficients with low variance are removed.

variance_threshold : float (default = 0.)

Fourier coefficients with a training-set variance lower than this threshold will be removed. Ignored if variance_selection=False.

numerosity_reduction : bool (default = True)

If True, numerosity reduction is applied: When the same word occurs several times in a row, only one instance of this word is kept.

Attributes:
vocabulary_ : dict

A mapping of features indices to terms.

Methods

fit(X[, y, overlapping]) Fit the model according to the given training data.
fit_transform(X[, y, overlapping]) Fit the data then transform it.
get_params([deep]) Get parameters for this estimator.
set_params(**params) Set the parameters of this estimator.
transform(X) Transform the provided data.
fit(X, y=None, overlapping=True)[source]

Fit the model according to the given training data.

Parameters:
X : array-like, shape = [n_samples, n_features]

Training vector, where n_samples in the number of samples and n_features is the number of features.

y :

Ignored.

overlapping : boolean (default = True)

whether or not overlapping windows are used for the training phase.

Returns:
self : object
fit_transform(X, y=None, overlapping=True)[source]

Fit the data then transform it.

Parameters:
X : array-like, shape = [n_samples, n_features]
Returns:
X_new : sparse matrix, shape [n_samples, n_words]

Document-term matrix.

transform(X)[source]

Transform the provided data.

Parameters:
X : array-like, shape = [n_samples, n_features]
Returns:
X_new : sparse matrix, shape [n_samples, n_words]

Document-term matrix.

class pyts.transformation.WEASEL(n_coefs, window_sizes, norm_mean=True, norm_std=True, n_bins=4, variance_selection=False, variance_threshold=0.0, pvalue_threshold=0.9)[source]

Word ExtrAction for time SEries cLassification.

Parameters:
n_coefs : int

The number of Fourier coefficients to keep. The n_coefs most significant Fourier coefficients are returned.

window_sizes : array-like

The size of the windows.

anova : bool (default = False)

If True, the Fourier coefficients selection is done via a one-way ANOVA test. If False, the first Fourier coefficients are selected.

norm_mean : bool (default = True)

If True, center the data before scaling. If norm_mean=True and anova=False, the first Fourier coefficient will be dropped.

norm_std : bool (default = True)

If True, scale the data to unit variance.

n_bins : int (default = 4)

The number of bins (also known as the size of the alphabet).

variance_selection : bool (default = False)

If True, the Fourier coefficients with low variance are removed.

variance_threshold : float (default = 0.)

Fourier coefficients with a training-set variance lower than this threshold will be removed. Ignored if variance_selection=False.

pvalue_threshold : float (default = 0.9)

threshold for the feature selection. Features with p-values above ‘pvalue_threshold’ for the Chi-2 test are kept.

Attributes:
vocabulary_ : dict

A mapping of features indices to terms.

Methods

fit(X, y[, overlapping]) Fit the model according to the given training data.
fit_transform(X[, y]) Fit to data, then transform it.
get_params([deep]) Get parameters for this estimator.
set_params(**params) Set the parameters of this estimator.
transform(X) Transform the provided data.
fit(X, y, overlapping=False)[source]

Fit the model according to the given training data.

Parameters:
X : array-like, shape = [n_samples, n_features]

Training vector, where n_samples in the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples]

Class labels for each data sample.

overlapping : boolean (default = False)

If True, overlapping windows are used.

Returns:
self : object
transform(X)[source]

Transform the provided data.

Parameters:
X : array-like, shape [n_samples, n_features]

The data used to scale along the features axis.

Returns:
X_new : sparse matrix, shape [n_samples, n_relevant_features]

Document-term matrix with relevant features only.