Friday, August 15, 2014

Symbolic Aggregate approXimation: A symbolic time series representation

I used this time series representation some years ago for a lot for my research. I still think it is an elegant way of representing time series. You can use this easy to use algorithm to convert a one dimensional time series into a string. Given a time series you split it into w equally sized segments and estimate the sample mean in each segment. So we end up with a time series of length w. We then divide the Y axis into k regions using split points or thresholds. Assigning a unique symbol to each of the regions we can
check in which region each sample mean falls into and read of the symbol. So we end with a string of size w.

You can see the performance on multiple time series data sets in the original paper. Furthermore, there is a very efficient way on how to index massive data sets using this representation.

No comments:

Post a Comment