Time series analysis 23 DTW (time series similarity measurement algorithm)

Time series analysis - 23 DTW (time series similarity measurement algorithm)

Preliminary study on DTW

brief introduction

In time series analysis, DTW(Dynamic Time Warping) is an algorithm used to detect the similarity of two time series, which is usually represented by a distance. For example, the following two sequences,
a = [ 1 , 2 , 3 ] a=[1,2,3] a=[1,2,3]
b = [ 3 , 2 , 2 ] b=[3,2,2] b=[3,2,2]
How do we measure the distance between the two sequences? An obvious method is to calculate the distance of the elements in the two sequences one by one according to their positions, and finally add it up or average it after adding it up. It sounds like a relatively simple and easy to understand method. We can observe the following figure to understand the measurement of temporal similarity.

But if it's the following sequence,
a = [ 1 , 2 , 3 ] a=[1,2,3] a=[1,2,3]
b = [ 2 , 2 , 2 , 3 , 4 ] b=[2,2,2,3,4] b=[2,2,2,3,4]
What should we do?
DTW is one of the most commonly used algorithms to solve such problems.
In fact, this algorithm can not solve the problem that the length of the two sequences is not the same, even if the length of the sequence is not the same.

In short, DTW algorithm provides the elastic alignment of two sequences, finds the best alignment, and calculates the distance based on the best alignment.

application

DTW has a very wide range of applications. For example, it can be used to find similar walking patterns, even if one is walking fast, the other is walking slowly, or one is decelerating or accelerating.

DTW has been applied to a variety of serializable data such as video, audio and image, such as identifying speakers according to audio, signature recognition and so on.

Another typical application is in the time series analysis of the stock market. In the stock market, people always hope to predict the future, but the general machine learning algorithm is not very effective in this field. Because the prediction task of machine learning generally requires that its training set and test set have the same dimensional feature attributes. But we know that even if the very same pattern of a stock appears, the reflection length of their K-line and other indicators is not the same.

DTW algorithm rules

DTW is to calculate the optimal matching of two sequences under certain rules and constraints:

Each location element of the first sequence must match one or more location elements of the second sequence, and vice versa.
The first element of the first sequence must match the first element of the second sequence, but it can also match other locations.
The last element of the first sequence must match the last element of the second sequence, but it can also match other locations.
The position mapping relationship between two sequences must be monotonically increasing, and vice versa. For example, if the 1,3 position of the first sequence is mapped to the 2,3 position of the second sequence, it is OK, but if it is mapped to the 3,2 position, it is not possible.

Optimal matching is the matching with minimum loss under all the above constraints. The minimum loss is the sum of the distances of the values on the mapping position. To sum up, the head and tail must match. There is no cross mapping and no residue subsequence.

Let's look at the pseudo code first:  
int DTWDistance( s: array [1..n],  t: array [1..m] ) {  
    DTW := array [0..n, 0..m]
    
    for i := 0 to n
        for j := 0 to m
            DTW[i, j] := infinity
    DTW[0, 0] := 0
    
    for i := 1 to n
        for j := 1 to m
            cost := d(s[i], t[j])
            DTW[i, j] := cost + minimum(DTW[i-1, j  ],    // insertion
                                        DTW[i  , j-1],    // deletion
                                        DTW[i-1, j-1])    // match
    
    return DTW[n, m]
}

here, d ( x , y ) = ∣ x − y ∣ d(x,y)=|x-y| d(x,y) = ∣ x − y ∣ and D T W [ i , j ] DTW[i,j] DTW[i,j] represents the sequence under the best alignment s [ 1 : i ] s[1:i] s[1:i] and t [ 1 : j ] t[1:j] Distance between t[1:j].
The core of this algorithm is:

DTW[i, j] := cost + minimum(DTW[i-1, j ], //insertion
                   DTW[i  , j-1],  //deletion
                   DTW[i-1, j-1])  //match

Obviously, this is a dynamic programming algorithm. Two sequences s [ 1 : i ] s[1:i] s[1:i] and t [ 1 : j ] t[1:j] The distance between t[1:j] is i i i and j j The distance of j plus the minimum distance of the following three subproblems

Sub question 1: DTW[i-1,j] is understood as s s s sequence inserts a point
Sub question 2: DTW[i,j-1] is understood as t t t sequence deletes a point
Sub problem 3: DTW[i-1,j-1] is understood as two sequences matching at the last time point

Python code is as follows

def dtw(s, t):
    n, m = len(s), len(t)
    dtw_matrix = np.zeros((n+1, m+1))
    for i in range(n+1):
        for j in range(m+1):
            dtw_matrix[i, j] = np.inf
    dtw_matrix[0, 0] = 0
    
    for i in range(1, n+1):
        for j in range(1, m+1):
            cost = abs(s[i-1] - t[j-1])
            # take last min from a square box
            last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
            dtw_matrix[i, j] = cost + last_min
    return dtw_matrix

for instance

import numpy as np
a = [1,2,3]
b = [2,2,2,3,4]
dtw(a,b)

array([[ 0., inf, inf, inf, inf, inf],
[inf, 1., 2., 3., 5., 8.],
[inf, 1., 1., 1., 2., 4.],
[inf, 2., 2., 2., 1., 2.]])
The value of the last element is the distance between the two sequences, which is 2.

Increase window limit

One problem with the above algorithm is that there is no limit to the number of elements that allow one element in one sequence to match another. In this case, if one of the sequences is too long, it will lead to performance problems. For example,
a = [ 1 , 2 , 3 ] a=[1,2,3] a=[1,2,3]
b = [ 1 , 2 , 2 , 2 , 2 , 2 , . . . , 5 ] b=[1,2,2,2,2,2,...,5] b=[1,2,2,2,2,2,...,5]
To avoid this problem, we can add a window parameter to limit the number of elements that one timing point can match in another timing point.

The pseudo code is as follows

int DTWDistance( s: array [1..n], t: array [1..m], w: int ) {   
    
    DTW := array [0..n, 0..m]

    w := max(w, abs(n-m)) // adapt window size (*)

    for i := 0 to n
        for j:= 0 to m
            DTW[i, j] := infinity
    DTW[0, 0] := 0
    for i := 1 to n
        for j := max(1, i-w) to min(m, i+w)
            DTW[i, j] := 0

    for i := 1 to n
        for j := max(1, i-w) to min(m, i+w)
            cost := d(s[i], t[j])
            DTW[i, j] := cost + minimum(DTW[i-1, j  ],    // insertion
                               DTW[i  , j-1],    // deletion
                               DTW[i-1, j-1])    // match

    return DTW[n, m]
}

Python code is as follows

def dtw(s, t, window):
    n, m = len(s), len(t)
    w = np.max([window, abs(n-m)])
    dtw_matrix = np.zeros((n+1, m+1))
    
    for i in range(n+1):
        for j in range(m+1):
            dtw_matrix[i, j] = np.inf
    dtw_matrix[0, 0] = 0
    
    for i in range(1, n+1):
        for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
            dtw_matrix[i, j] = 0
    
    for i in range(1, n+1):
        for j in range(np.max([1, i-w]), np.min([m, i+w])+1):
            cost = abs(s[i-1] - t[j-1])
            # take last min from a square box
            last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
            dtw_matrix[i, j] = cost + last_min
    return dtw_matrix

a = [1,2,3,3,5]
b = [1,2,2,2,2,2,2,4]
dtw(a,b,window=3)

array([[ 0., inf, inf, inf, inf, inf, inf, inf, inf],
[inf, 0., 1., 2., 3., inf, inf, inf, inf],
[inf, 1., 0., 0., 0., 0., inf, inf, inf],
[inf, 3., 1., 1., 1., 1., 1., inf, inf],
[inf, 5., 2., 2., 2., 2., 2., 2., inf],
[inf, inf, 5., 5., 5., 5., 5., 5., 3.]])

Under the control of window parameters, the result is 3

Python package

The previous example code adopts recursion to realize dynamic planning. Although this method is simple to implement, its operation efficiency is very low. Therefore, in the actual scene, some optimization algorithms are generally used, such as
PrunedDTW, SparseDTW, FastDTW and MultiscaleDTW
Here we use FastDTW to audition

from fastdtw import fastdtw
from scipy.spatial.distance import euclidean

x = np.array([1, 2, 3, 3, 7])
y = np.array([1, 2, 2, 2, 2, 2, 2, 4])

distance, path = fastdtw(x, y, dist=euclidean)

print(distance)
print(path)

# 5.0
# [(0, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 5), (3, 6), (4, 7)]

5.0
[(0, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 7), (3, 7), (4, 7)]

Unfinished, to be continued

Keywords: Python Algorithm

Added by mrjcfreak on Wed, 02 Feb 2022 21:02:26 +0200

Programming VIP