A system for keyword search on textual streams
Vagelis Hristidis, Oscar Valdivia, et al.
SDM 2007
While most time series data mining research has concentrated on providing solutions for a single distance function, in this work we motivate the need for an index structure that can support multiple distance measures. Our specific area of interest is the efficient retrieval and analysis of similar trajectories. Trajectory datasets are very common in environmental applications, mobility experiments, and video surveillance and are especially important for the discovery of certain biological patterns. Our primary similarity measure is based on the longest common subsequence (LCSS) model that offers enhanced robustness, particularly for noisy data, which are encountered very often in real-world applications. However, our index is able to accommodate other distance measures as well, including the ubiquitous Euclidean distance and the increasingly popular dynamic time warping (DTW). While other researchers have advocated one or other of these similarity measures, a major contribution of our work is the ability to support all these measures without the need to restructure the index. Our framework guarantees no false dismissals and can also be tailored to provide much faster response time at the expense of slightly reduced precision/recall. The experimental results demonstrate that our index can help speed up the computation of expensive similarity measures such as the LCSS and the DTW.
Vagelis Hristidis, Oscar Valdivia, et al.
SDM 2007
Francesco Fusco, Xenofontas Dimitropoulos, et al.
Computer Communication Review
Aris Anagnostopoulos, Michail Vlachos, et al.
KDD 2006
Demetrios Zeinalipour-Yazti, Christos Laoudias, et al.
IEEE TKDE