Top-<i>k</i> Similarity Search over Gaussian Distributions Based on KL-Divergence

Abstract

The problem of similarity search is a crucial task in many real-world applications such as multimedia databases, data mining, and bioinformatics. In this work, we investigate the similarity search on uncertain data modeled in Gaussian distributions. By employing Kullback-Leibler divergence (KL-divergence) to measure the dissimilarity between two Gaussian distributions, our goal is to search a database for the top-k Gaussian distributions similar to a given query Gaussian distribution. Especially, we consider non-correlated Gaussian distributions, where there are no correlations between dimensions and their covariance matrices are diagonal. To support query processing, we propose two types of novel approaches utilizing the notions of rank aggregation and skyline queries. The efficiency and effectiveness of our approaches are demonstrated through a comprehensive experimental performance study.

Journal

References(10)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top