Kun Hou
Master student. Now at Teradata

Book Chapter  Journal  Conference  Workshop  Tech Report]



  • Approximate String Search in Spatial Databases (Project Website), Talk
    By Bin Yao,    Feifei Li,    Marios Hadjieleftheriou,    Kun Hou
    In Proceedings of 26th IEEE International Conference on Data Engineering (ICDE 2010),  pages 4-15,  Long Beach, California,  March,  2010.

    This work presents a novel index structure, MHR-tree, for efficiently answering approximate string match queries in large spatial databases. The MHR-tree is based on the R-tree augmented with the min-wise signature and the linear hashing technique. The min-wise signature for an index node u keeps a concise representation of the union of q-grams from strings under the sub-tree of u. We analyze the pruning functionality of such signatures based on set resemblance between the query string and the q-grams from the sub-trees of index nodes. MHR-tree supports a wide range of query predicates efficiently, including range and nearest neighbor queries. We also discuss how to estimate range query selectivity accurately. We present a novel adaptive algorithm for finding balancedpartitions using both the spatial and string information stored in the tree. Extensive experiments on large real data sets demonstrate the efficiency and effectiveness of our approach.