文件名称:Indexing for Subtree Similarity-Search using Edit Distance
文件大小:2.44MB
文件格式:PDF
更新时间:2017-03-10 03:02:31
Similarity Search
sigmod2013;ABSTRACT Given a tree Q and a large set of trees T = fT1; : : : ; Tng, the subtree similarity-search problem is that of nding the subtrees of trees among T that are most similar to Q, using the tree edit distance metric. Determining similarity using tree edit distance has been proven useful in a variety of application areas. While subtree similarity-search has been studied in the past, solutions required traversal of all of T , which poses a severe bottleneck in processing time, as T grows larger. This paper proposes the rst index structure for subtree similarity-search, provided that the unit cost function is used. Extensive experimentation and comparison to previous work shows the huge improvement gained when using the proposed index structure and processing algorithm.