【文件属性】:
文件名称:Similarity Join Processing on Uncertain Data Streams
文件大小:1.22MB
文件格式:PDF
更新时间:2017-03-10 03:00:36
join; data stream
IEEE论文Abstract—Similarity join processing in the streaming environment has many practical applications such as sensor networks, object
tracking and monitoring, and so on. Previous works usually assume that stream processing is conducted over precise data. In this
paper, we study an important problem of similarity join processing on stream data that inherently contain uncertainty (or called
uncertain data streams), where the incoming data at each time stamp are uncertain and imprecise. Specifically, we formalize this
problem as join on uncertain data streams (USJ), which can guarantee the accuracy of USJ answers over uncertain data. To tackle the
challenges with respect to efficiency and effectiveness such as limited memory and small response time, we propose effective pruning
methods on both object and sample levels to filter out false alarms. We integrate the proposed pruning methods into an efficient query
procedure that can incrementally maintain the USJ answers. Most importantly, we further design a novel strategy, namely, adaptive
superset prejoin (ASP), to maintain a superset of USJ candidate pairs. ASP is in light of our proposed formal cost model such that the
average USJ processing cost is minimized. We have conducted extensive experiments to demonstrate the efficiency and effectiveness
of our proposed approaches.