文件名称:Hashing Searching Sketching
文件大小:889KB
文件格式:PDF
更新时间:2015-11-26 23:50:47
Algorithm
The Information Age has enabled the search for information in ways never imagined before. The search criteria may be exact where the input query is expected to exactly match the search object, or fuzzy -- for instance image search, news search, and similar document search -- making the search problem much harder. Hashing is a simple and effective method for exact search that uses a random hash function to map items into buckets, often viewed as throwing balls into bins. This book studies algorithms for different kinds of search using hashing and sketching, and some fundamental limits of what can be realized using some of these approaches. For exact search, we will see how variants of balls- and-bins processes can be used to derive space efficient methods for maintaining hash tables. For fuzzy search, we will see a variant of a special type of hashing, called locality-sensitive hashing, that uses linear space and how the underlying ideas can be used in the kd-tree data structure for improved performance. We will also probe the fundamental limits of some of these approaches by showing lower bounds on their performance.