MapReduce-based Assembly Clone Search for Reverse Engineering.pdf

时间:2022-06-08 16:38:51
【文件属性】:

文件名称:MapReduce-based Assembly Clone Search for Reverse Engineering.pdf

文件大小:1.07MB

文件格式:PDF

更新时间:2022-06-08 16:38:51

Assembly MapReduce Assembly code an

A paper published in 2016. ABSTRACT Assembly code analysis is one of the critical processes for detecting and proving software plagiarism and software patent infringements when the source code is unavailable. It is also a common practice to discover exploits and vulnerabilities in existing software. However, it is a manually intensive and time-consuming process even for experienced reverse engineers. An effective and efficient assembly code clone search engine can greatly reduce the effort of this process, since it can identify the cloned parts that have been previously analyzed. The assembly code clone search problem belongs to the field of software engineering. However, it strongly depends on practical nearest neighbor search techniques in data mining and databases. By closely collaborating with reverse engineers and Defence Research and Development Canada (DRDC ), we study the concerns and challenges that make existing assembly code clone approaches not practically applicable from the perspective of data mining. We propose a new variant of LSH scheme and incorporate it with graph matching to address these challenges. We implement an integrated assembly clone search engine called Kam1n0. It is the first clone search engine that can efficiently identify the given query assembly function’s subgraph clones from a large assembly code repository. Kam1n0 is built upon the Apache Spark computation framework and Cassandra-like key-value distributed storage. A deployed demo system is publicly available.1 Extensive experimental results suggest that Kam1n0 is accurate, efficient, and scalable for handling large volume of assembly code.


网友评论