文件名称:Improving Nested Loop Pipelining on Coarse-Grained Reconfigurable Architectures
文件大小:4.42MB
文件格式:PDF
更新时间:2018-07-22 09:16:32
nested loop
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and backend placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.