文件名称:英特尔® C++编译器Cilk语言扩展.pdf
文件大小:1.15MB
文件格式:PDF
更新时间:2014-05-04 05:19:50
英特尔 C++ 编译器 Cilk
英特尔® C++编译器Cilk语言扩展 ............................................................................................... 1 1. 介绍............................................................................................................................................ 7 1.1 目标读者 ............................................................................................................................... 7 1.2 前提条件 ............................................................................................................................... 7 1.3 排字约定 ............................................................................................................................... 7 1.4 附加资源和信息 ................................................................................................................... 7 2. 新手上路 .................................................................................................................................... 8 2.1 编译运行一个Cilk用例 ..................................................................................................... 8 2.1.1 编译生成 qsort ........................................................................................................... 8 2.1.2 执行 qsort ................................................................................................................... 9 2.1.3 观察多核系统中的加速 ............................................................................................... 9 2.2 改写一个C++程序 .............................................................................................................. 10 2.2.1 从一个串行程序开始 ................................................................................................. 11 2.2.2 使用_Cilk_spawn加入并行性 .................................................................................. 12 2.2.3 编译,执行和测试 ..................................................................................................... 14 3. 编译、运行和调试Cilk程序 ................................................................................................. 15 3.1 设定工作线程数量 ............................................................................................................. 15 3.1.1 环境变量 ..................................................................................................................... 15 3.1.2 程序控制 ..................................................................................................................... 15 3.2 串行化................................................................................................................................ 15 3.2.1 如何创建串行化 ......................................................................................................... 16 3.3 调试策略 ............................................................................................................................. 16 4. Cilk 语言特性说明 ................................................................................................................. 17 5. Cilk关键字 .............................................................................................................................. 18 5.1 cilk_spawn ......................................................................................................................... 18 5.2 cilk_sync ........................................................................................................................... 19 5.3 cilk_for ............................................................................................................................. 19 5.3.1 串行或并行结构的 cilk_for ................................................................................... 20 5.3.2 衍生发生在串行循环内的串行或并行结构 ............................................................. 21 5.3.3 cilk_for 循环体 ....................................................................................................... 21 5.3.4 cilk_for 类型要求 ................................................................................................... 22 5.3.5 cilk_for限制 ............................................................................................................ 23 5.3.6 cilk_for的粒度 ........................................................................................................ 24 5 5.4预处理宏 .............................................................................................................................. 25 6. Cilk 执行模型 ......................................................................................................................... 27 6.1 Strands ............................................................................................................................... 27 6.2 工时和跨度 ......................................................................................................................... 28 6.3 strand到工作线程的映射 ................................................................................................ 30 6.4 异常处理 ............................................................................................................................. 32 7. Reducers .................................................................................................................................. 34 7.1 使用Reducers – 一个简单的例子 ................................................................................ 34 7.2 Reducers是如何工作的 .................................................................................................... 36 7.3 安全性和性能考虑 ............................................................................................................. 38 7.3.1 安全性 ......................................................................................................................... 38 7.3.2 确定性 ......................................................................................................................... 39 7.3.3 性能 ............................................................................................................................. 39 7.4 Reducer 库 ......................................................................................................................... 39 7.5 使用Reducers – 另一个例子 ........................................................................................ 41 7.5.1 字符串Reducer .......................................................................................................... 41 7.5.2 List reducer (使用用户定义类型) ....................................................................... 42 7.5.3 递归函数中的Reducers ............................................................................................ 43 8. 操作系统相关事项 ................................................................................................................... 44 8.1在Cilk程序上使用其它工具 ............................................................................................ 44 8.2 和操作系统线程的一般交互 ............................................................................................. 44 8.3 Microsoft Foundation Class 和 Cilk程序 ................................................................ 45 9. Cilk运行系统API ................................................................................................................... 47 9.1 __cilkrts_set_param ....................................................................................................... 47 9.2 __cilkrts_get_nworkers ................................................................................................. 47 9.3 __cilkrts_get_worker_number .............................................................................................. 47 9.4 __cilkrts_get_total_workers .................................................................................................. 48 10. 理解竞争条件 ......................................................................................................................... 49 10.1 数据竞争 ........................................................................................................................... 49 10.2 良性竞争 ........................................................................................................................... 50 10.3 解决数据竞争 ................................................................................................................... 50 10.3.1 纠正程序中的错误 ................................................................................................... 51 10.3.2 使用局部变量而不是全局变量 ............................................................................... 51 10.3.3 重新构造代码 ........................................................................................................... 52 6 10.3.3 更改算法 ................................................................................................................... 52 10.3.4 使用reducer ............................................................................................................ 52 10.3.5 使用锁 ....................................................................................................................... 52 11. 使用锁的注意事项 ................................................................................................................. 54 11.1 锁引起的确定性竞争 ....................................................................................................... 54 11.2 死锁.................................................................................................................................. 55 11.3锁竞争对并行性的影响 .................................................................................................... 56 11.4跨越strand边界的锁 ...................................................................................................... 56 12. Cilk程序性能方面的注意事项 ............................................................................................ 58 12.1 粒度.................................................................................................................................. 58 12.2 首先优化串行程序 ........................................................................................................... 58 12.3 程序和程序段计时 ........................................................................................................... 59 12.4 常见性能隐患 ................................................................................................................... 59 12.5 高速缓存效率和内存带宽 ............................................................................................... 60 12.6 伪共享 ............................................................................................................................... 60 12.7 内存分配瓶颈 ................................................................................................................... 61 Appendix A. 怎样写一个新的Reducer ...................................................................................... 62 Reducer的组件 .......................................................................................................................... 62 恒等值 ....................................................................................................................................... 63 The Monoid................................................................................................................................ 63 写Reducer – 一个“Holder”的例子 .................................................................................... 64 附录B: 参考读物 .......................................................................................................................... 67 Cilk 总体说明: ......................................................................................................................... 67 串行语义: .................................................................................................................................. 67 例子: ....................................................................................................................................... 67 竞争条件: .................................................................................................................................. 67