原书章节 |
原书章节题目 |
翻译文章序号 |
翻译文章题目 |
链接 |
4.1 |
Joining |
Hadoop(1) |
MapReduce 连接:重分区连接(Repartition join) |
http://www.cnblogs.com/datacloud/p/3578509.html |
4.1.1 |
Repartition join |
Hadoop(1) |
MapReduce 连接:重分区连接(Repartition join) |
http://www.cnblogs.com/datacloud/p/3578509.html |
4.1.2 |
Replicated joins |
Hadoop(2) |
MapReduce 连接:复制连接(Replication join) |
http://www.cnblogs.com/datacloud/p/3579333.html |
4.1.3 |
Semi-joins |
Hadoop(3) |
MapReduce 连接:半连接(Semi-join) |
http://www.cnblogs.com/datacloud/p/3579975.html |
4.1.4 |
Picking the best join strategy for your data |
Hadoop(4) |
MapReduce 连接:选择最佳连接策略 |
http://www.cnblogs.com/datacloud/p/3582113.html |
4.2 |
Sorting |
Hadoop(5) |
MapReduce 排序:次排序(Secondary sort) |
http://www.cnblogs.com/datacloud/p/3584640.html |
4.2.1 |
Secondary sort |
Hadoop(5) |
MapReduce 排序:次排序(Secondary sort) |
http://www.cnblogs.com/datacloud/p/3584640.html |
4.2.2 |
Total order sorting |
Hadoop(6) |
MapReduce 排序:总排序(Total order sorting) |
http://www.cnblogs.com/datacloud/p/3586761.html |
4.3 |
Sampling |
Hadoop(7) |
MapReduce:抽样(Sampling) |
http://www.cnblogs.com/datacloud/p/3588120.html |
6.1 |
Measuring MapReduce and your environment |
Hadoop(8) |
MapReduce 性能调优:性能测量(Measuring) |
http://www.cnblogs.com/datacloud/p/3589875.html |
6.2 |
Determining the cause of your performance woes |
Hadoop(9) |
MapReduce 性能调优:理解性能瓶颈,诊断map性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3591981.html |
6.2.1 |
Understanding what can impact MapReduce job performance |
Hadoop(9) |
MapReduce 性能调优:理解性能瓶颈,诊断map性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3591981.html |
6.2.2 |
Map woes |
Hadoop(9) |
MapReduce 性能调优:理解性能瓶颈,诊断map性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3591981.html |
6.2.3 |
Reducer woes |
Hadoop(10) |
MapReduce 性能调优:诊断reduce性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3595682.html |
6.2.4 |
General task woes |
Hadoop(11) |
MapReduce 性能调优:诊断一般性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3596294.html |
6.2.5 |
Hardware woes |
Hadoop(12) |
MapReduce 性能调优:诊断硬件性能瓶颈 |
http://www.cnblogs.com/datacloud/p/3597909.html |
6.4.3 |
Optimizing the shuffle and sort phase |
Hadoop(13) |
MapReduce 性能调优:优化洗牌(shuffle)和排序阶段 |
http://www.cnblogs.com/datacloud/p/3599920.html |
6.4.4 |
Skew mitigation |
Hadoop(14) |
MapReduce 性能调优:减小数据倾斜的性能损失 |
http://www.cnblogs.com/datacloud/p/3601624.html |
6.4.5 |
Optimizing user space Java in MapReduce |
Hadoop(15) |
MapReduce 性能调优:优化MapReduce的用户JAVA代码 |
http://www.cnblogs.com/datacloud/p/3603191.html |
6.4.6 |
Data serialization |
Hadoop(16) |
MapReduce 性能调优:优化数据序列化 |
http://www.cnblogs.com/datacloud/p/3608591.html |
6.5 |
Chapter summary |
Hadoop(16) |
MapReduce 性能调优:优化数据序列化 |
http://www.cnblogs.com/datacloud/p/3608591.html |
5.1 |
Working with small files |
Hadoop(17) |
MapReduce 文件处理:小文件 |
http://www.cnblogs.com/datacloud/p/3611459.html |
5.2 |
Efficient storage with compression(tech 25, 26) |
Hadoop(19) |
MapReduce 文件处理:基于压缩的高效存储(一) |
http://www.cnblogs.com/datacloud/p/3612817.html |
5.2 |
Efficient storage with compression(tech 27) |
Hadoop(19) |
MapReduce 文件处理:基于压缩的高效存储(一) |
http://www.cnblogs.com/datacloud/p/3616544.html |
Appendix A.10 |
LZOP |
Hadoop(20) |
附录A.10 压缩格式LZOP编译安装配置 |
http://www.cnblogs.com/datacloud/p/3617586.html |
Appendix D.1 |
An optimized repartition join framework |
Hadoop(21) |
附录D.1 优化后的重分区框架 |
http://www.cnblogs.com/datacloud/p/3617079.html |
Appendix D.2 |
A replicated join framework |
Hadoop(22) |
附录D.2 复制连接框架 |
http://www.cnblogs.com/datacloud/p/3617078.html |