Hadoop使用lzo压缩格式

时间:2023-03-08 17:14:30

在hadoop中搭建lzo环境:

  1. wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.06.tar.gz
  2. export CFLAGS=-m64
  3. ./configure -enable-shared -prefix=/usr/local/hadoop/lzo/
  4. make && make test && make install

在hadoop-env.sh中

export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib  

将如下配置到 core-site.xml 中:

<property>

<name>io.compression.codecs</name>

<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>

</property>

<property>

<name>io.compression.codec.lzo.class</name>

<value>com.hadoop.compression.lzo.LzoCodec</value>

</property>

将如下配置到mapred-site.xml中:

<property>

<name>mapred.compress.map.output</name>

<value>true</value>

</property>

<property>

<name>mapred.map.output.compression.codec</name>

<value>com.hadoop.compression.lzo.LzoCodec</value>

</property>

<property>

<name>mapred.child.env</name>

<value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>

</property>