更新时间:2022-08-25 21:25:44
Parquet MR Parquet-MR contains the java implementation of the Parquet format. Parquet is a columnar storage format for Hadoop; it provides efficient storage and encoding of data. Parquet uses the record shredding and assembly algorithm described in the Dremel paper to represent nested structures. You can find some details about the format and intended use cases in our Hadoop Summit 2013 presentation Building Parquet-MR uses Maven to build and depends on the thrift compiler (protoc is now manage