行式存储和列式存储的比较

时间:2025-04-12 07:07:47

行式存储的优点:

同一行数据存放在同一个block块里面,select * from table_name;数据能直接获取出来;

 INSERT/UPDATE比较方便

行式存储的缺点:

不同类型数据存放在同一个block块里面,压缩性能不好;

select id,name from table_name;这种类型的列查询,所有数据都要读取,而不能跳过。


列式存储的优点:

同类型数据存放在同一个block块里面,压缩性能好;

任何列都能作为索引。

列式存储的缺点:

select * from table_name;这类全表查询,需要数据重组;

INSERT/UPDATE比较麻烦。


create table page_views_orc_zlib
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS ORC 
TBLPROPERTIES(""="ZLIB")
as select * from page_views;
#默认是zlib,写不写都一样

create table page_views_orc_snappy
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS ORC 
TBLPROPERTIES(""="SNAPPY")
as select * from page_views;



create table page_views_parquet
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS PARQUET 
as select * from page_views;


set =gzip;
create table page_views_parquet_gzip
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
STORED AS PARQUET 
as select * from page_views;


【来自@若泽大数据】