行列存储-Greenplum 数据库基础培训V1.2

时间:2024-05-15 15:14:48
【文件属性】:

文件名称:行列存储-Greenplum 数据库基础培训V1.2

文件大小:7.92MB

文件格式:PPT

更新时间:2024-05-15 15:14:48

Greenplum 中文

行列存储 Greenplum支持行或列存储模式 列模式目前只支持Append Only 如果常用的查询只取表中少量字段,则列模式效率更高,如查询需要取表中的大量字段,行模式效率更高 语法: CREATE TABLE sales2 (LIKE sales) WITH (appendonly=true, orientation=column); 效率比较测试: 测试1:需要去表中所有字段,此时行存储更快。 select * from dw_ods.s1_sema_scmcaccp_row where crdacct_nbr= '4033930000166380411'; 41秒 select * from dw_ods.s1_sema_scmcaccp_col where crdacct_nbr= '4033930000166380411'; 116秒 测试2:只取表中少量字段,列存储更快 select crdacct_status from dw_ods.s1_sema_scmcaccp_row where crdacct_nbr= '4033930000166380411'; 35秒 select crdacct_status from dw_ods.s1_sema_scmcaccp_col where crdacct_nbr= '4033930000166380411'; 3秒 Instructor Notes Table partitioning addresses the problem of supporting very large tables, such as fact tables, by allowing you to divide them into smaller and more manageable pieces. The advantages of partitioning are to improve query performance by scanning only the relevant data needed to satisfy a given query. It can also be used to facilitate database maintenance tasks such as rolling old data out of the data warehouse or speeding up the update of indexes. GPDB supports both range partitioning (division of data based on a numerical range, such as date or price) or list partitioning (division of data based on a list of values, such as gender or region), or a combination of both types. Table partitioning in GPDB works using table inheritance and constraints. Table inheritance creates a persistent relationship between a child table and its parent table(s), so that all of the schema information from the parent table propagates to its children. CHECK constraints limit the data a table can contain based on some defining criteria. These constraints are also used at runtime to determine which tables to scan in order to satisfy a given query. In Greenplum DB partitioned tables are also distributed across the segments as is any non-partitioned table. Partitioning is more of a tool to logically divide big tables to improve query performance and maintenance. It does not effect the physical distribution of the table data.


网友评论