如何在Amazon Redshift中创建索引

时间:2021-12-03 23:04:49

I'm trying to create indexes in Amazon Redshift but I received an error

我正在尝试在Amazon Redshift中创建索引但是我收到了错误

create index on session_log(UserId);

UserId is an integer field.

UserId是一个整数字段。

3 个解决方案

#1


36  

If you try and create an index (with a name) on a Redshift table:

如果您尝试在Redshift表上创建索引(带有名称):

create index IX1 on "SomeTable"("UserId");

You'll receive the error

你会收到错误

An error occurred when executing the SQL command: create index IX1 on "SomeTable"("UserId") ERROR: SQL command "create index IX1 on "SomeTable"("UserId")" not supported on Redshift tables.

执行SQL命令时发生错误:在“SomeTable”(“UserId”)上创建索引IX1错误:SQL命令“创建索引IX1 on”SomeTable“(”UserId“)”Redshift表不支持。

This is because, like other data warehouses, Redshift uses columnar storage, and as a result, many of the indexing techniques (like adding non-clustered indexes) used in other RDBMS aren't applicable.

这是因为,与其他数据仓库一样,Redshift使用列式存储,因此,其他RDBMS中使用的许多索引技术(如添加非聚簇索引)都不适用。

You do however have the option of providing a single sort key per table, and you can also influence performance with a distribution key for sharding your data, and selecting appropriate compression encodings for each column to minimize storage and I/O overheads.

但是,您可以选择为每个表提供单个排序键,还可以使用分发键来影响性能,以便对每个列进行分片,并为每个列选择适当的压缩编码,以最大限度地减少存储和I / O开销。

For example, in your case, you may elect to use UserId as a sort key:

例如,在您的情况下,您可以选择使用UserId作为排序键:

create table if not exists "SomeTable"
(
    "UserId" int,
    "Name" text
)
sortkey("UserId");

You might want to read a few primers like these

你可能想读一些像这样的引物

#2


3  

Redshift allow to create primary key

Redshift允许创建主键

create table user (
id int ,
phone_number int,
primary key(id))

but since Redshift does not enforce this constraints, primary key accepts duplicate values.

但由于Redshift不强制执行此约束,因此主键接受重复值。

attached article on that issue

关于该问题的附件

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/

#3


1  

You can Define Constraints but will be informational only, as Amazon says: they are not enforced by Amazon Redshift. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity.

您可以定义约束,但仅限信息,如亚马逊所说:Amazon Redshift不强制执行约束。尽管如此,主键和外键用作计划提示,如果您的ETL过程或应用程序中的某些其他过程强制执行其完整性,则应声明它们。

Some services like pipelines with insert mode (REPLACE_EXISTING) will need a primary key defined in your table.

某些服务(如带有插入模式的管道(REPLACE_EXISTING))需要在表中定义主键。

For other performance purposes the Stuart's response is correct.

出于其他性能目的,Stuart的回答是正确的。

#1


36  

If you try and create an index (with a name) on a Redshift table:

如果您尝试在Redshift表上创建索引(带有名称):

create index IX1 on "SomeTable"("UserId");

You'll receive the error

你会收到错误

An error occurred when executing the SQL command: create index IX1 on "SomeTable"("UserId") ERROR: SQL command "create index IX1 on "SomeTable"("UserId")" not supported on Redshift tables.

执行SQL命令时发生错误:在“SomeTable”(“UserId”)上创建索引IX1错误:SQL命令“创建索引IX1 on”SomeTable“(”UserId“)”Redshift表不支持。

This is because, like other data warehouses, Redshift uses columnar storage, and as a result, many of the indexing techniques (like adding non-clustered indexes) used in other RDBMS aren't applicable.

这是因为,与其他数据仓库一样,Redshift使用列式存储,因此,其他RDBMS中使用的许多索引技术(如添加非聚簇索引)都不适用。

You do however have the option of providing a single sort key per table, and you can also influence performance with a distribution key for sharding your data, and selecting appropriate compression encodings for each column to minimize storage and I/O overheads.

但是,您可以选择为每个表提供单个排序键,还可以使用分发键来影响性能,以便对每个列进行分片,并为每个列选择适当的压缩编码,以最大限度地减少存储和I / O开销。

For example, in your case, you may elect to use UserId as a sort key:

例如,在您的情况下,您可以选择使用UserId作为排序键:

create table if not exists "SomeTable"
(
    "UserId" int,
    "Name" text
)
sortkey("UserId");

You might want to read a few primers like these

你可能想读一些像这样的引物

#2


3  

Redshift allow to create primary key

Redshift允许创建主键

create table user (
id int ,
phone_number int,
primary key(id))

but since Redshift does not enforce this constraints, primary key accepts duplicate values.

但由于Redshift不强制执行此约束,因此主键接受重复值。

attached article on that issue

关于该问题的附件

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/

#3


1  

You can Define Constraints but will be informational only, as Amazon says: they are not enforced by Amazon Redshift. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity.

您可以定义约束,但仅限信息,如亚马逊所说:Amazon Redshift不强制执行约束。尽管如此,主键和外键用作计划提示,如果您的ETL过程或应用程序中的某些其他过程强制执行其完整性,则应声明它们。

Some services like pipelines with insert mode (REPLACE_EXISTING) will need a primary key defined in your table.

某些服务(如带有插入模式的管道(REPLACE_EXISTING))需要在表中定义主键。

For other performance purposes the Stuart's response is correct.

出于其他性能目的,Stuart的回答是正确的。