将AWS Lambda与Redshift连接 - 60秒后超时

时间:2021-08-16 23:07:54

I created an AWS Lambda function that:

我创建了一个AWS Lambda函数:

  • logs onto Redshift via JDBC URL
  • 通过JDBC URL登录Redshift

  • runs a query
  • 运行查询

Locally, using Node, I can successfully connect to the Redshift instance via JDBC, and execute a query.

在本地,使用Node,我可以通过JDBC成功连接到Redshift实例,并执行查询。

var conString = "postgresql://USER_NAME:PASSWORD@JDBC_URL”;
var client = new pg.Client(conString);
client.connect(function(err) {   
  if(err) {
            
      console.log('could not connect to redshift', err);
          
  }  
          
// omitted due to above error

However, when I execute the function on AWS Lambda (where it's wrapped in a async#waterfall block), AWS Cloudwatch logs tells me that the AWS Lambda function timed out after 60 seconds.

但是,当我在AWS Lambda上执行该函数(它包含在异步#瀑布块中)时,AWS Cloudwatch日志会告诉我AWS Lambda函数在60秒后超时。

Any ideas on why my function is not able to connect?

关于为什么我的功能无法连接的任何想法?

3 个解决方案

#1


3  

I find it's either you open your Redshift security group public to all sources, or none. Because a Lambda function isn't running on a fixed address or even a fixed range of IP addresses, which is completely transparent to users (AKA server-less).

我发现要么将Redshift安全组公开给所有来源,要么就是没有。因为Lambda函数不在固定地址甚至固定范围的IP地址上运行,这对用户完全透明(AKA无服务器)。

I just saw Amazon announced the new Lambda feature to support VPC yesterday. I guess if we can run a Redshift cluster in a VPC, this could solve the problem.

我刚看到亚马逊宣布新的Lambda功能,以支持VPC昨天。我想如果我们可以在VPC中运行Redshift集群,这可以解决问题。

#2


1  

If you are using serverless-framework v1.5.0, you should add:

如果您使用的是无服务器框架v1.5.0,则应添加:

iamRoleStatements: - Effect: Allow Action: - ec2:CreateNetworkInterface Resource: '*' - Effect: Allow Action: - ec2:DeleteNetworkInterface - ec2:DescribeNetworkInterfaces Resource: 'arn:aws:ec2:${self:provider.region}:*:network-interface/*'

iamRoleStatements: - 效果:允许操作: - ec2:CreateNetworkInterface资源:'*' - 效果:允许操作: - ec2:DeleteNetworkInterface - ec2:DescribeNetworkInterfaces资源:'arn:aws:ec2:$ {self:provider.region}:* :网络接口/ *”

Also should add all securityGroupIds to Inbounds Rules, like below: 将AWS Lambda与Redshift连接 -  60秒后超时

还应将所有securityGroupIds添加到Inbounds Rules,如下所示:

More info: https://serverless.com/framework/docs/providers/aws/guide/functions/#vpc-configuration

更多信息:https://serverless.com/framework/docs/providers/aws/guide/functions/#vpc-configuration

#3


1  

Going a step back, I would recommend to use Kinesis[1] firehose in order to connect lambda and redshift. This is better approach as suggested in docs[2].

退后一步,我建议使用Kinesis [1] firehose连接lambda和redshift。这是文档[2]中建议的更好的方法。

Kinesis can use s3 as intermediate storage to push data to redshift using copy command, automatically.

Kinesis可以使用s3作为中间存储来自动使用复制命令将数据推送到红移。

"A COPY command is the most efficient way to load a table. You can also add data to your tables using INSERT commands, though it is much less efficient than using COPY"

“COPY命令是加载表的最有效方法。您还可以使用INSERT命令向表中添加数据,尽管它比使用COPY效率低得多”

Footnotes: [1] http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html

脚注:[1] http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html

[2] http://docs.aws.amazon.com/redshift/latest/dg/t_Loading_data.html.

#1


3  

I find it's either you open your Redshift security group public to all sources, or none. Because a Lambda function isn't running on a fixed address or even a fixed range of IP addresses, which is completely transparent to users (AKA server-less).

我发现要么将Redshift安全组公开给所有来源,要么就是没有。因为Lambda函数不在固定地址甚至固定范围的IP地址上运行,这对用户完全透明(AKA无服务器)。

I just saw Amazon announced the new Lambda feature to support VPC yesterday. I guess if we can run a Redshift cluster in a VPC, this could solve the problem.

我刚看到亚马逊宣布新的Lambda功能,以支持VPC昨天。我想如果我们可以在VPC中运行Redshift集群,这可以解决问题。

#2


1  

If you are using serverless-framework v1.5.0, you should add:

如果您使用的是无服务器框架v1.5.0,则应添加:

iamRoleStatements: - Effect: Allow Action: - ec2:CreateNetworkInterface Resource: '*' - Effect: Allow Action: - ec2:DeleteNetworkInterface - ec2:DescribeNetworkInterfaces Resource: 'arn:aws:ec2:${self:provider.region}:*:network-interface/*'

iamRoleStatements: - 效果:允许操作: - ec2:CreateNetworkInterface资源:'*' - 效果:允许操作: - ec2:DeleteNetworkInterface - ec2:DescribeNetworkInterfaces资源:'arn:aws:ec2:$ {self:provider.region}:* :网络接口/ *”

Also should add all securityGroupIds to Inbounds Rules, like below: 将AWS Lambda与Redshift连接 -  60秒后超时

还应将所有securityGroupIds添加到Inbounds Rules,如下所示:

More info: https://serverless.com/framework/docs/providers/aws/guide/functions/#vpc-configuration

更多信息:https://serverless.com/framework/docs/providers/aws/guide/functions/#vpc-configuration

#3


1  

Going a step back, I would recommend to use Kinesis[1] firehose in order to connect lambda and redshift. This is better approach as suggested in docs[2].

退后一步,我建议使用Kinesis [1] firehose连接lambda和redshift。这是文档[2]中建议的更好的方法。

Kinesis can use s3 as intermediate storage to push data to redshift using copy command, automatically.

Kinesis可以使用s3作为中间存储来自动使用复制命令将数据推送到红移。

"A COPY command is the most efficient way to load a table. You can also add data to your tables using INSERT commands, though it is much less efficient than using COPY"

“COPY命令是加载表的最有效方法。您还可以使用INSERT命令向表中添加数据,尽管它比使用COPY效率低得多”

Footnotes: [1] http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html

脚注:[1] http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html

[2] http://docs.aws.amazon.com/redshift/latest/dg/t_Loading_data.html.