Javascript AWS SDK S3上传方法,Body流生成空文件

时间:2021-12-22 23:06:11

I'm trying to use the method upload from s3 using a ReadableStream from the module fs.

我正在尝试使用模块fs中的ReadableStream从s3使用方法上传。

The documentation says that a ReadableStream can be used at Bodyparam:

文档说可以在Bodyparam中使用ReadableStream:

Body — (Buffer, Typed Array, Blob, String, ReadableStream) Object data.

Body - (Buffer,Typed Array,Blob,String,ReadableStream)对象数据。

Also the upload method description is:

上传方法说明也是:

Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.

如果有效负载足够大,则使用智能并发处理部件来上传任意大小的缓冲区,blob或流。

Also, here: Upload pdf generated to AWS S3 using nodejs aws sdk the @shivendra says he can use a ReadableStream and it works.

此外,这里:使用nodejs aws sdk将生成的pdf上传到AWS S3,@ shivendra说他可以使用ReadableStream并且它可以工作。

This is my code:

这是我的代码:

const fs = require('fs')
const S3 = require('aws-sdk/clients/s3')

const s3 = new S3()

const send = async () => {
  const rs = fs.createReadStream('/home/osman/Downloads/input.txt')
  rs.on('open', () => {
    console.log('OPEN')
  })
  rs.on('end', () => {
    console.log('END')
  })
  rs.on('close', () => {
    console.log('CLOSE')
  })
  rs.on('data', (chunk) => {
    console.log('DATA: ', chunk)
  })

  console.log('START UPLOAD')

  const response = await s3.upload({
    Bucket: 'test-bucket',
    Key: 'output.txt',
    Body: rs,
  }).promise()

  console.log('response:')
  console.log(response)
}

send().catch(err => { console.log(err) })

It's getting this output:

它得到了这个输出:

START UPLOAD
OPEN
DATA: <Buffer 73 6f 6d 65 74 68 69 6e 67>
END
CLOSE
response:
{ ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
  Location: 'https://test-bucket.s3.amazonaws.com/output.txt',
  key: 'output.txt',
  Key: 'output.txt',
  Bucket: 'test-bucket' }

The problem is that my file generated at S3 (output.txt) has 0 Bytes.

问题是我在S3(output.txt)生成的文件有0字节。

Someone know what am I doing wrong?

有人知道我做错了什么?

If I pass a buffer on Body it works.

如果我在Body上传递一个缓冲区就可以了。

Body: Buffer.alloc(8 * 1024 * 1024, 'something'), 

But it's not what I want to do. I'd like to do this using a stream to generate a file and pipe a stream to S3 as long as I generate it.

但这不是我想要做的。我想使用流来生成文件并将流传输到S3,只要我生成它。

2 个解决方案

#1


6  

It's an API interface issue using NodeJS ReadableStreams. Just comment the code related to listen event 'data', solves the problem.

这是使用NodeJS ReadableStreams的API接口问题。只需注释与侦听事件“数据”相关的代码,即可解决问题。

const fs = require('fs')
const S3 = require('aws-sdk/clients/s3')

const s3 = new S3()

const send = async () => {
  const rs = fs.createReadStream('/home/osman/Downloads/input.txt')
  rs.on('open', () => {
    console.log('OPEN')
  })
  rs.on('end', () => {
    console.log('END')
  })
  rs.on('close', () => {
    console.log('CLOSE')
  })
  // rs.on('data', (chunk) => {
  //   console.log('DATA: ', chunk)
  // })

  console.log('START UPLOAD')

  const response = await s3.upload({
    Bucket: 'test-bucket',
    Key: 'output.txt',
    Body: rs,
  }).promise()

  console.log('response:')
  console.log(response)
}

send().catch(err => { console.log(err) })

Though it's an strange API, when we listen to 'data' event, the ReadableStream starts the flowing mode (listening to an event changing publisher/EventEmitter state? Yes, very error prone...). For some reason the S3 need a paused ReadableStream. If whe put rs.on('data'...) after await s3.upload(...) it works. If we put rs.pause() after rs.on('data'...) and befote await s3.upload(...), it works too.

虽然这是一个奇怪的API,但当我们收听'data'事件时,ReadableStream会启动流动模式(监听事件更改发布者/ EventEmitter状态?是的,非常容易出错...)。出于某种原因,S3需要暂停的ReadableStream。如果在等待s3.upload(...)之后把rs.on('data'...)放进去就行了。如果我们在rs.on('data'...)之后放入rs.pause()并且befote等待s3.upload(...),它也可以。

Now, what does it happen? I don't know yet...

现在,它发生了什么?我还不知道......

But the problem was solved, even it isn't completely explained.

但问题已经解决,即使没有完全解释。

#2


1  

  1. Check if file /home/osman/Downloads/input.txt actually exists and accessible by node.js process
  2. 检查文件/home/osman/Downloads/input.txt是否实际存在并可由node.js进程访问
  3. Consider to use putObject method
  4. 考虑使用putObject方法

Example:

例:

const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');

const s3 = new S3();

s3.putObject({
  Bucket: 'test-bucket',
  Key: 'output.txt',
  Body: fs.createReadStream('/home/osman/Downloads/input.txt'),
}, (err, response) => {
  if (err) {
    throw err;
  }
  console.log('response:')
  console.log(response)
});

Not sure how this will work with async .. await, better to make upload to AWS:S3 work first, then change the flow.

不知道这将如何与异步...等待,更好地上传到AWS:S3先工作,然后改变流程。


UPDATE: Try to implement upload directly via ManagedUpload

更新:尝试直接通过ManagedUpload实现上传

const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');

const s3 = new S3();

const upload = new S3.ManagedUpload({
  service: s3,
  params: {
    Bucket: 'test-bucket',
    Key: 'output.txt',
    Body: fs.createReadStream('/home/osman/Downloads/input.txt')
  }
});

upload.send((err, response) => {
  if (err) {
    throw err;
  }
  console.log('response:')
  console.log(response)
});

#1


6  

It's an API interface issue using NodeJS ReadableStreams. Just comment the code related to listen event 'data', solves the problem.

这是使用NodeJS ReadableStreams的API接口问题。只需注释与侦听事件“数据”相关的代码,即可解决问题。

const fs = require('fs')
const S3 = require('aws-sdk/clients/s3')

const s3 = new S3()

const send = async () => {
  const rs = fs.createReadStream('/home/osman/Downloads/input.txt')
  rs.on('open', () => {
    console.log('OPEN')
  })
  rs.on('end', () => {
    console.log('END')
  })
  rs.on('close', () => {
    console.log('CLOSE')
  })
  // rs.on('data', (chunk) => {
  //   console.log('DATA: ', chunk)
  // })

  console.log('START UPLOAD')

  const response = await s3.upload({
    Bucket: 'test-bucket',
    Key: 'output.txt',
    Body: rs,
  }).promise()

  console.log('response:')
  console.log(response)
}

send().catch(err => { console.log(err) })

Though it's an strange API, when we listen to 'data' event, the ReadableStream starts the flowing mode (listening to an event changing publisher/EventEmitter state? Yes, very error prone...). For some reason the S3 need a paused ReadableStream. If whe put rs.on('data'...) after await s3.upload(...) it works. If we put rs.pause() after rs.on('data'...) and befote await s3.upload(...), it works too.

虽然这是一个奇怪的API,但当我们收听'data'事件时,ReadableStream会启动流动模式(监听事件更改发布者/ EventEmitter状态?是的,非常容易出错...)。出于某种原因,S3需要暂停的ReadableStream。如果在等待s3.upload(...)之后把rs.on('data'...)放进去就行了。如果我们在rs.on('data'...)之后放入rs.pause()并且befote等待s3.upload(...),它也可以。

Now, what does it happen? I don't know yet...

现在,它发生了什么?我还不知道......

But the problem was solved, even it isn't completely explained.

但问题已经解决,即使没有完全解释。

#2


1  

  1. Check if file /home/osman/Downloads/input.txt actually exists and accessible by node.js process
  2. 检查文件/home/osman/Downloads/input.txt是否实际存在并可由node.js进程访问
  3. Consider to use putObject method
  4. 考虑使用putObject方法

Example:

例:

const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');

const s3 = new S3();

s3.putObject({
  Bucket: 'test-bucket',
  Key: 'output.txt',
  Body: fs.createReadStream('/home/osman/Downloads/input.txt'),
}, (err, response) => {
  if (err) {
    throw err;
  }
  console.log('response:')
  console.log(response)
});

Not sure how this will work with async .. await, better to make upload to AWS:S3 work first, then change the flow.

不知道这将如何与异步...等待,更好地上传到AWS:S3先工作,然后改变流程。


UPDATE: Try to implement upload directly via ManagedUpload

更新:尝试直接通过ManagedUpload实现上传

const fs = require('fs');
const S3 = require('aws-sdk/clients/s3');

const s3 = new S3();

const upload = new S3.ManagedUpload({
  service: s3,
  params: {
    Bucket: 'test-bucket',
    Key: 'output.txt',
    Body: fs.createReadStream('/home/osman/Downloads/input.txt')
  }
});

upload.send((err, response) => {
  if (err) {
    throw err;
  }
  console.log('response:')
  console.log(response)
});