删除“文件夹”中的所有文件或使用Java中的Google Cloud Bucket中的前缀

时间:2023-01-14 11:13:22

I know the idea of 'folders' is sort of non existent or different in Google Cloud Storage, but I need a way to delete all objects in a 'folder' or with a given prefix from Java.

我知道Google云端存储中“文件夹”的概念是不存在或不同的,但我需要一种方法来删除“文件夹”中的所有对象或使用Java中的给定前缀。

The GcsService has a delete function, but as far as I can tell it only takes 1 GscFilename object and does not honor wildcards (i.e., "folderName/**" did not work).

GcsService有一个删除函数,但据我所知它只需要1个GscFilename对象并且不支持通配符(即“folderName / **”不起作用)。

Any tips?

有小费吗?

2 个解决方案

#1


7  

The API only supports deleting a single object at a time. You can only request many deletions using many HTTP requests or by batching many delete requests. There is no API call to delete multiple objects using wildcards or the like. In order to delete all of the objects with a certain prefix, you'd need to list the objects, then make a delete call for each object that matches the pattern.

API仅支持一次删除单个对象。您只能使用许多HTTP请求或批量处理许多删除请求来请求许多删除。没有API调用使用通配符等删除多个对象。为了删除具有特定前缀的所有对象,您需要列出对象,然后对与该模式匹配的每个对象进行删除调用。

The command-line utility, gsutil, does exactly that when you ask it to delete the path "gs://bucket/dir/**. It fetches a list of objects matching that pattern, then it makes a delete call for each of them.

当你要求删除路径“gs:// bucket / dir / **时,命令行实用程序gsutil就是这样做的。它获取与该模式匹配的对象列表,然后对每个模块进行删除调用。他们。

If you need a quick solution, you could always have your Java program exec gsutil.

如果您需要快速解决方案,您可以始终拥有Java程序exec gsutil。

Here is the code that corresponds to the above answer in case anyone else wants to use it:

以下是与上述答案相对应的代码,以防其他人想要使用它:

public void deleteFolder(String bucket, String folderName) throws CoultNotDeleteFile {
  try
  {
    ListResult list = gcsService.list(bucket, new ListOptions.Builder().setPrefix(folderName).setRecursive(true).build());

    while(list.hasNext())
    {
      ListItem item = list.next();
      gcsService.delete(new GcsFilename(file.getBucket(), item.getName()));
    }
  }
  catch (IOException e)
  {
    //Error handling
  }
}

Note that object listing is eventually consistent, which means that this approach is not guaranteed to catch all objects if they were created recently.

请注意,对象列表最终是一致的,这意味着如果最近创建了对象,则无法保证捕获所有对象。

#2


0  

I realise this is an old question, but I just stumbled upon the same issue and found a different way to resolve it.

我意识到这是一个老问题,但我偶然发现了同样的问题并找到了解决问题的不同方法。

The Storage class in the Google Cloud Java Client for Storage includes a method to list the blobs in a bucket, which can also accept an option to set a prefix to filter results to blobs whose names begin with the prefix.

Google云端客户端存储中的存储类包含一个列出存储桶中blob的方法,该方法还可以接受设置前缀的选项,以便将结果过滤到名称以前缀开头的blob。

For example, deleting all the files with a given prefix from a bucket can be achieved like this:

例如,从桶中删除具有给定前缀的所有文件可以像这样实现:

Storage storage = StorageOptions.getDefaultInstance().getService();
Iterable<Blob> blobs = storage.list("bucket_name", Storage.BlobListOption.prefix("prefix")).iterateAll();
for (Blob blob : blobs) {
    blob.delete(Blob.BlobSourceOption.generationMatch());
}

#1


7  

The API only supports deleting a single object at a time. You can only request many deletions using many HTTP requests or by batching many delete requests. There is no API call to delete multiple objects using wildcards or the like. In order to delete all of the objects with a certain prefix, you'd need to list the objects, then make a delete call for each object that matches the pattern.

API仅支持一次删除单个对象。您只能使用许多HTTP请求或批量处理许多删除请求来请求许多删除。没有API调用使用通配符等删除多个对象。为了删除具有特定前缀的所有对象,您需要列出对象,然后对与该模式匹配的每个对象进行删除调用。

The command-line utility, gsutil, does exactly that when you ask it to delete the path "gs://bucket/dir/**. It fetches a list of objects matching that pattern, then it makes a delete call for each of them.

当你要求删除路径“gs:// bucket / dir / **时,命令行实用程序gsutil就是这样做的。它获取与该模式匹配的对象列表,然后对每个模块进行删除调用。他们。

If you need a quick solution, you could always have your Java program exec gsutil.

如果您需要快速解决方案,您可以始终拥有Java程序exec gsutil。

Here is the code that corresponds to the above answer in case anyone else wants to use it:

以下是与上述答案相对应的代码,以防其他人想要使用它:

public void deleteFolder(String bucket, String folderName) throws CoultNotDeleteFile {
  try
  {
    ListResult list = gcsService.list(bucket, new ListOptions.Builder().setPrefix(folderName).setRecursive(true).build());

    while(list.hasNext())
    {
      ListItem item = list.next();
      gcsService.delete(new GcsFilename(file.getBucket(), item.getName()));
    }
  }
  catch (IOException e)
  {
    //Error handling
  }
}

Note that object listing is eventually consistent, which means that this approach is not guaranteed to catch all objects if they were created recently.

请注意,对象列表最终是一致的,这意味着如果最近创建了对象,则无法保证捕获所有对象。

#2


0  

I realise this is an old question, but I just stumbled upon the same issue and found a different way to resolve it.

我意识到这是一个老问题,但我偶然发现了同样的问题并找到了解决问题的不同方法。

The Storage class in the Google Cloud Java Client for Storage includes a method to list the blobs in a bucket, which can also accept an option to set a prefix to filter results to blobs whose names begin with the prefix.

Google云端客户端存储中的存储类包含一个列出存储桶中blob的方法,该方法还可以接受设置前缀的选项,以便将结果过滤到名称以前缀开头的blob。

For example, deleting all the files with a given prefix from a bucket can be achieved like this:

例如,从桶中删除具有给定前缀的所有文件可以像这样实现:

Storage storage = StorageOptions.getDefaultInstance().getService();
Iterable<Blob> blobs = storage.list("bucket_name", Storage.BlobListOption.prefix("prefix")).iterateAll();
for (Blob blob : blobs) {
    blob.delete(Blob.BlobSourceOption.generationMatch());
}