The only way I came up for deleting a file from a zipfile was to create a temporary zipfile without the file to be deleted and then rename it to the original filename.
我从zipfile中删除文件的唯一方法是创建一个临时zipfile而不删除该文件,然后将其重命名为原始文件名。
In python 2.4 the ZipInfo class had an attribute file_offset
, so it was possible to create a second zip file and copy the data to other file without decompress/recompressing.
在python 2.4中,ZipInfo类有一个属性file_offset,因此可以创建第二个zip文件并将数据复制到其他文件而不进行解压缩/重新压缩。
This file_offset
is missing in python 2.6, so is there another option than creating another zipfile by uncompressing every file and then recompressing it again?
python 2.6中缺少这个file_offset,那么还有另一个选择,那就是通过解压缩每个文件再重新压缩它来创建另一个zipfile吗?
Is there maybe a direct way of deleting a file in the zipfile, I searched and didn't find anything.
是否有可能直接删除zipfile中的文件,我搜索并没有找到任何东西。
3 个解决方案
#1
37
The following snippet worked for me (deletes all *.exe files from a Zip archive):
以下代码片段适用于我(删除Zip存档中的所有* .exe文件):
zin = zipfile.ZipFile ('archive.zip', 'r')
zout = zipfile.ZipFile ('archve_new.zip', 'w')
for item in zin.infolist():
buffer = zin.read(item.filename)
if (item.filename[-4:] != '.exe'):
zout.writestr(item, buffer)
zout.close()
zin.close()
If you read everything into memory, you can eliminate the need for a second file. However, this snippet recompresses everything.
如果您将所有内容都读入内存,则可以省去第二个文件。但是,这个片段会重新压缩所有内容。
After closer inspection the ZipInfo.header_offset
is the offset from the file start. The name is misleading, but the main Zip header is actually stored at the end of the file. My hex editor confirms this.
仔细检查后,ZipInfo.header_offset是文件启动的偏移量。该名称具有误导性,但主Zip头实际上存储在文件的末尾。我的十六进制编辑确认了这一点
So the problem you'll run into is the following: You need to delete the directory entry in the main header as well or it will point to a file that doesn't exist anymore. Leaving the main header intact might work if you keep the local header of the file you're deleting as well, but I'm not sure about that. How did you do it with the old module?
因此,您将遇到的问题如下:您还需要删除主标头中的目录条目,否则它将指向不再存在的文件。如果你保留你正在删除的文件的本地标题,保留主标题可能会有效,但我不确定。你是怎么用旧模块做的?
Without modifying the main header I get an error "missing X bytes in zipfile" when I open it. This might help you to find out how to modify the main header.
在不修改主标题的情况下,当我打开它时,我得到一个错误“在zipfile中缺少X字节”。这可能有助于您了解如何修改主标头。
#2
2
Not very elegant but this is how I did it:
不是很优雅,但这就是我做的方式:
import subprocess
import zipfile
z = zipfile.ZipFile(zip_filename)
files_to_del = filter( lambda f: f.endswith('exe'), z.namelist()]
cmd=['zip', '-d', zip_filename] + files_to_del
subprocess.check_call(cmd)
# reload the modified archive
z = zipfile.ZipFile(zip_filename)
#3
0
The routine delete_from_zip_file
from ruamel.std.zipfile
¹ allows you to delete a file based on its full path within the ZIP, or based on (re
) patterns. E.g. you can delete all of the .exe
files from test.zip
using
ruamel.std.zipfile¹中的例程delete_from_zip_file允许您根据ZIP中的完整路径或基于(重新)模式删除文件。例如。您可以使用从test.zip删除所有.exe文件
from ruamel.std.zipfile import delete_from_zip_file
delete_from_zip_file('test.zip', pattern='.*.exe')
(please note the dot before the *
).
(请注意*之前的点)。
This works similar to mdm's solution (including the need for recompression), but recreates the ZIP file in memory (using the class InMemZipFile()
), overwriting the old file after it is fully read.
这类似于mdm的解决方案(包括需要重新压缩),但在内存中重新创建ZIP文件(使用类InMemZipFile()),在完全读取后覆盖旧文件。
¹ Disclaimer: I am the author of that package.
¹免责声明:我是该套餐的作者。
#1
37
The following snippet worked for me (deletes all *.exe files from a Zip archive):
以下代码片段适用于我(删除Zip存档中的所有* .exe文件):
zin = zipfile.ZipFile ('archive.zip', 'r')
zout = zipfile.ZipFile ('archve_new.zip', 'w')
for item in zin.infolist():
buffer = zin.read(item.filename)
if (item.filename[-4:] != '.exe'):
zout.writestr(item, buffer)
zout.close()
zin.close()
If you read everything into memory, you can eliminate the need for a second file. However, this snippet recompresses everything.
如果您将所有内容都读入内存,则可以省去第二个文件。但是,这个片段会重新压缩所有内容。
After closer inspection the ZipInfo.header_offset
is the offset from the file start. The name is misleading, but the main Zip header is actually stored at the end of the file. My hex editor confirms this.
仔细检查后,ZipInfo.header_offset是文件启动的偏移量。该名称具有误导性,但主Zip头实际上存储在文件的末尾。我的十六进制编辑确认了这一点
So the problem you'll run into is the following: You need to delete the directory entry in the main header as well or it will point to a file that doesn't exist anymore. Leaving the main header intact might work if you keep the local header of the file you're deleting as well, but I'm not sure about that. How did you do it with the old module?
因此,您将遇到的问题如下:您还需要删除主标头中的目录条目,否则它将指向不再存在的文件。如果你保留你正在删除的文件的本地标题,保留主标题可能会有效,但我不确定。你是怎么用旧模块做的?
Without modifying the main header I get an error "missing X bytes in zipfile" when I open it. This might help you to find out how to modify the main header.
在不修改主标题的情况下,当我打开它时,我得到一个错误“在zipfile中缺少X字节”。这可能有助于您了解如何修改主标头。
#2
2
Not very elegant but this is how I did it:
不是很优雅,但这就是我做的方式:
import subprocess
import zipfile
z = zipfile.ZipFile(zip_filename)
files_to_del = filter( lambda f: f.endswith('exe'), z.namelist()]
cmd=['zip', '-d', zip_filename] + files_to_del
subprocess.check_call(cmd)
# reload the modified archive
z = zipfile.ZipFile(zip_filename)
#3
0
The routine delete_from_zip_file
from ruamel.std.zipfile
¹ allows you to delete a file based on its full path within the ZIP, or based on (re
) patterns. E.g. you can delete all of the .exe
files from test.zip
using
ruamel.std.zipfile¹中的例程delete_from_zip_file允许您根据ZIP中的完整路径或基于(重新)模式删除文件。例如。您可以使用从test.zip删除所有.exe文件
from ruamel.std.zipfile import delete_from_zip_file
delete_from_zip_file('test.zip', pattern='.*.exe')
(please note the dot before the *
).
(请注意*之前的点)。
This works similar to mdm's solution (including the need for recompression), but recreates the ZIP file in memory (using the class InMemZipFile()
), overwriting the old file after it is fully read.
这类似于mdm的解决方案(包括需要重新压缩),但在内存中重新创建ZIP文件(使用类InMemZipFile()),在完全读取后覆盖旧文件。
¹ Disclaimer: I am the author of that package.
¹免责声明:我是该套餐的作者。