I know, I know, who would want to compress or uncompressed large files in java. Completely unreasonable. For the moment suspend disbelief and assume I have a good reason for uncompressing a large zip File.
我知道,我知道,谁想要在java中压缩或解压缩大文件。完全不合理。目前暂停怀疑,并假设我有一个很好的理由解压缩大型zip文件。
Issue 1: ZipFile has a bug (bug # 6280693), sun has fixed this in java 1.6 (Mustang). The fix isn't isn't helpful as our software needs to support java 1.4. The bug, as I understand it, works like this. When the following code is run, Java allocates a chunk of memory large enough to hold the entire file.
问题1:ZipFile有一个bug(bug#6280693),sun已经在java 1.6(Mustang)中修复了这个问题。由于我们的软件需要支持java 1.4,因此修复程序没有用。据我所知,这个bug就像这样。运行以下代码时,Java会分配一大块足以容纳整个文件的内存。
ZipFile zipFile = new ZipFile("/tmp/myFile.zip");
If /tmp/myFile.zip is 4gb, java allocates 4gb. This causes an out of a heap exception. A heap size of +4gb is unfortunately not an acceptable solution. =(
如果/tmp/myFile.zip是4gb,则java分配4gb。这会导致堆异常。遗憾的是,堆大小为+ 4gb不是可接受的解决方案。 =(
Solution to issue 1: Use ZipInputStream, to deal with the file as a stream and thus reduce and control the memory footprint.
问题1的解决方案:使用ZipInputStream,将文件作为流处理,从而减少和控制内存占用。
byte[] buf = new byte[1024];
FileInputStream fs = new FileInputStream("/tmp/myFile.zip")
ZipInputStream zipIn = new ZipInputStream(fs);
ZipEntry ze = zipIn.getNextEntry();
while (ze != null){
while ((int cr = zipIn.read(buf, 0, 1024)) > -1)
System.out.write(buf, 0, len);
ze = zipIn.getNextEntry();
}
Issue 2: I would like access the ZipEntries randomly. That is, I would like to only uncompress one ZipEntry, without having to search through the entire stream. Currently I am building up a list of zipEntries, called zes:
问题2:我想随机访问ZipEntries。也就是说,我想只解压缩一个ZipEntry,而不必搜索整个流。目前我正在建立一个名为zes的zipEntries列表:
ZipInputStream zin = new ZipInputStream("/tmp/myFile.zip");
ZipEntry ze = zin.getNextEntry();
List<ZipEntry> zes = new ArrayList<ZipEntry>();
while(ze!=null){
zes.add(ze);
ze = zin.getNextEntry();
}
Then when I need to uncompress a particular zipEntry I iteratorate through all the zipEntries until I find the matching zipEntry, which I then uncompress.
然后当我需要解压缩一个特定的zipEntry时,我会遍历所有的zipEntries,直到找到匹配的zipEntry,然后我解压缩。
ZipEntry ze = in.getNextEntry();
while (! ze.getName().equals(queryZe.getName())){
ze = zin.getNextEntry();
}
int cr;
while ((cr = zin.read(buf)) > -1)
System.out.write(buf, 0, cr);
Quertion: ZipFile has the ability to randomly access ZipEntries.
Quertion:ZipFile具有随机访问ZipEntries的能力。
new BufferedInputStream(zipFile.getInputStream(zipEntry));
How can I get this same ability without using ZipFile?
如何在不使用ZipFile的情况下获得相同的能力?
Note that ZipInputStream has some rather strange behavior.
请注意,ZipInputStream有一些相当奇怪的行为。
Especially good documentation on java and ZipFiles can be found here:
关于java和ZipFiles的特别好的文档可以在这里找到:
http://commons.apache.org/compress/zip.html
Notes on switching replacing sun ZipFile with apache commons ZipFile as suggested in the answers:
关于切换替换sun ZipFile与apache commons ZipFile的注意事项,如答案所示:
- Sun's ZipFile.entries() always returns the ZipEntries in the order in which they occur in the file, whereas apache commons ZipFile.getEntries() returns the entries in random order. This caused an interesting bug, because some code was assuming that the entries would be "in order".
Sun的ZipFile.entries()始终按照它们在文件中出现的顺序返回ZipEntries,而apache commons ZipFile.getEntries()以随机顺序返回条目。这引起了一个有趣的错误,因为有些代码假设条目是“按顺序”。
2 个解决方案
#1
For this task, you may want to look at Apache Commons Compress, Apache Commons VFS, or TrueZip. All of these should be Java 1.4 compatible, and probably support the features you need.
对于此任务,您可能需要查看Apache Commons Compress,Apache Commons VFS或TrueZip。所有这些都应该与Java 1.4兼容,并且可能支持您需要的功能。
#2
You could look at Apache Commons Compress, which works with 1.4+, but I don't know if it exposes the same bug under the hood or not.
你可以看一下使用1.4+的Apache Commons Compress,但我不知道它是否暴露了相同的bug。
#1
For this task, you may want to look at Apache Commons Compress, Apache Commons VFS, or TrueZip. All of these should be Java 1.4 compatible, and probably support the features you need.
对于此任务,您可能需要查看Apache Commons Compress,Apache Commons VFS或TrueZip。所有这些都应该与Java 1.4兼容,并且可能支持您需要的功能。
#2
You could look at Apache Commons Compress, which works with 1.4+, but I don't know if it exposes the same bug under the hood or not.
你可以看一下使用1.4+的Apache Commons Compress,但我不知道它是否暴露了相同的bug。