如何从损坏的.tar.gz存档中恢复文件?

时间:2023-01-15 07:49:05

I have a large number of files in a .tar.gz archive. Checking the file type with the command

我在.tar.gz存档中有大量文件。使用命令检查文件类型

file SMS.tar.gz

gives the response

给出了回应

gzip compressed data - deflate method , max compression

When I try to extract the archive with gunzip, after a delay I receive the message

当我尝试使用gunzip提取存档时,延迟后我会收到消息

gunzip: SMS.tar.gz: unexpected end of file

Is there any way to recover even part of the archive?

有没有办法恢复部分档案?

3 个解决方案

#1


14  

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

你确定它是一个gzip文件吗?我首先运行'文件SMS.tar.gz'来验证。

Then I would read the The gzip Recovery Toolkit page.

然后我会阅读The gzip Recovery Toolkit页面。

#2


31  

Recovery is possible but it depends on what caused the corruption.

恢复是可能的,但这取决于导致腐败的原因。

If the file is just truncated, getting some partial result out is not too hard; just run

如果文件被截断,获得一些部分结果并不太难;赶紧跑

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

尽管最后出现错误,但仍能提供一些输出。

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

如果压缩文件有大块丢失块,那么在坏块之后它基本没有了。

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question.

如果压缩文件在很小的方面被系统地破坏(例如,以ASCII模式传输二进制文件,这会破坏整个文件中的回车和换行符),则可以恢复但需要相当多的自定义编程,这真的是值得的如果你绝对没有其他追索权(没有备份),数据值得付出很多努力。 (我已经成功完成了。)我在上一个问题中提到了这个场景。

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

.zip文件的答案有所不同,因为zip存档有多个单独压缩的成员,因此更有希望(尽管大多数商业工具都是虚假的,它们通过修补CRC来消除警告,而不是通过恢复良好的数据)。但你的问题是关于.tar.gz文件,这是一个有一个大成员的档案。

#3


3  

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:

这是我们遇到的一种可能的情况。我们有一个tar.gz文件,不会解压缩,尝试解压缩给出错误:

gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

我发现该文件最初可能是通过非二进制ftp连接上传的(我们不确定)。

The solution was relatively simple using the unix dos2unix utility

使用unix dos2unix实用程序解决方案相对简单

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

有效!这是一个很小的可能性,也许值得一试 - 它可能会帮助那里的人。

#1


14  

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

你确定它是一个gzip文件吗?我首先运行'文件SMS.tar.gz'来验证。

Then I would read the The gzip Recovery Toolkit page.

然后我会阅读The gzip Recovery Toolkit页面。

#2


31  

Recovery is possible but it depends on what caused the corruption.

恢复是可能的,但这取决于导致腐败的原因。

If the file is just truncated, getting some partial result out is not too hard; just run

如果文件被截断,获得一些部分结果并不太难;赶紧跑

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

尽管最后出现错误,但仍能提供一些输出。

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

如果压缩文件有大块丢失块,那么在坏块之后它基本没有了。

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question.

如果压缩文件在很小的方面被系统地破坏(例如,以ASCII模式传输二进制文件,这会破坏整个文件中的回车和换行符),则可以恢复但需要相当多的自定义编程,这真的是值得的如果你绝对没有其他追索权(没有备份),数据值得付出很多努力。 (我已经成功完成了。)我在上一个问题中提到了这个场景。

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

.zip文件的答案有所不同,因为zip存档有多个单独压缩的成员,因此更有希望(尽管大多数商业工具都是虚假的,它们通过修补CRC来消除警告,而不是通过恢复良好的数据)。但你的问题是关于.tar.gz文件,这是一个有一个大成员的档案。

#3


3  

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:

这是我们遇到的一种可能的情况。我们有一个tar.gz文件,不会解压缩,尝试解压缩给出错误:

gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

我发现该文件最初可能是通过非二进制ftp连接上传的(我们不确定)。

The solution was relatively simple using the unix dos2unix utility

使用unix dos2unix实用程序解决方案相对简单

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

有效!这是一个很小的可能性,也许值得一试 - 它可能会帮助那里的人。