压缩和解压缩java中的字符串数据

时间:2022-12-16 07:40:24

I am using the following code to compress and decompress string data, but the problem which I am facing is, it is easily getting compressed without error, but decompress method throws following error.

我使用以下代码来压缩和解压缩字符串数据,但我面临的问题是,它很容易被压缩而没有错误,但解压缩方法会引发跟随错误。

Exception in thread "main" java.io.IOException: Not in GZIP format

线程“main”中的异常java.io.IOException:不是GZIP格式

public static void main(String[] args) throws Exception {
        String string = "I am what I am hhhhhhhhhhhhhhhhhhhhhhhhhhhhh"
                + "bjggujhhhhhhhhh"
                + "rggggggggggggggggggggggggg"
                + "esfffffffffffffffffffffffffffffff"
                + "esffffffffffffffffffffffffffffffff"
                + "esfekfgy enter code here`etd`enter code here wdd"
                + "heljwidgutwdbwdq8d"
                + "skdfgysrdsdnjsvfyekbdsgcu"
                +"jbujsbjvugsduddbdj";

       System.out.println("after compress:");
        String compressed = compress(string);
        System.out.println(compressed);
        System.out.println("after decompress:");
        String decomp = decompress(compressed);
        System.out.println(decomp);
    }


     public static String compress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        System.out.println("String length : " + str.length());
        ByteArrayOutputStream obj=new ByteArrayOutputStream();
        GZIPOutputStream gzip = new GZIPOutputStream(obj);
        gzip.write(str.getBytes("UTF-8"));
        gzip.close();
        String outStr = obj.toString("UTF-8");
        System.out.println("Output String length : " + outStr.length());
        return outStr;
     }

      public static String decompress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        System.out.println("Input String length : " + str.length());
        GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(str.getBytes("UTF-8")));
        BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
        String outStr = "";
        String line;
        while ((line=bf.readLine())!=null) {
          outStr += line;
        }
        System.out.println("Output String lenght : " + outStr.length());
        return outStr;
     }

Still couldn't figure out how to fix this issue!!!

仍然无法弄清楚如何解决这个问题!

6 个解决方案

#1


33  

This is because of

这是因为

String outStr = obj.toString("UTF-8");

Send the byte[] which you can get from your ByteArrayOutputStream and use it as such in your ByteArrayInputStream to construct your GZIPInputStream. Following are the changes which need to be done in your code.

发送你可以从ByteArrayOutputStream获取的byte []并在ByteArrayInputStream中使用它来构造你的GZIPInputStream。以下是您的代码中需要进行的更改。

byte[] compressed = compress(string); //In the main method

public static byte[] compress(String str) throws Exception {
    ...
    ...
    return obj.toByteArray();
}

public static String decompress(byte[] bytes) throws Exception {
    ...
    GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
    ...
}

#2


20  

The above Answer solves our problem but in addition to that. if we are trying to decompress a uncompressed("not a zip format") byte[] . we will get "Not in GZIP format" exception message.

上述答案解决了我们的问题,但除此之外。如果我们试图解压缩未压缩的(“不是zip格式”)byte []。我们将获得“不是GZIP格式”的异常消息。

For solving that we can add addition code in our Class.

为了解决这个问题,我们可以在我们的类中添加额外代

public static boolean isCompressed(final byte[] compressed) {
    return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}

My Complete Compression Class with compress/decompress would look like:

我的压缩/解压缩完整压缩类看起来像:

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class GZIPCompression {
  public static byte[] compress(final String str) throws IOException {
    if ((str == null) || (str.length() == 0)) {
      return null;
    }
    ByteArrayOutputStream obj = new ByteArrayOutputStream();
    GZIPOutputStream gzip = new GZIPOutputStream(obj);
    gzip.write(str.getBytes("UTF-8"));
    gzip.flush();
    gzip.close();
    return obj.toByteArray();
  }

  public static String decompress(final byte[] compressed) throws IOException {
    final StringBuilder outStr = new StringBuilder();
    if ((compressed == null) || (compressed.length == 0)) {
      return "";
    }
    if (isCompressed(compressed)) {
      final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed));
      final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8"));

      String line;
      while ((line = bufferedReader.readLine()) != null) {
        outStr.append(line);
      }
    } else {
      outStr.append(compressed);
    }
    return outStr.toString();
  }

  public static boolean isCompressed(final byte[] compressed) {
    return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
  }
}

#3


12  

If you ever need to transfer the zipped content via network or store it as text, you have to use Base64 encoder(such as apache commons codec Base64) to convert the byte array to a Base64 String, and decode the string back to byte array at remote client. Found an example at Use Zip Stream and Base64 Encoder to Compress Large String Data!

如果您需要通过网络传输压缩内容或将其存储为文本,则必须使用Base64编码器(例如apache commons编解码器Base64)将字节数组转换为Base64字符串,并将字符串解码回字节数组at远程客户端。在Use Zip Stream和Base64 Encoder上找到一个压缩大字符串数据的例子!

#4


7  

The problem is this line:

问题是这一行:

    String outStr = obj.toString("UTF-8");

The byte array obj contains arbitrary binary data. You can't "decode" arbitrary binary data as if it was UTF-8. If you try you will get a String that cannot then be "encoded" back to bytes. Or at least, the bytes you get will be different to what you started with ... to the extent that they are no longer a valid GZIP stream.

字节数组obj包含任意二进制数据。您无法“解码”任意二进制数据,就像它是UTF-8一样。如果你尝试,你将得到一个不能被“编码”回字节的字符串。或者至少,你得到的字节将与你开始的字节不同......只要它们不再是有效的GZIP流。

The fix is to store or transmit the contents of the byte array as-is. Don't try to convert it into a String. It is binary data, not text.

修复是按原样存储或传输字节数组的内容。不要尝试将其转换为String。它是二进制数据,而不是文本。

#5


0  

You can't convert binary data to String. As a solution you can encode binary data and then convert to String. For example, look at this How do you convert binary data to Strings and back in Java?

您无法将二进制数据转换为String。作为解决方案,您可以编码二进制数据,然后转换为String。例如,看看这个如何将二进制数据转换为字符串并返回Java?

#6


0  

Another example of correct compression and decompression:

正确压缩和解压缩的另一个例子:

@Slf4j
public class GZIPCompression {
    public static byte[] compress(final String stringToCompress) {
        if (isNull(stringToCompress) || stringToCompress.length() == 0) {
            return null;
        }

        try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
            final GZIPOutputStream gzipOutput = new GZIPOutputStream(baos)) {
            gzipOutput.write(stringToCompress.getBytes(UTF_8));
            gzipOutput.finish();
            return baos.toByteArray();
        } catch (IOException e) {
            throw new UncheckedIOException("Error while compression!", e);
        }
    }

    public static String decompress(final byte[] compressed) {
        if (isNull(compressed) || compressed.length == 0) {
            return null;
        }

        try (final GZIPInputStream gzipInput = new GZIPInputStream(new ByteArrayInputStream(compressed));
             final StringWriter stringWriter = new StringWriter()) {
            IOUtils.copy(gzipInput, stringWriter, UTF_8);
            return stringWriter.toString();
        } catch (IOException e) {
            throw new UncheckedIOException("Error while decompression!", e);
        }
    }
}

#1


33  

This is because of

这是因为

String outStr = obj.toString("UTF-8");

Send the byte[] which you can get from your ByteArrayOutputStream and use it as such in your ByteArrayInputStream to construct your GZIPInputStream. Following are the changes which need to be done in your code.

发送你可以从ByteArrayOutputStream获取的byte []并在ByteArrayInputStream中使用它来构造你的GZIPInputStream。以下是您的代码中需要进行的更改。

byte[] compressed = compress(string); //In the main method

public static byte[] compress(String str) throws Exception {
    ...
    ...
    return obj.toByteArray();
}

public static String decompress(byte[] bytes) throws Exception {
    ...
    GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
    ...
}

#2


20  

The above Answer solves our problem but in addition to that. if we are trying to decompress a uncompressed("not a zip format") byte[] . we will get "Not in GZIP format" exception message.

上述答案解决了我们的问题,但除此之外。如果我们试图解压缩未压缩的(“不是zip格式”)byte []。我们将获得“不是GZIP格式”的异常消息。

For solving that we can add addition code in our Class.

为了解决这个问题,我们可以在我们的类中添加额外代

public static boolean isCompressed(final byte[] compressed) {
    return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}

My Complete Compression Class with compress/decompress would look like:

我的压缩/解压缩完整压缩类看起来像:

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class GZIPCompression {
  public static byte[] compress(final String str) throws IOException {
    if ((str == null) || (str.length() == 0)) {
      return null;
    }
    ByteArrayOutputStream obj = new ByteArrayOutputStream();
    GZIPOutputStream gzip = new GZIPOutputStream(obj);
    gzip.write(str.getBytes("UTF-8"));
    gzip.flush();
    gzip.close();
    return obj.toByteArray();
  }

  public static String decompress(final byte[] compressed) throws IOException {
    final StringBuilder outStr = new StringBuilder();
    if ((compressed == null) || (compressed.length == 0)) {
      return "";
    }
    if (isCompressed(compressed)) {
      final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed));
      final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8"));

      String line;
      while ((line = bufferedReader.readLine()) != null) {
        outStr.append(line);
      }
    } else {
      outStr.append(compressed);
    }
    return outStr.toString();
  }

  public static boolean isCompressed(final byte[] compressed) {
    return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
  }
}

#3


12  

If you ever need to transfer the zipped content via network or store it as text, you have to use Base64 encoder(such as apache commons codec Base64) to convert the byte array to a Base64 String, and decode the string back to byte array at remote client. Found an example at Use Zip Stream and Base64 Encoder to Compress Large String Data!

如果您需要通过网络传输压缩内容或将其存储为文本,则必须使用Base64编码器(例如apache commons编解码器Base64)将字节数组转换为Base64字符串,并将字符串解码回字节数组at远程客户端。在Use Zip Stream和Base64 Encoder上找到一个压缩大字符串数据的例子!

#4


7  

The problem is this line:

问题是这一行:

    String outStr = obj.toString("UTF-8");

The byte array obj contains arbitrary binary data. You can't "decode" arbitrary binary data as if it was UTF-8. If you try you will get a String that cannot then be "encoded" back to bytes. Or at least, the bytes you get will be different to what you started with ... to the extent that they are no longer a valid GZIP stream.

字节数组obj包含任意二进制数据。您无法“解码”任意二进制数据,就像它是UTF-8一样。如果你尝试,你将得到一个不能被“编码”回字节的字符串。或者至少,你得到的字节将与你开始的字节不同......只要它们不再是有效的GZIP流。

The fix is to store or transmit the contents of the byte array as-is. Don't try to convert it into a String. It is binary data, not text.

修复是按原样存储或传输字节数组的内容。不要尝试将其转换为String。它是二进制数据,而不是文本。

#5


0  

You can't convert binary data to String. As a solution you can encode binary data and then convert to String. For example, look at this How do you convert binary data to Strings and back in Java?

您无法将二进制数据转换为String。作为解决方案,您可以编码二进制数据,然后转换为String。例如,看看这个如何将二进制数据转换为字符串并返回Java?

#6


0  

Another example of correct compression and decompression:

正确压缩和解压缩的另一个例子:

@Slf4j
public class GZIPCompression {
    public static byte[] compress(final String stringToCompress) {
        if (isNull(stringToCompress) || stringToCompress.length() == 0) {
            return null;
        }

        try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
            final GZIPOutputStream gzipOutput = new GZIPOutputStream(baos)) {
            gzipOutput.write(stringToCompress.getBytes(UTF_8));
            gzipOutput.finish();
            return baos.toByteArray();
        } catch (IOException e) {
            throw new UncheckedIOException("Error while compression!", e);
        }
    }

    public static String decompress(final byte[] compressed) {
        if (isNull(compressed) || compressed.length == 0) {
            return null;
        }

        try (final GZIPInputStream gzipInput = new GZIPInputStream(new ByteArrayInputStream(compressed));
             final StringWriter stringWriter = new StringWriter()) {
            IOUtils.copy(gzipInput, stringWriter, UTF_8);
            return stringWriter.toString();
        } catch (IOException e) {
            throw new UncheckedIOException("Error while decompression!", e);
        }
    }
}