Java在文件中存储布尔数组并快速读取

时间:2021-05-29 21:22:39

I need to store boolean array with 80,000 items in file. I don't care how much time saving takes, I'm interested only in the loading time of array. I did't try to store it by DataOutputStream because it requires access for each value.

我需要在文件中存储包含80,000个项目的布尔数组。我不在乎节省多少时间,我只对数组的加载时间感兴趣。我没有尝试通过DataOutputStream存储它,因为它需要访问每个值。

I tried to make this by 3 approaches, such as:

我尝试通过3种方法来实现这一点,例如:

  1. serialize boolean array
  2. serialize布尔数组

  3. use BitSet instead of boolean array an serialize it
  4. 使用BitSet而不是布尔数组来序列化它

  5. transfer boolean array into byte array, where 1 is true and 0 is false appropriately and write it by FileChannel using ByteBuffer
  6. 将布尔数组转换为字节数组,其中1为真,0为适当的假,并使用ByteBuffer由FileChannel写入

To test reading from files by these approaches, I had run each approach 1,000 times in loop. So I got results which look like this:

为了通过这些方法测试文件读取,我已经循环运行了每个方法1000次。所以我得到的结果如下:

  1. deserialization of boolean array takes 574 ms
  2. 布尔数组的反序列化需要574 ms

  3. deserialization of BitSet - 379 ms
  4. BitSet的反序列化 - 379 ms

  5. getting byte array from FileChannel by MappedByteBuffer - 170 ms
  6. 通过MappedByteBuffer从FileChannel获取字节数组 - 170毫秒

The first and second approaches are too long, the third, perhaps, is not approach at all.

第一种和第二种方法太长,第三种方法或许根本不是方法。

Perhaps there are a best way to accomplish it, so I need your advice

也许有最好的方法来实现它,所以我需要你的建议

EDIT

Each method ran once

每个方法都运行一次

  1. 13.8
  2. 8.71
  3. 6.46 ms appropriatively
  4. 6.46毫秒专用

1 个解决方案

#1


What about writing a byte for each boolean and develop a custom parser? This will propably one of the fastest methods. If you want to save space you could also put 8 booleans into one byte but this would require some bit shifting operations.

如何为每个布尔值写一个字节并开发自定义解析器?这可能是最快的方法之一。如果你想节省空间,你也可以将8个布尔值放入一个字节,但这需要一些位移操作。

Here is a short example code:

这是一个简短的示例代码:

public void save() throws IOException
{
    boolean[] testData = new boolean[80000];
    for(int X=0;X < testData.length; X++)
    {
        testData[X] = Math.random() > 0.5;
    }
    FileOutputStream stream = new FileOutputStream(new File("test.bin"));

    for (boolean item : testData)
    {
        stream.write(item ? 1 : 0);
    }
    stream.close();
}

public boolean[] load() throws IOException
{
    long start = System.nanoTime();
    File file = new File("test.bin");
    FileInputStream inputStream = new FileInputStream(file);
    int fileLength = (int) file.length();

    byte[] data = new byte[fileLength];
    boolean[] output = new boolean[fileLength];

    inputStream.read(data);
    for (int X = 0; X < data.length; X++)
    {
        if (data[X] != 0)
        {
            output[X] = true;
            continue;
        }
        output[X] = false;
    }
    long end = System.nanoTime() - start;
    Console.log("Time: " + end);
    return output;
}

It takes about 2ms to load 80.000 booleans. Tested with JDK 1.8.0_45

加载80,000个布尔值需要大约2毫秒。使用JDK 1.8.0_45进行测试

#1


What about writing a byte for each boolean and develop a custom parser? This will propably one of the fastest methods. If you want to save space you could also put 8 booleans into one byte but this would require some bit shifting operations.

如何为每个布尔值写一个字节并开发自定义解析器?这可能是最快的方法之一。如果你想节省空间,你也可以将8个布尔值放入一个字节,但这需要一些位移操作。

Here is a short example code:

这是一个简短的示例代码:

public void save() throws IOException
{
    boolean[] testData = new boolean[80000];
    for(int X=0;X < testData.length; X++)
    {
        testData[X] = Math.random() > 0.5;
    }
    FileOutputStream stream = new FileOutputStream(new File("test.bin"));

    for (boolean item : testData)
    {
        stream.write(item ? 1 : 0);
    }
    stream.close();
}

public boolean[] load() throws IOException
{
    long start = System.nanoTime();
    File file = new File("test.bin");
    FileInputStream inputStream = new FileInputStream(file);
    int fileLength = (int) file.length();

    byte[] data = new byte[fileLength];
    boolean[] output = new boolean[fileLength];

    inputStream.read(data);
    for (int X = 0; X < data.length; X++)
    {
        if (data[X] != 0)
        {
            output[X] = true;
            continue;
        }
        output[X] = false;
    }
    long end = System.nanoTime() - start;
    Console.log("Time: " + end);
    return output;
}

It takes about 2ms to load 80.000 booleans. Tested with JDK 1.8.0_45

加载80,000个布尔值需要大约2毫秒。使用JDK 1.8.0_45进行测试