如何在Java中有效地存储小字节数组?

时间:2022-05-01 21:29:37

By small byte arrays I mean arrays of bytes with length from 10 up to 30.

所谓小字节数组,是指长度从10到30的字节数组。

By store I mean storing them in the RAM, not serializing and persisting to the filesystem.

通过存储,我的意思是将它们存储在RAM中,而不是序列化和持久化到文件系统中。

System macOS 10.12.6, Oracle jdk1.8.0_141 64bit, JVM args -Xmx1g

系统macOS 10.12.6, Oracle jdk1.8.0_141 64位,JVM args -Xmx1g

Example: Expected behavior for new byte[200 * 1024 * 1024] is ≈200mb of the heap space

例子:预期行为的新字节(200 * 1024 * 1024)≈200 mb的堆空间

public static final int TARGET_SIZE = 200 * 1024 * 1024;
public static void main(String[] args) throws InterruptedException {
    byte[] arr = new byte[TARGET_SIZE];
    System.gc();
    System.out.println("Array size: " + arr.length);
    System.out.println("HeapSize: " + Runtime.getRuntime().totalMemory());
    Thread.sleep(60000);
}

如何在Java中有效地存储小字节数组? 如何在Java中有效地存储小字节数组?

However for smaller arrays math is not so simple

public static final int TARGET_SIZE = 200 * 1024 * 1024;
public static void main(String[] args) throws InterruptedException {
    final int oneArraySize = 20;
    final int numberOfArrays = TARGET_SIZE / oneArraySize;
    byte[][] arrays = new byte[numberOfArrays][];
    for (int i = 0; i < numberOfArrays; i++) {
        arrays[i] = new byte[oneArraySize];
    }
    System.gc();
    System.out.println("Arrays size: " + arrays.length);
    System.out.println("HeapSize: " + Runtime.getRuntime().totalMemory());
    Thread.sleep(60000);
}

如何在Java中有效地存储小字节数组? 如何在Java中有效地存储小字节数组?

And even worse

如何在Java中有效地存储小字节数组? 如何在Java中有效地存储小字节数组?

Question is

From where this overhead is coming? How to efficiently store and work with small byte arrays (chunks of data)?

从这个开销的什么地方来?如何有效地存储和处理小字节数组(数据块)?

Update 1

for new byte[200*1024*1024][1] it eats 如何在Java中有效地存储小字节数组? 如何在Java中有效地存储小字节数组?

对于新的字节[200*1024*1024][1]

Basic math says that new byte[1] weights 24 bytes.

基本的数学计算表明新的字节[1]重24字节。

Update 2

According to What is the memory consumption of an object in Java? the minimum size of an object in Java is 16 bytes. From my previous "measurements" 24 bytes -4 bytes for int length -1 actual byte of my data = 3 bytes of some other garbage padding.

根据Java中对象的内存消耗是多少?Java中对象的最小大小是16字节。从我之前的“度量”中,24字节-4字节表示整数长度-1的实际字节,我的数据=其他一些垃圾填充的3字节。

2 个解决方案

#1


3  

The answer by Eugene explains the reason of why you are observing such an increase in memory consumption for a large number of arrays. The question in the title, "How to efficiently store small byte arrays in Java?", may then be answered with: Not at all. 1

尤金的答案解释了为什么你观察到大量数组的内存消耗增加的原因。题目中的问题,“如何有效地在Java中存储小字节数组?”,然后可以这样回答:一点也不。1

However, there probably are ways to achieve your goals. As usual, the "best" solution here will depend on how this data is going to be used. A very pragmatic approach would be: Define an interface for your data structure.

然而,可能有一些方法可以实现你的目标。和往常一样,这里的“最佳”解决方案将取决于如何使用这些数据。一种非常实用的方法是:为数据结构定义接口。

In the simplest case, this interface could just be

在最简单的情况下,这个接口可能就是

interface ByteArray2D 
{
    int getNumRows();
    int getNumColumns();
    byte get(int r, int c);
    void set(int r, int c, byte b);
}

Offering a basic abstraction of a "2D byte array". Depending on the application case, it may be beneficial to offer additional methods here. The patterns that could be employed here are frequently relevant for Matrix libraries, which handle "2D matrices" (usually of float values), and they often offer methods like these:

提供“2D字节数组”的基本抽象。根据应用程序的情况,在这里提供其他方法可能是有益的。这里可以使用的模式通常与处理“2D矩阵”(通常是浮点数)的矩阵库相关,它们通常提供如下方法:

interface Matrix {
    Vector getRow(int row);
    Vector getColumn(int column);
    ...
}

However, when the main purpose here is to handle a set of byte[] arrays, methods for accessing each array (that is, each row of the 2D array) could be sufficient:

然而,当这里的主要目的是处理一组字节[]数组时,访问每个数组(即二维数组的每一行)的方法就足够了:

ByteBuffer getRow(int row);

Given this interface, it is simple to create different implementations. For example, you could create a simple implementation that just stores a 2D byte[][] array internally:

考虑到这个接口,创建不同的实现很简单。例如,您可以创建一个简单的实现,该实现只在内部存储一个2D字节[][][]]数组:

class SimpleByteArray2D implements ByteArray2D 
{
    private final byte array[][];
    ...
}

Alternatively, you could create an implementation that stores a 1D byte[] array, or analogously, a ByteBuffer internally:

或者,您可以创建一个实现来存储一个1D字节[]数组,或者类似地,在内部存储一个ByteBuffer:

class CompactByteArray2D implements ByteArray2D
{
    private final ByteBuffer buffer;
    ...
}

This implementation then just has to compute the (1D) index when calling one of the methods for accessing a certain row/column of the 2D array.

这个实现只需在调用访问2D数组的特定行/列的方法时计算(1D)索引。

Below you will find a MCVE that shows this interface and the two implementations, the basic usage of the interface, and that does a memory footprint analysis using JOL.

下面您将看到一个MCVE,它显示了这个接口和两个实现,接口的基本用法,并使用JOL进行内存占用分析。

The output of this program is:

本程序输出为:

For 10 rows and 1000 columns:
Total size for SimpleByteArray2D : 10240
Total size for CompactByteArray2D: 10088

For 100 rows and 100 columns:
Total size for SimpleByteArray2D : 12440
Total size for CompactByteArray2D: 10088

For 1000 rows and 10 columns:
Total size for SimpleByteArray2D : 36040
Total size for CompactByteArray2D: 10088

Showing that

显示,

  • the SimpleByteArray2D implementation that is based on a simple 2D byte[][] array requires more memory when the number of rows increases (even if the total size of the array remains constant)

    基于一个简单的2D字节[]数组的SimpleByteArray2D实现需要更多的内存,当行数增加时(即使数组的总大小保持不变)。

  • the memory consumption of the CompactByteArray2D is independent of the structure of the array

    CompactByteArray2D的内存消耗与数组的结构无关

The whole program:

整个程序:

package *;

import java.nio.ByteBuffer;

import org.openjdk.jol.info.GraphLayout;

public class EfficientByteArrayStorage
{
    public static void main(String[] args)
    {
        showExampleUsage();
        anaylyzeMemoryFootprint();
    }

    private static void anaylyzeMemoryFootprint()
    {
        testMemoryFootprint(10, 1000);
        testMemoryFootprint(100, 100);
        testMemoryFootprint(1000, 10);
    }

    private static void testMemoryFootprint(int rows, int cols)
    {
        System.out.println("For " + rows + " rows and " + cols + " columns:");

        ByteArray2D b0 = new SimpleByteArray2D(rows, cols);
        GraphLayout g0 = GraphLayout.parseInstance(b0);
        System.out.println("Total size for SimpleByteArray2D : " + g0.totalSize());
        //System.out.println(g0.toFootprint());

        ByteArray2D b1 = new CompactByteArray2D(rows, cols);
        GraphLayout g1 = GraphLayout.parseInstance(b1);
        System.out.println("Total size for CompactByteArray2D: " + g1.totalSize());
        //System.out.println(g1.toFootprint());
    }

    // Shows an example of how to use the different implementations
    private static void showExampleUsage()
    {
        System.out.println("Using a SimpleByteArray2D");
        ByteArray2D b0 = new SimpleByteArray2D(10, 10);
        exampleUsage(b0);

        System.out.println("Using a CompactByteArray2D");
        ByteArray2D b1 = new CompactByteArray2D(10, 10);
        exampleUsage(b1);
    }

    private static void exampleUsage(ByteArray2D byteArray2D)
    {
        // Reading elements of the array
        System.out.println(byteArray2D.get(2, 4));

        // Writing elements of the array
        byteArray2D.set(2, 4, (byte)123);
        System.out.println(byteArray2D.get(2, 4));

        // Bulk access to rows
        ByteBuffer row = byteArray2D.getRow(2);
        for (int c = 0; c < row.capacity(); c++)
        {
            System.out.println(row.get(c));
        }

        // (Commented out for this MCVE: Writing one row to a file)
        /*/
        try (FileChannel fileChannel = 
            new FileOutputStream(new File("example.dat")).getChannel())
        {
            fileChannel.write(byteArray2D.getRow(2));
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
        //*/
    }

}


interface ByteArray2D 
{
    int getNumRows();
    int getNumColumns();
    byte get(int r, int c);
    void set(int r, int c, byte b);

    // Bulk access to rows, for convenience and efficiency
    ByteBuffer getRow(int row);
}

class SimpleByteArray2D implements ByteArray2D 
{
    private final int rows;
    private final int cols;
    private final byte array[][];

    public SimpleByteArray2D(int rows, int cols)
    {
        this.rows = rows;
        this.cols = cols;
        this.array = new byte[rows][cols];
    }

    @Override
    public int getNumRows()
    {
        return rows;
    }

    @Override
    public int getNumColumns()
    {
        return cols;
    }

    @Override
    public byte get(int r, int c)
    {
        return array[r][c];
    }

    @Override
    public void set(int r, int c, byte b)
    {
        array[r][c] = b;
    }

    @Override
    public ByteBuffer getRow(int row)
    {
        return ByteBuffer.wrap(array[row]);
    }
}

class CompactByteArray2D implements ByteArray2D
{
    private final int rows;
    private final int cols;
    private final ByteBuffer buffer;

    public CompactByteArray2D(int rows, int cols)
    {
        this.rows = rows;
        this.cols = cols;
        this.buffer = ByteBuffer.allocate(rows * cols);
    }

    @Override
    public int getNumRows()
    {
        return rows;
    }

    @Override
    public int getNumColumns()
    {
        return cols;
    }

    @Override
    public byte get(int r, int c)
    {
        return buffer.get(r * cols + c);
    }

    @Override
    public void set(int r, int c, byte b)
    {
        buffer.put(r * cols + c, b);
    }

    @Override
    public ByteBuffer getRow(int row)
    {
        ByteBuffer r = buffer.slice();
        r.position(row * cols);
        r.limit(row * cols + cols);
        return r.slice();
    }
}

Again, this is mainly intended as a sketch, to show one possible approach. The details of the interface will depend on the intended application pattern.

同样,这主要是作为一个草图来展示一种可能的方法。接口的细节将取决于预期的应用程序模式。


1 A side note:

1注:

The problem of the memory overhead is similar in other languages. For example, in C/C++, the structure that most closely resembles a "2D Java array" would be an array of manually allocated pointers:

内存开销的问题与其他语言相似。例如,在C/ c++中,最接近“2D Java数组”的结构是一个手动分配指针的数组:

char** array;
array = new (char*)[numRows];
array[0] = new char[numCols];
...

In this case, you also have an overhead that is proportional to the number of rows - namely, one (usually 4 byte) pointer for each row.

在本例中,还存在与行数成比例的开销——即每一行一个(通常为4字节)指针。

#2


9  

OK, so if I understood correctly (please ask if not - will try to answer), there are a couple of things here. First is that you need the right tool for measurements and JOL is the only one I trust.

好吧,如果我理解正确的话(如果不理解,请提问——我会尝试回答),这里有一些东西。首先,您需要合适的工具进行测量,而JOL是我唯一信任的工具。

Let' start simple:

让我们开始简单:

byte[] two = new byte[1];
System.out.println(GraphLayout.parseInstance(one).toFootprint()); 

This will show 24 bytes (12 for mark and class words - or Object headers + 4 bytes padding), 1 byte for the actual value and 7 bytes for padding (memory is 8 bytes aligned).

这将显示24个字节(用于标记和类单词的12个字节(或对象标头+ 4字节填充),1个字节为实际值,7个字节为填充(内存为8字节对齐)。

Taking this into consideration, this should be a predictable output:

考虑到这一点,这应该是可预测的产出:

byte[] eight = new byte[8];
System.out.println(GraphLayout.parseInstance(eight).toFootprint()); // 24 bytes

byte[] nine = new byte[9];
System.out.println(GraphLayout.parseInstance(nine).toFootprint()); // 32 bytes

Now let's move to two dimensional arrays:

现在我们来看二维数组:

byte[][] ninenine = new byte[9][9];    
System.out.println(GraphLayout.parseInstance(ninenine).toFootprint()); // 344 bytes

System.out.println(ClassLayout.parseInstance(ninenine).toPrintable());

Since java does not have true two dimensional arrays; every nested array is itself an Object (byte[]) that has headers and content. Thus a single byte[9] has 32 bytes (12 headers + 4 padding) and 16 bytes for content (9 bytes for actual content + 7 bytes padding).

因为java没有真正的二维数组;每个嵌套数组本身都是一个对象(byte[]),具有头和内容。因此,单个字节[9]有32个字节(12个header + 4填充)和16字节的内容(实际内容为9字节,填充为7个字节)。

The ninenine object has 56 bytes total: 16 headers + 36 for keeping the references to the nine objects + 4 bytes for padding.

ninenine对象总共有56个字节:16个标头+ 36个标头用于保持对9个对象的引用+ 4个字节用于填充。


Look at the produced sample here:

请看这里生产的样品:

byte[][] left = new byte[10000][10];
System.out.println(GraphLayout.parseInstance(left).toFootprint()); // 360016 bytes

byte[][] right = new byte[10][10000];
System.out.println(GraphLayout.parseInstance(right).toFootprint()); // 100216 bytes

That's a 260% increase; so by simply changing to work the other way around you can save a lot of space.

这是一个260%的增长;所以简单地换一种工作方式可以节省很多空间。

But the deeper problem is that every single Object in Java has those headers, there are no headerless objects yet. They might appear and are called Value Types. May be when that is implemented - arrays of primitives at least would not have this overhead.

但更深层的问题是,Java中的每一个对象都有这些头,没有headerless对象。它们可能会出现并被称为值类型。可能是在实现的时候——基本类型数组至少不会有这种开销。

#1


3  

The answer by Eugene explains the reason of why you are observing such an increase in memory consumption for a large number of arrays. The question in the title, "How to efficiently store small byte arrays in Java?", may then be answered with: Not at all. 1

尤金的答案解释了为什么你观察到大量数组的内存消耗增加的原因。题目中的问题,“如何有效地在Java中存储小字节数组?”,然后可以这样回答:一点也不。1

However, there probably are ways to achieve your goals. As usual, the "best" solution here will depend on how this data is going to be used. A very pragmatic approach would be: Define an interface for your data structure.

然而,可能有一些方法可以实现你的目标。和往常一样,这里的“最佳”解决方案将取决于如何使用这些数据。一种非常实用的方法是:为数据结构定义接口。

In the simplest case, this interface could just be

在最简单的情况下,这个接口可能就是

interface ByteArray2D 
{
    int getNumRows();
    int getNumColumns();
    byte get(int r, int c);
    void set(int r, int c, byte b);
}

Offering a basic abstraction of a "2D byte array". Depending on the application case, it may be beneficial to offer additional methods here. The patterns that could be employed here are frequently relevant for Matrix libraries, which handle "2D matrices" (usually of float values), and they often offer methods like these:

提供“2D字节数组”的基本抽象。根据应用程序的情况,在这里提供其他方法可能是有益的。这里可以使用的模式通常与处理“2D矩阵”(通常是浮点数)的矩阵库相关,它们通常提供如下方法:

interface Matrix {
    Vector getRow(int row);
    Vector getColumn(int column);
    ...
}

However, when the main purpose here is to handle a set of byte[] arrays, methods for accessing each array (that is, each row of the 2D array) could be sufficient:

然而,当这里的主要目的是处理一组字节[]数组时,访问每个数组(即二维数组的每一行)的方法就足够了:

ByteBuffer getRow(int row);

Given this interface, it is simple to create different implementations. For example, you could create a simple implementation that just stores a 2D byte[][] array internally:

考虑到这个接口,创建不同的实现很简单。例如,您可以创建一个简单的实现,该实现只在内部存储一个2D字节[][][]]数组:

class SimpleByteArray2D implements ByteArray2D 
{
    private final byte array[][];
    ...
}

Alternatively, you could create an implementation that stores a 1D byte[] array, or analogously, a ByteBuffer internally:

或者,您可以创建一个实现来存储一个1D字节[]数组,或者类似地,在内部存储一个ByteBuffer:

class CompactByteArray2D implements ByteArray2D
{
    private final ByteBuffer buffer;
    ...
}

This implementation then just has to compute the (1D) index when calling one of the methods for accessing a certain row/column of the 2D array.

这个实现只需在调用访问2D数组的特定行/列的方法时计算(1D)索引。

Below you will find a MCVE that shows this interface and the two implementations, the basic usage of the interface, and that does a memory footprint analysis using JOL.

下面您将看到一个MCVE,它显示了这个接口和两个实现,接口的基本用法,并使用JOL进行内存占用分析。

The output of this program is:

本程序输出为:

For 10 rows and 1000 columns:
Total size for SimpleByteArray2D : 10240
Total size for CompactByteArray2D: 10088

For 100 rows and 100 columns:
Total size for SimpleByteArray2D : 12440
Total size for CompactByteArray2D: 10088

For 1000 rows and 10 columns:
Total size for SimpleByteArray2D : 36040
Total size for CompactByteArray2D: 10088

Showing that

显示,

  • the SimpleByteArray2D implementation that is based on a simple 2D byte[][] array requires more memory when the number of rows increases (even if the total size of the array remains constant)

    基于一个简单的2D字节[]数组的SimpleByteArray2D实现需要更多的内存,当行数增加时(即使数组的总大小保持不变)。

  • the memory consumption of the CompactByteArray2D is independent of the structure of the array

    CompactByteArray2D的内存消耗与数组的结构无关

The whole program:

整个程序:

package *;

import java.nio.ByteBuffer;

import org.openjdk.jol.info.GraphLayout;

public class EfficientByteArrayStorage
{
    public static void main(String[] args)
    {
        showExampleUsage();
        anaylyzeMemoryFootprint();
    }

    private static void anaylyzeMemoryFootprint()
    {
        testMemoryFootprint(10, 1000);
        testMemoryFootprint(100, 100);
        testMemoryFootprint(1000, 10);
    }

    private static void testMemoryFootprint(int rows, int cols)
    {
        System.out.println("For " + rows + " rows and " + cols + " columns:");

        ByteArray2D b0 = new SimpleByteArray2D(rows, cols);
        GraphLayout g0 = GraphLayout.parseInstance(b0);
        System.out.println("Total size for SimpleByteArray2D : " + g0.totalSize());
        //System.out.println(g0.toFootprint());

        ByteArray2D b1 = new CompactByteArray2D(rows, cols);
        GraphLayout g1 = GraphLayout.parseInstance(b1);
        System.out.println("Total size for CompactByteArray2D: " + g1.totalSize());
        //System.out.println(g1.toFootprint());
    }

    // Shows an example of how to use the different implementations
    private static void showExampleUsage()
    {
        System.out.println("Using a SimpleByteArray2D");
        ByteArray2D b0 = new SimpleByteArray2D(10, 10);
        exampleUsage(b0);

        System.out.println("Using a CompactByteArray2D");
        ByteArray2D b1 = new CompactByteArray2D(10, 10);
        exampleUsage(b1);
    }

    private static void exampleUsage(ByteArray2D byteArray2D)
    {
        // Reading elements of the array
        System.out.println(byteArray2D.get(2, 4));

        // Writing elements of the array
        byteArray2D.set(2, 4, (byte)123);
        System.out.println(byteArray2D.get(2, 4));

        // Bulk access to rows
        ByteBuffer row = byteArray2D.getRow(2);
        for (int c = 0; c < row.capacity(); c++)
        {
            System.out.println(row.get(c));
        }

        // (Commented out for this MCVE: Writing one row to a file)
        /*/
        try (FileChannel fileChannel = 
            new FileOutputStream(new File("example.dat")).getChannel())
        {
            fileChannel.write(byteArray2D.getRow(2));
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
        //*/
    }

}


interface ByteArray2D 
{
    int getNumRows();
    int getNumColumns();
    byte get(int r, int c);
    void set(int r, int c, byte b);

    // Bulk access to rows, for convenience and efficiency
    ByteBuffer getRow(int row);
}

class SimpleByteArray2D implements ByteArray2D 
{
    private final int rows;
    private final int cols;
    private final byte array[][];

    public SimpleByteArray2D(int rows, int cols)
    {
        this.rows = rows;
        this.cols = cols;
        this.array = new byte[rows][cols];
    }

    @Override
    public int getNumRows()
    {
        return rows;
    }

    @Override
    public int getNumColumns()
    {
        return cols;
    }

    @Override
    public byte get(int r, int c)
    {
        return array[r][c];
    }

    @Override
    public void set(int r, int c, byte b)
    {
        array[r][c] = b;
    }

    @Override
    public ByteBuffer getRow(int row)
    {
        return ByteBuffer.wrap(array[row]);
    }
}

class CompactByteArray2D implements ByteArray2D
{
    private final int rows;
    private final int cols;
    private final ByteBuffer buffer;

    public CompactByteArray2D(int rows, int cols)
    {
        this.rows = rows;
        this.cols = cols;
        this.buffer = ByteBuffer.allocate(rows * cols);
    }

    @Override
    public int getNumRows()
    {
        return rows;
    }

    @Override
    public int getNumColumns()
    {
        return cols;
    }

    @Override
    public byte get(int r, int c)
    {
        return buffer.get(r * cols + c);
    }

    @Override
    public void set(int r, int c, byte b)
    {
        buffer.put(r * cols + c, b);
    }

    @Override
    public ByteBuffer getRow(int row)
    {
        ByteBuffer r = buffer.slice();
        r.position(row * cols);
        r.limit(row * cols + cols);
        return r.slice();
    }
}

Again, this is mainly intended as a sketch, to show one possible approach. The details of the interface will depend on the intended application pattern.

同样,这主要是作为一个草图来展示一种可能的方法。接口的细节将取决于预期的应用程序模式。


1 A side note:

1注:

The problem of the memory overhead is similar in other languages. For example, in C/C++, the structure that most closely resembles a "2D Java array" would be an array of manually allocated pointers:

内存开销的问题与其他语言相似。例如,在C/ c++中,最接近“2D Java数组”的结构是一个手动分配指针的数组:

char** array;
array = new (char*)[numRows];
array[0] = new char[numCols];
...

In this case, you also have an overhead that is proportional to the number of rows - namely, one (usually 4 byte) pointer for each row.

在本例中,还存在与行数成比例的开销——即每一行一个(通常为4字节)指针。

#2


9  

OK, so if I understood correctly (please ask if not - will try to answer), there are a couple of things here. First is that you need the right tool for measurements and JOL is the only one I trust.

好吧,如果我理解正确的话(如果不理解,请提问——我会尝试回答),这里有一些东西。首先,您需要合适的工具进行测量,而JOL是我唯一信任的工具。

Let' start simple:

让我们开始简单:

byte[] two = new byte[1];
System.out.println(GraphLayout.parseInstance(one).toFootprint()); 

This will show 24 bytes (12 for mark and class words - or Object headers + 4 bytes padding), 1 byte for the actual value and 7 bytes for padding (memory is 8 bytes aligned).

这将显示24个字节(用于标记和类单词的12个字节(或对象标头+ 4字节填充),1个字节为实际值,7个字节为填充(内存为8字节对齐)。

Taking this into consideration, this should be a predictable output:

考虑到这一点,这应该是可预测的产出:

byte[] eight = new byte[8];
System.out.println(GraphLayout.parseInstance(eight).toFootprint()); // 24 bytes

byte[] nine = new byte[9];
System.out.println(GraphLayout.parseInstance(nine).toFootprint()); // 32 bytes

Now let's move to two dimensional arrays:

现在我们来看二维数组:

byte[][] ninenine = new byte[9][9];    
System.out.println(GraphLayout.parseInstance(ninenine).toFootprint()); // 344 bytes

System.out.println(ClassLayout.parseInstance(ninenine).toPrintable());

Since java does not have true two dimensional arrays; every nested array is itself an Object (byte[]) that has headers and content. Thus a single byte[9] has 32 bytes (12 headers + 4 padding) and 16 bytes for content (9 bytes for actual content + 7 bytes padding).

因为java没有真正的二维数组;每个嵌套数组本身都是一个对象(byte[]),具有头和内容。因此,单个字节[9]有32个字节(12个header + 4填充)和16字节的内容(实际内容为9字节,填充为7个字节)。

The ninenine object has 56 bytes total: 16 headers + 36 for keeping the references to the nine objects + 4 bytes for padding.

ninenine对象总共有56个字节:16个标头+ 36个标头用于保持对9个对象的引用+ 4个字节用于填充。


Look at the produced sample here:

请看这里生产的样品:

byte[][] left = new byte[10000][10];
System.out.println(GraphLayout.parseInstance(left).toFootprint()); // 360016 bytes

byte[][] right = new byte[10][10000];
System.out.println(GraphLayout.parseInstance(right).toFootprint()); // 100216 bytes

That's a 260% increase; so by simply changing to work the other way around you can save a lot of space.

这是一个260%的增长;所以简单地换一种工作方式可以节省很多空间。

But the deeper problem is that every single Object in Java has those headers, there are no headerless objects yet. They might appear and are called Value Types. May be when that is implemented - arrays of primitives at least would not have this overhead.

但更深层的问题是,Java中的每一个对象都有这些头,没有headerless对象。它们可能会出现并被称为值类型。可能是在实现的时候——基本类型数组至少不会有这种开销。