Java:多维数组与一维数组

时间:2023-01-28 12:32:25

For example:

例如:

  • a) int [x][y][z]

    一)int[x][y][z]

    vs

    vs

  • b) int[x*y*z]

    b)int[x * y * z]

Initially thought i'd go with a) for simplicity

一开始我以为我会选择a)为了简单

I know that Java doesn't store arrays linearly in memory like C does. But what implications does this have for my program?

我知道Java不像C那样线性地存储数组。但是这对我的程序有什么影响呢?

4 个解决方案

#1


61  

Usually the best thing to do when searching anwers for such questions is to see how the choices are compiled into JVM bytecode:

在搜索此类问题时,通常最好的做法是查看如何将这些选项编译成JVM字节码:

multi = new int[50][50];
single = new int[2500];

This is translated into:

这是翻译成:

BIPUSH 50
BIPUSH 50
MULTIANEWARRAY int[][] 2
ASTORE 1
SIPUSH 2500
NEWARRAY T_INT
ASTORE 2

So, as you can see, the JVM already knows that we are talking about a multi dimensional array.

因此,正如您所看到的,JVM已经知道我们正在讨论多维数组。

Keeping it further:

让它进一步指出:

for (int i = 0; i < 50; ++i)
    for (int j = 0; j < 50; ++j)
    {
        multi[i][j] = 20;
        single[i*50+j] = 20;
    }

This is translated (skipping the cycles) into:

这被翻译成:

ALOAD 1: multi
ILOAD 3: i
AALOAD
ILOAD 4: j
BIPUSH 20
IASTORE

ALOAD 2: single
ILOAD 3: i
BIPUSH 50
IMUL
ILOAD 4: j
IADD
BIPUSH 20
IASTORE

So, as you can see, the multi-dimensional array is treated internally in the VM, no overhead generated by useless instructions, while using a single one uses more instructions since offset is calculated by hand.

因此,如您所见,多维数组在VM内部处理,没有无用指令产生的开销,而使用单个指令则使用更多指令,因为偏移量是手工计算的。

I don't think that performance will be such an issue.

我不认为性能会是一个问题。

EDIT:

编辑:

I did some simple benchmarks to see what's going down here. I chose to try different examples: linear read, linear write, and random access. Times are expressed in millisecs (and calculated using System.nanoTime(). Here are the results:

我做了一些简单的基准来看看下面是什么。我选择尝试不同的例子:线性读、线性写和随机访问。时间以毫秒表示(并使用System.nanoTime()计算。这里是结果:

Linear write

线性写

  • Size: 100x100 (10000) Multi: 5.786591 Single: 6.131748
  • 尺寸:100x100(10000)多:5.786591单:6.131748
  • Size: 200x200 (40000) Multi: 1.216366 Single: 0.782041
  • 尺寸:200x200(40000)多:1.216366单:0.782041
  • Size: 500x500 (250000) Multi: 7.177029 Single: 3.667017
  • 尺寸:500x500(250000)多:7.177029单:3.667017
  • Size: 1000x1000 (1000000) Multi: 30.508131 Single: 18.064592
  • 尺寸:1000x1000(1000000)多:30.508131单:18.064592
  • Size: 2000x2000 (4000000) Multi: 185.3548 Single: 155.590313
  • 尺寸:2000x2000 (4000000) Multi: 185.3548单:155.590313
  • Size: 5000x5000 (25000000) Multi: 955.5299 Single: 923.264417
  • 尺寸:5000x5000 (25000000) Multi: 955.5299单:923.264417
  • Size: 10000x10000 (100000000) Multi: 4084.798753 Single: 4015.448829
  • 尺寸:10000x10000 (100000000) Multi: 4084.798753单:4015.448829

Linear read

线性阅读

  • Size: 100x100 (10000) Multi: 5.241338 Single: 5.135957
  • 尺寸:100x100(10000)多:5.241338单:5.135957
  • Size: 200x200 (40000) Multi: 0.080209 Single: 0.044371
  • 尺寸:200x200 (40000) multi: 0.080209单:0.044371
  • Size: 500x500 (250000) Multi: 0.088742 Single: 0.084476
  • 尺寸:500x500(250000)多:0.088742单:0.084476
  • Size: 1000x1000 (1000000) Multi: 0.232095 Single: 0.167671
  • 尺寸:1000x1000(1000000)多:0.232095单:0.167671
  • Size: 2000x2000 (4000000) Multi: 0.481683 Single: 0.33321
  • 尺寸:2000x2000(4000000)多:0.481683单:0.33321
  • Size: 5000x5000 (25000000) Multi: 1.222339 Single: 0.828118 Size: 10000x10000 (100000000) Multi: 2.496302 Single: 1.650691
  • 尺寸:5000x5000 (25000000) Multi: 1.222339 Single: 0.828118 Size: 10000x10000 (100000000) Multi: 2.496302 Single: 1.650691

Random read

随机读取

  • Size: 100x100 (10000) Multi: 22.317393 Single: 8.546134
  • 尺寸:100x100(10000)多:22.317393单:8.546134
  • Size: 200x200 (40000) Multi: 32.287669 Single: 11.022383
  • 尺寸:200x200(40000)多:32.287669单:11.022383
  • Size: 500x500 (250000) Multi: 189.542751 Single: 68.181343
  • 尺寸:500x500(250000)多:189.542751单:68.181343
  • Size: 1000x1000 (1000000) Multi: 1124.78609 Single: 272.235584
  • 尺寸:1000x1000(1000000)多:1124.78609单:272.235584
  • Size: 2000x2000 (4000000) Multi: 6814.477101 Single: 1091.998395
  • 尺寸:2000x2000(4000000)多:6814.477101单:1091.998395。
  • Size: 5000x5000 (25000000) Multi: 50051.306239 Single: 7028.422262
  • 尺寸:5000x5000 (25000000) Multi: 50051.306239单:7028.422262

The random one is a little misleading since it generates 2 random numbers for multi-dimensional array while just one for single dimensional (and PNRGs may consume some CPU).

随机的一个有点误导,因为它为多维数组生成两个随机数,而为单个维数组生成一个随机数(而PNRGs可能会消耗一些CPU)。

Mind that I tried to let JIT work by benchmarking only after the 20th run of the same loop. For completeness my java VM is the following:

注意,我只是在相同循环的第20次运行之后才尝试通过基准测试来让JIT工作。为了完整起见,我的java VM如下:

java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

java版本“1.6.0_17”java (TM) SE运行时环境(构建1.6.0_17-b04) java HotSpot(TM) 64位服务器VM(构建14.3-b01,混合模式)

#2


22  

On current CPUs, non-cached memory access is hundreds of times slower than arithmetics (see this presentation and read What every programmer should know about memory). The a) option will result in about 3 memory lookups whereas the b) option will result in about 1 memory lookup. Also the CPU's prefetching algorithms might not work as well. So the b) option can be faster in some situations (it's a hot spot and the array does not fit into the CPU's cache). How much faster? - that will depend on the application.

在当前的cpu上,非缓存的内存访问要比算术运算慢数百倍(参见这个演示并阅读每个程序员应该了解的内存)。a)选项将导致大约3个内存查找,而b)选项将导致大约1个内存查找。此外,CPU的预取算法可能也不能正常工作。所以b)选项在某些情况下可以更快(这是一个热点,数组不适合CPU的缓存)。快多少呢?-这取决于申请。

Personally I would first use the a) option, because it will result in simpler code. If a profiler shows that array access to be a bottleneck, then I would convert it to the b) option, so that there is a pair of helper methods for reading and writing array values (that way the messy code will be restricted to those two methods).

就我个人而言,我将首先使用a)选项,因为它将导致更简单的代码。如果分析器显示数组访问是瓶颈,那么我将把它转换为b)选项,这样就有一对用于读取和写入数组值的辅助方法(这样一来,混乱的代码将仅限于这两个方法)。

I made a benchmark for comparing 3-dimensional int arrays ("Multi" column) to the equivalent 1-dimensional int arrays ("Single" column). The code is here and tests here. I ran it on 64-bit jdk1.6.0_18, Windows 7 x64, Core 2 Quad Q6600 @ 3.0 GHz, 4 GB DDR2, using the JVM options -server -Xmx3G -verbose:gc -XX:+PrintCompilation (I have removed the debug output from the following results). The results were:

我做了一个基准来比较三维int数组(“Multi”列)和等效的一维int数组(“Single”列)。代码在这里,测试在这里。我在64位的jdk1.6.0_18、Windows 7 x64、Core 2 Quad Q6600 @ 3.0 GHz、4 GB DDR2上运行它,使用JVM选项-server -Xmx3G -verbose:gc -XX:+ printcompile(我从以下结果中删除了调试输出)。结果:

Out of 20 repeats, the minimum time in milliseconds is reported.

Array dimensions: 100x100x100 (1000000)
            Multi   Single
Seq Write   1       1
Seq Read    1       1
Random Read 99      90    (of which generating random numbers 59 ms)

Array dimensions: 200x200x200 (8000000)
            Multi   Single
Seq Write   14      13
Seq Read    11      8
Random Read 1482    1239    (of which generating random numbers 474 ms)

Array dimensions: 300x300x300 (27000000)
            Multi   Single
Seq Write   53      46
Seq Read    34      24
Random Read 5915    4418    (of which generating random numbers 1557 ms)

Array dimensions: 400x400x400 (64000000)
            Multi   Single
Seq Write   123     111
Seq Read    71      55
Random Read 16326   11144    (of which generating random numbers 3693 ms)

This shows the 1-dimensional array to be faster. Though the differences are so small, that for 99% applications it won't be noticable.

这表明一维数组的速度更快。虽然差异很小,但是对于99%的应用程序来说,差异是不明显的。

I did also some measurements to estimate the overhead of generating the random numbers in the Random Read benchmark by replacing preventOptimizingAway += array.get(x, y, z); with preventOptimizingAway += x * y * z; and added the measurements to the above results table by hand. Generating the random numbers takes 1/3 or less of the total time of the Random Read benchmark, so the memory access dominates the benchmark as expected. It would be interesting to repeat this benchmark with arrays of 4 and more dimensions. Probably it would make the speed difference bigger, because the multidimensional array's topmost levels will fit into the CPU's cache, and only the other levels will require a memory lookup.

我还做了一些测量,通过替换preventOptimizingAway +=数组来估计在随机读基准中生成随机数的开销。得到(x,y,z);用preventOptimizingAway += x * y * z;并将测量结果手工添加到上述结果表中。生成随机数只占用随机读取基准的1/3或更少的时间,因此内存访问控制了基准测试的预期值。用4个或更多维度的数组重复这个基准会很有趣。它可能会使速度差异更大,因为多维数组的最顶层将适合于CPU的缓存,只有其他级别需要进行内存查找。

#3


4  

Use first variant (3-dimensional) because it's easier for understanding and there are less chances to make some logical error (especially if you're using it for modeling 3-dimensional space)

使用第一种变体(3维),因为它更容易理解,并且很少有可能出现逻辑错误(特别是如果您使用它建模3维空间)

#4


2  

If you choose the latter route then you're going to have to perform arithmetic for every single array access. That's going to be a pain and error prone (unless you wrap it in a class providing this functionality).

如果你选择后一种路径,那么你必须对每个数组访问执行算术运算。这将是一种痛苦和错误倾向(除非您将其封装在提供此功能的类中)。

I don't believe that there's any (significant) optimisation in choosing your flat array (especially given the arithmetic taken to index into it). As always with optimisations, you would need to perform some measurements and determine if it's really worth it.

我不相信在选择平面数组时有任何(重要的)优化(特别是考虑到索引的算法)。与对优化的一贯做法一样,您需要执行一些度量并确定它是否真正值得。

#1


61  

Usually the best thing to do when searching anwers for such questions is to see how the choices are compiled into JVM bytecode:

在搜索此类问题时,通常最好的做法是查看如何将这些选项编译成JVM字节码:

multi = new int[50][50];
single = new int[2500];

This is translated into:

这是翻译成:

BIPUSH 50
BIPUSH 50
MULTIANEWARRAY int[][] 2
ASTORE 1
SIPUSH 2500
NEWARRAY T_INT
ASTORE 2

So, as you can see, the JVM already knows that we are talking about a multi dimensional array.

因此,正如您所看到的,JVM已经知道我们正在讨论多维数组。

Keeping it further:

让它进一步指出:

for (int i = 0; i < 50; ++i)
    for (int j = 0; j < 50; ++j)
    {
        multi[i][j] = 20;
        single[i*50+j] = 20;
    }

This is translated (skipping the cycles) into:

这被翻译成:

ALOAD 1: multi
ILOAD 3: i
AALOAD
ILOAD 4: j
BIPUSH 20
IASTORE

ALOAD 2: single
ILOAD 3: i
BIPUSH 50
IMUL
ILOAD 4: j
IADD
BIPUSH 20
IASTORE

So, as you can see, the multi-dimensional array is treated internally in the VM, no overhead generated by useless instructions, while using a single one uses more instructions since offset is calculated by hand.

因此,如您所见,多维数组在VM内部处理,没有无用指令产生的开销,而使用单个指令则使用更多指令,因为偏移量是手工计算的。

I don't think that performance will be such an issue.

我不认为性能会是一个问题。

EDIT:

编辑:

I did some simple benchmarks to see what's going down here. I chose to try different examples: linear read, linear write, and random access. Times are expressed in millisecs (and calculated using System.nanoTime(). Here are the results:

我做了一些简单的基准来看看下面是什么。我选择尝试不同的例子:线性读、线性写和随机访问。时间以毫秒表示(并使用System.nanoTime()计算。这里是结果:

Linear write

线性写

  • Size: 100x100 (10000) Multi: 5.786591 Single: 6.131748
  • 尺寸:100x100(10000)多:5.786591单:6.131748
  • Size: 200x200 (40000) Multi: 1.216366 Single: 0.782041
  • 尺寸:200x200(40000)多:1.216366单:0.782041
  • Size: 500x500 (250000) Multi: 7.177029 Single: 3.667017
  • 尺寸:500x500(250000)多:7.177029单:3.667017
  • Size: 1000x1000 (1000000) Multi: 30.508131 Single: 18.064592
  • 尺寸:1000x1000(1000000)多:30.508131单:18.064592
  • Size: 2000x2000 (4000000) Multi: 185.3548 Single: 155.590313
  • 尺寸:2000x2000 (4000000) Multi: 185.3548单:155.590313
  • Size: 5000x5000 (25000000) Multi: 955.5299 Single: 923.264417
  • 尺寸:5000x5000 (25000000) Multi: 955.5299单:923.264417
  • Size: 10000x10000 (100000000) Multi: 4084.798753 Single: 4015.448829
  • 尺寸:10000x10000 (100000000) Multi: 4084.798753单:4015.448829

Linear read

线性阅读

  • Size: 100x100 (10000) Multi: 5.241338 Single: 5.135957
  • 尺寸:100x100(10000)多:5.241338单:5.135957
  • Size: 200x200 (40000) Multi: 0.080209 Single: 0.044371
  • 尺寸:200x200 (40000) multi: 0.080209单:0.044371
  • Size: 500x500 (250000) Multi: 0.088742 Single: 0.084476
  • 尺寸:500x500(250000)多:0.088742单:0.084476
  • Size: 1000x1000 (1000000) Multi: 0.232095 Single: 0.167671
  • 尺寸:1000x1000(1000000)多:0.232095单:0.167671
  • Size: 2000x2000 (4000000) Multi: 0.481683 Single: 0.33321
  • 尺寸:2000x2000(4000000)多:0.481683单:0.33321
  • Size: 5000x5000 (25000000) Multi: 1.222339 Single: 0.828118 Size: 10000x10000 (100000000) Multi: 2.496302 Single: 1.650691
  • 尺寸:5000x5000 (25000000) Multi: 1.222339 Single: 0.828118 Size: 10000x10000 (100000000) Multi: 2.496302 Single: 1.650691

Random read

随机读取

  • Size: 100x100 (10000) Multi: 22.317393 Single: 8.546134
  • 尺寸:100x100(10000)多:22.317393单:8.546134
  • Size: 200x200 (40000) Multi: 32.287669 Single: 11.022383
  • 尺寸:200x200(40000)多:32.287669单:11.022383
  • Size: 500x500 (250000) Multi: 189.542751 Single: 68.181343
  • 尺寸:500x500(250000)多:189.542751单:68.181343
  • Size: 1000x1000 (1000000) Multi: 1124.78609 Single: 272.235584
  • 尺寸:1000x1000(1000000)多:1124.78609单:272.235584
  • Size: 2000x2000 (4000000) Multi: 6814.477101 Single: 1091.998395
  • 尺寸:2000x2000(4000000)多:6814.477101单:1091.998395。
  • Size: 5000x5000 (25000000) Multi: 50051.306239 Single: 7028.422262
  • 尺寸:5000x5000 (25000000) Multi: 50051.306239单:7028.422262

The random one is a little misleading since it generates 2 random numbers for multi-dimensional array while just one for single dimensional (and PNRGs may consume some CPU).

随机的一个有点误导,因为它为多维数组生成两个随机数,而为单个维数组生成一个随机数(而PNRGs可能会消耗一些CPU)。

Mind that I tried to let JIT work by benchmarking only after the 20th run of the same loop. For completeness my java VM is the following:

注意,我只是在相同循环的第20次运行之后才尝试通过基准测试来让JIT工作。为了完整起见,我的java VM如下:

java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

java版本“1.6.0_17”java (TM) SE运行时环境(构建1.6.0_17-b04) java HotSpot(TM) 64位服务器VM(构建14.3-b01,混合模式)

#2


22  

On current CPUs, non-cached memory access is hundreds of times slower than arithmetics (see this presentation and read What every programmer should know about memory). The a) option will result in about 3 memory lookups whereas the b) option will result in about 1 memory lookup. Also the CPU's prefetching algorithms might not work as well. So the b) option can be faster in some situations (it's a hot spot and the array does not fit into the CPU's cache). How much faster? - that will depend on the application.

在当前的cpu上,非缓存的内存访问要比算术运算慢数百倍(参见这个演示并阅读每个程序员应该了解的内存)。a)选项将导致大约3个内存查找,而b)选项将导致大约1个内存查找。此外,CPU的预取算法可能也不能正常工作。所以b)选项在某些情况下可以更快(这是一个热点,数组不适合CPU的缓存)。快多少呢?-这取决于申请。

Personally I would first use the a) option, because it will result in simpler code. If a profiler shows that array access to be a bottleneck, then I would convert it to the b) option, so that there is a pair of helper methods for reading and writing array values (that way the messy code will be restricted to those two methods).

就我个人而言,我将首先使用a)选项,因为它将导致更简单的代码。如果分析器显示数组访问是瓶颈,那么我将把它转换为b)选项,这样就有一对用于读取和写入数组值的辅助方法(这样一来,混乱的代码将仅限于这两个方法)。

I made a benchmark for comparing 3-dimensional int arrays ("Multi" column) to the equivalent 1-dimensional int arrays ("Single" column). The code is here and tests here. I ran it on 64-bit jdk1.6.0_18, Windows 7 x64, Core 2 Quad Q6600 @ 3.0 GHz, 4 GB DDR2, using the JVM options -server -Xmx3G -verbose:gc -XX:+PrintCompilation (I have removed the debug output from the following results). The results were:

我做了一个基准来比较三维int数组(“Multi”列)和等效的一维int数组(“Single”列)。代码在这里,测试在这里。我在64位的jdk1.6.0_18、Windows 7 x64、Core 2 Quad Q6600 @ 3.0 GHz、4 GB DDR2上运行它,使用JVM选项-server -Xmx3G -verbose:gc -XX:+ printcompile(我从以下结果中删除了调试输出)。结果:

Out of 20 repeats, the minimum time in milliseconds is reported.

Array dimensions: 100x100x100 (1000000)
            Multi   Single
Seq Write   1       1
Seq Read    1       1
Random Read 99      90    (of which generating random numbers 59 ms)

Array dimensions: 200x200x200 (8000000)
            Multi   Single
Seq Write   14      13
Seq Read    11      8
Random Read 1482    1239    (of which generating random numbers 474 ms)

Array dimensions: 300x300x300 (27000000)
            Multi   Single
Seq Write   53      46
Seq Read    34      24
Random Read 5915    4418    (of which generating random numbers 1557 ms)

Array dimensions: 400x400x400 (64000000)
            Multi   Single
Seq Write   123     111
Seq Read    71      55
Random Read 16326   11144    (of which generating random numbers 3693 ms)

This shows the 1-dimensional array to be faster. Though the differences are so small, that for 99% applications it won't be noticable.

这表明一维数组的速度更快。虽然差异很小,但是对于99%的应用程序来说,差异是不明显的。

I did also some measurements to estimate the overhead of generating the random numbers in the Random Read benchmark by replacing preventOptimizingAway += array.get(x, y, z); with preventOptimizingAway += x * y * z; and added the measurements to the above results table by hand. Generating the random numbers takes 1/3 or less of the total time of the Random Read benchmark, so the memory access dominates the benchmark as expected. It would be interesting to repeat this benchmark with arrays of 4 and more dimensions. Probably it would make the speed difference bigger, because the multidimensional array's topmost levels will fit into the CPU's cache, and only the other levels will require a memory lookup.

我还做了一些测量,通过替换preventOptimizingAway +=数组来估计在随机读基准中生成随机数的开销。得到(x,y,z);用preventOptimizingAway += x * y * z;并将测量结果手工添加到上述结果表中。生成随机数只占用随机读取基准的1/3或更少的时间,因此内存访问控制了基准测试的预期值。用4个或更多维度的数组重复这个基准会很有趣。它可能会使速度差异更大,因为多维数组的最顶层将适合于CPU的缓存,只有其他级别需要进行内存查找。

#3


4  

Use first variant (3-dimensional) because it's easier for understanding and there are less chances to make some logical error (especially if you're using it for modeling 3-dimensional space)

使用第一种变体(3维),因为它更容易理解,并且很少有可能出现逻辑错误(特别是如果您使用它建模3维空间)

#4


2  

If you choose the latter route then you're going to have to perform arithmetic for every single array access. That's going to be a pain and error prone (unless you wrap it in a class providing this functionality).

如果你选择后一种路径,那么你必须对每个数组访问执行算术运算。这将是一种痛苦和错误倾向(除非您将其封装在提供此功能的类中)。

I don't believe that there's any (significant) optimisation in choosing your flat array (especially given the arithmetic taken to index into it). As always with optimisations, you would need to perform some measurements and determine if it's really worth it.

我不相信在选择平面数组时有任何(重要的)优化(特别是考虑到索引的算法)。与对优化的一贯做法一样,您需要执行一些度量并确定它是否真正值得。