为什么在F#中处理数组比列表更快

时间:2022-01-27 17:04:52

I was checking the performance of F# lists and arrays. Given the code:

我正在检查F#列表和数组的性能。鉴于代码:

let list = [ 1.. 100000 ]
for i in 1 .. 100 do ignore ( list|>List.map(fun n -> n))

let array = [| 1.. 100000 |]
for i in 1 .. 100 do ignore ( array|>Array.map(fun n -> n))

I would suspect both to run in very similar time. Actually it turned out that arrays are over 10 times faster : array takes 28 ms while list takes 346 ms! Why is that? I understand the concept of list in F# and the fact that for example appending values to list or taking subsequence is time consuming but in this code it just iterates through all elements so I thought the timing will be very comparable.

我怀疑两者都会在非常相似的时间内运行。实际上,结果表明阵列的速度提高了10倍以上:数组需要28毫秒而列表需要346毫秒!这是为什么?我理解F#中列表的概念以及例如将值附加到列表或采用子序列的事实是耗时的,但是在这段代码中它只是遍历所有元素,所以我认为时间将是非常可比的。

Tests under release mode in Visual Studio 2012 (in Debug mode arrays are about 5 times faster).

Visual Studio 2012中的发布模式下的测试(在调试模式阵列中的测试速度提高了约5倍)。

1 个解决方案

#1


14  

The main difference is that arrays are allocated as a continuous block of memory - the Array.map function knows the size of the input array and can perform just a single allocation to get all memory for the resulting array. On the other hand, the List.map function needs to separately allocate a single "cons cell" for each element of the input array.

主要区别在于数组被分配为连续的内存块 - Array.map函数知道输入数组的大小,并且只能执行单个分配以获取结果数组的所有内存。另一方面,List.map函数需要为输入数组的每个元素单独分配一个“cons单元”。

The difference is particularly visible for the map function, because that can really benefit from the up-front allocation. However, arrays are generally faster for larger data sets. If you change the code to use filtering (where the array needs to be reallocated during construction), then the array version is ~2x faster:

这种差异对于map函数尤其明显,因为这可以从前期分配中获益。但是,对于较大的数据集,数组通常更快。如果您更改代码以使用过滤(在构造期间需要重新分配阵列),那么阵列版本的速度要快2倍:

for i in 1 .. 100 do ignore ( list|>List.filter(fun n -> n%5=1))
for i in 1 .. 100 do ignore ( array|>Array.map(fun n -> n%5=1))

Using lists has other benefits - because they are immutable, you can safely share references to lists. Also, cloning a list and adding an element to the front is super efficient (single cell allocation), while doing similar operation with array would be really slow (copy the whole array).

使用列表还有其他好处 - 因为它们是不可变的,您可以安全地共享对列表的引用。此外,克隆列表并向前面添加元素是超级高效的(单细胞分配),而对数组执行类似操作则非常慢(复制整个数组)。

#1


14  

The main difference is that arrays are allocated as a continuous block of memory - the Array.map function knows the size of the input array and can perform just a single allocation to get all memory for the resulting array. On the other hand, the List.map function needs to separately allocate a single "cons cell" for each element of the input array.

主要区别在于数组被分配为连续的内存块 - Array.map函数知道输入数组的大小,并且只能执行单个分配以获取结果数组的所有内存。另一方面,List.map函数需要为输入数组的每个元素单独分配一个“cons单元”。

The difference is particularly visible for the map function, because that can really benefit from the up-front allocation. However, arrays are generally faster for larger data sets. If you change the code to use filtering (where the array needs to be reallocated during construction), then the array version is ~2x faster:

这种差异对于map函数尤其明显,因为这可以从前期分配中获益。但是,对于较大的数据集,数组通常更快。如果您更改代码以使用过滤(在构造期间需要重新分配阵列),那么阵列版本的速度要快2倍:

for i in 1 .. 100 do ignore ( list|>List.filter(fun n -> n%5=1))
for i in 1 .. 100 do ignore ( array|>Array.map(fun n -> n%5=1))

Using lists has other benefits - because they are immutable, you can safely share references to lists. Also, cloning a list and adding an element to the front is super efficient (single cell allocation), while doing similar operation with array would be really slow (copy the whole array).

使用列表还有其他好处 - 因为它们是不可变的,您可以安全地共享对列表的引用。此外,克隆列表并向前面添加元素是超级高效的(单细胞分配),而对数组执行类似操作则非常慢(复制整个数组)。