将数组初始化为0的时间是多长?

时间:2022-01-06 21:29:40

I am confused whether
int arr[n]={0} takes constant time i.e. O(1), or O(n)?

我很困惑int arr [n] = {0}是否需要恒定时间,即O(1),还是O(n)?

2 个解决方案

#1


2  

You should expect O(N) time, but there are caveats:

你应该期待O(N)时间,但有一些警告:

  • It is O(1) if memory occupied by array is smaller than the word size. (it may be O(1) all the way to the cache line size on modern CPUs)
  • 如果数组占用的内存小于字大小,则为O(1)。 (它可能是O(1)一直到现代CPU上的缓存行大小)

  • It is O(N) if array fits within a single tier in the memory.
  • 如果数组适合内存中的单个层,则为O(N)。

  • It is complicated if array pushes through the tiers: There are multiple tiers on all modern computers (registers, L0 cache, L1 cache, L3 cache?, NUMA on multi-CPU machines, Virtual memory(mapping to swap), ...). If array can't fit in one - there will be a severe performance penalty.
  • 如果数组推送层,则很复杂:所有现代计算机上都有多个层(寄存器,L0缓存,L1缓存,L3缓存?,多CPU机器上的NUMA,虚拟内存(映射到交换),...) 。如果数组不能合二为一,那将会造成严重的性能损失。

CPU cache architecture impacts the time needed to zero out memory quite severely. In practice calling it O(N) is somewhat misleading given that going from 100 to 101 may increase time 10x if it falls on a cache boundary (line or whole). It may be even more dramatic if swapping is involved. Beware of the tiered memory model...

CPU缓存架构会严重影响将内存清零所需的时间。在实践中,调用O(N)有点误导,因为如果它落在缓存边界(线或整体)上,从100到101可能会增加10倍的时间。如果涉及交换,可能会更加引人注目。小心分层内存模型......

#2


1  

Generally initialization to zero of non-static storage is linear in the size of the storage. If you are reusing the array, it will need to be zero'ed each time. There are computer architectures that try to make this free via maintaining bit masks on pages or cache lines and returning zeroes at some point in the cache fill or load unit machinery. These are somewhat rare, but the effect can often be replicated in software if performance is paramount.

通常,非静态存储的初始化为零是存储器大小的线性。如果您正在重用该数组,则每次都需要将其置零。有些计算机体系结构试图通过在页面或缓存行上保持位掩码并在缓存填充或加载单元机器中的某个点返回零来使其*。这些有点罕见,但如果性能至关重要,通常可以在软件中复制效果。

Arguably zeroed static storage is free but in practice the pages to back it up will be faulted in and there will be some cost to zero them on most architectures.

可以说零归一化静态存储是免费的,但实际上备份它的页面会出现故障,并且在大多数架构上都会有一些成本归零。

(One can end up in situations where the cost of the faults to provide zero filled pages is quite noticeable. E.g. repeated malloc/free of blocks bigger than some threshold can result in the address space backing the allocation being returned to the OS at each deallocation. The OS then has to zero it for security reasons even though malloc isn't guaranteed to return zero filled storage. Worse case the program then writes zeroes into the same block after it is returned from malloc so it ends up being twice the cost.)

(最终可能会出现提供零填充页面的故障成本非常明显的情况。例如,重复的malloc /没有大于某个阈值的块可能导致支持分配的地址空间在每次释放时返回到OS操作系统然后必须出于安全原因将其归零,即使malloc不能保证返回零填充存储。更糟糕的是,程序然后在从malloc返回后将零写入同一块中,因此它最终成本的两倍。 )

For cases where large arrays of mostly zeroes will be accessed in a sparse fashion, the zero fill on demand behavior mentioned above can reduce the cost to linear in the number of pages actually used, not the total size. Arranging to do this usually requires using mmap or similar directly, not just allocating an array in C and zero initializing it.

对于以稀疏方式访问大多数零的大数组的情况,上面提到的零填充按需行为可以将实际使用的页数的成本降低到线性,而不是总大小。安排这样做通常需要直接使用mmap或类似的,而不仅仅是在C中分配一个数组而不是初始化它。

#1


2  

You should expect O(N) time, but there are caveats:

你应该期待O(N)时间,但有一些警告:

  • It is O(1) if memory occupied by array is smaller than the word size. (it may be O(1) all the way to the cache line size on modern CPUs)
  • 如果数组占用的内存小于字大小,则为O(1)。 (它可能是O(1)一直到现代CPU上的缓存行大小)

  • It is O(N) if array fits within a single tier in the memory.
  • 如果数组适合内存中的单个层,则为O(N)。

  • It is complicated if array pushes through the tiers: There are multiple tiers on all modern computers (registers, L0 cache, L1 cache, L3 cache?, NUMA on multi-CPU machines, Virtual memory(mapping to swap), ...). If array can't fit in one - there will be a severe performance penalty.
  • 如果数组推送层,则很复杂:所有现代计算机上都有多个层(寄存器,L0缓存,L1缓存,L3缓存?,多CPU机器上的NUMA,虚拟内存(映射到交换),...) 。如果数组不能合二为一,那将会造成严重的性能损失。

CPU cache architecture impacts the time needed to zero out memory quite severely. In practice calling it O(N) is somewhat misleading given that going from 100 to 101 may increase time 10x if it falls on a cache boundary (line or whole). It may be even more dramatic if swapping is involved. Beware of the tiered memory model...

CPU缓存架构会严重影响将内存清零所需的时间。在实践中,调用O(N)有点误导,因为如果它落在缓存边界(线或整体)上,从100到101可能会增加10倍的时间。如果涉及交换,可能会更加引人注目。小心分层内存模型......

#2


1  

Generally initialization to zero of non-static storage is linear in the size of the storage. If you are reusing the array, it will need to be zero'ed each time. There are computer architectures that try to make this free via maintaining bit masks on pages or cache lines and returning zeroes at some point in the cache fill or load unit machinery. These are somewhat rare, but the effect can often be replicated in software if performance is paramount.

通常,非静态存储的初始化为零是存储器大小的线性。如果您正在重用该数组,则每次都需要将其置零。有些计算机体系结构试图通过在页面或缓存行上保持位掩码并在缓存填充或加载单元机器中的某个点返回零来使其*。这些有点罕见,但如果性能至关重要,通常可以在软件中复制效果。

Arguably zeroed static storage is free but in practice the pages to back it up will be faulted in and there will be some cost to zero them on most architectures.

可以说零归一化静态存储是免费的,但实际上备份它的页面会出现故障,并且在大多数架构上都会有一些成本归零。

(One can end up in situations where the cost of the faults to provide zero filled pages is quite noticeable. E.g. repeated malloc/free of blocks bigger than some threshold can result in the address space backing the allocation being returned to the OS at each deallocation. The OS then has to zero it for security reasons even though malloc isn't guaranteed to return zero filled storage. Worse case the program then writes zeroes into the same block after it is returned from malloc so it ends up being twice the cost.)

(最终可能会出现提供零填充页面的故障成本非常明显的情况。例如,重复的malloc /没有大于某个阈值的块可能导致支持分配的地址空间在每次释放时返回到OS操作系统然后必须出于安全原因将其归零,即使malloc不能保证返回零填充存储。更糟糕的是,程序然后在从malloc返回后将零写入同一块中,因此它最终成本的两倍。 )

For cases where large arrays of mostly zeroes will be accessed in a sparse fashion, the zero fill on demand behavior mentioned above can reduce the cost to linear in the number of pages actually used, not the total size. Arranging to do this usually requires using mmap or similar directly, not just allocating an array in C and zero initializing it.

对于以稀疏方式访问大多数零的大数组的情况,上面提到的零填充按需行为可以将实际使用的页数的成本降低到线性,而不是总大小。安排这样做通常需要直接使用mmap或类似的,而不仅仅是在C中分配一个数组而不是初始化它。