与原子变量相同的高速缓存行的非原子加载会导致原子变量失败吗?

时间:2022-01-31 21:01:25

Given something like this on an ARMv8 CPU (though this may apply to many others as well):

在ARMv8 CPU上给出类似的东西(尽管这可能适用于许多其他CPU):

class abcxzy 
{
  // Pragma align to cacheline to ensure they exist on same line.
  unit32_t atomic_data;
  uint32_t data;

  void foo()
  {
    volatile asm (
      "   ldr w0, [address of data]\n"
      "# Do stuff with data in w0..."
      "   str w0, [address of data]\n"

      "1: ldaxr w0, [address of atomic_data]\n"
      "   add w1, w0, #0x1\n"
      "   stxr w2,w1, [address of atomic_data]\n"
      "   cbnz w2, 1b\n"
    );
  }
}

With proper clobbers and such set on the Asm inline so that C and Asm can coexist happily in a world of rainbow ponies and sunshine.

在Asm内联中使用适当的clobbers和这样的设置,以便C和Asm可以在彩虹小马和阳光的世界中愉快地共存。

In a multiple CPU situation, all running this code at the same time, will the stores to data cause the atomic load/store to atomic_data to fail? From what I've read, the ARM atomic stuff works on a cache line basis, but it is not clear if the non-atomic store will affect the atomic. I hope that it it doesn't (and assume that it does...), but I am looking to see if anyone else can confirm this.

在多CP​​U情况下,所有同时运行此代码的数据,存储到数据会导致原子加载/存储到atomic_data失败吗?从我所读到的,ARM原子的东西在缓存行的基础上工作,但不清楚非原子存储是否会影响原子。我希望它不会(并假设它确实...),但我希望看看是否有其他人可以证实这一点。

1 个解决方案

#1


2  

Ok, finally found what I needed, though I don't like it:

好的,终于找到了我需要的东西,虽然我不喜欢它:

According to the ARM documentation, It is IMPLEMENTATION DEFINED whether a non-exclusive store to the same cache line as the exclusive store causes the exclusive store to fail. Thanks ARM. Appreciate that wonderful non-conclusive info.

根据ARM文档,对于与专用存储相同的高速缓存行的非独占存储是否导致专用存储失败,这是实现的定义。谢谢ARM。欣赏那些精彩的非定论信息。


Edit:

编辑:

By fail, I mean the stxr command did not write to memory and returned a "1" in the status register. "Your atomic data updated and needs new RMW" status.

失败,我的意思是stxr命令没有写入内存并在状态寄存器中返回“1”。 “您的原子数据已更新,需要新的RMW”状态。

To answer other statements:

回答其他陈述:

  • Yes, atomic critical areas should be as small as possible. The docs event give numbers on how small, and they are very reasonable indeed. I hope that my sections never span 1k or more...

    是的,原子临界区应该尽可能小。 docs事件给出的数字有多小,而且确实非常合理。我希望我的部分永远不会超过1k或更多......

  • And yes, any situation where you would need to worry about this kind of contention killing performance or worse means your code is "doing it wrong." The ARM docs are state this in a round about manner :)

    是的,任何你需要担心这种争用杀戮性能或更糟糕的情况意味着你的代码“做错了”。 ARM文档以一种方式陈述这种方式:)

  • As to putting the non-atomic loads and stores inside the atomics - my pseudo test above was just demonstrating a random access to the same cache line as an example. In real code, you obviously should avoid this. I was just trying to get a feeling for how "bad" it might be if, perhaps a high speed hardware timer store was hitting the same cache line as a lock. Again, don't do this...

    至于将非原子载荷和存储放在原子中 - 我上面的伪测试只是演示了对同一缓存行的随机访问作为一个例子。在实际代码中,你显然应该避免这种情况。我只是试图了解一下,如果高速硬件计时器商店与锁相同的高速缓存行,可能会有多“糟糕”。再说一次,不要这样做......

#1


2  

Ok, finally found what I needed, though I don't like it:

好的,终于找到了我需要的东西,虽然我不喜欢它:

According to the ARM documentation, It is IMPLEMENTATION DEFINED whether a non-exclusive store to the same cache line as the exclusive store causes the exclusive store to fail. Thanks ARM. Appreciate that wonderful non-conclusive info.

根据ARM文档,对于与专用存储相同的高速缓存行的非独占存储是否导致专用存储失败,这是实现的定义。谢谢ARM。欣赏那些精彩的非定论信息。


Edit:

编辑:

By fail, I mean the stxr command did not write to memory and returned a "1" in the status register. "Your atomic data updated and needs new RMW" status.

失败,我的意思是stxr命令没有写入内存并在状态寄存器中返回“1”。 “您的原子数据已更新,需要新的RMW”状态。

To answer other statements:

回答其他陈述:

  • Yes, atomic critical areas should be as small as possible. The docs event give numbers on how small, and they are very reasonable indeed. I hope that my sections never span 1k or more...

    是的,原子临界区应该尽可能小。 docs事件给出的数字有多小,而且确实非常合理。我希望我的部分永远不会超过1k或更多......

  • And yes, any situation where you would need to worry about this kind of contention killing performance or worse means your code is "doing it wrong." The ARM docs are state this in a round about manner :)

    是的,任何你需要担心这种争用杀戮性能或更糟糕的情况意味着你的代码“做错了”。 ARM文档以一种方式陈述这种方式:)

  • As to putting the non-atomic loads and stores inside the atomics - my pseudo test above was just demonstrating a random access to the same cache line as an example. In real code, you obviously should avoid this. I was just trying to get a feeling for how "bad" it might be if, perhaps a high speed hardware timer store was hitting the same cache line as a lock. Again, don't do this...

    至于将非原子载荷和存储放在原子中 - 我上面的伪测试只是演示了对同一缓存行的随机访问作为一个例子。在实际代码中,你显然应该避免这种情况。我只是试图了解一下,如果高速硬件计时器商店与锁相同的高速缓存行,可能会有多“糟糕”。再说一次,不要这样做......