如何保证代码运行,而不会因缓存而导致执行时间不变?

时间:2022-03-26 21:29:29

In an embedded application (written in C, on a 32-bit processor) with hard real-time constraints, the execution time of critical code (specially interrupts) needs to be constant.

在具有硬实时约束的嵌入式应用程序(用C语言编写,在32位处理器上编写)中,关键代码(特别是中断)的执行时间需要保持不变。

How do you insure that time variability is not introduced in the execution of the code, specifically due to the processor's caches (be it L1, L2 or L3)?

您如何确保在执行代码时不会引入时间变化,特别是由于处理器的缓存(L1,L2或L3)?

Note that we are concerned with cache behavior due to the huge effect it has on execution speed (sometimes more than 100:1 vs. accessing RAM). Variability introduced due to specific processor architecture are nowhere near the magnitude of cache.

请注意,由于它对执行速度的巨大影响(有时超过100:1与访问RAM),我们关注缓存行为。由于特定的处理器架构而引入的可变性远不及缓存的大小。

7 个解决方案

#1


2  

If you can get your hands on the hardware, or work with someone who can, you can turn off the cache. Some CPUs have a pin that, if wired to ground instead of power (or maybe the other way), will disable all internal caches. That will give predictability but not speed!

如果您可以使用硬件或与可以使用的人合作,则可以关闭缓存。有些CPU有一个引脚,如果连接到地而不是电源(或者可能是另一种方式),将禁用所有内部缓存。这将提供可预测性但不提速度!

Failing that, maybe in certain places in the software code could be written to deliberately fill the cache with junk, so whatever happens next can be guaranteed to be a cache miss. Done right, that can give predictability, and perhaps could be done only in certain places so speed may be better than totally disabling caches.

如果做不到这一点,也许在软件代码中的某些地方可能会写入故意用垃圾填充缓存,因此接下来发生的任何事情都可以保证是缓存未命中。做得好,可以提供可预测性,也许只能在某些地方完成,所以速度可能比完全禁用缓存更好。

Finally, if speed does matter - carefully design the software and data as if in the old day of programming for an ancient 8-bit CPU - keep it small enough for it all to fit in L1 cache. I'm always amazed at how on-board caches these days are bigger than all of RAM on a minicomputer back in (mumble-decade). But this will be hard work and takes cleverness. Good luck!

最后,如果速度确实很重要 - 仔细设计软件和数据,就像在古老的8位CPU编程的旧日那样 - 保持足够小,使其全部适合L1缓存。我总是惊讶于这些日子里的机载缓存如何比小型机上的所有RAM都要大(mumble-decade)。但这将是艰苦的工作,需要聪明才智。祝好运!

#2


2  

Two possibilities:

Disable the cache entirely. The application will run slower, but without any variability.

完全禁用缓存。应用程序运行速度较慢,但​​没有任何可变性。

Pre-load the code in the cache and "lock it in". Most processors provide a mechanism to do this.

将代码预加载到缓存中并“锁定”。大多数处理器提供了执行此操作的机制。

#3


2  

It seems that you are referring to x86 processor family that is not built with real-time systems in mind, so there is no real guarantee for constant time execution (CPU may reorder micro-instructions, than there is branch prediction and instruction prefetch queue which is flushed each time when CPU wrongly predicts conditional jumps...)

看来你指的是x86处理器系列并没有考虑到实时系统,所以没有真正保证恒定时间执行(CPU可能重新排序微指令,而不是分支预测和指令预取队列,每当CPU错误地预测条件跳转时刷新...)

#4


0  

This answer will sound snide, but it is intended to make you think:

这个答案听起来很讽刺,但它的目的是让你想到:

Only run the code once.

只运行一次代码。

The reason I say that is because so much will make it variable and you might not even have control over it. And what is your definition of time? Suppose the operating system decides to put your process in the wait queue.

我说这个的原因是因为这么多会使它变量而你甚至可能无法控制它。你对时间的定义是什么?假设操作系统决定将您的进程置于等待队列中。

Next you have unpredictability due to cache performance, memory latency, disk I/O, and so on. These all boil down to one thing; sometimes it takes time to get the information into the processor where your code can use it. Including the time it takes to fetch/decode your code itself.

接下来,由于缓存性能,内存延迟,磁盘I / O等原因,您具有不可预测性。这些都归结为一件事;有时需要时间将信息输入到代码可以使用它的处理器中。包括获取/解码代码本身所需的时间。

Also, how much variance is acceptable to you? It could be that you're okay with 40 milliseconds, or you're okay with 10 nanoseconds.

另外,你可以接受多少差异?它可能是你没有40毫秒,或者你没有10纳秒。

Depending on the application domain you can even further just mask over or hide the variance. Computer graphics people have been rendering to off screen buffers for years to hide variance in the time to rendering each frame.

根据应用程序域,您甚至可以进一步掩盖或隐藏差异。计算机图形人员多年来一直在渲染屏幕缓冲区,以隐藏渲染每个帧的时间差异。

The traditional solutions just remove as many known variable rate things as possible. Load files into RAM, warm up the cache and avoid IO.

传统解决方案只是尽可能多地删除已知的可变速率事物。将文件加载到RAM中,预热缓存并避免IO。

#5


0  

If you make all the function calls in the critical code 'inline', and minimize the number of variables you have, so that you can let them have the 'register' type. This should improve the running time of your program. (You probably have to compile it in a special way since compilers these days tend to disregard your 'register' tags)

如果你在关键代码'inline'中进行所有函数调用,并最小化你拥有的变量数,那么你可以让它们具有'register'类型。这应该可以缩短程序的运行时间。 (你可能必须以特殊的方式编译它,因为这些天的编译器往往忽略你的'register'标签)

I'm assuming that you have enough memory not to cause page faults when you try to load something from memory. The page faults can take a lot of time.

我假设你有足够的内存,当你试图从内存中加载东西时不会导致页面错误。页面错误可能需要很长时间。

You could also take a look at the generated assembly code, to see if there are lots of branches and memory instuctions that could change your running code.

您还可以查看生成的汇编代码,看看是否有很多分支和内存可能会改变您的运行代码。

If an interrupt happens in your code execution it WILL take longer time. Do you have interrupts/exceptions enabled?

如果代码执行中发生中断,则需要更长时间。你有中断/例外吗?

#6


-1  

Preallocate memory, and make sure interrupts do no affect the cache (impossible, right).

预分配内存,并确保中断不影响缓存(不可能,正确)。

/Allan

#7


-1  

Understand your worst case runtime for complex operations and use timers.

了解复杂操作和使用计时器的最坏情况运行时。

#1


2  

If you can get your hands on the hardware, or work with someone who can, you can turn off the cache. Some CPUs have a pin that, if wired to ground instead of power (or maybe the other way), will disable all internal caches. That will give predictability but not speed!

如果您可以使用硬件或与可以使用的人合作,则可以关闭缓存。有些CPU有一个引脚,如果连接到地而不是电源(或者可能是另一种方式),将禁用所有内部缓存。这将提供可预测性但不提速度!

Failing that, maybe in certain places in the software code could be written to deliberately fill the cache with junk, so whatever happens next can be guaranteed to be a cache miss. Done right, that can give predictability, and perhaps could be done only in certain places so speed may be better than totally disabling caches.

如果做不到这一点,也许在软件代码中的某些地方可能会写入故意用垃圾填充缓存,因此接下来发生的任何事情都可以保证是缓存未命中。做得好,可以提供可预测性,也许只能在某些地方完成,所以速度可能比完全禁用缓存更好。

Finally, if speed does matter - carefully design the software and data as if in the old day of programming for an ancient 8-bit CPU - keep it small enough for it all to fit in L1 cache. I'm always amazed at how on-board caches these days are bigger than all of RAM on a minicomputer back in (mumble-decade). But this will be hard work and takes cleverness. Good luck!

最后,如果速度确实很重要 - 仔细设计软件和数据,就像在古老的8位CPU编程的旧日那样 - 保持足够小,使其全部适合L1缓存。我总是惊讶于这些日子里的机载缓存如何比小型机上的所有RAM都要大(mumble-decade)。但这将是艰苦的工作,需要聪明才智。祝好运!

#2


2  

Two possibilities:

Disable the cache entirely. The application will run slower, but without any variability.

完全禁用缓存。应用程序运行速度较慢,但​​没有任何可变性。

Pre-load the code in the cache and "lock it in". Most processors provide a mechanism to do this.

将代码预加载到缓存中并“锁定”。大多数处理器提供了执行此操作的机制。

#3


2  

It seems that you are referring to x86 processor family that is not built with real-time systems in mind, so there is no real guarantee for constant time execution (CPU may reorder micro-instructions, than there is branch prediction and instruction prefetch queue which is flushed each time when CPU wrongly predicts conditional jumps...)

看来你指的是x86处理器系列并没有考虑到实时系统,所以没有真正保证恒定时间执行(CPU可能重新排序微指令,而不是分支预测和指令预取队列,每当CPU错误地预测条件跳转时刷新...)

#4


0  

This answer will sound snide, but it is intended to make you think:

这个答案听起来很讽刺,但它的目的是让你想到:

Only run the code once.

只运行一次代码。

The reason I say that is because so much will make it variable and you might not even have control over it. And what is your definition of time? Suppose the operating system decides to put your process in the wait queue.

我说这个的原因是因为这么多会使它变量而你甚至可能无法控制它。你对时间的定义是什么?假设操作系统决定将您的进程置于等待队列中。

Next you have unpredictability due to cache performance, memory latency, disk I/O, and so on. These all boil down to one thing; sometimes it takes time to get the information into the processor where your code can use it. Including the time it takes to fetch/decode your code itself.

接下来,由于缓存性能,内存延迟,磁盘I / O等原因,您具有不可预测性。这些都归结为一件事;有时需要时间将信息输入到代码可以使用它的处理器中。包括获取/解码代码本身所需的时间。

Also, how much variance is acceptable to you? It could be that you're okay with 40 milliseconds, or you're okay with 10 nanoseconds.

另外,你可以接受多少差异?它可能是你没有40毫秒,或者你没有10纳秒。

Depending on the application domain you can even further just mask over or hide the variance. Computer graphics people have been rendering to off screen buffers for years to hide variance in the time to rendering each frame.

根据应用程序域,您甚至可以进一步掩盖或隐藏差异。计算机图形人员多年来一直在渲染屏幕缓冲区,以隐藏渲染每个帧的时间差异。

The traditional solutions just remove as many known variable rate things as possible. Load files into RAM, warm up the cache and avoid IO.

传统解决方案只是尽可能多地删除已知的可变速率事物。将文件加载到RAM中,预热缓存并避免IO。

#5


0  

If you make all the function calls in the critical code 'inline', and minimize the number of variables you have, so that you can let them have the 'register' type. This should improve the running time of your program. (You probably have to compile it in a special way since compilers these days tend to disregard your 'register' tags)

如果你在关键代码'inline'中进行所有函数调用,并最小化你拥有的变量数,那么你可以让它们具有'register'类型。这应该可以缩短程序的运行时间。 (你可能必须以特殊的方式编译它,因为这些天的编译器往往忽略你的'register'标签)

I'm assuming that you have enough memory not to cause page faults when you try to load something from memory. The page faults can take a lot of time.

我假设你有足够的内存,当你试图从内存中加载东西时不会导致页面错误。页面错误可能需要很长时间。

You could also take a look at the generated assembly code, to see if there are lots of branches and memory instuctions that could change your running code.

您还可以查看生成的汇编代码,看看是否有很多分支和内存可能会改变您的运行代码。

If an interrupt happens in your code execution it WILL take longer time. Do you have interrupts/exceptions enabled?

如果代码执行中发生中断,则需要更长时间。你有中断/例外吗?

#6


-1  

Preallocate memory, and make sure interrupts do no affect the cache (impossible, right).

预分配内存,并确保中断不影响缓存(不可能,正确)。

/Allan

#7


-1  

Understand your worst case runtime for complex operations and use timers.

了解复杂操作和使用计时器的最坏情况运行时。