简单浮点运算的奇怪结果 - FPU内部状态不好?

时间:2022-03-12 20:44:13

I have a software project in which I sometimes get strange results from small, simple floating point operations. I assume there is something I have missed, and would like some tips about how to debug the following problems:


(the compiler used is MS VC 6.0, that is version 12 of the Microsoft C compiler)

(使用的编译器是MS VC 6.0,即Microsoft C编译器的12版)

First anomaly:

extern double Time, TimeStamp, TimeStep;  // History terms, updated elsewhere
void timer_evaluation_function( ) {
    if ( ( Time - TimeStamp ) >= TimeStep ) {  
        TimeStamp += TimeStep;  
        timer_controlled_code( );  

For some reason, the timer evaluation failed and the timed code never executed. In the debugger, there was no problem to see that the trig condition were in fact true but the FPU refused to find a positive result. The following code segment had no problems although it performed the same operations. The problem was sidestepped by inserting a bogus evaluation which could be allowed to fail.


I'm guessing the FPU state is somehow tainted by earlier operations performed, and that there are some compiler flags that would help?


Second anomaly:

double K, Kp = 1.0, Ti = 0.02;
void timed_code( ){
    K = ( Kp * ( float ) 2000 ) / ( ( float ) 2000 - 2.0F * Ti * 1e6 )

The result is #IND, even though the debugger evaluates the equation to approx 0.05. The #IND value appears in the FPU stack when the 2.0F value is loaded into the FPU from using the fld instruction. The previous instruction loads the integer value 2000 as a double float using the fild instruction. Once the FPU stack contains the #IND value all is lost, but once again the debugger has no problem evaluating the formula. Later on, these operations return the expected results.

结果是#IND,即使调试器将等式评估为大约0.05。当使用fld指令将2.0F值加载到FPU时,#INT值出现在FPU堆栈中。前一条指令使用fild指令将整数值2000加载为double float。一旦FPU堆栈包含#IND值,所有都会丢失,但调试器再次评估公式没有问题。稍后,这些操作将返回预期结果。

Also, once again the FPU problems occur directly after the function call. Should I insert floating point operations that clears the FPU state after each new function? Is there a compiler flag that could affect the FPU in some way?


I'm grateful of any and all tips and tricks at this point.


EDIT: I've managed to avoid the problem by calling the assembly function EMMS first thing in the top function. That way the FPU is cleared of any MMX related garbage that may or may not have been created in the environment my code is called from. It seems that the state of the FPU is not something to take for granted.



6 个解决方案


If you're using the windows QueryPerformanceCounter and QueryPerformanceFrequency functions on a system that supports MMX try inserting the femms instruction after querying the frequency/counter and before the computation.

如果您在支持MMX的系统上使用Windows QueryPerformanceCounter和QueryPerformanceFrequency函数,请在查询频率/计数器之后和计算之前插入femms指令。

__asm femms

I've encountered trouble from these function before where they were doing 64 bit computation using MMX and not clearing the floating point flags/state.


This situation could also happen if there is any 64 bit arithmetic between the floating point operations.



No idea what the problem could be, but on x86, the FINIT instructions clears the FPU. To test your theory, you can insert this somewhere in your code:


__asm {


It's not really an answer to your question, but you might want to look at two of Raymond Chen's articles regarding strange FPU behaviour. Having read your question and re-read the articles, I don't immediately see a link - but if the code you've pasted isn't complete or if the articles give you an idea about some surrounding behaviour which caused the issue... specifically, if you're loading a DLL anywhere nearby.

这不是你问题的真正答案,但你可能想看看Raymond Chen关于奇怪的FPU行为的两篇文章。阅读完您的问题并重新阅读文章后,我不会立即看到一个链接 - 但如果您粘贴的代码不完整,或者文章让您了解导致该问题的某些周围行为。特别是,如果你在附近的任何地方加载DLL。

Uninitialized floating point variables can be deadly


How did the invalid floating point operand exception get raised when I disabled it?



While I am not providing you with an exact solution, I suggest you start by reading this article that describes the different optimizations that one can use.



re: timestamps--

What are you getting your source of timestamps from? Something sounds suspicious. Try logging them to a file.



If the bad value is loaded by a fld that should load 2.0, I'd check the memory where this value is loaded from - it might just be a compiler/linker problem.

如果错误的值由应加载2.0的fld加载,我会检查加载此值的内存 - 它可能只是编译器/链接器问题。


