I am trying to convert a double to a string in a native NT application, i.e. an application that only depends on ntdll.dll
. Unfortunately, ntdll's version of vsnprintf
does not support %f
et al., forcing me to implement the conversion on my own.
我正在尝试将一个二进制文件转换成本地NT应用程序中的一个字符串,即只依赖于ntdll.dll的应用程序。不幸的是,ntdll的vsnprintf版本不支持%f等,迫使我自己实现转换。
The aforementioned ntdll.dll
exports only a few of the math.h
functions (floor
, ceil
, log
, pow
, ...). However, I am reasonably sure that I can implement any of the unavailable math.h
functions if necessary.
上述ntdll。dll只导出一些数学数据。h函数(楼层,ceil, log, pow,…)但是,我可以合理地确定我可以实现任何不可用的数学。必要时h函数。
There is an implementation of floating point conversion in GNU's libc, but the code is extremely dense and difficult to comprehent (the GNU indentation style does not help here).
在GNU的libc中有一个浮点转换的实现,但是代码非常密集并且难于理解(GNU的缩进样式在这里没有帮助)。
I've already implemented the conversion by normalizing the number (i.e. multiplying/dividing the number by 10 until it's in the interval [1, 10)
) and then generating each digit by cutting the integral part off with modf
and multiplying the fractional part by 10. This works, but there is a loss of precision (only the first 15 digits are correct). The loss of precision is, of course, inherent to the algorithm.
我已经通过对数字的标准化来实现转换(即把数字乘以10,直到它在区间内[1,10)),然后通过将整数部分与modf分割,并将小数部分乘以10来生成每一个数字。这是可行的,但是有精度的损失(只有前15位是正确的)。当然,精度的损失是算法固有的。
I'd settle with 17 digits, but an algorithm that would be able to generate an arbitrary number of digits correctly would be preferred.
我用17个数字来解决,但是一个能够正确生成任意位数的数字的算法是最好的。
Could you please suggest an algorithm or point me to a good resource?
你能给我推荐一个算法或者给我一个好的资源吗?
7 个解决方案
#1
5
Double-precision numbers do not have more than 15 significant (decimal) figures of precision. There is absolutely no way you can get "an arbitrary number of digits correctly"; double
s are not bignums.
双精度数字的精度不超过15个有效数字(十进制)。你绝对不可能得到“任意数目的数字”;不bignums双打。
Since you say you're happy with 17 significant figures, use long double
; on Windows, I think, that will give you 19 significant figures.
既然你说你对17个重要的数字感到满意,那就用长倍;在Windows上,我认为,这将给你带来19个重要的数字。
#2
4
I've thought about this a bit more. You lose precision because you normalize by multiplying by some power of 10 (you chose [1,10) rather than [0,1), but that's a minor detail). If you did so with a power of 2, you'd lose no precision, but then you'd get "decimal digits"*2^e; you could implement bcd arithmetic and compute the product yourself, but that doesn't sound like fun.
我想得更多了。你失去了精度,因为你通过乘以10的一些幂(你选择了[1,10)而不是[0,1),但这只是一个小细节。如果你是2的幂,你不失精度,但你会得到“小数位数”* 2 ^ e;您可以实现bcd算法并自己计算产品,但这听起来并不有趣。
I'm pretty confident that you could split the double g=m*2^e
into two parts: h=floor(g*10^k)
and i=modf(g*10^k)
for some k, and then separately convert to decimal digits and then stitch them together, but how about a simpler approach: use "long double" (80 bits, but I've heard that Visual C++ may not support it?) with your current approach and stop after 17 digits.
我很相信你可以把双g = m * 2 ^ e分为两部分:h =地板(g * 10 ^ k)和I = modf(g * 10 ^ k)k,然后分别转换为小数位数,然后缝合在一起,但是一个更简单的方法:用“长两倍”(80位,但我听说Visual c++可能不支持吗?)你目前的方法和停止后17位数。
_gcvt
should do it (edit - it's not in ntdll.dll, it's in some msvcrt*.dll?)
_gcvt应该这样做(编辑——它不在ntdll中。dll文件,在msvcrt*.dll文件中?)
As for decimal digits of precision, IEEE binary64 has 52 binary digits. 52*log10(2)=15.65... (edit: as you pointed out, to round trip, you need more than 16 digits)
至于十进制精度数字,IEEE binary64有52个二进制数字。52 * log10(2)= 15.65……(编辑:如你所指出的,对于往返,你需要超过16位数字)
#3
3
After a lot of research, I found a paper titled Printing Floating-Point Numbers Quickly and Accurately. It uses exact rational arithmetic to avoid precision loss. It cites a little older paper: How to Print Floating-Point Numbers Accurately, which however seems to require ACM subscription to access.
经过大量的研究,我发现了一篇论文,题目是快速准确的打印浮点数。它使用精确的有理算法来避免精度损失。它引用了一篇较老的文章:如何准确地打印浮点数,而这似乎需要ACM订阅才能访问。
Since the former paper was reprinted in 2006, I am inclined to believe that it is still current. The exact rational arithmetic (which requires dynamic allocation) seems to be a necessary evil.
自从2006年再版以来,我倾向于认为它仍然是当前的。精确的理性算法(需要动态分配)似乎是一种必要的邪恶。
#4
2
A complete implementation of the C code for the fastest known (as of today) algorithm: http://code.google.com/p/double-conversion/downloads/list
一个完整的C代码实现的最快的算法:http://code.google.com/p/double conversion/downloads/list
It even includes a test suite.
它甚至包括一个测试套件。
This is the C code behind the algorithm described in this PDF: Printing Floating-Point Numbers Quickly and Accurately http://www.cs.indiana.edu/~burger/FP-Printing-PLDI96.pdf
这是本PDF中描述的算法背后的C代码:快速准确地打印浮点数http://www.cs.indiana.edu/~burger/ fp - printpldi96 . PDF
#5
2
#include <cstdint>
// --------------------------------------------------------------------------
// Return number of decimal-digits of a given unsigned-integer
// N is unit8_t/uint16_t/uint32_t/uint64_t
template <class N> inline uint8_t GetUnsignedDecDigits(const N n)
{
static_assert(std::numeric_limits<N>::is_integer && !std::numeric_limits<N>::is_signed,
"GetUnsignedDecDigits: unsigned integer type expected" );
const uint8_t anMaxDigits[]= {3, 5, 8, 10, 13, 15, 17, 20};
const uint8_t nMaxDigits = anMaxDigits[sizeof(N)-1];
uint8_t nDigits= 1;
N nRoof = 10;
while ((n >= nRoof) && (nDigits<nMaxDigits))
{
nDigits++;
nRoof*= 10;
}
return nDigits;
}
// --------------------------------------------------------------------------
// Convert floating-point value to NULL-terminated string represention
TCHAR* DoubleToStr(double f , // [i ]
TCHAR* pczStr , // [i/o] caller should allocate enough space
int nDigitsI, // [i ] digits of integer part including sign / <1: auto
int nDigitsF ) // [i ] digits of fractional part / <0: auto
{
switch (_fpclass(f))
{
case _FPCLASS_SNAN:
case _FPCLASS_QNAN: _tcscpy_s(pczStr, 5, _T("NaN" )); return pczStr;
case _FPCLASS_NINF: _tcscpy_s(pczStr, 5, _T("-INF")); return pczStr;
case _FPCLASS_PINF: _tcscpy_s(pczStr, 5, _T("+INF")); return pczStr;
}
if (nDigitsI> 18) nDigitsI= 18; if (nDigitsI< 1) nDigitsI= -1;
if (nDigitsF> 18) nDigitsF= 18; if (nDigitsF< 0) nDigitsF= -1;
bool bNeg= (f<0);
if (f<0)
f= -f;
int nE= 0; // exponent (displayed if != 0)
if ( ((-1 == nDigitsI) && (f >= 1e18 )) || // large value: switch to scientific representation
((-1 != nDigitsI) && (f >= pow(10., nDigitsI))) )
{
nE= (int)log10(f);
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
else if (f>0)
if ((-1 == nDigitsF) && (f <= 1e-10)) // small value: switch to scientific representation
{
nE= (int)log10(f)-1;
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
double fI;
double fF= modf(f, &fI); // fI: integer part, fF: fractional part
if (-1 == nDigitsF) // figure out number of meaningfull digits in fF
{
double fG, fGI, fGF;
do
{
nDigitsF++;
fG = fF*pow(10., nDigitsF);
fGF= modf(fG, &fGI);
}
while (fGF > 1e-10);
}
const double afPower10[20]= {1e0 , 1e1 , 1e2 , 1e3 , 1e4 , 1e5 , 1e6 , 1e7 , 1e8 , 1e9 ,
1e10, 1e11, 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19 };
uint64_t uI= (uint64_t)round(fI );
uint64_t uF= (uint64_t)round(fF*afPower10[nDigitsF]);
if (uF)
if (GetUnsignedDecDigits(uF) > nDigitsF) // X.99999 was rounded to X+1
{
uF= 0;
uI++;
if (nE)
{
uI/= 10;
nE++;
}
}
uint8_t nRealDigitsI= GetUnsignedDecDigits(uI);
if (bNeg)
nRealDigitsI++;
int nPads= 0;
if (-1 != nDigitsI)
{
nPads= nDigitsI-nRealDigitsI;
for (int i= nPads-1; i>=0; i--) // leading spaces
pczStr[i]= _T(' ');
}
if (bNeg) // minus sign
{
pczStr[nPads]= _T('-');
nRealDigitsI--;
nPads++;
}
for (int j= nRealDigitsI-1; j>=0; j--) // digits of integer part
{
pczStr[nPads+j]= (uint8_t)(uI%10) + _T('0');
uI /= 10;
}
nPads+= nRealDigitsI;
if (nDigitsF)
{
pczStr[nPads++]= _T('.'); // decimal point
for (int k= nDigitsF-1; k>=0; k--) // digits of fractional part
{
pczStr[nPads+k]= (uint8_t)(uF%10)+ _T('0');
uF /= 10;
}
}
nPads+= nDigitsF;
if (nE)
{
pczStr[nPads++]= _T('e'); // exponent sign
if (nE<0)
{
pczStr[nPads++]= _T('-');
nE= -nE;
}
else
pczStr[nPads++]= _T('+');
for (int l= 2; l>=0; l--) // digits of exponent
{
pczStr[nPads+l]= (uint8_t)(nE%10) + _T('0');
nE /= 10;
}
pczStr[nPads+3]= 0;
}
else
pczStr[nPads]= 0;
return pczStr;
}
#6
1
Does vsnprintf
supports I64?
vsnprintf支持I64吗?
double x = SOME_VAL; // allowed to be from -1.e18 to 1.e18
bool sign = (SOME_VAL < 0);
if ( sign ) x = -x;
__int64 i = static_cast<__int64>( x );
double xm = x - static_cast<double>( i );
__int64 w = static_cast<__int64>( xm*pow(10.0, DIGITS_VAL) ); // DIGITS_VAL indicates how many digits after the decimal point you want to get
char out[100];
vsnprintf( out, sizeof out, "%s%I64.%I64", (sign?"-":""), i, w );
Another option is to try to find implementation of gcvt
.
另一个选择是尝试找到gcvt的实现。
#1
5
Double-precision numbers do not have more than 15 significant (decimal) figures of precision. There is absolutely no way you can get "an arbitrary number of digits correctly"; double
s are not bignums.
双精度数字的精度不超过15个有效数字(十进制)。你绝对不可能得到“任意数目的数字”;不bignums双打。
Since you say you're happy with 17 significant figures, use long double
; on Windows, I think, that will give you 19 significant figures.
既然你说你对17个重要的数字感到满意,那就用长倍;在Windows上,我认为,这将给你带来19个重要的数字。
#2
4
I've thought about this a bit more. You lose precision because you normalize by multiplying by some power of 10 (you chose [1,10) rather than [0,1), but that's a minor detail). If you did so with a power of 2, you'd lose no precision, but then you'd get "decimal digits"*2^e; you could implement bcd arithmetic and compute the product yourself, but that doesn't sound like fun.
我想得更多了。你失去了精度,因为你通过乘以10的一些幂(你选择了[1,10)而不是[0,1),但这只是一个小细节。如果你是2的幂,你不失精度,但你会得到“小数位数”* 2 ^ e;您可以实现bcd算法并自己计算产品,但这听起来并不有趣。
I'm pretty confident that you could split the double g=m*2^e
into two parts: h=floor(g*10^k)
and i=modf(g*10^k)
for some k, and then separately convert to decimal digits and then stitch them together, but how about a simpler approach: use "long double" (80 bits, but I've heard that Visual C++ may not support it?) with your current approach and stop after 17 digits.
我很相信你可以把双g = m * 2 ^ e分为两部分:h =地板(g * 10 ^ k)和I = modf(g * 10 ^ k)k,然后分别转换为小数位数,然后缝合在一起,但是一个更简单的方法:用“长两倍”(80位,但我听说Visual c++可能不支持吗?)你目前的方法和停止后17位数。
_gcvt
should do it (edit - it's not in ntdll.dll, it's in some msvcrt*.dll?)
_gcvt应该这样做(编辑——它不在ntdll中。dll文件,在msvcrt*.dll文件中?)
As for decimal digits of precision, IEEE binary64 has 52 binary digits. 52*log10(2)=15.65... (edit: as you pointed out, to round trip, you need more than 16 digits)
至于十进制精度数字,IEEE binary64有52个二进制数字。52 * log10(2)= 15.65……(编辑:如你所指出的,对于往返,你需要超过16位数字)
#3
3
After a lot of research, I found a paper titled Printing Floating-Point Numbers Quickly and Accurately. It uses exact rational arithmetic to avoid precision loss. It cites a little older paper: How to Print Floating-Point Numbers Accurately, which however seems to require ACM subscription to access.
经过大量的研究,我发现了一篇论文,题目是快速准确的打印浮点数。它使用精确的有理算法来避免精度损失。它引用了一篇较老的文章:如何准确地打印浮点数,而这似乎需要ACM订阅才能访问。
Since the former paper was reprinted in 2006, I am inclined to believe that it is still current. The exact rational arithmetic (which requires dynamic allocation) seems to be a necessary evil.
自从2006年再版以来,我倾向于认为它仍然是当前的。精确的理性算法(需要动态分配)似乎是一种必要的邪恶。
#4
2
A complete implementation of the C code for the fastest known (as of today) algorithm: http://code.google.com/p/double-conversion/downloads/list
一个完整的C代码实现的最快的算法:http://code.google.com/p/double conversion/downloads/list
It even includes a test suite.
它甚至包括一个测试套件。
This is the C code behind the algorithm described in this PDF: Printing Floating-Point Numbers Quickly and Accurately http://www.cs.indiana.edu/~burger/FP-Printing-PLDI96.pdf
这是本PDF中描述的算法背后的C代码:快速准确地打印浮点数http://www.cs.indiana.edu/~burger/ fp - printpldi96 . PDF
#5
2
#include <cstdint>
// --------------------------------------------------------------------------
// Return number of decimal-digits of a given unsigned-integer
// N is unit8_t/uint16_t/uint32_t/uint64_t
template <class N> inline uint8_t GetUnsignedDecDigits(const N n)
{
static_assert(std::numeric_limits<N>::is_integer && !std::numeric_limits<N>::is_signed,
"GetUnsignedDecDigits: unsigned integer type expected" );
const uint8_t anMaxDigits[]= {3, 5, 8, 10, 13, 15, 17, 20};
const uint8_t nMaxDigits = anMaxDigits[sizeof(N)-1];
uint8_t nDigits= 1;
N nRoof = 10;
while ((n >= nRoof) && (nDigits<nMaxDigits))
{
nDigits++;
nRoof*= 10;
}
return nDigits;
}
// --------------------------------------------------------------------------
// Convert floating-point value to NULL-terminated string represention
TCHAR* DoubleToStr(double f , // [i ]
TCHAR* pczStr , // [i/o] caller should allocate enough space
int nDigitsI, // [i ] digits of integer part including sign / <1: auto
int nDigitsF ) // [i ] digits of fractional part / <0: auto
{
switch (_fpclass(f))
{
case _FPCLASS_SNAN:
case _FPCLASS_QNAN: _tcscpy_s(pczStr, 5, _T("NaN" )); return pczStr;
case _FPCLASS_NINF: _tcscpy_s(pczStr, 5, _T("-INF")); return pczStr;
case _FPCLASS_PINF: _tcscpy_s(pczStr, 5, _T("+INF")); return pczStr;
}
if (nDigitsI> 18) nDigitsI= 18; if (nDigitsI< 1) nDigitsI= -1;
if (nDigitsF> 18) nDigitsF= 18; if (nDigitsF< 0) nDigitsF= -1;
bool bNeg= (f<0);
if (f<0)
f= -f;
int nE= 0; // exponent (displayed if != 0)
if ( ((-1 == nDigitsI) && (f >= 1e18 )) || // large value: switch to scientific representation
((-1 != nDigitsI) && (f >= pow(10., nDigitsI))) )
{
nE= (int)log10(f);
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
else if (f>0)
if ((-1 == nDigitsF) && (f <= 1e-10)) // small value: switch to scientific representation
{
nE= (int)log10(f)-1;
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
double fI;
double fF= modf(f, &fI); // fI: integer part, fF: fractional part
if (-1 == nDigitsF) // figure out number of meaningfull digits in fF
{
double fG, fGI, fGF;
do
{
nDigitsF++;
fG = fF*pow(10., nDigitsF);
fGF= modf(fG, &fGI);
}
while (fGF > 1e-10);
}
const double afPower10[20]= {1e0 , 1e1 , 1e2 , 1e3 , 1e4 , 1e5 , 1e6 , 1e7 , 1e8 , 1e9 ,
1e10, 1e11, 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19 };
uint64_t uI= (uint64_t)round(fI );
uint64_t uF= (uint64_t)round(fF*afPower10[nDigitsF]);
if (uF)
if (GetUnsignedDecDigits(uF) > nDigitsF) // X.99999 was rounded to X+1
{
uF= 0;
uI++;
if (nE)
{
uI/= 10;
nE++;
}
}
uint8_t nRealDigitsI= GetUnsignedDecDigits(uI);
if (bNeg)
nRealDigitsI++;
int nPads= 0;
if (-1 != nDigitsI)
{
nPads= nDigitsI-nRealDigitsI;
for (int i= nPads-1; i>=0; i--) // leading spaces
pczStr[i]= _T(' ');
}
if (bNeg) // minus sign
{
pczStr[nPads]= _T('-');
nRealDigitsI--;
nPads++;
}
for (int j= nRealDigitsI-1; j>=0; j--) // digits of integer part
{
pczStr[nPads+j]= (uint8_t)(uI%10) + _T('0');
uI /= 10;
}
nPads+= nRealDigitsI;
if (nDigitsF)
{
pczStr[nPads++]= _T('.'); // decimal point
for (int k= nDigitsF-1; k>=0; k--) // digits of fractional part
{
pczStr[nPads+k]= (uint8_t)(uF%10)+ _T('0');
uF /= 10;
}
}
nPads+= nDigitsF;
if (nE)
{
pczStr[nPads++]= _T('e'); // exponent sign
if (nE<0)
{
pczStr[nPads++]= _T('-');
nE= -nE;
}
else
pczStr[nPads++]= _T('+');
for (int l= 2; l>=0; l--) // digits of exponent
{
pczStr[nPads+l]= (uint8_t)(nE%10) + _T('0');
nE /= 10;
}
pczStr[nPads+3]= 0;
}
else
pczStr[nPads]= 0;
return pczStr;
}
#6
1
Does vsnprintf
supports I64?
vsnprintf支持I64吗?
double x = SOME_VAL; // allowed to be from -1.e18 to 1.e18
bool sign = (SOME_VAL < 0);
if ( sign ) x = -x;
__int64 i = static_cast<__int64>( x );
double xm = x - static_cast<double>( i );
__int64 w = static_cast<__int64>( xm*pow(10.0, DIGITS_VAL) ); // DIGITS_VAL indicates how many digits after the decimal point you want to get
char out[100];
vsnprintf( out, sizeof out, "%s%I64.%I64", (sign?"-":""), i, w );
Another option is to try to find implementation of gcvt
.
另一个选择是尝试找到gcvt的实现。