实现良好的“itoa()”功能的正确方法是什么?

时间:2022-07-14 14:28:33

I was wondering if my implementation of an "itoa" function is correct. Maybe you can help me getting it a bit more "correct", I'm pretty sure I'm missing something. (Maybe there is already a library doing the conversion the way I want it to do, but... couldn't find any)

我想知道我执行“itoa”功能是否正确。也许你可以帮我把它变得更“正确”,我很确定我错过了什么。 (也许已经有一个库按我想要的方式进行转换,但是......找不到任何东西)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

char * itoa(int i) {
  char * res = malloc(8*sizeof(int));
  sprintf(res, "%d", i);
  return res;
}

int main(int argc, char *argv[]) {
 ...

11 个解决方案

#1


6  

The only actual error is that you don't check the return value of malloc for null.

唯一的实际错误是您没有检查malloc的返回值是否为null。

The name itoa is kind of already taken for a function that's non-standard, but not that uncommon. It doesn't allocate memory, rather it writes to a buffer provided by the caller:

itoa的名称已经被用于非标准的功能,但并非罕见。它不分配内存,而是写入调用者提供的缓冲区:

char *itoa(int value, char * str, int base);

If you don't want to rely on your platform having that, I would still advise following the pattern. String-handling functions which return newly allocated memory in C are generally more trouble than they're worth in the long run, because most of the time you end up doing further manipulation, and so you have to free lots of intermediate results. For example, compare:

如果你不想依赖你的平台,我仍然会建议遵循这种模式。在C中返回新分配的内存的字符串处理函数通常比它们在长期运行中的价值更麻烦,因为大多数时候你最终会进行进一步的操作,因此你必须释放大量的中间结果。例如,比较:

void delete_temp_files() {
    char filename[20];
    strcpy(filename, "tmp_");
    char *endptr = filename + strlen(filename);
    for (int i = 0; i < 10; ++i) {
        itoa(endptr, i, 10); // itoa doesn't allocate memory
        unlink(filename);
    }
}

vs.

void delete_temp_files() {
    char filename[20];
    strcpy(filename, "tmp_");
    char *endptr = filename + strlen(filename);
    for (int i = 0; i < 10; ++i) {
        char *number = itoa(i, 10); // itoa allocates memory
        strcpy(endptr, number);
        free(number);
        unlink(filename);
    }
}

If you had reason to be especially concerned about performance (for instance if you're implementing a stdlib-style library including itoa), or if you were implementing bases that sprintf doesn't support, then you might consider not calling sprintf. But if you want a base 10 string, then your first instinct was right. There's absolutely nothing "incorrect" about the %d format specifier.

如果您有理由特别关注性能(例如,如果您正在实现包含itoa的stdlib样式库),或者如果您正在实现sprintf不支持的基础,那么您可能会考虑不调用sprintf。但是如果你想要一个基数为10的字符串,那么你的第一直觉是正确的。关于%d格式说明符绝对没有“不正确”。

Here's a possible implementation of itoa, for base 10 only:

以下是itoa的可能实现,仅适用于基数10:

char *itobase10(char *buf, int value) {
    sprintf(buf, "%d", value);
    return buf;
}

Here's one which incorporates the snprintf-style approach to buffer lengths:

这是一个将snprintf风格的方法结合到缓冲区长度的方法:

int itobase10n(char *buf, size_t sz, int value) {
    return snprintf(buf, sz, "%d", value);
}

#2


10  

// Yet, another good itoa implementation
// returns: the length of the number string
int itoa(int value, char *sp, int radix)
{
    char tmp[16];// be careful with the length of the buffer
    char *tp = tmp;
    int i;
    unsigned v;

    int sign = (radix == 10 && value < 0);    
    if (sign)
        v = -value;
    else
        v = (unsigned)value;

    while (v || tp == tmp)
    {
        i = v % radix;
        v /= radix; // v/=radix uses less CPU clocks than v=v/radix does
        if (i < 10)
          *tp++ = i+'0';
        else
          *tp++ = i + 'a' - 10;
    }

    int len = tp - tmp;

    if (sign) 
    {
        *sp++ = '-';
        len++;
    }

    while (tp > tmp)
        *sp++ = *--tp;

    return len;
}

// Usage Example:
char int_str[15]; // be careful with the length of the buffer
int n = 56789;
int len = itoa(n,int_str,10);

#3


3  

I think you are allocating perhaps too much memory. malloc(8*sizeof(int)) will give you 32 bytes on most machines, which is probably excessive for a text representation of an int.

我认为你分配的内存太多了。 malloc(8 * sizeof(int))在大多数机器上都会给你32个字节,这对于int的文本表示来说可能是过多的。

#4


2  

I'm not quite sure where you get 8*sizeof(int) as the maximum possible number of characters -- ceil(8 / (log(10) / log(2))) yields a multiplier of 3*. Additionally, under C99 and some older POSIX platforms you can create an accurately-allocating version with sprintf():

我不太清楚8 * sizeof(int)的最大可能字符数 - ceil(8 /(log(10)/ log(2)))乘以3 *。此外,在C99和一些较旧的POSIX平台下,您可以使用sprintf()创建精确分配的版本:

char *
itoa(int i) 
{
    int n = snprintf(NULL, 0, "%d", i) + 1;
    char *s = malloc(n);

    if (s != NULL)
        snprintf(s, n, "%d", i);
    return s;
}

HTH

#5


2  

i found an interesting resource dealing with several different issues with the itoa implementation
you might wanna look it up too
itoa() implementations with performance tests

我找到了一个有趣的资源来处理与itoa实现的几个不同的问题你可能想要查找itoa()实现与性能测试

#6


1  

You should use a function in the printf family for this purpose. If you'll be writing the result to stdout or a file, use printf/fprintf. Otherwise, use snprintf with a buffer big enough to hold 3*sizeof(type)+2 bytes or more.

为此,您应该在printf系列中使用一个函数。如果您要将结果写入stdout或文件,请使用printf / fprintf。否则,请使用snprintf,其缓冲区大小足以容纳3 * sizeof(类型)+2字节或更多字节。

#7


1  

A good itoa()

好itoa()

Works for [INT_MIN...INT_MAX], base [2...36]
Does not require 2's complement.
Does not require unsigned to have a greater positive range than int - it does not use unsigned.
Does not assume int size.

适用于[INT_MIN ... INT_MAX],base [2 ... 36]不需要2的补码。不要求unsigned具有比int更大的正范围 - 它不使用unsigned。不假设int大小。

Note: Uses a '-' for negative numbers, even when base != 10.

注意:对于负数使用“ - ”,即使基数为!= 10。

Tailor the error handling as needed.

根据需要定制错误处理。

char* itostr(char *dest, size_t size, int a, int base) {
  static char buffer[sizeof a * CHAR_BIT + 1 + 1];
  static const char digits[36] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";

  if (base < 2 || base > 36) {
    fprintf(stderr, "Invalid base");
    return NULL;
  }

  char* p = &buffer[sizeof(buffer) - 1];
  *p = '\0';

  int an = a < 0 ? a : -a;  

  // Works with negative `int`
  do {
    *(--p) = digits[-(an % base)];
    an /= base;
  } while (an);

  if (a < 0) {
    *(--p) = '-';
  }

  size_t size_used = &buffer[sizeof(buffer)] - p;
  if (size_used > size) {
    fprintf(stderr, "Scant buffer %zu > %zu", size_used , size);
    return NULL;
  }
  return memcpy(dest, p, size_used);
}

#8


0  

This should work:

这应该工作:

#include <string.h>
#include <stdlib.h>
#include <math.h>

char * itoa_alloc(int x) {
   int s = x<=0 ? 1 ? 0; // either space for a - or for a 0
   size_t len = (size_t) ceil( log10( abs(x) ) );
   char * str = malloc(len+s + 1);

   sprintf(str, "%i", x);

   return str;
}

If you don't want to have to use the math/floating point functions (and have to link in the math libraries) you should be able to find non-floating point versions of log10 by searching the Web and do:

如果您不想使用数学/浮点函数(并且必须在数学库中链接),您应该能够通过搜索Web找到log10的非浮点版本并执行:

size_t len = my_log10( abs(x) ) + 1;

size_t len = my_log10(abs(x))+ 1;

That might give you 1 more byte than you needed, but you'd have enough.

这可能比你需要多1个字节,但你已经足够了。

#9


0  

There a couple of suggestions I might make. You can use a static buffer and strdup to avoid repeatedly allocating too much memory on subsequent calls. I would also add some error checking.

我可能会提出一些建议。您可以使用静态缓冲区和strdup来避免在后续调用中重复分配太多内存。我还会添加一些错误检查。

char *itoa(int i)
{
  static char buffer[12];

  if (snprintf(buffer, sizeof(buffer), "%d", i) < 0)
    return NULL;

  return strdup(buffer);
}

If this will be called in a multithreaded environment, remove "static" from the buffer declaration.

如果在多线程环境中调用它,请从缓冲区声明中删除“static”。

#10


0  

sprintf is quite slow, if performance matters it is probably not the best solution.

sprintf非常慢,如果性能很重要,它可能不是最好的解决方案。

if the base argument is a power of 2 the conversion can be done with a shift and masking, and one can avoid reversing the string by recording the digits from the highest positions. For instance, something like this for base=16

如果基本参数是2的幂,则可以通过移位和屏蔽来完成转换,并且可以通过记录来自最高位置的数字来避免反转字符串。例如,base = 16就是这样的

int  num_iter = sizeof(int) / 4;

const char digits[] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

const char digits [] = {'0','1','2','3','4','5','6','7','8','9','a' ,'b','c','d','e','f'};

/* skip zeros in the highest positions */
int i = num_iter;
for (; i >= 0; i--)
{
    int digit = (value >> (bits_per_digit*i)) & 15;
    if ( digit > 0 )  break;
}

for (; i >= 0; i--)
{
    int digit = (value >> (bits_per_digit*i)) & 15;
    result[len++] = digits[digit];
}

For decimals there is a nice idea to use a static array big enough to record the numbers in the reversed order, see here

对于小数,有一个很好的想法,使用足够大的静态数组以相反的顺序记录数字,请参见此处

#11


-1  

main()
{
  int i=1234;
  char stmp[10];
#if _MSC_VER
  puts(_itoa(i,stmp,10));
#else
  puts((sprintf(stmp,"%d",i),stmp));
#endif
  return 0;
}

#1


6  

The only actual error is that you don't check the return value of malloc for null.

唯一的实际错误是您没有检查malloc的返回值是否为null。

The name itoa is kind of already taken for a function that's non-standard, but not that uncommon. It doesn't allocate memory, rather it writes to a buffer provided by the caller:

itoa的名称已经被用于非标准的功能,但并非罕见。它不分配内存,而是写入调用者提供的缓冲区:

char *itoa(int value, char * str, int base);

If you don't want to rely on your platform having that, I would still advise following the pattern. String-handling functions which return newly allocated memory in C are generally more trouble than they're worth in the long run, because most of the time you end up doing further manipulation, and so you have to free lots of intermediate results. For example, compare:

如果你不想依赖你的平台,我仍然会建议遵循这种模式。在C中返回新分配的内存的字符串处理函数通常比它们在长期运行中的价值更麻烦,因为大多数时候你最终会进行进一步的操作,因此你必须释放大量的中间结果。例如,比较:

void delete_temp_files() {
    char filename[20];
    strcpy(filename, "tmp_");
    char *endptr = filename + strlen(filename);
    for (int i = 0; i < 10; ++i) {
        itoa(endptr, i, 10); // itoa doesn't allocate memory
        unlink(filename);
    }
}

vs.

void delete_temp_files() {
    char filename[20];
    strcpy(filename, "tmp_");
    char *endptr = filename + strlen(filename);
    for (int i = 0; i < 10; ++i) {
        char *number = itoa(i, 10); // itoa allocates memory
        strcpy(endptr, number);
        free(number);
        unlink(filename);
    }
}

If you had reason to be especially concerned about performance (for instance if you're implementing a stdlib-style library including itoa), or if you were implementing bases that sprintf doesn't support, then you might consider not calling sprintf. But if you want a base 10 string, then your first instinct was right. There's absolutely nothing "incorrect" about the %d format specifier.

如果您有理由特别关注性能(例如,如果您正在实现包含itoa的stdlib样式库),或者如果您正在实现sprintf不支持的基础,那么您可能会考虑不调用sprintf。但是如果你想要一个基数为10的字符串,那么你的第一直觉是正确的。关于%d格式说明符绝对没有“不正确”。

Here's a possible implementation of itoa, for base 10 only:

以下是itoa的可能实现,仅适用于基数10:

char *itobase10(char *buf, int value) {
    sprintf(buf, "%d", value);
    return buf;
}

Here's one which incorporates the snprintf-style approach to buffer lengths:

这是一个将snprintf风格的方法结合到缓冲区长度的方法:

int itobase10n(char *buf, size_t sz, int value) {
    return snprintf(buf, sz, "%d", value);
}

#2


10  

// Yet, another good itoa implementation
// returns: the length of the number string
int itoa(int value, char *sp, int radix)
{
    char tmp[16];// be careful with the length of the buffer
    char *tp = tmp;
    int i;
    unsigned v;

    int sign = (radix == 10 && value < 0);    
    if (sign)
        v = -value;
    else
        v = (unsigned)value;

    while (v || tp == tmp)
    {
        i = v % radix;
        v /= radix; // v/=radix uses less CPU clocks than v=v/radix does
        if (i < 10)
          *tp++ = i+'0';
        else
          *tp++ = i + 'a' - 10;
    }

    int len = tp - tmp;

    if (sign) 
    {
        *sp++ = '-';
        len++;
    }

    while (tp > tmp)
        *sp++ = *--tp;

    return len;
}

// Usage Example:
char int_str[15]; // be careful with the length of the buffer
int n = 56789;
int len = itoa(n,int_str,10);

#3


3  

I think you are allocating perhaps too much memory. malloc(8*sizeof(int)) will give you 32 bytes on most machines, which is probably excessive for a text representation of an int.

我认为你分配的内存太多了。 malloc(8 * sizeof(int))在大多数机器上都会给你32个字节,这对于int的文本表示来说可能是过多的。

#4


2  

I'm not quite sure where you get 8*sizeof(int) as the maximum possible number of characters -- ceil(8 / (log(10) / log(2))) yields a multiplier of 3*. Additionally, under C99 and some older POSIX platforms you can create an accurately-allocating version with sprintf():

我不太清楚8 * sizeof(int)的最大可能字符数 - ceil(8 /(log(10)/ log(2)))乘以3 *。此外,在C99和一些较旧的POSIX平台下,您可以使用sprintf()创建精确分配的版本:

char *
itoa(int i) 
{
    int n = snprintf(NULL, 0, "%d", i) + 1;
    char *s = malloc(n);

    if (s != NULL)
        snprintf(s, n, "%d", i);
    return s;
}

HTH

#5


2  

i found an interesting resource dealing with several different issues with the itoa implementation
you might wanna look it up too
itoa() implementations with performance tests

我找到了一个有趣的资源来处理与itoa实现的几个不同的问题你可能想要查找itoa()实现与性能测试

#6


1  

You should use a function in the printf family for this purpose. If you'll be writing the result to stdout or a file, use printf/fprintf. Otherwise, use snprintf with a buffer big enough to hold 3*sizeof(type)+2 bytes or more.

为此,您应该在printf系列中使用一个函数。如果您要将结果写入stdout或文件,请使用printf / fprintf。否则,请使用snprintf,其缓冲区大小足以容纳3 * sizeof(类型)+2字节或更多字节。

#7


1  

A good itoa()

好itoa()

Works for [INT_MIN...INT_MAX], base [2...36]
Does not require 2's complement.
Does not require unsigned to have a greater positive range than int - it does not use unsigned.
Does not assume int size.

适用于[INT_MIN ... INT_MAX],base [2 ... 36]不需要2的补码。不要求unsigned具有比int更大的正范围 - 它不使用unsigned。不假设int大小。

Note: Uses a '-' for negative numbers, even when base != 10.

注意:对于负数使用“ - ”,即使基数为!= 10。

Tailor the error handling as needed.

根据需要定制错误处理。

char* itostr(char *dest, size_t size, int a, int base) {
  static char buffer[sizeof a * CHAR_BIT + 1 + 1];
  static const char digits[36] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";

  if (base < 2 || base > 36) {
    fprintf(stderr, "Invalid base");
    return NULL;
  }

  char* p = &buffer[sizeof(buffer) - 1];
  *p = '\0';

  int an = a < 0 ? a : -a;  

  // Works with negative `int`
  do {
    *(--p) = digits[-(an % base)];
    an /= base;
  } while (an);

  if (a < 0) {
    *(--p) = '-';
  }

  size_t size_used = &buffer[sizeof(buffer)] - p;
  if (size_used > size) {
    fprintf(stderr, "Scant buffer %zu > %zu", size_used , size);
    return NULL;
  }
  return memcpy(dest, p, size_used);
}

#8


0  

This should work:

这应该工作:

#include <string.h>
#include <stdlib.h>
#include <math.h>

char * itoa_alloc(int x) {
   int s = x<=0 ? 1 ? 0; // either space for a - or for a 0
   size_t len = (size_t) ceil( log10( abs(x) ) );
   char * str = malloc(len+s + 1);

   sprintf(str, "%i", x);

   return str;
}

If you don't want to have to use the math/floating point functions (and have to link in the math libraries) you should be able to find non-floating point versions of log10 by searching the Web and do:

如果您不想使用数学/浮点函数(并且必须在数学库中链接),您应该能够通过搜索Web找到log10的非浮点版本并执行:

size_t len = my_log10( abs(x) ) + 1;

size_t len = my_log10(abs(x))+ 1;

That might give you 1 more byte than you needed, but you'd have enough.

这可能比你需要多1个字节,但你已经足够了。

#9


0  

There a couple of suggestions I might make. You can use a static buffer and strdup to avoid repeatedly allocating too much memory on subsequent calls. I would also add some error checking.

我可能会提出一些建议。您可以使用静态缓冲区和strdup来避免在后续调用中重复分配太多内存。我还会添加一些错误检查。

char *itoa(int i)
{
  static char buffer[12];

  if (snprintf(buffer, sizeof(buffer), "%d", i) < 0)
    return NULL;

  return strdup(buffer);
}

If this will be called in a multithreaded environment, remove "static" from the buffer declaration.

如果在多线程环境中调用它,请从缓冲区声明中删除“static”。

#10


0  

sprintf is quite slow, if performance matters it is probably not the best solution.

sprintf非常慢,如果性能很重要,它可能不是最好的解决方案。

if the base argument is a power of 2 the conversion can be done with a shift and masking, and one can avoid reversing the string by recording the digits from the highest positions. For instance, something like this for base=16

如果基本参数是2的幂,则可以通过移位和屏蔽来完成转换,并且可以通过记录来自最高位置的数字来避免反转字符串。例如,base = 16就是这样的

int  num_iter = sizeof(int) / 4;

const char digits[] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

const char digits [] = {'0','1','2','3','4','5','6','7','8','9','a' ,'b','c','d','e','f'};

/* skip zeros in the highest positions */
int i = num_iter;
for (; i >= 0; i--)
{
    int digit = (value >> (bits_per_digit*i)) & 15;
    if ( digit > 0 )  break;
}

for (; i >= 0; i--)
{
    int digit = (value >> (bits_per_digit*i)) & 15;
    result[len++] = digits[digit];
}

For decimals there is a nice idea to use a static array big enough to record the numbers in the reversed order, see here

对于小数,有一个很好的想法,使用足够大的静态数组以相反的顺序记录数字,请参见此处

#11


-1  

main()
{
  int i=1234;
  char stmp[10];
#if _MSC_VER
  puts(_itoa(i,stmp,10));
#else
  puts((sprintf(stmp,"%d",i),stmp));
#endif
  return 0;
}