当具有指针数组的结构是malloc时,是否存在潜在的数据损坏

时间:2022-09-06 12:01:10

I want to take in three arbitrary length buffers of doubles. Below is a short example

我想要三个任意长度的双打缓冲区。以下是一个简短的例子

struct Data
{
  double *foo[3];
};

int main(void)
{
  double bar1[] = {1.0, 2.0, 3.0};
  double bar2[] = {1.0, 2.0, 3.0, 4.0};
  double bar3[] = {1.0, 2.0, 3.0, 4.0, 5.0};

  struct Data *data = (struct Data*)malloc(sizeof(struct Data));

  data->foo[0] = bar1;
  data->foo[1] = bar2;
  data->foo[2] = bar3;

  printf("%lf %lf %lf\n", data->foo[0][0], data->foo[0][1], data->foo[0][2]);
  printf("%lf %lf %lf %lf\n", data->foo[1][0], data->foo[1][1], 
  data->foo[1][2], data->foo[1][3]);
  printf("%lf %lf %lf %lf %lf\n", data->foo[2][0], data->foo[2][1], 
  data->foo[2][2], data->foo[2][3], data->foo[2][4]);

  return 0;
}

My concern is that if I malloc Data in the manner above I run the risk of corrupt data. If I allocate memory on the heap for an array of pointers to double buffers (or essentially an arbitrarily sized two-dimensional array of doubles) without knowing the size, is the data protected in any way? I feel like it runs the possibility of overwritten data. Am I correct in this thinking? This compiles and prints, but I'm not sure I trust it in a much larger scale implementation.

我担心的是,如果我以上述方式使用malloc Data,那么我就冒着损坏数据的风险。如果我在堆上为一个指向双缓冲区(或基本上是任意大小的二维双精度数组)的指针数组分配内存而不知道大小,那么数据是否受到任何保护?我觉得它有可能覆盖数据。我这个想法是否正确?这编译和打印,但我不确定我是否相信它在更大规模的实现中。

2 个解决方案

#1


2  

As long as you do not assign wrong values, there is no data corruption. You have to be aware, where the data lives and how long it's valid. For example:

只要您没有分配错误的值,就不会有数据损坏。您必须知道数据存在的位置以及数据的有效期。例如:

/* !!!! broken code ahead !!!! */
struct Data
{
  double *foo[3];
};

void initData(struct Data* data) {
  double bar1[] = {1.0, 2.0, 3.0};
  double bar2[] = {1.0, 2.0, 3.0, 4.0};
  double bar3[] = {1.0, 2.0, 3.0, 4.0, 5.0};
  data->foo[0] = bar1;
  data->foo[1] = bar2;
  data->foo[2] = bar3;
}

int main(void)
{
  struct Data *data = (struct Data*)malloc(sizeof(struct Data));
  initData(data);

  printf("%lf %lf %lf\n", data->foo[0][0], data->foo[0][1], data->foo[0][2]);
  printf("%lf %lf %lf %lf\n", data->foo[1][0], data->foo[1][1], 
  data->foo[1][2], data->foo[1][3]);
  printf("%lf %lf %lf %lf %lf\n", data->foo[2][0], data->foo[2][1], 
  data->foo[2][2], data->foo[2][3], data->foo[2][4]);

  return 0;
}

This would be a bad idea:

这将是一个坏主意:

  • data is heap-allocated and "lives" until you call free
  • 数据是堆分配的并且“生命”直到您免费呼叫

  • bar1..3 is stack-allocated and only lives inside of initData()
  • bar1..3是堆栈分配的,只存在于initData()中

  • data->foo points to bar1..3 and is only valid inside initData()
  • data-> foo指向bar1..3并且仅在initData()内有效

  • the printf-calls might work (haven't tested) but it is broken code
  • printf调用可能有效(尚未测试),但它是破坏的代码

Getting this right is the hardest task with C. When you use linux for development, you should take a look into valgrind to catch those type of bugs (the one in my example is obvious but it can get realy hard)

正确使用C是最难的。当你使用linux进行开发时,你应该看看valgrind来捕获那些类型的bug(我的例子中的那个很明显,但它可以真的很难)

#2


1  

Certainly the malloc() itself does not contribute to the risk of data corruption. Whatever risk there is would be at least as great if the struct in question were an automatic variable allocated on the stack.

当然,malloc()本身不会导致数据损坏的风险。如果所讨论的结构是在堆栈上分配的自动变量,那么无论风险如何都至少同样大。

What you seem really to be asking about is the data structure itself, and basically pointers in general. Yes, if you have a pointer then it is possible to attempt an invalid memory access via that pointer, outside the bounds of the object to which the pointer points. C provides no protection against such attempts; it addresses the issue by declaring that the behavior of a program that attempts such an action is undefined.

你真正要问的是数据结构本身,基本上是指针。是的,如果你有一个指针,则可以通过指针在指针所指向的对象边界之外尝试无效的内存访问。 C不提供此类尝试的保护;它通过声明尝试此类操作的程序的行为未定义来解决该问题。

It is incumbent on the programmer to ensure that his program does not attempt such an action. For pointers to arrays, one generally approaches that problem either by keeping track separately of the length of the pointed-to array, or by marking the end of the array with a sentinel value that cannot appear as normal data.

程序员有责任确保他的程序不会尝试这样的行动。对于指向数组的指针,通常可以通过分别跟踪指向数组的长度来跟踪该问题,或者通过使用不能显示为普通数据的标记值标记数组的末尾来解决该问题。

#1


2  

As long as you do not assign wrong values, there is no data corruption. You have to be aware, where the data lives and how long it's valid. For example:

只要您没有分配错误的值,就不会有数据损坏。您必须知道数据存在的位置以及数据的有效期。例如:

/* !!!! broken code ahead !!!! */
struct Data
{
  double *foo[3];
};

void initData(struct Data* data) {
  double bar1[] = {1.0, 2.0, 3.0};
  double bar2[] = {1.0, 2.0, 3.0, 4.0};
  double bar3[] = {1.0, 2.0, 3.0, 4.0, 5.0};
  data->foo[0] = bar1;
  data->foo[1] = bar2;
  data->foo[2] = bar3;
}

int main(void)
{
  struct Data *data = (struct Data*)malloc(sizeof(struct Data));
  initData(data);

  printf("%lf %lf %lf\n", data->foo[0][0], data->foo[0][1], data->foo[0][2]);
  printf("%lf %lf %lf %lf\n", data->foo[1][0], data->foo[1][1], 
  data->foo[1][2], data->foo[1][3]);
  printf("%lf %lf %lf %lf %lf\n", data->foo[2][0], data->foo[2][1], 
  data->foo[2][2], data->foo[2][3], data->foo[2][4]);

  return 0;
}

This would be a bad idea:

这将是一个坏主意:

  • data is heap-allocated and "lives" until you call free
  • 数据是堆分配的并且“生命”直到您免费呼叫

  • bar1..3 is stack-allocated and only lives inside of initData()
  • bar1..3是堆栈分配的,只存在于initData()中

  • data->foo points to bar1..3 and is only valid inside initData()
  • data-> foo指向bar1..3并且仅在initData()内有效

  • the printf-calls might work (haven't tested) but it is broken code
  • printf调用可能有效(尚未测试),但它是破坏的代码

Getting this right is the hardest task with C. When you use linux for development, you should take a look into valgrind to catch those type of bugs (the one in my example is obvious but it can get realy hard)

正确使用C是最难的。当你使用linux进行开发时,你应该看看valgrind来捕获那些类型的bug(我的例子中的那个很明显,但它可以真的很难)

#2


1  

Certainly the malloc() itself does not contribute to the risk of data corruption. Whatever risk there is would be at least as great if the struct in question were an automatic variable allocated on the stack.

当然,malloc()本身不会导致数据损坏的风险。如果所讨论的结构是在堆栈上分配的自动变量,那么无论风险如何都至少同样大。

What you seem really to be asking about is the data structure itself, and basically pointers in general. Yes, if you have a pointer then it is possible to attempt an invalid memory access via that pointer, outside the bounds of the object to which the pointer points. C provides no protection against such attempts; it addresses the issue by declaring that the behavior of a program that attempts such an action is undefined.

你真正要问的是数据结构本身,基本上是指针。是的,如果你有一个指针,则可以通过指针在指针所指向的对象边界之外尝试无效的内存访问。 C不提供此类尝试的保护;它通过声明尝试此类操作的程序的行为未定义来解决该问题。

It is incumbent on the programmer to ensure that his program does not attempt such an action. For pointers to arrays, one generally approaches that problem either by keeping track separately of the length of the pointed-to array, or by marking the end of the array with a sentinel value that cannot appear as normal data.

程序员有责任确保他的程序不会尝试这样的行动。对于指向数组的指针,通常可以通过分别跟踪指向数组的长度来跟踪该问题,或者通过使用不能显示为普通数据的标记值标记数组的末尾来解决该问题。