如何在C中动态分配内存和确定数组大小?

时间:2022-01-12 07:22:18

I am trying to teach myself C from a python background. My current mini-problem is trying to do less hard-coding of things like array lengths and allocate memory dynamically based on input.

我正试图从python背景中自学C语言。我目前的小问题是尽量减少对数组长度之类的东西进行硬编码,并根据输入动态地分配内存。

I've written the following program. I was hoping for suggestions from the community for modifying it in the following ways:

我写了下面的程序。我希望得到社区的建议,以以下方式修改:

1.) Make first and last Name elements variable length. Currently their length is hardcoded as MAX_NAME_LENGTH. This will involve both change Names structdeclaration as well as the way I'm assigning values to its elements.

1)。使姓和名元素的长度可变。目前,它们的长度被硬编码为MAX_NAME_LENGTH。这将包括更改名称、结构声明以及为其元素分配值的方式。

2.) Bonus: Figure out some way to progressively add new elements to the name_list array without having to determine its length beforehand. Basically make it an expandable list.

2)。额外的好处:找到一种方法来逐步向name_list数组添加新元素,而不必事先确定其长度。基本上是一个可扩展的列表。

/* namelist.c 

   Loads up a list of names from a file to then do something with them.

*/
#include <stdlib.h>
#include <stdio.h>
#include <memory.h>

#define DATAFILE "name_list.txt"
#define DATAFILE_FORMAT "%[^,]%*c%[^\n]%*c"
#define MAX_NAME_LENGTH 100

typedef struct {
  char first[MAX_NAME_LENGTH];
  char last[MAX_NAME_LENGTH];
} Name;


int main() {
  FILE *fp = fopen(DATAFILE, "r");

  // Get the number of names in DATAFILE at runtime.
  Name aName;
  int lc = 0;
  while ((fscanf(fp, DATAFILE_FORMAT, aName.last, aName.first))!=EOF) lc++;
  Name *name_list[lc];

  // Now actually pull the data out of the file
  rewind(fp);
  int n = 0;
  while ((fscanf(fp, DATAFILE_FORMAT, aName.last, aName.first))!=EOF)
  {
    Name *newName = malloc(sizeof(Name));
    if (newName == NULL) {
      puts("Warning: Was not able to allocate memory for ``Name`` ``newName``on the heap.");
    } 
    memcpy(newName, &aName, sizeof(Name));
  name_list[n] = newName;
  n++;
  }

  int i = 1;
  for (--n; n >= 0; n--, i++) {
    printf("%d: %s %s\n", i, name_list[n]->first, name_list[n]->last);
    free(name_list[n]);
    name_list[n] = NULL;
  }

  fclose(fp);
  return 0;
}

Sample contents of name_list.txt:

name_list.txt示例内容:

Washington,George
Adams,John 
Jefferson,Thomas
Madison,James

Update 1:

更新1:

I've implemented a linked list and some helper functions as @Williham suggested, results are below.

我实现了一个链接列表和一些帮助函数,如@Williham所建议的,结果如下。

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define DATAFILE "name_list.txt"
#define MAX_NAME_LENGTH 30
#define DATAFILE_FORMAT "%29[^,]%*c%29[^\n]%*c"

static const int INPUT_BUFFER_SIZE_DEFAULT = sizeof(char) * MAX_NAME_LENGTH;

typedef struct _Name Name;

struct _Name {
  char *first;
  char *last;
  Name *next;
};

int get_charcount(char *str);

Name * create_name_list(char *filename);
void print_name_list(Name *name);
void free_name_list (Name *name);

int main() {

  // Read a list of names into memory and 
  // return the head of the linked list.
  Name *head = create_name_list(DATAFILE);

  // Now do something with all this data.
  print_name_list(head);

  // If you love something, let it go.
  free_name_list(head);
  head = NULL;
  return 0;
}

int get_charcount (char *str) 
{
  int input_length = 1;
    while (str[input_length] != '\0')
    {
      input_length++;
    }
  return input_length;
}

Name * create_name_list(char *filename)
{
  FILE *fp = fopen(DATAFILE, "r");
  char *first_input_buffer = malloc(INPUT_BUFFER_SIZE_DEFAULT);
  char *last_input_buffer = malloc(INPUT_BUFFER_SIZE_DEFAULT);

  Name *firstNamePtr;
  Name *previousNamePtr;
  while ((fscanf(fp, DATAFILE_FORMAT, last_input_buffer, first_input_buffer))!=EOF)
  {
    Name *newNamePtr = malloc(sizeof(Name));

    if (previousNamePtr) 
    {
      previousNamePtr->next = newNamePtr;
      previousNamePtr = newNamePtr;
    } 
    else 
    {
      firstNamePtr = previousNamePtr = newNamePtr;
    }

    char *temp_buffer = malloc(get_charcount(first_input_buffer));
    strcpy(temp_buffer, first_input_buffer);
    newNamePtr->first = malloc(get_charcount(first_input_buffer));
    strcpy(newNamePtr->first, temp_buffer);


    realloc(temp_buffer, get_charcount(last_input_buffer));

    strcpy(temp_buffer, last_input_buffer);
    newNamePtr->last = malloc(get_charcount(last_input_buffer));
    strcpy(newNamePtr->last, temp_buffer);

    free(temp_buffer);
    temp_buffer = NULL;
  }
  previousNamePtr->next = NULL;
  previousNamePtr = NULL;
  free(first_input_buffer);
  free(last_input_buffer);
  first_input_buffer = NULL;
  last_input_buffer = NULL;
  fclose(fp);

  return firstNamePtr;
}

void print_name_list (Name *name)
{
  static int first_iteration = 1;
  if (first_iteration) 
  {
    printf("\nList of Names\n");
    printf("=============\n");
    first_iteration--;
  }
  printf("%s %s\n",name->first, name->last);
  if (name->next)
    print_name_list(name->next);
  else
    printf("\n");
}

void free_name_list (Name *name)
{
  if (name->next)
    free_name_list(name->next);
  free(name->first);
  free(name->last);
  name->first = NULL;
  name->last = NULL;
  name->next = NULL;
  free(name);
  name = NULL;
}

5 个解决方案

#1


5  

A very simple approach is to not use an array at all, but rather a linked list:

一个非常简单的方法是不使用数组,而是一个链表:

This can be done in a number of ways, but the simplest is probably to modify the Name struct as follows:

这可以通过多种方式实现,但最简单的方法可能是修改名称结构,如下所示:

typedef struct _Name Name;

struct _Name {
  char *first;
  char *last;
  Name *next;
};

The use of char * instead of char[] will necessitate some strcpying, but that's really neither here nor there. To expand the array you can now simply malloc these elements one at a time; and set next appropriately.

使用char *而不是char[]将需要一些字符串,但是这里和那里都没有。要扩展这个数组,你现在可以简单地将这些元素一个一个地排列;并适当地设置下。

Note: Remember to set next to NULL when you create new tail elements.

注意:在创建新尾元素时,请记住将其设置为NULL。

#2


4  

There isn't a way to expand an array in C. All that you can do is allocate a larger block of memory and copy the items over. This is what Python does under the covers.

在c中没有扩展数组的方法,您所能做的就是分配一个更大的内存块并复制这些项。这就是Python在幕后所做的事情。

There's also no way to determine the size of an array, since it stores only the items an no additional information (Python arrays store the length alongside the elements.) You could put a marker at the end of the array and count elements until you hit the marker, this is how strings work (the null char '\0' is the marker.)

也没有办法确定数组的大小,因为它只存储条目,没有附加的信息(Python数组在元素旁边存储长度)。您可以在数组的末尾放置一个标记并计数元素,直到您点击标记,这就是字符串的工作方式(null char '\0'是标记)。

#3


2  

You'll have to have a maximum length array to take the input - you don't know how long the input is going to be. However, you only need one array of that length. Once you've got the input you can get it's length and allocate an array of just the right size to store the names.

你必须有一个最大长度的数组来接收输入你不知道输入的长度是多少。但是,您只需要该长度的一个数组。一旦你得到了输入,你就可以得到它的长度并分配一个大小合适的数组来存储名字。

As for growing your list you can use realloc or use a linked list of course.

至于增加你的列表,你可以使用realloc或使用链表。

#4


1  

Have you started reading about linked lists? It may help you to have a dynamic structure.

你开始读链接列表了吗?它可以帮助你有一个动态的结构。

Also you can check what is the difference between array and linked list. A char pointer can be member of your linked list which can be allocated memory dynamically.

还可以检查数组和链表之间的区别。char指针可以是链接列表的成员,可以动态地分配内存。

Check this link from * for more information on the same.

查看来自*的链接以获得更多信息。

#5


1  

Since noone mentioned it yet, if you use *scanf to read untrusted input strings, you should always use maximum field width specifiers. Like

由于还没有人提到它,如果您使用*scanf读取不受信任的输入字符串,您应该始终使用最大字段宽度说明符。就像

scanf("%19s", str);

Note that this specify maximum string length not including the terminating NUL, so in this example, str should be 20 chars long.

注意,这指定了最大字符串长度,不包括终止NUL,因此在本例中,str应该是20字符长。

If you don't limit scanf input conversions like this, on buffer overflow you will not only get undefined behaviour, but also a security vulnerability.

如果您不限制这样的scanf输入转换,那么在缓冲区溢出时,您不仅会得到未定义的行为,而且还会得到安全漏洞。

Back to your question about dinamically growing buffers. Imagine you are reading some input composed of lines, and each line can be at a maximum 78 chars wide. In situations like this, you know the maximum length of each line, but have no idea of the maximum length of your input as a whole. A common way to do this is to allocate some default space, and if you need to grow that space, grow it by *2, so you don't need to realloc() many times.

回到你的问题,关于以丁胺方式增长的缓冲区。假设您正在读取由几行组成的输入,并且每一行的宽度最多为78字符。在这种情况下,您知道每一行的最大长度,但不知道作为一个整体的输入的最大长度。一种常见的方法是分配一些默认空间,如果需要扩展这个空间,可以将它扩展到*2,这样就不需要多次realloc()。

#1


5  

A very simple approach is to not use an array at all, but rather a linked list:

一个非常简单的方法是不使用数组,而是一个链表:

This can be done in a number of ways, but the simplest is probably to modify the Name struct as follows:

这可以通过多种方式实现,但最简单的方法可能是修改名称结构,如下所示:

typedef struct _Name Name;

struct _Name {
  char *first;
  char *last;
  Name *next;
};

The use of char * instead of char[] will necessitate some strcpying, but that's really neither here nor there. To expand the array you can now simply malloc these elements one at a time; and set next appropriately.

使用char *而不是char[]将需要一些字符串,但是这里和那里都没有。要扩展这个数组,你现在可以简单地将这些元素一个一个地排列;并适当地设置下。

Note: Remember to set next to NULL when you create new tail elements.

注意:在创建新尾元素时,请记住将其设置为NULL。

#2


4  

There isn't a way to expand an array in C. All that you can do is allocate a larger block of memory and copy the items over. This is what Python does under the covers.

在c中没有扩展数组的方法,您所能做的就是分配一个更大的内存块并复制这些项。这就是Python在幕后所做的事情。

There's also no way to determine the size of an array, since it stores only the items an no additional information (Python arrays store the length alongside the elements.) You could put a marker at the end of the array and count elements until you hit the marker, this is how strings work (the null char '\0' is the marker.)

也没有办法确定数组的大小,因为它只存储条目,没有附加的信息(Python数组在元素旁边存储长度)。您可以在数组的末尾放置一个标记并计数元素,直到您点击标记,这就是字符串的工作方式(null char '\0'是标记)。

#3


2  

You'll have to have a maximum length array to take the input - you don't know how long the input is going to be. However, you only need one array of that length. Once you've got the input you can get it's length and allocate an array of just the right size to store the names.

你必须有一个最大长度的数组来接收输入你不知道输入的长度是多少。但是,您只需要该长度的一个数组。一旦你得到了输入,你就可以得到它的长度并分配一个大小合适的数组来存储名字。

As for growing your list you can use realloc or use a linked list of course.

至于增加你的列表,你可以使用realloc或使用链表。

#4


1  

Have you started reading about linked lists? It may help you to have a dynamic structure.

你开始读链接列表了吗?它可以帮助你有一个动态的结构。

Also you can check what is the difference between array and linked list. A char pointer can be member of your linked list which can be allocated memory dynamically.

还可以检查数组和链表之间的区别。char指针可以是链接列表的成员,可以动态地分配内存。

Check this link from * for more information on the same.

查看来自*的链接以获得更多信息。

#5


1  

Since noone mentioned it yet, if you use *scanf to read untrusted input strings, you should always use maximum field width specifiers. Like

由于还没有人提到它,如果您使用*scanf读取不受信任的输入字符串,您应该始终使用最大字段宽度说明符。就像

scanf("%19s", str);

Note that this specify maximum string length not including the terminating NUL, so in this example, str should be 20 chars long.

注意,这指定了最大字符串长度,不包括终止NUL,因此在本例中,str应该是20字符长。

If you don't limit scanf input conversions like this, on buffer overflow you will not only get undefined behaviour, but also a security vulnerability.

如果您不限制这样的scanf输入转换,那么在缓冲区溢出时,您不仅会得到未定义的行为,而且还会得到安全漏洞。

Back to your question about dinamically growing buffers. Imagine you are reading some input composed of lines, and each line can be at a maximum 78 chars wide. In situations like this, you know the maximum length of each line, but have no idea of the maximum length of your input as a whole. A common way to do this is to allocate some default space, and if you need to grow that space, grow it by *2, so you don't need to realloc() many times.

回到你的问题,关于以丁胺方式增长的缓冲区。假设您正在读取由几行组成的输入,并且每一行的宽度最多为78字符。在这种情况下,您知道每一行的最大长度,但不知道作为一个整体的输入的最大长度。一种常见的方法是分配一些默认空间,如果需要扩展这个空间,可以将它扩展到*2,这样就不需要多次realloc()。