我应该如何在C结构中声明字符串？

Hello I am new to this site, and I require some help with understanding what would be considered the "norm" while coding structures in C that require a string. Basically I am wondering which of the following ways would be considered the "industry standard" while using structures in C to keep track of ALL of the memory the structure requires:

你好,我是这个网站的新手,我需要一些帮助来理解在C中编码需要字符串的结构时会被视为“规范”的内容。基本上我想知道在使用C中的结构来跟踪结构所需的所有内存时,下列哪种方式将被视为“行业标准”:

1) Fixed Size String:

1)固定大小字符串:

typedef struct
{
    int damage;
    char name[40];
} Item;

I can now get the size using sizeof(Item)

我现在可以使用sizeof(Item)获取大小

2) Character Array Pointer

2)字符数组指针

typedef struct
{
    int damage;
    char *name;
} Item;

I know I can store the size of name using a second variable, but is there another way?

我知道我可以使用第二个变量存储名称的大小,但还有另一种方法吗?

i) is there any other advantage to using the fixed size (1)

i)使用固定尺寸是否有任何其他优势(1)

char name[40];

versus doing the following and using a pointer to a char array (2)?

与执行以下操作并使用指向char数组的指针(2)?

char *name;

and if so, what is the advantage?

如果是这样,有什么好处?

ii) Also, is the string using a pointer to a char array (2) going to be stored sequentially and immediately after the structure (immediately after the pointer to the string) or will it be stored somewhere else in memory?

ii)此外,字符串是否使用指向char数组(2)的指针将按顺序存储在结构之后(紧接在指向字符串的指针之后),还是存储在内存中的其他位置?

iii) I wish to know how one can find the length of a char * string variable (without using a size_t, or integer value to store the length)

iii)我想知道如何找到char * string变量的长度(不使用size_t或整数值来存储长度)

3 个解决方案

#1

There are basically 3 common conventions for strings. All three are found in the wild, both for in-memory representation and storage/transmission.

字符串基本上有3种常见的约定。所有这三个都是在野外发现的,用于内存表示和存储/传输。

Fixed size. Access is very efficient, but if the actual length varies you both waste space and need one of the below methods to determine the end of the "real" content.

固定尺寸。访问是非常有效的,但如果实际长度变化,您既浪费空间又需要以下方法之一来确定“真实”内容的结束。

Length prefixed. Extra space is included in the dynamically allocation, to hold the length. From the pointer you can find both the character content and the length immediately preceding it. Example: BSTR Sometimes the length is encoded to be more space efficient for short strings. Example: ASN-1

长度前缀。动态分配中包含额外空间以保持长度。从指针中,您可以找到字符内容和紧接其前面的长度。示例:BSTR有时将长度编码为短字符串的空间效率更高。示例:ASN-1

Terminated. The string extends until the first occurrence of the termination character (typically NUL), and the content cannot contain that character. Variations made the termination two NUL in sequence, to allow individual NUL characters to exist in the string, which is then often treated as a packed list of strings. Other variations use an encoding such as byte stuffing (UTF-8 would also work) to guarantee that there exists some code reserved for termination that can't ever appear in the encoded version of the content.

终止。字符串一直延伸到第一次出现终止字符(通常是NUL),并且内容不能包含该字符。变化使终止两个NUL按顺序,允许单个NUL字符存在于字符串中,然后通常将其视为字符串的打包列表。其他变体使用诸如字节填充之类的编码(UTF-8也可以工作)以保证存在一些保留用于终止的代码,这些代码不能出现在内容的编码版本中。

In the third case, there's a function such as strlen to search for the terminator and find the length.

在第三种情况下,有一个诸如strlen之类的函数来搜索终结符并找到长度。

Both cases which use pointers can point to data immediately following the fixed portion of the structure, if you carefully allocate it that way. If you want to force this, then use a flexible array on the end of your structure (no pointer needed). Like this:

使用指针的两种情况都可以指向紧跟在结构的固定部分之后的数据,如果你仔细地分配它。如果要强制执行此操作,请在结构末尾使用灵活数组(无需指针)。喜欢这个:

typedef struct
{
    int damage;
    char name[]; // terminated
} Item;

typedef struct
{
    int damage;
    int length_of_name;
    char name[];
} Item;

#2

1) is there any other advantage to using the fixed size (1)

char name[40];

versus doing the following and using a pointer to a char array (2)?

char *name;

and if so, what is the advantage?

With your array declared as char name[40]; space for name is already allocated and you are free to copy information into name from name[0] through name[39]. However, in the case of char *name;, it is simply a character pointer and can be used to point to an existing string in memory, but, on its own, cannot be used to copy information to until you allocate memory to hold that information. So say you have a 30 character string you want to copy to name declared as char *name;, you must first allocate with malloc 30 characters plus an additional character to hold the null-terminating character:

将您的数组声明为char name [40];名称空间已经分配,您可以*地将信息从名称[0]复制到名称[39]。但是,在char * name;的情况下,它只是一个字符指针,可以用来指向内存中的现有字符串,但是,它本身不能用于复制信息,直到你分配内存来保存它信息。因此,假设您要将30个字符的字符串复制到声明为char * name;的名称,则必须首先使用malloc 30个字符以及另外一个字符来分配以保存空终止字符:

char *name;
name = malloc (sizeof (char) * (30 + 1));

Then you are free to copy information to/from name. An advantage of dynamically allocating is that you can realloc memory for name if the information you are storing in name grows. beyond 30 characters. An additional requirement after allocating memory for name, you are responsible for freeing the memory you have allocated when it is no longer needed. That's a rough outline of the pros/cons/requirements for using one as opposed to the other.

然后,您可以*地将信息复制到名称中。动态分配的一个优点是,如果名称中存储的信息增长,则可以为内存重新分配内存。超过30个字符。为name分配内存后的额外要求,您负责释放不再需要的内存。这是使用一个而不是另一个的利弊/要求的粗略概述。

#3

If you know the maximum length of the string you need, then you can use a character array. It does mean though that you will be using more memory than you'd typically use with dynamically allocated character arrays. Also, take a look at CString if you are using C++. You can find the length of the character array using strlen. In case of static allocation I believe it will be a part of the variable. Dynamic can be anywhere on the heap.

如果您知道所需字符串的最大长度,则可以使用字符数组。它确实意味着您将使用比通常用于动态分配的字符数组更多的内存。另外,如果您使用的是C ++,请查看CString。您可以使用strlen找到字符数组的长度。在静态分配的情况下,我相信它将是变量的一部分。动态可以在堆上的任何位置。

#1