字符数组和字符指针在C中的区别是什么?

时间:2021-06-25 13:30:12

I am trying to understand pointers in C but I am currently confused with the following:

我正在尝试理解C中的指针,但目前我对以下内容感到困惑:

  • char *p = "hello"

    This is a char pointer pointing at the character array, starting at h.

    这是一个指向字符数组的字符指针,从h开始。

  • char p[] = "hello"

    This is an array that stores hello.

    这是一个存储hello的数组。

What is the difference when I pass both these variables into this function?

当我把这两个变量都传递给这个函数时,有什么区别呢?

void printSomething(char *p){    printf("p: %s",p);}

7 个解决方案

#1


179  

char* and char[] are different types, but it's not immediately apparent in all cases. This is because arrays decay into pointers, meaning that if an expression of type char[] is provided where one of type char* is expected, the compiler automatically converts the array into a pointer to its first element.

char*和char[]是不同的类型,但并不是在所有情况下都能立即显示出来。这是因为数组会衰减成指针,这意味着如果在需要char*类型的地方提供了char[]类型的表达式,编译器会自动将数组转换成指向第一个元素的指针。

Your example function printSomething expects a pointer, so if you try to pass an array to it like this:

您的示例函数printSomething需要一个指针,所以如果您试图像这样向它传递一个数组:

char s[10] = "hello";printSomething(s);

The compiler pretends that you wrote this:

编译器假装是你写的:

char s[10] = "hello";printSomething(&s[0]);

#2


62  

Let's see:

让我们来看看:

#include <stdio.h>#include <string.h>int main(){    char *p = "hello";    char q[] = "hello"; // no need to count this    printf("%zu\n", sizeof(p)); // => size of pointer to char -- 4 on x86, 8 on x86-64    printf("%zu\n", sizeof(q)); // => size of char array in memory -- 6 on both    // size_t strlen(const char *s) and we don't get any warnings here:    printf("%zu\n", strlen(p)); // => 5    printf("%zu\n", strlen(q)); // => 5    return 0;}

foo* and foo[] are different types and they are handled differently by the compiler (pointer = address + representation of the pointer's type, array = pointer + optional length of the array, if known, for example, if the array is statically allocated), the details can be found in the standard. And at the level of runtime no difference between them (in assembler, well, almost, see below).

foo*和foo[]是不同的类型,它们被编译器以不同的方式处理(指针=地址+指针类型的表示,数组=指针+数组的可选长度,如果已知,例如,如果数组是静态分配的),详细信息可以在标准中找到。在运行时级别,它们之间没有区别(在汇编程序中,差不多,见下文)。

Also, there is a related question in the C FAQ:

另外,在C FAQ中有一个相关的问题:

Q: What is the difference between these initializations?

问:这些初始化有什么不同?

char a[] = "string literal";   char *p  = "string literal";   

My program crashes if I try to assign a new value to p[i].

如果我试图给p[I]赋一个新值,我的程序就会崩溃。

A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

A:字符串文字(C源中双引号字符串的正式术语)可以用两种稍微不同的方式使用:

  1. As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).
  2. 作为char数组的初始化器(如chara[]的声明中所示),它指定该数组中字符的初始值(必要时,还指定其大小)。
  3. Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
  4. 在其他地方,它会变成一个未命名的静态字符数组,这个未命名的数组可以存储在只读内存中,因此不能对其进行修改。在表达式上下文中,数组像往常一样被转换为指针(参见第6节),因此第二个声明初始化p,指向未命名数组的第一个元素。

Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

有些编译器有一个开关来控制字符串文字是否可写(用于编译旧代码),有些则可以选择将字符串文字正式地当作const char数组来处理(以便更好地捕获错误)。

See also questions 1.31, 6.1, 6.2, 6.8, and 11.8b.

参见问题1.31、6.1、6.2、6.8和11.8b。

References: K&R2 Sec. 5.5 p. 104

参考文献:K&R2,第5.5页104页

ISO Sec. 6.1.4, Sec. 6.5.7

第6.1.4节,第6.5.7节

Rationale Sec. 3.1.4

理由秒。3.1.4

H&S Sec. 2.7.4 pp. 31-2

第2.7.4页31-2页

#3


22  

C99 N1256 draft

C99 N1256草案

There are two completely different uses of array literals:

数组文字有两种完全不同的用法:

  1. Initialize char[]:

    初始化char[]:

    char c[] = "abc";      

    This is "more magic", and described at 6.7.8/14 "Initialization":

    这是“更神奇的”,在6.7.8/14“初始化”中描述:

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    一个字符类型的数组可以用字符串文字初始化,也可以用大括号括起来。字符串文字的连续字符(如果有空间或数组大小未知,包括终止空字符)初始化数组的元素。

    So this is just a shortcut for:

    这是一个捷径

    char c[] = {'a', 'b', 'c', '\0'};

    Like any other regular array, c can be modified.

    与任何其他常规数组一样,c可以被修改。

  2. Everywhere else: it generates an:

    其他地方:它产生:

    So when you write:

    所以当你写:

    char *c = "abc";

    This is similar to:

    这类似于:

    /* __unnamed is magic because modifying it gives UB. */static char __unnamed[] = "abc";char *c = __unnamed;

    Note the implicit cast from char[] to char *, which is always legal.

    注意从char[]到char *的隐式转换,这始终是合法的。

    Then if you modify c[0], you also modify __unnamed, which is UB.

    然后如果你修改c[0],你也修改了__unknown,也就是UB。

    This is documented at 6.4.5 "String literals":

    这是在6.4.5“字符串文字”中记录的:

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    在翻译阶段7中,一个字节或值为0的代码被附加到每个多字节字符序列中,这些字符序列由字符串文字或文字产生。然后使用多字节字符序列初始化一个静态存储持续时间和长度的数组,仅足以包含序列。对于字符串文本,数组元素具有类型char,并使用多字节字符序列的单个字节进行初始化[…]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

    如果这些数组的元素具有适当的值,那么这些数组是否不同就不得而知了。如果程序试图修改这样的数组,则该行为未定义。

6.7.8/32 "Initialization" gives a direct example:

6.7.8/32“初始化”给出了一个直接的例子:

EXAMPLE 8: The declaration

例8:声明

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

定义“普通”字符数组对象s和t,它们的元素是用字符串常量初始化的。

This declaration is identical to

此声明与

char s[] = { 'a', 'b', 'c', '\0' },t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

数组的内容是可以修改的。另一方面,宣言

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

定义具有“指向char的指针”类型的p,并将其初始化为指向具有“char数组”类型的对象,该对象的长度为4,其元素初始化为字符字符串字面量。如果尝试使用p修改数组的内容,则行为未定义。

GCC 4.8 x86-64 ELF implementation

GCC 4.8 x86-64 ELF实现

Program:

计划:

#include <stdio.h>int main() {    char *s = "abc";    printf("%s\n", s);    return 0;}

Compile and decompile:

编译和反编译:

gcc -ggdb -std=c99 -c main.cobjdump -Sr main.o

Output contains:

输出包含:

 char *s = "abc";8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)f:  00         c: R_X86_64_32S .rodata

Conclusion: GCC stores char* it in .rodata section, not in .text.

结论:GCC将char*存储在.rodata部分,而不是.text部分。

If we do the same for char[]:

如果我们对char[]做同样的事:

 char s[] = "abc";

we obtain:

我们获得:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp).

所以它被存储在堆栈中(相对于%rbp)。

Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:

请注意,默认的链接器脚本将.rodata和.text放在同一段中,该段执行但是没有写权限。这可以用以下方法来观察:

readelf -l a.out

which contains:

它包含:

 Section to Segment mapping:  Segment Sections...   02     .text .rodata

#4


7  

You're not allowed to change the contents of a string constant, which is what the first p points to. The second p is an array initialized with a string constant, and you can change its contents.

您不允许更改字符串常量的内容,这是第一个p指向的内容。第二个p是一个用字符串常量初始化的数组,您可以修改它的内容。

#5


5  

For cases like this, the effect is the same: You end up passing the address of the first character in a string of characters.

对于这种情况,效果是相同的:您最终将第一个字符的地址传递到一个字符串中。

The declarations are obviously not the same though.

但是声明显然是不一样的。

The following sets aside memory for a string and also a character pointer, and then initializes the pointer to point to the first character in the string.

下面将为字符串和字符指针预留内存,然后初始化指针以指向字符串中的第一个字符。

char *p = "hello";

While the following sets aside memory just for the string. So it can actually use less memory.

下面将为字符串预留内存。所以它实际上可以使用更少的内存。

char p[10] = "hello";

#6


2  

As far as I can remember, an array is actually a group of pointers.For example

据我所知,数组实际上是一组指针。例如

p[1]== *(&p+1)

is a true statement

是一个真正的声明

#7


0  

char p[3] = "hello" ? should be char p[6] = "hello" remember there is a '\0' char in the end of a "string" in C.

char p[3] = "hello" ?应该是char p[6] = "hello"记住在C中的"string"后面有一个'\0' char。

anyway, array in C is just a pointer to the first object of an adjust objects in the memory. the only different s are in semantics. while you can change the value of a pointer to point to a different location in the memory an array, after created, will always point to the same location.
also when using array the "new" and "delete" is automatically done for you.

总之,C中的数组只是指向内存中调整对象的第一个对象的指针。唯一不同的是语义学。虽然可以更改指针的值以指向内存中的不同位置,但创建后,数组始终指向相同的位置。在使用数组时,“new”和“delete”也会自动为您完成。

#1


179  

char* and char[] are different types, but it's not immediately apparent in all cases. This is because arrays decay into pointers, meaning that if an expression of type char[] is provided where one of type char* is expected, the compiler automatically converts the array into a pointer to its first element.

char*和char[]是不同的类型,但并不是在所有情况下都能立即显示出来。这是因为数组会衰减成指针,这意味着如果在需要char*类型的地方提供了char[]类型的表达式,编译器会自动将数组转换成指向第一个元素的指针。

Your example function printSomething expects a pointer, so if you try to pass an array to it like this:

您的示例函数printSomething需要一个指针,所以如果您试图像这样向它传递一个数组:

char s[10] = "hello";printSomething(s);

The compiler pretends that you wrote this:

编译器假装是你写的:

char s[10] = "hello";printSomething(&s[0]);

#2


62  

Let's see:

让我们来看看:

#include <stdio.h>#include <string.h>int main(){    char *p = "hello";    char q[] = "hello"; // no need to count this    printf("%zu\n", sizeof(p)); // => size of pointer to char -- 4 on x86, 8 on x86-64    printf("%zu\n", sizeof(q)); // => size of char array in memory -- 6 on both    // size_t strlen(const char *s) and we don't get any warnings here:    printf("%zu\n", strlen(p)); // => 5    printf("%zu\n", strlen(q)); // => 5    return 0;}

foo* and foo[] are different types and they are handled differently by the compiler (pointer = address + representation of the pointer's type, array = pointer + optional length of the array, if known, for example, if the array is statically allocated), the details can be found in the standard. And at the level of runtime no difference between them (in assembler, well, almost, see below).

foo*和foo[]是不同的类型,它们被编译器以不同的方式处理(指针=地址+指针类型的表示,数组=指针+数组的可选长度,如果已知,例如,如果数组是静态分配的),详细信息可以在标准中找到。在运行时级别,它们之间没有区别(在汇编程序中,差不多,见下文)。

Also, there is a related question in the C FAQ:

另外,在C FAQ中有一个相关的问题:

Q: What is the difference between these initializations?

问:这些初始化有什么不同?

char a[] = "string literal";   char *p  = "string literal";   

My program crashes if I try to assign a new value to p[i].

如果我试图给p[I]赋一个新值,我的程序就会崩溃。

A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

A:字符串文字(C源中双引号字符串的正式术语)可以用两种稍微不同的方式使用:

  1. As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).
  2. 作为char数组的初始化器(如chara[]的声明中所示),它指定该数组中字符的初始值(必要时,还指定其大小)。
  3. Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.
  4. 在其他地方,它会变成一个未命名的静态字符数组,这个未命名的数组可以存储在只读内存中,因此不能对其进行修改。在表达式上下文中,数组像往常一样被转换为指针(参见第6节),因此第二个声明初始化p,指向未命名数组的第一个元素。

Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

有些编译器有一个开关来控制字符串文字是否可写(用于编译旧代码),有些则可以选择将字符串文字正式地当作const char数组来处理(以便更好地捕获错误)。

See also questions 1.31, 6.1, 6.2, 6.8, and 11.8b.

参见问题1.31、6.1、6.2、6.8和11.8b。

References: K&R2 Sec. 5.5 p. 104

参考文献:K&R2,第5.5页104页

ISO Sec. 6.1.4, Sec. 6.5.7

第6.1.4节,第6.5.7节

Rationale Sec. 3.1.4

理由秒。3.1.4

H&S Sec. 2.7.4 pp. 31-2

第2.7.4页31-2页

#3


22  

C99 N1256 draft

C99 N1256草案

There are two completely different uses of array literals:

数组文字有两种完全不同的用法:

  1. Initialize char[]:

    初始化char[]:

    char c[] = "abc";      

    This is "more magic", and described at 6.7.8/14 "Initialization":

    这是“更神奇的”,在6.7.8/14“初始化”中描述:

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    一个字符类型的数组可以用字符串文字初始化,也可以用大括号括起来。字符串文字的连续字符(如果有空间或数组大小未知,包括终止空字符)初始化数组的元素。

    So this is just a shortcut for:

    这是一个捷径

    char c[] = {'a', 'b', 'c', '\0'};

    Like any other regular array, c can be modified.

    与任何其他常规数组一样,c可以被修改。

  2. Everywhere else: it generates an:

    其他地方:它产生:

    So when you write:

    所以当你写:

    char *c = "abc";

    This is similar to:

    这类似于:

    /* __unnamed is magic because modifying it gives UB. */static char __unnamed[] = "abc";char *c = __unnamed;

    Note the implicit cast from char[] to char *, which is always legal.

    注意从char[]到char *的隐式转换,这始终是合法的。

    Then if you modify c[0], you also modify __unnamed, which is UB.

    然后如果你修改c[0],你也修改了__unknown,也就是UB。

    This is documented at 6.4.5 "String literals":

    这是在6.4.5“字符串文字”中记录的:

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    在翻译阶段7中,一个字节或值为0的代码被附加到每个多字节字符序列中,这些字符序列由字符串文字或文字产生。然后使用多字节字符序列初始化一个静态存储持续时间和长度的数组,仅足以包含序列。对于字符串文本,数组元素具有类型char,并使用多字节字符序列的单个字节进行初始化[…]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

    如果这些数组的元素具有适当的值,那么这些数组是否不同就不得而知了。如果程序试图修改这样的数组,则该行为未定义。

6.7.8/32 "Initialization" gives a direct example:

6.7.8/32“初始化”给出了一个直接的例子:

EXAMPLE 8: The declaration

例8:声明

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

定义“普通”字符数组对象s和t,它们的元素是用字符串常量初始化的。

This declaration is identical to

此声明与

char s[] = { 'a', 'b', 'c', '\0' },t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

数组的内容是可以修改的。另一方面,宣言

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

定义具有“指向char的指针”类型的p,并将其初始化为指向具有“char数组”类型的对象,该对象的长度为4,其元素初始化为字符字符串字面量。如果尝试使用p修改数组的内容,则行为未定义。

GCC 4.8 x86-64 ELF implementation

GCC 4.8 x86-64 ELF实现

Program:

计划:

#include <stdio.h>int main() {    char *s = "abc";    printf("%s\n", s);    return 0;}

Compile and decompile:

编译和反编译:

gcc -ggdb -std=c99 -c main.cobjdump -Sr main.o

Output contains:

输出包含:

 char *s = "abc";8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)f:  00         c: R_X86_64_32S .rodata

Conclusion: GCC stores char* it in .rodata section, not in .text.

结论:GCC将char*存储在.rodata部分,而不是.text部分。

If we do the same for char[]:

如果我们对char[]做同样的事:

 char s[] = "abc";

we obtain:

我们获得:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp).

所以它被存储在堆栈中(相对于%rbp)。

Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:

请注意,默认的链接器脚本将.rodata和.text放在同一段中,该段执行但是没有写权限。这可以用以下方法来观察:

readelf -l a.out

which contains:

它包含:

 Section to Segment mapping:  Segment Sections...   02     .text .rodata

#4


7  

You're not allowed to change the contents of a string constant, which is what the first p points to. The second p is an array initialized with a string constant, and you can change its contents.

您不允许更改字符串常量的内容,这是第一个p指向的内容。第二个p是一个用字符串常量初始化的数组,您可以修改它的内容。

#5


5  

For cases like this, the effect is the same: You end up passing the address of the first character in a string of characters.

对于这种情况,效果是相同的:您最终将第一个字符的地址传递到一个字符串中。

The declarations are obviously not the same though.

但是声明显然是不一样的。

The following sets aside memory for a string and also a character pointer, and then initializes the pointer to point to the first character in the string.

下面将为字符串和字符指针预留内存,然后初始化指针以指向字符串中的第一个字符。

char *p = "hello";

While the following sets aside memory just for the string. So it can actually use less memory.

下面将为字符串预留内存。所以它实际上可以使用更少的内存。

char p[10] = "hello";

#6


2  

As far as I can remember, an array is actually a group of pointers.For example

据我所知,数组实际上是一组指针。例如

p[1]== *(&p+1)

is a true statement

是一个真正的声明

#7


0  

char p[3] = "hello" ? should be char p[6] = "hello" remember there is a '\0' char in the end of a "string" in C.

char p[3] = "hello" ?应该是char p[6] = "hello"记住在C中的"string"后面有一个'\0' char。

anyway, array in C is just a pointer to the first object of an adjust objects in the memory. the only different s are in semantics. while you can change the value of a pointer to point to a different location in the memory an array, after created, will always point to the same location.
also when using array the "new" and "delete" is automatically done for you.

总之,C中的数组只是指向内存中调整对象的第一个对象的指针。唯一不同的是语义学。虽然可以更改指针的值以指向内存中的不同位置,但创建后,数组始终指向相同的位置。在使用数组时,“new”和“delete”也会自动为您完成。