如何在内存中分配C字符串?

时间:2021-02-27 21:17:26

Say I have a simple function that returns a C string this way:

假设我有一个以这种方式返回C字符串的简单函数:

const char * getString()
{
  const char * ptr = "blah blah";
  return ptr; 
}

and I call getString() from main() this way:

我以这种方式从main()调用getString():

  const char * s = getString();

1) According to gdb, the variable ptr is stored on the stack, but the string pointed by ptr is not:

1)根据gdb,变量ptr存储在堆栈中,但ptr指向的字符串不是:

(gdb) p &ptr
$1 = (const char **) 0x7fffffffe688

(gdb) p ptr
$2 = 0x4009fc "blah blah"

Does this mean that "blah blah" is not a local variable inside getString()?

这是否意味着“blah blah”不是getString()中的局部变量?

I guess that if it were a local variable, I would not be able to pass it to my main() function... But if it's not, where is it stored? On the heap? Is that a "kind of" dynamically memory allocation implemented by the OS every time it hits on a string, or what?

我想如果它是一个局部变量,我将无法将它传递给我的main()函数......但如果不是,它存储在哪里?在堆上?这是OS每次点击字符串时实现的“一种”动态内存分配,还是什么?

2) If I use an array instead of a pointer, this way:

2)如果我使用数组而不是指针,这样:

const char *getString2()
{
  const char a[] = "blah blah blah";
  return a;
}

the compiler warns me that:

编译器警告我:

warning: address of local variable ‘a’ returned

警告:返回的局部变量'a'的地址

(and of course the program compiles, but it doesn't work).

(当然程序编译,但它不起作用)。

Actually, if I ask gdb, I get

实际上,如果我问gdb,我明白了

(gdb) p &a
$2 = (const char (*)[15]) 0x7fffffffe690

But I thought that const char * ptr and const char a[] were basically the same thing. Looks like they're not.

但我认为const char * ptr和const char a []基本上是一回事。看起来他们不是。

Am I wrong? What is exactely the difference between the two versions?

我错了吗?什么是两个版本之间的差异?

Thank you!

5 个解决方案

#1


7  

When you write

当你写作

const char *ptr = "blah blah";

then the following happens: the compiler generates a constant string (of type char []) with the contents "blah blah" and stores it somewhere in the data segment of the executable (it basically has a similar storage duration to that of variables declared using the static keyword).

然后发生以下情况:编译器生成一个常量字符串(类型为char []),内容为“blah blah”并将其存储在可执行文件的数据段中的某个位置(它基本上具有与使用static关键字)。

Then, the address of this string, which is valid throughout the lifetime of the program, is stored in the ptr pointer, which is then returned. All is fine.

然后,该字符串的地址(在程序的整个生命周期内有效)存储在ptr指针中,然后返回该指针。一切皆好。

Does this mean that "blah blah" is not a local variable inside getString()?

这是否意味着“blah blah”不是getString()中的局部变量?

Let me respond with a broken English sentence: yes, it isn't.

让我用一句破碎的英语句子回答:是的,事实并非如此。

However, when you declare an array, as in

但是,当您声明一个数组时,如

const char a[] = "blah blah";

then the compiler doesn't generate a static string. (Indeed, this is a somewhat special case when initializing strings.) It then generates code that will allocate a big enough piece of stack memory for the a array (it's not a pointer!) and will fill it with the bytes of the string. Here a is actually a local variable and returning its address results in undefined behavior.

那么编译器不会生成静态字符串。 (实际上,这在初始化字符串时有点特殊。)然后生成代码,为一个数组分配足够大的堆栈内存(它不是一个指针!),并用字符串的字节填充它。这里a实际上是一个局部变量,并返回其地址导致未定义的行为。

So...

But I thought that const char *ptr and const char a[] were basically the same thing.

但我认为const char * ptr和const char a []基本上是一回事。

No, not at all, because arrays are not pointers.

不,完全没有,因为数组不是指针。

#2


3  

I guess that if it were a local variable, I would not be able to pass it to my main() function... But if it's not, where is it stored?

我想如果它是一个局部变量,我将无法将它传递给我的main()函数......但如果不是,它存储在哪里?

String literals are usually stored in a read-only data section (.rodata). C standard just say they have static storage duration. Therefore you can return a pointer to such literal, but it is not the case of arrays.

字符串文字通常存储在只读数据部分(.rodata)中。 C标准只是说他们有静态存储持续时间。因此,您可以返回指向此类文字的指针,但不是数组的情况。

In the following example, the object pointed by p1 has static storage duration, whereas the array p2 has automatic storage duration.

在以下示例中,p1指向的对象具有静态存储持续时间,而数组p2具有自动存储持续时间。

char *f(void)
{
    const char *p1 = "hello, world";
    char p2[] = "hello, world";

    return p1; /* allowed */
    return p2, /* forbidden */
}

#3


2  

You are right in that they are not the same thing. char a[] is an array formed on the stack, and then filled with "blah.." - inside the function, you have essentially `const char a[15]; strcpy(a, "blah blah blah");"

你是对的,他们不是一回事。 char a []是在堆栈上形成的数组,然后填充“blah ..” - 在函数内部,你基本上是`const char a [15]; strcpy(a,“blah blah blah”);“

The const char *ptr = "blah blah blah"; on the other hand is simply a pointer (the pointer itself is on the stack), and the pointer points to the string "blah blah blah", which is stored somewhere else [in "read only data" most likely].

const char * ptr =“blah blah blah”;另一方面,它只是一个指针(指针本身在堆栈上),指针指向字符串“blah blah blah”,它存储在其他地方[很可能是“只读数据”]。

You will notice a big difference if you try to alter something, e.g: a[2] = 'e'; vs ptr[2] = 'e'; - the first one will succeed, because you are modifying a stack value, where the second (probably) will fail, because you are modifying a read only piece of memory, which of course should not work.

如果你试图改变某些东西,你会注意到一个很大的不同,例如:a [2] ='e'; vs ptr [2] ='e'; - 第一个将成功,因为你正在修改堆栈值,其中第二个(可能)将失败,因为你正在修改一个只读内存,当然这应该不起作用。

#4


2  

In your function, the scope of a[] array is within the function getString2(). its local array variable.

在函数中,[]数组的范围在函数getString2()中。它的本地数组变量。

const char *getString2()
{
  const char a[] = "blah blah blah";
  return a;
}  

In above case string "blah blah blah" copies fist into a[] and your are trying to return that array with return a statement, but not constant string.

在上面的情况下,字符串“blah blah blah”将拳头复制到[]并且您试图返回该数组并返回一个语句,但不是常量字符串。

Where as in first code getString() : ptr = "blah blah"; ptr point to memory that has global scope.

第一个代码中的getString():ptr =“blah blah”; ptr指向具有全局范围的内存。

const char * getString()
{
  const char * ptr = "blah blah";
  return ptr; 
}

In this case you returns the address of constant string "blah blah" that is legal to do.

在这种情况下,您返回常量字符串“blah blah”的地址,这是合法的。

So actually its Scope problem.

实际上它的Scope问题。

it helpful to learn about Memory Layout of C Programs and Variable Scope in C.

有助于了解C程序的内存布局和C中的变量范围。

#5


1  

They are not the same.

他们不一样。

The first is a pointer to a string literal. The pointer itself is in automatic storage. The string is in static, read-only memory. It's immutable.

第一个是指向字符串文字的指针。指针本身处于自动存储中。该字符串位于静态只读内存中。这是不可改变的。

The second is an automatic (stack) char array (and that return is, as the warning says, not legal).

第二个是自动(堆栈)char数组(正如警告所说,返回的是不合法的)。

#1


7  

When you write

当你写作

const char *ptr = "blah blah";

then the following happens: the compiler generates a constant string (of type char []) with the contents "blah blah" and stores it somewhere in the data segment of the executable (it basically has a similar storage duration to that of variables declared using the static keyword).

然后发生以下情况:编译器生成一个常量字符串(类型为char []),内容为“blah blah”并将其存储在可执行文件的数据段中的某个位置(它基本上具有与使用static关键字)。

Then, the address of this string, which is valid throughout the lifetime of the program, is stored in the ptr pointer, which is then returned. All is fine.

然后,该字符串的地址(在程序的整个生命周期内有效)存储在ptr指针中,然后返回该指针。一切皆好。

Does this mean that "blah blah" is not a local variable inside getString()?

这是否意味着“blah blah”不是getString()中的局部变量?

Let me respond with a broken English sentence: yes, it isn't.

让我用一句破碎的英语句子回答:是的,事实并非如此。

However, when you declare an array, as in

但是,当您声明一个数组时,如

const char a[] = "blah blah";

then the compiler doesn't generate a static string. (Indeed, this is a somewhat special case when initializing strings.) It then generates code that will allocate a big enough piece of stack memory for the a array (it's not a pointer!) and will fill it with the bytes of the string. Here a is actually a local variable and returning its address results in undefined behavior.

那么编译器不会生成静态字符串。 (实际上,这在初始化字符串时有点特殊。)然后生成代码,为一个数组分配足够大的堆栈内存(它不是一个指针!),并用字符串的字节填充它。这里a实际上是一个局部变量,并返回其地址导致未定义的行为。

So...

But I thought that const char *ptr and const char a[] were basically the same thing.

但我认为const char * ptr和const char a []基本上是一回事。

No, not at all, because arrays are not pointers.

不,完全没有,因为数组不是指针。

#2


3  

I guess that if it were a local variable, I would not be able to pass it to my main() function... But if it's not, where is it stored?

我想如果它是一个局部变量,我将无法将它传递给我的main()函数......但如果不是,它存储在哪里?

String literals are usually stored in a read-only data section (.rodata). C standard just say they have static storage duration. Therefore you can return a pointer to such literal, but it is not the case of arrays.

字符串文字通常存储在只读数据部分(.rodata)中。 C标准只是说他们有静态存储持续时间。因此,您可以返回指向此类文字的指针,但不是数组的情况。

In the following example, the object pointed by p1 has static storage duration, whereas the array p2 has automatic storage duration.

在以下示例中,p1指向的对象具有静态存储持续时间,而数组p2具有自动存储持续时间。

char *f(void)
{
    const char *p1 = "hello, world";
    char p2[] = "hello, world";

    return p1; /* allowed */
    return p2, /* forbidden */
}

#3


2  

You are right in that they are not the same thing. char a[] is an array formed on the stack, and then filled with "blah.." - inside the function, you have essentially `const char a[15]; strcpy(a, "blah blah blah");"

你是对的,他们不是一回事。 char a []是在堆栈上形成的数组,然后填充“blah ..” - 在函数内部,你基本上是`const char a [15]; strcpy(a,“blah blah blah”);“

The const char *ptr = "blah blah blah"; on the other hand is simply a pointer (the pointer itself is on the stack), and the pointer points to the string "blah blah blah", which is stored somewhere else [in "read only data" most likely].

const char * ptr =“blah blah blah”;另一方面,它只是一个指针(指针本身在堆栈上),指针指向字符串“blah blah blah”,它存储在其他地方[很可能是“只读数据”]。

You will notice a big difference if you try to alter something, e.g: a[2] = 'e'; vs ptr[2] = 'e'; - the first one will succeed, because you are modifying a stack value, where the second (probably) will fail, because you are modifying a read only piece of memory, which of course should not work.

如果你试图改变某些东西,你会注意到一个很大的不同,例如:a [2] ='e'; vs ptr [2] ='e'; - 第一个将成功,因为你正在修改堆栈值,其中第二个(可能)将失败,因为你正在修改一个只读内存,当然这应该不起作用。

#4


2  

In your function, the scope of a[] array is within the function getString2(). its local array variable.

在函数中,[]数组的范围在函数getString2()中。它的本地数组变量。

const char *getString2()
{
  const char a[] = "blah blah blah";
  return a;
}  

In above case string "blah blah blah" copies fist into a[] and your are trying to return that array with return a statement, but not constant string.

在上面的情况下,字符串“blah blah blah”将拳头复制到[]并且您试图返回该数组并返回一个语句,但不是常量字符串。

Where as in first code getString() : ptr = "blah blah"; ptr point to memory that has global scope.

第一个代码中的getString():ptr =“blah blah”; ptr指向具有全局范围的内存。

const char * getString()
{
  const char * ptr = "blah blah";
  return ptr; 
}

In this case you returns the address of constant string "blah blah" that is legal to do.

在这种情况下,您返回常量字符串“blah blah”的地址,这是合法的。

So actually its Scope problem.

实际上它的Scope问题。

it helpful to learn about Memory Layout of C Programs and Variable Scope in C.

有助于了解C程序的内存布局和C中的变量范围。

#5


1  

They are not the same.

他们不一样。

The first is a pointer to a string literal. The pointer itself is in automatic storage. The string is in static, read-only memory. It's immutable.

第一个是指向字符串文字的指针。指针本身处于自动存储中。该字符串位于静态只读内存中。这是不可改变的。

The second is an automatic (stack) char array (and that return is, as the warning says, not legal).

第二个是自动(堆栈)char数组(正如警告所说,返回的是不合法的)。