基本的c风格字符串内存分配

时间:2021-09-06 01:28:18

I am working on a project with existing code which uses mainly C++ but with c-style strings. Take the following:

我正在使用现有代码开发一个项目,该代码主要使用C ++但使用c风格的字符串。请采取以下措施:

#include <iostream>
int main(int argc, char *argv[])
{
    char* myString = "this is a test";
    myString = "this is a very very very very very very very very very very very long string";
    cout << myString << endl;
    return 0;
}

This compiles and runs fine with the output being the long string.

这个编译并运行正常,输出是长字符串。

However I don't understand WHY it works. My understanding is that

但是我不明白为什么它有效。我的理解是

char* myString 

is a pointer to an area of memory big enough to hold the string literal "this is a test". If that's the case, then how am I able to then store a much longer string in the same location? I expected it to crash when doing this due to trying to cram a long string into a space set aside for the shorter one.

是一个指向内存区域的指针,该区域足以容纳字符串文字“这是一个测试”。如果是这种情况,那我怎么能在同一个位置存储更长的字符串呢?我希望它在这样做时会崩溃,因为我试图将一个长字符串塞进一个留给较短字符串的空间。

Obviously there's a basic misunderstanding of what's going on here so I appreciate any help understanding this.

显然对这里发生的事情有一个基本的误解,所以我感谢任何帮助理解这一点。

6 个解决方案

#1


14  

You're not changing the content of the memory, you're changing the value of the pointer to point to a different area of memory which holds "this is a very very very very very very very very very very very long string".

你没有改变内存的内容,你正在改变指针的值,指向不同的内存区域,这意味着“这是一个非常非常非常非常非常非常非常非常长的字符串”。

Note that char* myString only allocates enough bytes for the pointer (usually 4 or 8 bytes). When you do char* myString = "this is a test";, what actually happened was that before your program even started, the compiler allocated space in the executable image and put "this is a test" in that memory. Then when you do char* myString = "this is a test"; what it actually does is just allocate enough bytes for the pointer, and make the pointer point to that memory it had already allocated at compile time, in the executable.

请注意,char * myString仅为指针分配足够的字节(通常为4或8个字节)。当你做char * myString =“这是一个测试”时,实际发生的事情是,在程序开始之前,编译器在可执行映像中分配空间并在该内存中放置“这是一个测试”。然后当你做char * myString =“这是一个测试”;它实际上只是为指针分配足够的字节,并使指针指向它在编译时已在可执行文件中分配的内存。

So if you like diagrams:

所以如果你喜欢图表:

char* myString = "this is a test";

(allocate memory for myString)

              ---> "this is a test"
            / 
myString---

                   "this is a very very very very very very very very very very very long string"

Then

myString = "this is a very very very very very very very very very very very long string";

                   "this is a test"

myString---
            \
              ---> "this is a very very very very very very very very very very very long string"

#2


5  

There are two strings in the memory. First is "this is a test" and lets say it begins at the address 0x1000. The second is "this is a very very ... test" and it begins at the address 0x1200.

内存中有两个字符串。首先是“这是一个测试”,让我们说它从地址0x1000开始。第二个是“这是一个非常非常......测试”,它从地址0x1200开始。

By

char* myString = "this is a test";

you crate a variable called myString and assign address 0x1000 to it. Then, by

你创建一个名为myString的变量并为其分配地址0x1000。然后,通过

myString = "this is a very very ... test";

you assign 0x1200. By

你分配0x1200。通过

cout << myString << endl;

you just print the string beginning at 0x1200.

你只需要打印从0x1200开始的字符串。

#3


2  

You have two string literals of type const char[n]. These can be assigned to a variable of type char*, which is nothing more than a pointer to a char. Whenever you declare a variable of type pointer-to-T you are only declaring the pointer, and not the memory to which it points.

你有两个类型为const char [n]的字符串文字。这些可以分配给char *类型的变量,它只是一个指向char的指针。每当你声明一个指向-T的类型的变量时,你只是声明指针,而不是它指向的内存。

The compiler reserves memory for both literals and you just take your pointer variable and point it at those literals one after the other. String literals are read-only and their allocation is taken care of by the compiler. Typically they are stored in the executable image in protected read-only memory. A string literal typically has a lifetime equal to that of the program itself.

编译器为两个文字保留内存,您只需获取指针变量并将其指向一个接一个的文字。字符串文字是只读的,它们的分配由编译器处理。通常,它们存储在受保护的只读存储器中的可执行映像中。字符串文字的生命周期通常等于程序本身的生命周期。

Now, it would be UB if you attempted to modify the contents of a literal, but you don't. To help prevent yourself from attempting modifications in error you would be wise to declare your variable as const char*.

现在,如果您尝试修改文字的内容,那么它将是UB,但您不会。为了防止自己尝试修改错误,最好将变量声明为const char *。

#4


2  

During program execution, a block of memory containing "this is a test" is allocated, and the address of the first character in that block of memory is assigned to the myString variable. In the next line, a separate block of memory containing "this is a very very..." is allocated, and the address of the first character in that block of memory is now assigned to the myString variable, replacing the address it used to store with the new address to the "very very long" string.

在程序执行期间,分配一个包含“this is a test”的内存块,并将该内存块中第一个字符的地址分配给myString变量。在下一行中,分配了一个单独的内存块,其中包含“this is a very very ...”,并且该内存块中第一个字符的地址现在被分配给myString变量,替换它用于的地址。使用新地址存储到“非常非常长”的字符串。

just for illustration, let's say the first block of memory looks like this:

只是为了说明,假设第一块内存看起来像这样:

[t][h][i][s][ ][i][s][ ][a][ ][t][e][s][t] and let's just say the address of this first 't' character in this sequence/array of characters is 0x100. so after the first assignment of the myString variable, the myString variable contains the address 0x100, which points to the first letter of "this is a test".

[t] [h] [i] [s] [] [i] [s] [] [a] [] [t] [e] [s] [t]让我们先说出第一个地址'此序列中的字符/字符数组是0x100。所以在第一次分配myString变量之后,myString变量包含地址0x100,它指向“这是一个测试”的第一个字母。

then, a totally different block of memory contains:

然后,一个完全不同的内存块包含:

[t][h][i][s][ ][i][s][ ][a][ ][v][e][r][r][y]... and let's just say that the address of this first 't' character is 0x200. so after the second assignment of the myString variable, the myString variable NOW contains the address 0x200, which points to the first letter of "this is a very very very...".

[t] [h] [i] [s] [] [i] [s] [] [a] [] [v] [e] [r] [r] [y] ...让我们说第一个't'字符的地址是0x200。所以在myString变量的第二次赋值之后,myString变量NOW包含地址0x200,它指向“这是一个非常非常非常......”的第一个字母。

Since myString is just a pointer to a character (hence: "char *" is it's type), it only stores the address of a character; it has no concern for how big the array is supposed to be, it doesn't even know that it is pointing to an "array", only that it is storing the address of a character...

由于myString只是一个指向字符的指针(因此:“char *”是它的类型),它只存储一个字符的地址;它不关心数组应该有多大,它甚至不知道它指向一个“数组”,只是它存储了一个字符的地址......

for example, you could legally do this:

例如,你可以合法地这样做:

    char myChar = 'C';
/* assign the address of the location in 
   memory in which 'C' is stored to 
   the myString variable. */
    myString = &myChar; 

Hopefully that was clear enough. If so, upvote/accept answer. If not, please comment so that I may clarify.

希望这很清楚。如果是这样,upvote /接受答案。如果没有,请发表评论,以便我澄清。

#5


1  

string literals do not require allocation - they are stored as-is and can be used directly. Essentially myString was a pointer to one string literal, and was changed to point to another string literal.

字符串文字不需要分配 - 它们按原样存储,可以直接使用。本质上,myString是一个指向一个字符串文字的指针,并被更改为指向另一个字符串文字。

#6


0  

char* means a pointer to a block of memory that holds a character.

char *表示指向包含字符的内存块的指针。

C style string functions get a pointer to the start of a string. They assume there's a sequence of characters that end with a 0-null character (\n).

C样式字符串函数获取指向字符串开头的指针。他们假设有一系列以0-null字符结尾的字符(\ n)。

So what the << operator actually does is loop from that first character position until it finds a null character.

那么< <运算符实际上做的是从第一个字符位置循环直到它找到一个空字符。< p>

#1


14  

You're not changing the content of the memory, you're changing the value of the pointer to point to a different area of memory which holds "this is a very very very very very very very very very very very long string".

你没有改变内存的内容,你正在改变指针的值,指向不同的内存区域,这意味着“这是一个非常非常非常非常非常非常非常非常长的字符串”。

Note that char* myString only allocates enough bytes for the pointer (usually 4 or 8 bytes). When you do char* myString = "this is a test";, what actually happened was that before your program even started, the compiler allocated space in the executable image and put "this is a test" in that memory. Then when you do char* myString = "this is a test"; what it actually does is just allocate enough bytes for the pointer, and make the pointer point to that memory it had already allocated at compile time, in the executable.

请注意,char * myString仅为指针分配足够的字节(通常为4或8个字节)。当你做char * myString =“这是一个测试”时,实际发生的事情是,在程序开始之前,编译器在可执行映像中分配空间并在该内存中放置“这是一个测试”。然后当你做char * myString =“这是一个测试”;它实际上只是为指针分配足够的字节,并使指针指向它在编译时已在可执行文件中分配的内存。

So if you like diagrams:

所以如果你喜欢图表:

char* myString = "this is a test";

(allocate memory for myString)

              ---> "this is a test"
            / 
myString---

                   "this is a very very very very very very very very very very very long string"

Then

myString = "this is a very very very very very very very very very very very long string";

                   "this is a test"

myString---
            \
              ---> "this is a very very very very very very very very very very very long string"

#2


5  

There are two strings in the memory. First is "this is a test" and lets say it begins at the address 0x1000. The second is "this is a very very ... test" and it begins at the address 0x1200.

内存中有两个字符串。首先是“这是一个测试”,让我们说它从地址0x1000开始。第二个是“这是一个非常非常......测试”,它从地址0x1200开始。

By

char* myString = "this is a test";

you crate a variable called myString and assign address 0x1000 to it. Then, by

你创建一个名为myString的变量并为其分配地址0x1000。然后,通过

myString = "this is a very very ... test";

you assign 0x1200. By

你分配0x1200。通过

cout << myString << endl;

you just print the string beginning at 0x1200.

你只需要打印从0x1200开始的字符串。

#3


2  

You have two string literals of type const char[n]. These can be assigned to a variable of type char*, which is nothing more than a pointer to a char. Whenever you declare a variable of type pointer-to-T you are only declaring the pointer, and not the memory to which it points.

你有两个类型为const char [n]的字符串文字。这些可以分配给char *类型的变量,它只是一个指向char的指针。每当你声明一个指向-T的类型的变量时,你只是声明指针,而不是它指向的内存。

The compiler reserves memory for both literals and you just take your pointer variable and point it at those literals one after the other. String literals are read-only and their allocation is taken care of by the compiler. Typically they are stored in the executable image in protected read-only memory. A string literal typically has a lifetime equal to that of the program itself.

编译器为两个文字保留内存,您只需获取指针变量并将其指向一个接一个的文字。字符串文字是只读的,它们的分配由编译器处理。通常,它们存储在受保护的只读存储器中的可执行映像中。字符串文字的生命周期通常等于程序本身的生命周期。

Now, it would be UB if you attempted to modify the contents of a literal, but you don't. To help prevent yourself from attempting modifications in error you would be wise to declare your variable as const char*.

现在,如果您尝试修改文字的内容,那么它将是UB,但您不会。为了防止自己尝试修改错误,最好将变量声明为const char *。

#4


2  

During program execution, a block of memory containing "this is a test" is allocated, and the address of the first character in that block of memory is assigned to the myString variable. In the next line, a separate block of memory containing "this is a very very..." is allocated, and the address of the first character in that block of memory is now assigned to the myString variable, replacing the address it used to store with the new address to the "very very long" string.

在程序执行期间,分配一个包含“this is a test”的内存块,并将该内存块中第一个字符的地址分配给myString变量。在下一行中,分配了一个单独的内存块,其中包含“this is a very very ...”,并且该内存块中第一个字符的地址现在被分配给myString变量,替换它用于的地址。使用新地址存储到“非常非常长”的字符串。

just for illustration, let's say the first block of memory looks like this:

只是为了说明,假设第一块内存看起来像这样:

[t][h][i][s][ ][i][s][ ][a][ ][t][e][s][t] and let's just say the address of this first 't' character in this sequence/array of characters is 0x100. so after the first assignment of the myString variable, the myString variable contains the address 0x100, which points to the first letter of "this is a test".

[t] [h] [i] [s] [] [i] [s] [] [a] [] [t] [e] [s] [t]让我们先说出第一个地址'此序列中的字符/字符数组是0x100。所以在第一次分配myString变量之后,myString变量包含地址0x100,它指向“这是一个测试”的第一个字母。

then, a totally different block of memory contains:

然后,一个完全不同的内存块包含:

[t][h][i][s][ ][i][s][ ][a][ ][v][e][r][r][y]... and let's just say that the address of this first 't' character is 0x200. so after the second assignment of the myString variable, the myString variable NOW contains the address 0x200, which points to the first letter of "this is a very very very...".

[t] [h] [i] [s] [] [i] [s] [] [a] [] [v] [e] [r] [r] [y] ...让我们说第一个't'字符的地址是0x200。所以在myString变量的第二次赋值之后,myString变量NOW包含地址0x200,它指向“这是一个非常非常非常......”的第一个字母。

Since myString is just a pointer to a character (hence: "char *" is it's type), it only stores the address of a character; it has no concern for how big the array is supposed to be, it doesn't even know that it is pointing to an "array", only that it is storing the address of a character...

由于myString只是一个指向字符的指针(因此:“char *”是它的类型),它只存储一个字符的地址;它不关心数组应该有多大,它甚至不知道它指向一个“数组”,只是它存储了一个字符的地址......

for example, you could legally do this:

例如,你可以合法地这样做:

    char myChar = 'C';
/* assign the address of the location in 
   memory in which 'C' is stored to 
   the myString variable. */
    myString = &myChar; 

Hopefully that was clear enough. If so, upvote/accept answer. If not, please comment so that I may clarify.

希望这很清楚。如果是这样,upvote /接受答案。如果没有,请发表评论,以便我澄清。

#5


1  

string literals do not require allocation - they are stored as-is and can be used directly. Essentially myString was a pointer to one string literal, and was changed to point to another string literal.

字符串文字不需要分配 - 它们按原样存储,可以直接使用。本质上,myString是一个指向一个字符串文字的指针,并被更改为指向另一个字符串文字。

#6


0  

char* means a pointer to a block of memory that holds a character.

char *表示指向包含字符的内存块的指针。

C style string functions get a pointer to the start of a string. They assume there's a sequence of characters that end with a 0-null character (\n).

C样式字符串函数获取指向字符串开头的指针。他们假设有一系列以0-null字符结尾的字符(\ n)。

So what the << operator actually does is loop from that first character position until it finds a null character.

那么< <运算符实际上做的是从第一个字符位置循环直到它找到一个空字符。< p>