When compiling this:
当编译:
// external definitions
int value1 = 0;
static int value2 = 0;
the gcc compiler generates the following assembly:
gcc编译器生成以下程序集:
.globl value1
.bss
.align 4
.type value1, @object
.size value1, 4
value1:
.zero 4
.local value2
.comm value2,4,4
However, when i initialize the variables to a value other than zero such as:
但是,当我将变量初始化为非零值时,例如:
// external definitions
int value1 = 1;
static int value2 = 1;
the gcc compiler generated the following:
gcc编译器生成以下内容:
.globl value1
.data
.align 4
.type value1, @object
.size value1, 4
value1:
.long 1
.align 4
.type value2, @object
.size value2, 4
value2:
.long 1
My questions are:
我的问题是:
- Why in the first case the values are allocated in the bss segment while in the second case in the data segment.
- 为什么在第一种情况下值在bss段中分配,而在第二种情况下在数据段中分配。
- Why value2 variable is defined as .local and .comm in the first case, while not in the second.
- 为什么value2变量在第一种情况下定义为.local和.comm,而在第二种情况下则不是。
3 个解决方案
#1
11
Generally speaking, the bss
section contains uninitialized values and the data
section contains initialized values. However, gcc places values that are initialized to zero into the bss
section instead of the data
section, as the bss
section is zeroed out in runtime anyway, it doesn't make much sense to store zeros in the data
section, this saves some disk space, from man gcc:
一般来说,bss部分包含未初始化的值,而data部分包含初始化的值。然而,gcc将被初始化为0的值放在bss部分而不是data部分,因为bss部分在运行时被归零,在data部分存储0没有多大意义,这节省了一些磁盘空间,从man gcc:
-fno-zero-initialized-in-bss If the target supports a BSS section, GCC by default puts variables that are initialized to zero into BSS. This can save space in the resulting code. This option turns off this behavior because some programs explicitly rely on variables going to the data section
-fno- 0初始化-in- BSS如果目标支持BSS段,GCC默认情况下会将初始化为0的变量放入BSS中。这可以在生成的代码中节省空间。这个选项关闭了这个行为,因为有些程序显式地依赖到data部分的变量
I'm not sure why .comm
is used with static storage which is local to an object file, it is usually used to declare common symbols that, if not defined/initialized, should be merged by the linker with symbol that have the same name from other object files and that's why it's not used in the second example because the variables are initialized, from the as
manual
我不确定为什么.comm使用静态存储本地对象的文件,通常是用于声明常见的符号,如果没有定义/初始化,应该合并与符号链接器,从其他对象文件具有相同的名称,这就是为什么它不是用在第二个例子中,因为变量初始化,从手册
.comm declares a common symbol named symbol. When linking, a common symbol in one object file may be merged with a defined or common symbol of the same name in another object file
.comm宣告了一个共同的符号,称为符号。当链接时,一个对象文件中的公共符号可以与另一个对象文件中同名的定义或公共符号合并
#2
5
The first case is because you initialized the values with zero. It's part of the C standard (section 6.7.8) that a global integer gets initialized with 0 if none is specified. So file formats made a provision to keep binaries smaller by having a special section these get placed in: bss
. If you take a look at some of the ELF specification (on page I-15), you'll find this:
第一种情况是,将值初始化为0。它是C标准(第6.7.8节)的一部分,如果没有指定全局整数,则用0初始化。因此,文件格式通过放置一个特殊的部分:bss来减少二进制文件的大小。如果你看一下ELF规范(在第15页),你会发现:
.bss This section holds uninitialized data that contribute to the program's memory image. By definition, the system initializes the data with zeros when the program begins to run. The section occupies no file space, as indicated by the section type, SHT_NOBITS.
本节包含未初始化的数据,这些数据有助于程序的内存映像。根据定义,当程序开始运行时,系统初始化数据为零。该节不占用任何文件空间,如节类型SHT_NOBITS所示。
In the first case, the compiler made an optimization. It doesn't need to take up room in the actual binary to store the initializer, since it can use the bss
segment and get the one you want for free.
在第一种情况下,编译器进行了优化。它不需要在实际的二进制文件中占用空间来存储初始化器,因为它可以使用bss段并获得您想要的免费。
Now, the fact that you have a static coming in from an external source is a bit interesting (it's not typically done). In the module being compiled though, that should not be shared with other modules, and should be marked with .local
. I suspect it does it this way because there is no actual value to be stored for the initializer.
现在,您有一个来自外部源的静态输入这个事实有点有趣(它通常不是这样做的)。在编译的模块中,不应该与其他模块共享,并且应该标记为.local。我怀疑它是这样做的,因为没有为初始化器存储的实际值。
In the second example, because you've given a non-zero initializer, it know resides in the initialized data segment data
. value1
looks very similar, but for value2
, the compiler needs to reserve space for the initializer. In this case, it doesn't need to be marked as .local
because it can just lay down the value and be done with it. It's not global because there is no .globl
statement for it.
在第二个示例中,因为您已经给出了一个非零初始化器,所以它知道驻留在初始化的数据段数据中。value1看起来非常相似,但是对于value2,编译器需要为初始化器预留空间。在这种情况下,不需要将它标记为.local,因为它可以指定值并对其进行处理。它不是全局的,因为它没有。globl语句。
BTW, http://refspecs.linuxbase.org/ is a good place to visit for some of the low-level details about binary formats and such.
顺便说一句,http://refspecs.linuxbase.org/是访问二进制格式等底层细节的好地方。
#3
3
BSS is the segment containing data initialized at run time where as data segment contains data initialized in the program binary.
BSS是在运行时初始化的数据段,数据段包含程序二进制中初始化的数据。
Now static variables are always initialized whether done explicitly in program or not. But there are two separate categories, initialized (DS) and uninitialized (BSS) statics.
现在静态变量总是初始化,不管是否在程序中显式地执行。但是有两个独立的类别,初始化(DS)和未初始化(BSS)静态。
All values present in BSS are those which are not initialized in the code of program and hence initialized when program is loaded at run time to 0 (if integer), null for pointers etc.
BSS中的所有值都是在程序代码中未初始化的值,因此当程序在运行时加载到0(如果是整数)、指针的空值等时初始化。
So when you initialize with 0, the value goes to BSS where as any other value assigned will allocate the variable in Data segment.
当你用0初始化时,它的值就会转到BSS,因为分配的任何其他值都会在数据段中分配变量。
An interesting consequence is, the size of data initialized in BSS will not be included in program binary, where as that of the one in data segment is included.
一个有趣的结果是,在BSS中初始化的数据的大小将不包含在程序二进制中,其中包含数据段中的数据。
Try allocating a large static array and use it in a program. See the executable size when it is not initialized explicitly in code. Then initialize it with non zero values like
尝试分配一个大型静态数组并在程序中使用它。在代码中未显式初始化可执行大小时,请参见可执行大小。然后用非零值来初始化它
static int arr[1000] = {2};
The size of executable in the latter case will be significantly greater
后一种情况下可执行文件的大小将显著增加。
#1
11
Generally speaking, the bss
section contains uninitialized values and the data
section contains initialized values. However, gcc places values that are initialized to zero into the bss
section instead of the data
section, as the bss
section is zeroed out in runtime anyway, it doesn't make much sense to store zeros in the data
section, this saves some disk space, from man gcc:
一般来说,bss部分包含未初始化的值,而data部分包含初始化的值。然而,gcc将被初始化为0的值放在bss部分而不是data部分,因为bss部分在运行时被归零,在data部分存储0没有多大意义,这节省了一些磁盘空间,从man gcc:
-fno-zero-initialized-in-bss If the target supports a BSS section, GCC by default puts variables that are initialized to zero into BSS. This can save space in the resulting code. This option turns off this behavior because some programs explicitly rely on variables going to the data section
-fno- 0初始化-in- BSS如果目标支持BSS段,GCC默认情况下会将初始化为0的变量放入BSS中。这可以在生成的代码中节省空间。这个选项关闭了这个行为,因为有些程序显式地依赖到data部分的变量
I'm not sure why .comm
is used with static storage which is local to an object file, it is usually used to declare common symbols that, if not defined/initialized, should be merged by the linker with symbol that have the same name from other object files and that's why it's not used in the second example because the variables are initialized, from the as
manual
我不确定为什么.comm使用静态存储本地对象的文件,通常是用于声明常见的符号,如果没有定义/初始化,应该合并与符号链接器,从其他对象文件具有相同的名称,这就是为什么它不是用在第二个例子中,因为变量初始化,从手册
.comm declares a common symbol named symbol. When linking, a common symbol in one object file may be merged with a defined or common symbol of the same name in another object file
.comm宣告了一个共同的符号,称为符号。当链接时,一个对象文件中的公共符号可以与另一个对象文件中同名的定义或公共符号合并
#2
5
The first case is because you initialized the values with zero. It's part of the C standard (section 6.7.8) that a global integer gets initialized with 0 if none is specified. So file formats made a provision to keep binaries smaller by having a special section these get placed in: bss
. If you take a look at some of the ELF specification (on page I-15), you'll find this:
第一种情况是,将值初始化为0。它是C标准(第6.7.8节)的一部分,如果没有指定全局整数,则用0初始化。因此,文件格式通过放置一个特殊的部分:bss来减少二进制文件的大小。如果你看一下ELF规范(在第15页),你会发现:
.bss This section holds uninitialized data that contribute to the program's memory image. By definition, the system initializes the data with zeros when the program begins to run. The section occupies no file space, as indicated by the section type, SHT_NOBITS.
本节包含未初始化的数据,这些数据有助于程序的内存映像。根据定义,当程序开始运行时,系统初始化数据为零。该节不占用任何文件空间,如节类型SHT_NOBITS所示。
In the first case, the compiler made an optimization. It doesn't need to take up room in the actual binary to store the initializer, since it can use the bss
segment and get the one you want for free.
在第一种情况下,编译器进行了优化。它不需要在实际的二进制文件中占用空间来存储初始化器,因为它可以使用bss段并获得您想要的免费。
Now, the fact that you have a static coming in from an external source is a bit interesting (it's not typically done). In the module being compiled though, that should not be shared with other modules, and should be marked with .local
. I suspect it does it this way because there is no actual value to be stored for the initializer.
现在,您有一个来自外部源的静态输入这个事实有点有趣(它通常不是这样做的)。在编译的模块中,不应该与其他模块共享,并且应该标记为.local。我怀疑它是这样做的,因为没有为初始化器存储的实际值。
In the second example, because you've given a non-zero initializer, it know resides in the initialized data segment data
. value1
looks very similar, but for value2
, the compiler needs to reserve space for the initializer. In this case, it doesn't need to be marked as .local
because it can just lay down the value and be done with it. It's not global because there is no .globl
statement for it.
在第二个示例中,因为您已经给出了一个非零初始化器,所以它知道驻留在初始化的数据段数据中。value1看起来非常相似,但是对于value2,编译器需要为初始化器预留空间。在这种情况下,不需要将它标记为.local,因为它可以指定值并对其进行处理。它不是全局的,因为它没有。globl语句。
BTW, http://refspecs.linuxbase.org/ is a good place to visit for some of the low-level details about binary formats and such.
顺便说一句,http://refspecs.linuxbase.org/是访问二进制格式等底层细节的好地方。
#3
3
BSS is the segment containing data initialized at run time where as data segment contains data initialized in the program binary.
BSS是在运行时初始化的数据段,数据段包含程序二进制中初始化的数据。
Now static variables are always initialized whether done explicitly in program or not. But there are two separate categories, initialized (DS) and uninitialized (BSS) statics.
现在静态变量总是初始化,不管是否在程序中显式地执行。但是有两个独立的类别,初始化(DS)和未初始化(BSS)静态。
All values present in BSS are those which are not initialized in the code of program and hence initialized when program is loaded at run time to 0 (if integer), null for pointers etc.
BSS中的所有值都是在程序代码中未初始化的值,因此当程序在运行时加载到0(如果是整数)、指针的空值等时初始化。
So when you initialize with 0, the value goes to BSS where as any other value assigned will allocate the variable in Data segment.
当你用0初始化时,它的值就会转到BSS,因为分配的任何其他值都会在数据段中分配变量。
An interesting consequence is, the size of data initialized in BSS will not be included in program binary, where as that of the one in data segment is included.
一个有趣的结果是,在BSS中初始化的数据的大小将不包含在程序二进制中,其中包含数据段中的数据。
Try allocating a large static array and use it in a program. See the executable size when it is not initialized explicitly in code. Then initialize it with non zero values like
尝试分配一个大型静态数组并在程序中使用它。在代码中未显式初始化可执行大小时,请参见可执行大小。然后用非零值来初始化它
static int arr[1000] = {2};
The size of executable in the latter case will be significantly greater
后一种情况下可执行文件的大小将显著增加。