C/ c++头文件和实现文件:它们是如何工作的?

时间:2021-08-21 02:06:20

This is probably a stupid question, but I've searched for quite a while now here and on the web and couldn't come up with a clear answer (did my due diligence googling).

这可能是一个愚蠢的问题,但我已经在这里和网上搜索了很长一段时间,没有找到一个明确的答案(我在谷歌上做了尽职调查)。

So I'm new to programming... My question is, how does the main function know about function definitions (implementations) in a different file?

所以我对编程很陌生……我的问题是,主函数如何知道不同文件中的函数定义(实现)?

ex. Say I have 3 files

比方说我有三个文件

  • main.cpp
  • main.cpp
  • myfunction.cpp
  • myfunction.cpp
  • myfunction.hpp
  • myfunction.hpp

//main.cpp

#include "myfunction.hpp"
int main() {
  int A = myfunction( 12 );
  ...
}

-

- - - - - -

//myfunction.cpp

#include "myfunction.hpp"
int myfunction( int x ) {
  return x * x;
}

-

- - - - - -

//myfunction.hpp

int myfunction( int x );

-

- - - - - -

I get how the preprocessor includes the header code, but how do the header and main function even know the function definition exists, much less utilize it?

我知道预处理器如何包含头代码,但是头和主函数如何知道函数定义的存在,更不用说使用它了?

I apologize if this isn't clear or I'm vastly mistaken about something, new here

我很抱歉,如果这还不清楚,或者我在这方面有很大的误解。

7 个解决方案

#1


55  

The header file declares functions/classes - i.e. tells the compiler when it is compiling a .cpp file what functions/classes are available.

头文件声明函数/类——例如,在编译.cpp文件时告诉编译器有哪些函数/类可用。

The .cpp file defines those functions - i.e. the compiler compiles the code and therefore produces the actual machine code to perform those actions that are declared in the corresponding .hpp file.

.cpp文件定义了这些函数,即编译器编译代码,从而生成实际的机器码来执行在对应的.hpp文件中声明的操作。

In your example, main.cpp includes a .hpp file. The preprocessor replaces the #include with the contents of the .hpp file. This file tells the compiler that the function myfunction is defined elsewhere and it takes one parameter (an int) and returns an int.

在你的例子中,主要。cpp包含一个.hpp文件。预处理器将#include替换为.hpp文件的内容。这个文件告诉编译器myfunction在其他地方定义,它接受一个参数(一个int)并返回一个int。

So when you compile main.cpp into object file (.o extension) it makes a note in that file that it requires the function myfunction. When you compile myfunction.cpp into an object file, the object file has a note in it that it has the definition for myfunction.

所以当你编译main的时候。cpp到对象文件(。它在那个文件中说明它需要函数myfunction。当你编译myfunction。在对象文件中,对象文件中有一个说明,说明它有myfunction的定义。

Then when you come to linking the two object files together into an executable, the linker ties the ends up - i.e. main.o uses myfunction as defined in myfunction.o.

然后,当您将这两个对象文件链接到一个可执行文件中时,链接器将结束连接到—即main。o使用myfunction.o中定义的myfunction。

I hope that helps

我希望能帮助

#2


14  

You have to understand that compilation is a 2-steps operations, from a user point of view.

您必须理解,从用户的角度来看,编译是一个两步操作。


1st Step : Object compilation

During this step, your *.c files are individually compiled into separate object files. It means that when main.cpp is compiled, it doesn't know anything about your myfunction.cpp. The only thing that he knows is that you declare that a function with this signature : int myfunction( int x ) exists in an other object file.

在此步骤中,您的*。c文件被单独编译成单独的对象文件。它的意思是当主。cpp是编译的,它对myfunction.cpp一无所知。他所知道的唯一一件事是,您声明具有此签名的函数:int myfunction(int x)存在于另一个对象文件中。

Compiler will keep a reference of this call and include it directly in the object file. Object file will contain a "I have to call myfunction with an int and it will return to me with an int. It keeps an index of all extern calls in order to be able to link with other afterwards.

编译器将保存此调用的引用并将其直接包含到对象文件中。对象文件将包含一个“我必须使用int调用myfunction,它将返回给我一个int.它保存所有extern调用的索引,以便之后能够与其他调用链接。”


2nd Step : Linking

During this step, the linker will take a look at all those indexes of your object files and will try to solve dependencies within those files. If one is not there, you'll get the famous undefined symbol XXX from it. He will then translate those references into real memory address in a result file : either a binary or a library.

在此步骤中,链接器将查看对象文件的所有索引,并尝试解决这些文件中的依赖关系。如果一个符号不存在,就会得到著名的未定义符号XXX。然后,他将这些引用转换为结果文件中的实际内存地址:二进制或库。


And then, you can begin to ask how is this possible to do that with gigantic program like an Office Suite, which have tons of methods & objects ? Well, they use the shared library mechanism. You know them with your '.dll' and/or '.so' files you have on your Unix/Windows workstation. It allows to postpone solving of undefined symbol until the program is run.

然后,你可以开始问,如何能够用像办公套件这样的庞大程序来实现这一点,它有大量的方法和对象?它们使用共享库机制。你知道他们和你的。dll“和/或”。因此,您在Unix/Windows工作站上的文件。它允许延迟解决未定义的符号,直到程序运行。

It even allows to solve undefined symbol on demand, with dl* functions.

它甚至允许根据需要使用dl*函数来解决未定义的符号。

#3


5  

1. The principle

1。原则

When you write:

当你写:

int A = myfunction(12);

This is translated to:

这是翻译成:

int A = @call(myfunction, 12);

where @call can be seen as a dictionary look-up. And if you think about the dictionary analogy, you can certainly know about a word (smogashboard ?) before knowing its definition. All you need is that, at runtime, the definition be in the dictionary.

@call可以被视为字典查找。如果你想想字典里的比喻,你肯定能在知道一个词的定义之前就知道它的意思。您所需要的是,在运行时,定义在字典中。

2. A point on ABI

2。上的一点ABI

How does this @call work ? Because of the ABI. The ABI is a way that describes many things, and among those how to perform a call to a given function (depending on its parameters). The call contract is simple: it simply says where each of the function arguments can be found (some will be in the processor's registers, some others on the stack).

这个@call是如何工作的?因为ABI。ABI是一种描述许多事情的方法,其中包括如何执行对给定函数的调用(取决于它的参数)。调用契约很简单:它只说明在哪里可以找到每个函数参数(一些将在处理器的寄存器中,另一些将在堆栈中)。

Therefore, @call actually does:

因此,@call确实:

@push 12, reg0
@invoke myfunction

And the function definition knows that its first argument (x) is located in reg0.

函数定义知道它的第一个参数(x)位于reg0中。

3. But I though dictionaries were for dynamic languages ?

3所示。但是我认为字典是用来表示动态语言的?

And you are right, to an extent. Dynamic languages are typically implemented with a hash table for symbol lookup that is dynamically populated.

在某种程度上,你是对的。动态语言通常使用一个哈希表实现,用于动态填充的符号查找。

For C++, the compiler will transform a translation unit (roughly speaking, a preprocessed source file) into an object (.o or .obj in general). Each object contains a table of the symbols it references but for which the definition is not known:

对于c++,编译器将转换一个翻译单元(粗略地说,一个预处理的源文件)到一个对象(。o或一般的。obj)每个对象包含它引用的符号的表,但是定义不知道:

.undefined
[0]: myfunction

Then the linker will bring together the objects and reconciliate the symbols. There are two kinds of symbols at this point:

然后链接器将对象集合起来并协调符号。在这一点上有两种符号:

  • those which are within the library, and can be referenced through an offset (the final address is still unknown)
  • 那些在库中,可以通过偏移量来引用(最终地址仍然未知)
  • those which are outside the library, and whose address is completely unknown until runtime.
  • 那些在库之外,并且其地址在运行时之前是完全未知的。

Both can be treated in the same fashion.

两者都可以用同样的方式对待。

.dynamic
[0]: myfunction at <undefined-address>

And then the code will reference the look-up entry:

然后代码将引用查找条目:

@invoke .dynamic[0]

When the library is loaded (DLL_Open for example), the runtime will finally know where the symbol is mapped in memory, and overwrite the <undefined-address> with the real address (for this run).

当加载库时(例如,DLL_Open),运行时将最终知道该符号在内存中映射到哪里,并将 <未定义地址> 与实际地址(用于此运行)覆盖。

#4


4  

As suggested in Matthieu M.'s comment, it is the linker job to find the right "function" at the right place. Compilation steps are, roughly:

正如马蒂厄姆所说。这是链接器的工作,在正确的位置找到正确的“函数”。编译步骤,大约:

  1. The compiler is invoked for each cpp file and translate it to an object file (binary code) with a symbol table which associates function name (names are mangled in c++) to their location in the object file.
  2. 对每个cpp文件调用编译器,并将其转换为对象文件(二进制代码),并使用符号表将函数名(名称在c++中被分解)关联到对象文件中的位置。
  3. The linker is invoked only one time: whith every object file in parameter. It will resolve function call location from one object file to another thanks to symbol tables. One main() function MUST exist somewhere. Eventually a binary executable file is produced when the linker found everything it needs.
  4. 链接器只被调用一次:在参数中调用每个对象文件。它将通过符号表将函数调用位置从一个对象文件解析为另一个对象文件。一个main()函数必须存在于某处。当链接器找到它需要的所有东西时,最终会生成一个二进制可执行文件。

#5


4  

The preprocessor includes the content of the header files in to the cpp files (cpp files are called translation unit). When you compile the code, each translational unit separately is checked for semantic and syntactic errors. The presence of function definitions across translation units is not considered. .obj files are generated after compilation.

预处理程序将头文件的内容包含在cpp文件中(cpp文件称为翻译单元)。在编译代码时,将分别检查每个翻译单元的语义和语法错误。不考虑跨翻译单元出现函数定义。obj文件是编译后生成的。

In the next step when the obj files are linked. the definition of functions (member functions for classes) that are used gets searched and linking happens. If the function is not found a linker error is thrown.

在下一步中,当obj文件被链接时。将搜索所使用的函数(类的成员函数)的定义并进行链接。如果未找到该函数,则抛出链接器错误。

In your example, If the function was not defined in myfunction.cpp, compilation would still go on with no problem. An error would be reported in the linking step.

在您的示例中,如果函数在myfunction中没有定义。cpp,编译仍然没有问题。在链接步骤中将报告错误。

#6


2  

int myfunction(int); is the function prototype. You declare function with it so that compiler knows that you are calling this function when you write myfunction(0);.

int myfunction(int);是函数原型。你用它声明函数,这样编译器知道你在写myfunction(0)时调用这个函数;

And how do the header and main function even know the function definition exists?
Well, this is the job of Linker.

header和main函数如何知道函数定义的存在?这是链接器的工作。

#7


1  

When you compile a program, the preprocessor adds source code of each header file to the file that included it. The compiler compiles EVERY .cpp file. The result is a number of .obj files.
After that comes the linker. Linker takes all .obj files, starting from you main file, Whenever it finds a reference that has no definition (e.g. a variable, function or class) it tries to locate the respective definition in other .obj files created at compile stage or supplied to linker at the beginning of linking stage.
Now to answer your question: each .cpp file is compile into a .obj file containing instructions in machine code. When you include a .hpp file and use some function that's defined in another .cpp file, at linking stage the linker looks for that function definition in the respective .obj file. That's how it finds it.

在编译程序时,预处理器将每个头文件的源代码添加到包含它的文件中。编译器编译每个.cpp文件。结果是许多.obj文件。之后是链接器。链接器从主文件开始获取所有.obj文件,每当它找到一个没有定义的引用(例如,一个变量、函数或类)时,它都试图在编译阶段创建的其他.obj文件中找到相应的定义,或者在连接阶段开始时提供给链接器。现在回答您的问题:每个.cpp文件被编译成一个.obj文件,其中包含机器代码中的指令。当您包含.hpp文件并使用另一个.cpp文件中定义的函数时,在链接阶段,链接器会在相应的.obj文件中查找该函数定义。它就是这样找到它的。

#1


55  

The header file declares functions/classes - i.e. tells the compiler when it is compiling a .cpp file what functions/classes are available.

头文件声明函数/类——例如,在编译.cpp文件时告诉编译器有哪些函数/类可用。

The .cpp file defines those functions - i.e. the compiler compiles the code and therefore produces the actual machine code to perform those actions that are declared in the corresponding .hpp file.

.cpp文件定义了这些函数,即编译器编译代码,从而生成实际的机器码来执行在对应的.hpp文件中声明的操作。

In your example, main.cpp includes a .hpp file. The preprocessor replaces the #include with the contents of the .hpp file. This file tells the compiler that the function myfunction is defined elsewhere and it takes one parameter (an int) and returns an int.

在你的例子中,主要。cpp包含一个.hpp文件。预处理器将#include替换为.hpp文件的内容。这个文件告诉编译器myfunction在其他地方定义,它接受一个参数(一个int)并返回一个int。

So when you compile main.cpp into object file (.o extension) it makes a note in that file that it requires the function myfunction. When you compile myfunction.cpp into an object file, the object file has a note in it that it has the definition for myfunction.

所以当你编译main的时候。cpp到对象文件(。它在那个文件中说明它需要函数myfunction。当你编译myfunction。在对象文件中,对象文件中有一个说明,说明它有myfunction的定义。

Then when you come to linking the two object files together into an executable, the linker ties the ends up - i.e. main.o uses myfunction as defined in myfunction.o.

然后,当您将这两个对象文件链接到一个可执行文件中时,链接器将结束连接到—即main。o使用myfunction.o中定义的myfunction。

I hope that helps

我希望能帮助

#2


14  

You have to understand that compilation is a 2-steps operations, from a user point of view.

您必须理解,从用户的角度来看,编译是一个两步操作。


1st Step : Object compilation

During this step, your *.c files are individually compiled into separate object files. It means that when main.cpp is compiled, it doesn't know anything about your myfunction.cpp. The only thing that he knows is that you declare that a function with this signature : int myfunction( int x ) exists in an other object file.

在此步骤中,您的*。c文件被单独编译成单独的对象文件。它的意思是当主。cpp是编译的,它对myfunction.cpp一无所知。他所知道的唯一一件事是,您声明具有此签名的函数:int myfunction(int x)存在于另一个对象文件中。

Compiler will keep a reference of this call and include it directly in the object file. Object file will contain a "I have to call myfunction with an int and it will return to me with an int. It keeps an index of all extern calls in order to be able to link with other afterwards.

编译器将保存此调用的引用并将其直接包含到对象文件中。对象文件将包含一个“我必须使用int调用myfunction,它将返回给我一个int.它保存所有extern调用的索引,以便之后能够与其他调用链接。”


2nd Step : Linking

During this step, the linker will take a look at all those indexes of your object files and will try to solve dependencies within those files. If one is not there, you'll get the famous undefined symbol XXX from it. He will then translate those references into real memory address in a result file : either a binary or a library.

在此步骤中,链接器将查看对象文件的所有索引,并尝试解决这些文件中的依赖关系。如果一个符号不存在,就会得到著名的未定义符号XXX。然后,他将这些引用转换为结果文件中的实际内存地址:二进制或库。


And then, you can begin to ask how is this possible to do that with gigantic program like an Office Suite, which have tons of methods & objects ? Well, they use the shared library mechanism. You know them with your '.dll' and/or '.so' files you have on your Unix/Windows workstation. It allows to postpone solving of undefined symbol until the program is run.

然后,你可以开始问,如何能够用像办公套件这样的庞大程序来实现这一点,它有大量的方法和对象?它们使用共享库机制。你知道他们和你的。dll“和/或”。因此,您在Unix/Windows工作站上的文件。它允许延迟解决未定义的符号,直到程序运行。

It even allows to solve undefined symbol on demand, with dl* functions.

它甚至允许根据需要使用dl*函数来解决未定义的符号。

#3


5  

1. The principle

1。原则

When you write:

当你写:

int A = myfunction(12);

This is translated to:

这是翻译成:

int A = @call(myfunction, 12);

where @call can be seen as a dictionary look-up. And if you think about the dictionary analogy, you can certainly know about a word (smogashboard ?) before knowing its definition. All you need is that, at runtime, the definition be in the dictionary.

@call可以被视为字典查找。如果你想想字典里的比喻,你肯定能在知道一个词的定义之前就知道它的意思。您所需要的是,在运行时,定义在字典中。

2. A point on ABI

2。上的一点ABI

How does this @call work ? Because of the ABI. The ABI is a way that describes many things, and among those how to perform a call to a given function (depending on its parameters). The call contract is simple: it simply says where each of the function arguments can be found (some will be in the processor's registers, some others on the stack).

这个@call是如何工作的?因为ABI。ABI是一种描述许多事情的方法,其中包括如何执行对给定函数的调用(取决于它的参数)。调用契约很简单:它只说明在哪里可以找到每个函数参数(一些将在处理器的寄存器中,另一些将在堆栈中)。

Therefore, @call actually does:

因此,@call确实:

@push 12, reg0
@invoke myfunction

And the function definition knows that its first argument (x) is located in reg0.

函数定义知道它的第一个参数(x)位于reg0中。

3. But I though dictionaries were for dynamic languages ?

3所示。但是我认为字典是用来表示动态语言的?

And you are right, to an extent. Dynamic languages are typically implemented with a hash table for symbol lookup that is dynamically populated.

在某种程度上,你是对的。动态语言通常使用一个哈希表实现,用于动态填充的符号查找。

For C++, the compiler will transform a translation unit (roughly speaking, a preprocessed source file) into an object (.o or .obj in general). Each object contains a table of the symbols it references but for which the definition is not known:

对于c++,编译器将转换一个翻译单元(粗略地说,一个预处理的源文件)到一个对象(。o或一般的。obj)每个对象包含它引用的符号的表,但是定义不知道:

.undefined
[0]: myfunction

Then the linker will bring together the objects and reconciliate the symbols. There are two kinds of symbols at this point:

然后链接器将对象集合起来并协调符号。在这一点上有两种符号:

  • those which are within the library, and can be referenced through an offset (the final address is still unknown)
  • 那些在库中,可以通过偏移量来引用(最终地址仍然未知)
  • those which are outside the library, and whose address is completely unknown until runtime.
  • 那些在库之外,并且其地址在运行时之前是完全未知的。

Both can be treated in the same fashion.

两者都可以用同样的方式对待。

.dynamic
[0]: myfunction at <undefined-address>

And then the code will reference the look-up entry:

然后代码将引用查找条目:

@invoke .dynamic[0]

When the library is loaded (DLL_Open for example), the runtime will finally know where the symbol is mapped in memory, and overwrite the <undefined-address> with the real address (for this run).

当加载库时(例如,DLL_Open),运行时将最终知道该符号在内存中映射到哪里,并将 <未定义地址> 与实际地址(用于此运行)覆盖。

#4


4  

As suggested in Matthieu M.'s comment, it is the linker job to find the right "function" at the right place. Compilation steps are, roughly:

正如马蒂厄姆所说。这是链接器的工作,在正确的位置找到正确的“函数”。编译步骤,大约:

  1. The compiler is invoked for each cpp file and translate it to an object file (binary code) with a symbol table which associates function name (names are mangled in c++) to their location in the object file.
  2. 对每个cpp文件调用编译器,并将其转换为对象文件(二进制代码),并使用符号表将函数名(名称在c++中被分解)关联到对象文件中的位置。
  3. The linker is invoked only one time: whith every object file in parameter. It will resolve function call location from one object file to another thanks to symbol tables. One main() function MUST exist somewhere. Eventually a binary executable file is produced when the linker found everything it needs.
  4. 链接器只被调用一次:在参数中调用每个对象文件。它将通过符号表将函数调用位置从一个对象文件解析为另一个对象文件。一个main()函数必须存在于某处。当链接器找到它需要的所有东西时,最终会生成一个二进制可执行文件。

#5


4  

The preprocessor includes the content of the header files in to the cpp files (cpp files are called translation unit). When you compile the code, each translational unit separately is checked for semantic and syntactic errors. The presence of function definitions across translation units is not considered. .obj files are generated after compilation.

预处理程序将头文件的内容包含在cpp文件中(cpp文件称为翻译单元)。在编译代码时,将分别检查每个翻译单元的语义和语法错误。不考虑跨翻译单元出现函数定义。obj文件是编译后生成的。

In the next step when the obj files are linked. the definition of functions (member functions for classes) that are used gets searched and linking happens. If the function is not found a linker error is thrown.

在下一步中,当obj文件被链接时。将搜索所使用的函数(类的成员函数)的定义并进行链接。如果未找到该函数,则抛出链接器错误。

In your example, If the function was not defined in myfunction.cpp, compilation would still go on with no problem. An error would be reported in the linking step.

在您的示例中,如果函数在myfunction中没有定义。cpp,编译仍然没有问题。在链接步骤中将报告错误。

#6


2  

int myfunction(int); is the function prototype. You declare function with it so that compiler knows that you are calling this function when you write myfunction(0);.

int myfunction(int);是函数原型。你用它声明函数,这样编译器知道你在写myfunction(0)时调用这个函数;

And how do the header and main function even know the function definition exists?
Well, this is the job of Linker.

header和main函数如何知道函数定义的存在?这是链接器的工作。

#7


1  

When you compile a program, the preprocessor adds source code of each header file to the file that included it. The compiler compiles EVERY .cpp file. The result is a number of .obj files.
After that comes the linker. Linker takes all .obj files, starting from you main file, Whenever it finds a reference that has no definition (e.g. a variable, function or class) it tries to locate the respective definition in other .obj files created at compile stage or supplied to linker at the beginning of linking stage.
Now to answer your question: each .cpp file is compile into a .obj file containing instructions in machine code. When you include a .hpp file and use some function that's defined in another .cpp file, at linking stage the linker looks for that function definition in the respective .obj file. That's how it finds it.

在编译程序时,预处理器将每个头文件的源代码添加到包含它的文件中。编译器编译每个.cpp文件。结果是许多.obj文件。之后是链接器。链接器从主文件开始获取所有.obj文件,每当它找到一个没有定义的引用(例如,一个变量、函数或类)时,它都试图在编译阶段创建的其他.obj文件中找到相应的定义,或者在连接阶段开始时提供给链接器。现在回答您的问题:每个.cpp文件被编译成一个.obj文件,其中包含机器代码中的指令。当您包含.hpp文件并使用另一个.cpp文件中定义的函数时,在链接阶段,链接器会在相应的.obj文件中查找该函数定义。它就是这样找到它的。