So I finished my first C++ programming assignment and received my grade. But according to the grading, I lost marks for including cpp files instead of compiling and linking them
. I'm not too clear on what that means.
所以我完成了我的第一个c++编程作业并获得了分数。但是根据评分,我没有编译和链接cpp文件,而是包含了cpp文件。我不太清楚这是什么意思。
Taking a look back at my code, I chose not to create header files for my classes, but did everything in the cpp files (it seemed to work fine without header files...). I'm guessing that the grader meant that I wrote '#include "mycppfile.cpp";' in some of my files.
回顾一下我的代码,我选择不为我的类创建头文件,而是在cpp文件中做了所有的事情(没有头文件看起来很好…)我猜评分员的意思是我在我的一些文件中写了#include“mycppfile.cpp”;
My reasoning for #include
'ing the cpp files was: - Everything that was supposed to go into the header file was in my cpp file, so I pretended it was like a header file - In monkey-see-monkey do fashion, I saw that other header files were #include
'd in the files, so I did the same for my cpp file.
#我的理由包括的cpp文件是:——一切应该进入头文件在我cpp文件,所以我假装就像在一个头文件——monkey-see-monkey做时尚,我看到其他头文件# include文件,所以我做了相同的cpp文件。
So what exactly did I do wrong, and why is it bad?
我到底做错了什么,为什么做得不好?
13 个解决方案
#1
126
To the best of my knowledge, the C++ standard knows no difference between header files and source files. As far as the language is concerned, any text file with legal code is the same as any other. However, although not illegal, including source files into your program will pretty much eliminate any advantages you would've got from separating your source files in the first place.
据我所知,c++标准对头文件和源文件没有区别。就语言而言,任何带合法代码的文本文件都与其他文件相同。但是,虽然不违法,但是将源文件包含到程序中会消除您从一开始分离源文件所获得的任何优势。
Essentially, what #include
does is tell the preprocessor to take the entire file you've specified, and copy it into your active file before the compiler gets its hands on it. So when you include all the source files in your project together, there is fundamentally no difference between what you've done, and just making one huge source file without any separation at all.
本质上,#include所做的是告诉预处理器获取您指定的整个文件,并在编译器得到它之前将其复制到您的活动文件中。因此,当您将项目中的所有源文件都包含在一起时,您所做的与只创建一个大型源文件而不进行任何分离是没有本质区别的。
"Oh, that's no big deal. If it runs, it's fine," I hear you cry. And in a sense, you'd be correct. But right now you're dealing with a tiny tiny little program, and a nice and relatively unencumbered CPU to compile it for you. You won't always be so lucky.
“哦,这没什么大不了的。”如果它运行,它是好的,“我听到你哭泣。在某种意义上,你是对的。但是现在您正在处理一个很小的程序,以及一个漂亮的、相对不受限制的CPU来编译它。你不会总是那么幸运的。
If you ever delve into the realms of serious computer programming, you'll be seeing projects with line counts that can reach millions, rather than dozens. That's a lot of lines. And if you try to compile one of these on a modern desktop computer, it can take a matter of hours instead of seconds.
如果你深入研究严肃的计算机编程领域,你会发现项目的行数可以达到数百万,而不是几十。有很多线。如果你试着在现代桌面电脑上编译一个这样的程序,它只需要几个小时而不是几秒钟。
"Oh no! That sounds horrible! However can I prevent this dire fate?!" Unfortunately, there's not much you can do about that. If it takes hours to compile, it takes hours to compile. But that only really matters the first time -- once you've compiled it once, there's no reason to compile it again.
“哦,不!这听起来可怕!但是我能阻止这种可怕的命运吗?不幸的是,你对此无能为力。如果编译需要几个小时,那么编译需要几个小时。但这只是第一次才真正重要——一旦编译了一次,就没有理由再编译一次了。
Unless you change something.
除非你改变一些事情。
Now, if you had two million lines of code merged together into one giant behemoth, and need to do a simple bug fix such as, say, x = y + 1
, that means you have to compile all two million lines again in order to test this. And if you find out that you meant to do a x = y - 1
instead, then again, two million lines of compile are waiting for you. That's many hours of time wasted that could be better spent doing anything else.
现在,如果你有200万行代码合并成一个巨大的庞然大物,并且需要做一个简单的bug修复,比如说,x = y + 1,这意味着你需要重新编译所有的200万行代码来测试这个。如果你发现你的意思是做一个x = y - 1,那么,200万行编译就等着你了。这是浪费了很多时间,可以用来做其他事情。
"But I hate being unproductive! If only there was some way to compile distinct parts of my codebase individually, and somehow link them together afterwards!" An excellent idea, in theory. But what if your program needs to know what's going on in a different file? It's impossible to completely separate your codebase unless you want to run a bunch of tiny tiny .exe files instead.
“可是我讨厌效率低下!”如果有什么方法可以单独编译我的代码库的不同部分,然后以某种方式将它们连接在一起就好了!在理论上,这是一个很好的想法。但是,如果你的程序需要知道在不同的文件中发生了什么呢?除非你想运行一堆小的.exe文件,否则不可能完全分离你的代码库。
"But surely it must be possible! Programming sounds like pure torture otherwise! What if I found some way to separate interface from implementation? Say by taking just enough information from these distinct code segments to identify them to the rest of the program, and putting them in some sort of header file instead? And that way, I can use the #include
preprocessor directive to bring in only the information necessary to compile!"
“但这肯定是可能的!”编程听起来纯粹是折磨!如果我找到了一种将接口与实现分离的方法呢?比方说,从这些不同的代码段中获取足够的信息,将它们标识给程序的其他部分,然后将它们放到某种头文件中?这样,我就可以使用#include预处理器指令只引入编译所需的信息!
Hmm. You might be on to something there. Let me know how that works out for you.
嗯。你可能会发现一些东西。让我知道你是怎么做的。
#2
35
This is probably a more detailed answer than you wanted, but I think a decent explanation is justified.
这可能是一个比你想要的更详细的答案,但我认为一个合理的解释是合理的。
In C and C++, one source file is defined as one translation unit. By convention, header files hold function declarations, type definitions and class definitions. The actual function implementations reside in translation units, i.e .cpp files.
在C和c++中,一个源文件定义为一个翻译单元。按照约定,头文件包含函数声明、类型定义和类定义。实际的函数实现位于转换单元i中。e . cpp文件。
The idea behind this is that functions and class/struct member functions are compiled and assembled once, then other functions can call that code from one place without making duplicates. Your function prototypes are declared as "extern" implicitly.
这背后的想法是,函数和类/struct成员函数一次编译和组装,然后其他函数可以从一个地方调用该代码,而不需要重复。函数原型被隐式声明为“extern”。
/* Function prototype, usually found in headers. */
/* Implicitly 'extern', i.e the symbols is visible everywhere, not just locally.*/
int add(int, int);
/* function body, or function definition. */
int add(int a, int b)
{
return a + b;
}
If you want a function to be local for a translation unit, you define it as 'static'. What does this mean? It means that if you include source files with extern functions, you will get redefinition errors, because the compiler comes across the same implementation more than once. So, you want all your translation units to see the function prototype but not the function body.
如果你想要一个翻译单元的函数是本地的,你可以把它定义为“静态的”。这是什么意思?这意味着如果您包含带有extern函数的源文件,您将获得重新定义错误,因为编译器不止一次地遇到相同的实现。所以,你想让所有的翻译单位看到函数原型,而不是函数体。
So how does it all get mashed together at the end? That is the linker's job. A linker reads all the object files which is generated by the assembler stage and resolves symbols. As I said earlier, a symbol is just a name. For example, the name of a variable or a function. When translation units which call functions or declare types do not know the implementation for those functions or types, those symbols are said to be unresolved. The linker resolves the unresolved symbol by connecting the translation unit which holds the undefined symbol together with the one which contains the implementation. Phew. This is true for all externally visible symbols, whether they are implemented in your code, or provided by an additional library. A library is really just an archive with reusable code.
那么,这一切是如何在最后融合在一起的呢?那是林克的工作。链接器读取汇编程序阶段生成的所有对象文件并解析符号。正如我之前所说的,符号只是一个名字。例如,变量或函数的名称。当调用函数或声明类型的转换单元不知道这些函数或类型的实现时,这些符号被称为未解决。链接器通过连接包含未定义符号的翻译单元和包含实现的转换单元来解决未解决的符号。唷。这适用于所有外部可见的符号,无论它们是在代码中实现的,还是由附加库提供的。一个库实际上只是一个具有可重用代码的存档。
There are two notable exceptions. First, if you have a small function, you can make it inline. This means that the generated machine code does not generate an extern function call, but is literally concatenated in-place. Since they usually are small, the size overhead does not matter. You can imagine them to be static in the way they work. So it is safe to implement inline functions in headers. Function implementations inside a class or struct definition are also often inlined automatically by the compiler.
有两个明显的例外。首先,如果你有一个小函数,你可以使它内联。这意味着生成的机器代码不会生成一个extern函数调用,而是在适当的地方进行连接。由于它们通常很小,所以大小开销并不重要。你可以想象它们的工作方式是静态的。因此,在header中实现内联函数是安全的。类或结构定义中的函数实现通常也由编译器自动内联。
The other exception is templates. Since the compiler needs to see the whole template type definition when instantiating them, it is not possible to decouple the implementation from the definition as with standalone functions or normal classes. Well, perhaps this is possible now, but getting widespread compiler support for the "export" keyword took a long, long time. So without support for 'export', translation units get their own local copies of instantiated templated types and functions, similar to how inline functions work. With support for 'export', this is not the case.
另一个例外是模板。由于编译器在实例化它们时需要看到整个模板类型定义,所以不可能将实现与独立函数或普通类的定义分离开来。好吧,也许现在这是可能的,但是得到广泛的编译器对“export”关键字的支持花了很长时间。因此,在不支持“导出”的情况下,翻译单元会得到它们自己的实例化模板类型和函数的本地副本,类似于内联函数的工作方式。在支持“出口”的情况下,情况并非如此。
For the two exceptions, some people find it "nicer" to put the implementations of inline functions, templated functions and templated types in .cpp files, and then #include the .cpp file. Whether this is a header or a source file doesn't really matter; the preprocessor does not care and is just a convention.
对于这两个例外,有些人认为将内联函数的实现、模板化函数和模板类型放在.cpp文件中是“更好的”,然后#include .cpp文件。这是头文件还是源文件并不重要;预处理器并不关心,它只是一个约定。
A quick summary of the whole process from C++ code (several files) and to a final executable:
从c++代码(几个文件)到最终可执行文件的整个过程的快速总结:
- The preprocessor is run, which parses all the directives which starts with a '#'. The #include directive concatenates the included file with inferior, for example. It also does macro-replacement and token-pasting.
- 预处理器运行,它解析以#开头的所有指令。例如,#include指令将包含的文件与劣质文件连接在一起。它还进行宏观替换和标记粘贴。
- The actual compiler runs on the intermediate text file after the preprocessor stage, and emits assembler code.
- 实际的编译器在预处理阶段之后在中间文本文件上运行,并发出汇编代码。
- The assembler runs on the assembly file and emits machine code, this is usually called an object file and follows the binary executable format of the operative system in question. For example, Windows uses the PE (portable executable format), while Linux uses the Unix System V ELF format, with GNU extensions. At this stage, symbols are still marked as undefined.
- 汇编程序在程序集文件上运行并发出机器代码,这通常称为对象文件,并遵循操作系统的二进制可执行格式。例如,Windows使用PE(可移植的可执行格式),而Linux使用Unix系统V ELF格式,使用GNU扩展。在这个阶段,符号仍然被标记为未定义。
- Finally, the linker is run. All the previous stages were run on each translation unit in order. However, the linker stage works on all the generated object files which were generated by the assembler. The linker resolves symbols and does a lot of magic like creating sections and segments, which is dependent on the target platform and binary format. Programmers aren't required to know this in general, but it surely helps in some cases.
- 最后,链接器运行。之前的所有阶段都按顺序在每个翻译单元上运行。然而,链接器阶段在汇编程序生成的所有生成的对象文件上工作。链接器解析符号并做很多魔术,比如创建节和段,这依赖于目标平台和二进制格式。程序员一般不需要知道这一点,但在某些情况下肯定会有帮助。
Again, this was definetely more than you asked for, but I hope the nitty-gritty details helps you to see the bigger picture.
同样,这肯定比你所要求的要多,但我希望细节可以帮助你看到更大的图景。
#3
9
The typical solution is to use .h
files for declarations only and .cpp
files for implementation. If you need to reuse the implementation you include the corresponding .h
file into the .cpp
file where the necessary class/function/whatever is used and link against an already compiled .cpp
file (either an .obj
file - usually used within one project - or .lib file - usually used for reusing from multiple projects). This way you don't need to recompile everything if only the implementation changes.
典型的解决方案是只对声明使用.h文件,对实现使用.cpp文件。如果你需要重用实现包含相应的头文件到. cpp文件中必要的类/函数/无论在哪里使用和链接一个已编译. cpp文件(一个.obj文件——通常在一个项目中使用——或者lib文件——通常用于从多个项目重用)。这样,即使实现发生了更改,也不需要重新编译所有内容。
#4
6
Think of cpp files as a black box and the .h files as the guides on how to use those black boxes.
可以将cpp文件看作一个黑框,将.h文件看作如何使用这些黑框的指南。
The cpp files can be compiled ahead of time. This doesn't work in you #include them, as it needs to actual "include" the code into your program each time it compiles it. If you just include the header, it can just use the header file to determine how to use the precompiled cpp file.
cpp文件可以提前编译。这在您#include中不起作用,因为它需要在每次编译时将代码“包含”到您的程序中。如果只包含头文件,它可以使用头文件来确定如何使用预编译的cpp文件。
Although this won't make much of a difference for your first project, if you start writing large cpp programs, people are going to hate you because compile times are going to explode.
虽然这对您的第一个项目没有太大的影响,但是如果您开始编写大型cpp程序,人们会讨厌您,因为编译时间将会激增。
Also have a read of this: Header File Include Patterns
还可以读一下:头文件包含模式
#5
6
Header files usually contain declarations of functions / classes, while .cpp files contain the actual implementations. At compile time, each .cpp file gets compiled into an object file (usually extension .o), and the linker combines the various object files into the final executable. The linking process is generally much faster than the compilation.
头文件通常包含函数/类的声明,而.cpp文件包含实际的实现。在编译时,每个.cpp文件被编译成一个对象文件(通常扩展名为.o),链接器将各种对象文件合并到最终的可执行文件中。链接过程通常比编译快得多。
Benefits of this separation: If you are recompiling one of the .cpp files in your project, you don't have to recompile all the others. You just create the new object file for that particular .cpp file. The compiler doesn't have to look at the other .cpp files. However, if you want to call functions in your current .cpp file that were implemented in the other .cpp files, you have to tell the compiler what arguments they take; that is the purpose of including the header files.
这种分离的好处:如果您正在重新编译项目中的一个.cpp文件,则不必重新编译其他所有文件。您只需为特定的.cpp文件创建新的对象文件。编译器不需要查看其他的.cpp文件。但是,如果您想要调用在其他.cpp文件中实现的当前.cpp文件中的函数,您必须告诉编译器它们采用什么参数;这就是包含头文件的目的。
Disadvantages: When compiling a given .cpp file, the compiler cannot 'see' what is inside the other .cpp files. So it doesn't know how the functions there are implemented, and as a result cannot optimize as aggressively. But I think you don't need to concern yourself with that just yet (:
缺点:编译给定的.cpp文件时,编译器不能“看到”其他.cpp文件中的内容。所以它不知道函数是如何实现的,因此无法进行积极的优化。但是我认为你现在还不需要担心这个问题。
#6
5
The basic idea that headers are only included and cpp files are only compiled. This will become more useful once you have many cpp files, and recompiling the whole application when you modify only one of them will be too slow. Or when the functions in the files will start depending on each other. So, you should separate class declarations into your header files, leave implementation in cpp files and write a Makefile (or something else, depending on what tools are you using) to compile the cpp files and link the resulting object files into a program.
只包含头文件和只编译cpp文件的基本思想。一旦您有了许多cpp文件,并且在修改其中一个文件的时候,重新编译整个应用程序时,这将变得更加有用。或者当文件中的函数开始相互依赖时。因此,您应该将类声明分离到头文件中,将实现保留在cpp文件中,并编写一个Makefile(或其他内容,取决于您使用的工具)来编译cpp文件,并将结果对象文件链接到一个程序中。
#7
3
If you #include a cpp file in several other files in your program, the compiler will try to compile the cpp file multiple times, and will generate an error as there will be multiple implementations of the same methods.
如果您在程序中的其他几个文件中包含一个cpp文件,编译器将尝试多次编译cpp文件,并将产生一个错误,因为相同方法将有多个实现。
Compilation will take longer (which becomes a problem on large projects), if you make edits in #included cpp files, which then force recompilation of any files #including them.
如果您在#include cpp文件中进行编辑,那么编译将花费更长的时间(这在大型项目中是一个问题),然后强制重新编译任何包含#的文件。
Just put your declarations into header files and include those (as they don't actually generate code per se), and the linker will hook up the declarations with the corresponding cpp code (which then only gets compiled once).
只需将声明放入头文件并包含它们(因为它们实际上并不生成代码本身),链接器将把声明与相应的cpp代码连接起来(然后只编译一次)。
#8
2
While it is certainly possible to do as you did, the standard practice is to put shared declarations into header files (.h), and definitions of functions and variables - implementation - into source files (.cpp).
虽然可以像您所做的那样做,但是标准的实践是将共享声明放入头文件(.h)中,并将函数和变量的定义(实现)放入源文件(.cpp)中。
As a convention, this helps make it clear where everything is, and makes a clear distinction between interface and implementation of your modules. It also means that you never have to check to see if a .cpp file is included in another, before adding something to it that could break if it was defined in several different units.
作为一种惯例,这有助于清楚地说明所有事情的位置,并明确区分模块的接口和实现。它还意味着,您永远不必检查是否包含了.cpp文件,然后再添加一些东西,如果它是在几个不同的单元中定义的,那么它可能会中断。
#9
2
re-usability, architecture and data encapsulation
可重用性、架构和数据封装
here's an example:
这里有一个例子:
say you create a cpp file which contains a simple form of string routines all in a class mystring, you place the class decl for this in a mystring.h compiling mystring.cpp to a .obj file
假设您创建了一个cpp文件,其中包含一个简单的形式的字符串例程,这些例程都在一个类mystring中,您将这个类的decl放在一个mystring中。h mystring编译。cpp到。obj文件
now in your main program (e.g. main.cpp) you include header and link with the mystring.obj. to use mystring in your program you don't care about the details how mystring is implemented since the header says what it can do
现在,在你的主程序(例如main.cpp)中,你包含了header和与mystring.obj的链接。要在程序中使用mystring,您不需要考虑mystring是如何实现的,因为header指定了它的功能
now if a buddy wants to use your mystring class you give him mystring.h and the mystring.obj, he also doesn't necessarily need to know how it works as long as it works.
如果一个朋友想要使用mystring类,你就给他mystring。h和mystring。obj,他也不需要知道它是如何工作的只要它有效。
later if you have more such .obj files you can combine them into a .lib file and link to that instead.
稍后,如果您有更多这样的.obj文件,您可以将它们合并到.lib文件中,并将其链接到该文件中。
you can also decide to change the mystring.cpp file and implement it more effectively, this will not affect your main.cpp or your buddies program.
您还可以决定更改mystring。cpp文件并更有效地实现它,这不会影响您的main。cpp或你的伙伴程序。
#10
2
If it works for you then there is nothing wrong with it -- except that it will ruffle the feathers of people who think that there is only one way to do things.
如果它对你有效,那么它没有什么错——除了它会激怒那些认为只有一种方法可以做事情的人。
Many of the answers given here address optimizations for large-scale software projects. These are good things to know about, but there is no point in optimizing a small project as if it were a large project -- that is what is known as "premature optimization". Depending on your development environment, there may be significant extra complexity involved in setting up a build configuration to support multiple source files per program.
这里给出的许多答案都是针对大型软件项目的优化。这些都是需要了解的好东西,但是把一个小项目当作一个大项目来优化是没有意义的——这就是所谓的“过早优化”。根据您的开发环境的不同,在为每个程序设置支持多个源文件的构建配置时,可能会有很大的额外复杂性。
If, over time, your project evolves and you find that the build process is taking too long, then you can refactor your code to use multiple source files for faster incremental builds.
如果随着时间的推移,您的项目不断发展,并且您发现构建过程花费的时间太长,那么您可以重构代码,以便使用多个源文件进行更快的增量构建。
Several of the answers discuss separating interface from implementation. However, this is not an inherent feature of include files, and it is quite common to #include "header" files that directly incorporate their implementation (even the C++ Standard Library does this to a significant degree).
有几个答案讨论了接口和实现之间的分离。但是,这并不是include文件的固有特性,而且直接合并实现的#include“头”文件也很常见(甚至c++标准库也在很大程度上这样做)。
The only thing truly "unconventional" about what you have done was naming your included files ".cpp" instead of ".h" or ".hpp".
你所做的唯一一件真正“非常规”的事情就是给你的文件命名。cpp”而不是“。h "或" . hpp。”
#11
1
When you compile and link a program the compiler first compiles the individual cpp files and then they link (connect) them. The headers will never get compiled, unless included in a cpp file first.
当您编译并链接一个程序时,编译器首先编译单个的cpp文件,然后它们链接(连接)它们。除非首先包含在cpp文件中,否则不会编译头文件。
Typically headers are declarations and cpp are implementation files. In the headers you define an interface for a class or function but you leave out how you actually implement the details. This way you don't have to recompile every cpp file if you make a change in one.
通常头文件是声明,cpp是实现文件。在header中,您为类或函数定义了接口,但是您省略了如何实际实现细节。这样,如果在一个cpp文件中做了更改,就不必重新编译每个cpp文件。
#12
1
I will suggest you to go through Large Scale C++ Software Design by John Lakos. In the college, we usually write small projects where we do not come across such problems. The book highlights the importance of separating interfaces and the implementations.
我建议您进行John Lakos的大规模c++软件设计。在大学里,我们通常写小的项目,在那里我们不会遇到这样的问题。本书强调了分离接口和实现的重要性。
Header files usually have interfaces which are supposed not to be changed so frequently. Similarly a look into patterns like Virtual Constructor idiom will help you grasp the concept further.
头文件通常有接口,这些接口应该不会经常被修改。类似地,研究像虚构造函数习语这样的模式将帮助您进一步理解这个概念。
I am still learning like you :)
我还像你一样在学习:
#13
1
It's like writing a book, you want to print out finished chapters only once
Say you are writing a book. If you put the chapters in separate files then you only need to print out a chapter if you have changed it. Working on one chapter doesn't change any of the others.
假设你正在写一本书。如果你把章节放在不同的文件中,那么你只需要打印一个章节,如果你改变了它。在一章里工作不会改变任何一个章节。
But including the cpp files is, from the compiler's point of view, like editing all of the chapters of the book in one file. Then if you change it you have to print all the pages of the entire book in order to get your revised chapter printed. There is no "print selected pages" option in object code generation.
但是从编译器的角度来看,包含cpp文件就像在一个文件中编辑书中的所有章节一样。然后如果你改变它,你必须打印出整本书的所有页,以便把修改后的章节打印出来。在对象代码生成中没有“打印选定的页面”选项。
Back to software: I have Linux and Ruby src lying around. A rough measure of lines of code...
回到软件:我有Linux和Ruby src。对代码行的粗略度量……
Linux Ruby
100,000 100,000 core functionality (just kernel/*, ruby top level dir)
10,000,000 200,000 everything
Any one of those four categories has a lot of code, hence the need for modularity. This kind of code base is surprisingly typical of real-world systems.
这四个类别中的任何一个都有大量的代码,因此需要模块化。这种代码库是真实世界系统的典型。
#1
126
To the best of my knowledge, the C++ standard knows no difference between header files and source files. As far as the language is concerned, any text file with legal code is the same as any other. However, although not illegal, including source files into your program will pretty much eliminate any advantages you would've got from separating your source files in the first place.
据我所知,c++标准对头文件和源文件没有区别。就语言而言,任何带合法代码的文本文件都与其他文件相同。但是,虽然不违法,但是将源文件包含到程序中会消除您从一开始分离源文件所获得的任何优势。
Essentially, what #include
does is tell the preprocessor to take the entire file you've specified, and copy it into your active file before the compiler gets its hands on it. So when you include all the source files in your project together, there is fundamentally no difference between what you've done, and just making one huge source file without any separation at all.
本质上,#include所做的是告诉预处理器获取您指定的整个文件,并在编译器得到它之前将其复制到您的活动文件中。因此,当您将项目中的所有源文件都包含在一起时,您所做的与只创建一个大型源文件而不进行任何分离是没有本质区别的。
"Oh, that's no big deal. If it runs, it's fine," I hear you cry. And in a sense, you'd be correct. But right now you're dealing with a tiny tiny little program, and a nice and relatively unencumbered CPU to compile it for you. You won't always be so lucky.
“哦,这没什么大不了的。”如果它运行,它是好的,“我听到你哭泣。在某种意义上,你是对的。但是现在您正在处理一个很小的程序,以及一个漂亮的、相对不受限制的CPU来编译它。你不会总是那么幸运的。
If you ever delve into the realms of serious computer programming, you'll be seeing projects with line counts that can reach millions, rather than dozens. That's a lot of lines. And if you try to compile one of these on a modern desktop computer, it can take a matter of hours instead of seconds.
如果你深入研究严肃的计算机编程领域,你会发现项目的行数可以达到数百万,而不是几十。有很多线。如果你试着在现代桌面电脑上编译一个这样的程序,它只需要几个小时而不是几秒钟。
"Oh no! That sounds horrible! However can I prevent this dire fate?!" Unfortunately, there's not much you can do about that. If it takes hours to compile, it takes hours to compile. But that only really matters the first time -- once you've compiled it once, there's no reason to compile it again.
“哦,不!这听起来可怕!但是我能阻止这种可怕的命运吗?不幸的是,你对此无能为力。如果编译需要几个小时,那么编译需要几个小时。但这只是第一次才真正重要——一旦编译了一次,就没有理由再编译一次了。
Unless you change something.
除非你改变一些事情。
Now, if you had two million lines of code merged together into one giant behemoth, and need to do a simple bug fix such as, say, x = y + 1
, that means you have to compile all two million lines again in order to test this. And if you find out that you meant to do a x = y - 1
instead, then again, two million lines of compile are waiting for you. That's many hours of time wasted that could be better spent doing anything else.
现在,如果你有200万行代码合并成一个巨大的庞然大物,并且需要做一个简单的bug修复,比如说,x = y + 1,这意味着你需要重新编译所有的200万行代码来测试这个。如果你发现你的意思是做一个x = y - 1,那么,200万行编译就等着你了。这是浪费了很多时间,可以用来做其他事情。
"But I hate being unproductive! If only there was some way to compile distinct parts of my codebase individually, and somehow link them together afterwards!" An excellent idea, in theory. But what if your program needs to know what's going on in a different file? It's impossible to completely separate your codebase unless you want to run a bunch of tiny tiny .exe files instead.
“可是我讨厌效率低下!”如果有什么方法可以单独编译我的代码库的不同部分,然后以某种方式将它们连接在一起就好了!在理论上,这是一个很好的想法。但是,如果你的程序需要知道在不同的文件中发生了什么呢?除非你想运行一堆小的.exe文件,否则不可能完全分离你的代码库。
"But surely it must be possible! Programming sounds like pure torture otherwise! What if I found some way to separate interface from implementation? Say by taking just enough information from these distinct code segments to identify them to the rest of the program, and putting them in some sort of header file instead? And that way, I can use the #include
preprocessor directive to bring in only the information necessary to compile!"
“但这肯定是可能的!”编程听起来纯粹是折磨!如果我找到了一种将接口与实现分离的方法呢?比方说,从这些不同的代码段中获取足够的信息,将它们标识给程序的其他部分,然后将它们放到某种头文件中?这样,我就可以使用#include预处理器指令只引入编译所需的信息!
Hmm. You might be on to something there. Let me know how that works out for you.
嗯。你可能会发现一些东西。让我知道你是怎么做的。
#2
35
This is probably a more detailed answer than you wanted, but I think a decent explanation is justified.
这可能是一个比你想要的更详细的答案,但我认为一个合理的解释是合理的。
In C and C++, one source file is defined as one translation unit. By convention, header files hold function declarations, type definitions and class definitions. The actual function implementations reside in translation units, i.e .cpp files.
在C和c++中,一个源文件定义为一个翻译单元。按照约定,头文件包含函数声明、类型定义和类定义。实际的函数实现位于转换单元i中。e . cpp文件。
The idea behind this is that functions and class/struct member functions are compiled and assembled once, then other functions can call that code from one place without making duplicates. Your function prototypes are declared as "extern" implicitly.
这背后的想法是,函数和类/struct成员函数一次编译和组装,然后其他函数可以从一个地方调用该代码,而不需要重复。函数原型被隐式声明为“extern”。
/* Function prototype, usually found in headers. */
/* Implicitly 'extern', i.e the symbols is visible everywhere, not just locally.*/
int add(int, int);
/* function body, or function definition. */
int add(int a, int b)
{
return a + b;
}
If you want a function to be local for a translation unit, you define it as 'static'. What does this mean? It means that if you include source files with extern functions, you will get redefinition errors, because the compiler comes across the same implementation more than once. So, you want all your translation units to see the function prototype but not the function body.
如果你想要一个翻译单元的函数是本地的,你可以把它定义为“静态的”。这是什么意思?这意味着如果您包含带有extern函数的源文件,您将获得重新定义错误,因为编译器不止一次地遇到相同的实现。所以,你想让所有的翻译单位看到函数原型,而不是函数体。
So how does it all get mashed together at the end? That is the linker's job. A linker reads all the object files which is generated by the assembler stage and resolves symbols. As I said earlier, a symbol is just a name. For example, the name of a variable or a function. When translation units which call functions or declare types do not know the implementation for those functions or types, those symbols are said to be unresolved. The linker resolves the unresolved symbol by connecting the translation unit which holds the undefined symbol together with the one which contains the implementation. Phew. This is true for all externally visible symbols, whether they are implemented in your code, or provided by an additional library. A library is really just an archive with reusable code.
那么,这一切是如何在最后融合在一起的呢?那是林克的工作。链接器读取汇编程序阶段生成的所有对象文件并解析符号。正如我之前所说的,符号只是一个名字。例如,变量或函数的名称。当调用函数或声明类型的转换单元不知道这些函数或类型的实现时,这些符号被称为未解决。链接器通过连接包含未定义符号的翻译单元和包含实现的转换单元来解决未解决的符号。唷。这适用于所有外部可见的符号,无论它们是在代码中实现的,还是由附加库提供的。一个库实际上只是一个具有可重用代码的存档。
There are two notable exceptions. First, if you have a small function, you can make it inline. This means that the generated machine code does not generate an extern function call, but is literally concatenated in-place. Since they usually are small, the size overhead does not matter. You can imagine them to be static in the way they work. So it is safe to implement inline functions in headers. Function implementations inside a class or struct definition are also often inlined automatically by the compiler.
有两个明显的例外。首先,如果你有一个小函数,你可以使它内联。这意味着生成的机器代码不会生成一个extern函数调用,而是在适当的地方进行连接。由于它们通常很小,所以大小开销并不重要。你可以想象它们的工作方式是静态的。因此,在header中实现内联函数是安全的。类或结构定义中的函数实现通常也由编译器自动内联。
The other exception is templates. Since the compiler needs to see the whole template type definition when instantiating them, it is not possible to decouple the implementation from the definition as with standalone functions or normal classes. Well, perhaps this is possible now, but getting widespread compiler support for the "export" keyword took a long, long time. So without support for 'export', translation units get their own local copies of instantiated templated types and functions, similar to how inline functions work. With support for 'export', this is not the case.
另一个例外是模板。由于编译器在实例化它们时需要看到整个模板类型定义,所以不可能将实现与独立函数或普通类的定义分离开来。好吧,也许现在这是可能的,但是得到广泛的编译器对“export”关键字的支持花了很长时间。因此,在不支持“导出”的情况下,翻译单元会得到它们自己的实例化模板类型和函数的本地副本,类似于内联函数的工作方式。在支持“出口”的情况下,情况并非如此。
For the two exceptions, some people find it "nicer" to put the implementations of inline functions, templated functions and templated types in .cpp files, and then #include the .cpp file. Whether this is a header or a source file doesn't really matter; the preprocessor does not care and is just a convention.
对于这两个例外,有些人认为将内联函数的实现、模板化函数和模板类型放在.cpp文件中是“更好的”,然后#include .cpp文件。这是头文件还是源文件并不重要;预处理器并不关心,它只是一个约定。
A quick summary of the whole process from C++ code (several files) and to a final executable:
从c++代码(几个文件)到最终可执行文件的整个过程的快速总结:
- The preprocessor is run, which parses all the directives which starts with a '#'. The #include directive concatenates the included file with inferior, for example. It also does macro-replacement and token-pasting.
- 预处理器运行,它解析以#开头的所有指令。例如,#include指令将包含的文件与劣质文件连接在一起。它还进行宏观替换和标记粘贴。
- The actual compiler runs on the intermediate text file after the preprocessor stage, and emits assembler code.
- 实际的编译器在预处理阶段之后在中间文本文件上运行,并发出汇编代码。
- The assembler runs on the assembly file and emits machine code, this is usually called an object file and follows the binary executable format of the operative system in question. For example, Windows uses the PE (portable executable format), while Linux uses the Unix System V ELF format, with GNU extensions. At this stage, symbols are still marked as undefined.
- 汇编程序在程序集文件上运行并发出机器代码,这通常称为对象文件,并遵循操作系统的二进制可执行格式。例如,Windows使用PE(可移植的可执行格式),而Linux使用Unix系统V ELF格式,使用GNU扩展。在这个阶段,符号仍然被标记为未定义。
- Finally, the linker is run. All the previous stages were run on each translation unit in order. However, the linker stage works on all the generated object files which were generated by the assembler. The linker resolves symbols and does a lot of magic like creating sections and segments, which is dependent on the target platform and binary format. Programmers aren't required to know this in general, but it surely helps in some cases.
- 最后,链接器运行。之前的所有阶段都按顺序在每个翻译单元上运行。然而,链接器阶段在汇编程序生成的所有生成的对象文件上工作。链接器解析符号并做很多魔术,比如创建节和段,这依赖于目标平台和二进制格式。程序员一般不需要知道这一点,但在某些情况下肯定会有帮助。
Again, this was definetely more than you asked for, but I hope the nitty-gritty details helps you to see the bigger picture.
同样,这肯定比你所要求的要多,但我希望细节可以帮助你看到更大的图景。
#3
9
The typical solution is to use .h
files for declarations only and .cpp
files for implementation. If you need to reuse the implementation you include the corresponding .h
file into the .cpp
file where the necessary class/function/whatever is used and link against an already compiled .cpp
file (either an .obj
file - usually used within one project - or .lib file - usually used for reusing from multiple projects). This way you don't need to recompile everything if only the implementation changes.
典型的解决方案是只对声明使用.h文件,对实现使用.cpp文件。如果你需要重用实现包含相应的头文件到. cpp文件中必要的类/函数/无论在哪里使用和链接一个已编译. cpp文件(一个.obj文件——通常在一个项目中使用——或者lib文件——通常用于从多个项目重用)。这样,即使实现发生了更改,也不需要重新编译所有内容。
#4
6
Think of cpp files as a black box and the .h files as the guides on how to use those black boxes.
可以将cpp文件看作一个黑框,将.h文件看作如何使用这些黑框的指南。
The cpp files can be compiled ahead of time. This doesn't work in you #include them, as it needs to actual "include" the code into your program each time it compiles it. If you just include the header, it can just use the header file to determine how to use the precompiled cpp file.
cpp文件可以提前编译。这在您#include中不起作用,因为它需要在每次编译时将代码“包含”到您的程序中。如果只包含头文件,它可以使用头文件来确定如何使用预编译的cpp文件。
Although this won't make much of a difference for your first project, if you start writing large cpp programs, people are going to hate you because compile times are going to explode.
虽然这对您的第一个项目没有太大的影响,但是如果您开始编写大型cpp程序,人们会讨厌您,因为编译时间将会激增。
Also have a read of this: Header File Include Patterns
还可以读一下:头文件包含模式
#5
6
Header files usually contain declarations of functions / classes, while .cpp files contain the actual implementations. At compile time, each .cpp file gets compiled into an object file (usually extension .o), and the linker combines the various object files into the final executable. The linking process is generally much faster than the compilation.
头文件通常包含函数/类的声明,而.cpp文件包含实际的实现。在编译时,每个.cpp文件被编译成一个对象文件(通常扩展名为.o),链接器将各种对象文件合并到最终的可执行文件中。链接过程通常比编译快得多。
Benefits of this separation: If you are recompiling one of the .cpp files in your project, you don't have to recompile all the others. You just create the new object file for that particular .cpp file. The compiler doesn't have to look at the other .cpp files. However, if you want to call functions in your current .cpp file that were implemented in the other .cpp files, you have to tell the compiler what arguments they take; that is the purpose of including the header files.
这种分离的好处:如果您正在重新编译项目中的一个.cpp文件,则不必重新编译其他所有文件。您只需为特定的.cpp文件创建新的对象文件。编译器不需要查看其他的.cpp文件。但是,如果您想要调用在其他.cpp文件中实现的当前.cpp文件中的函数,您必须告诉编译器它们采用什么参数;这就是包含头文件的目的。
Disadvantages: When compiling a given .cpp file, the compiler cannot 'see' what is inside the other .cpp files. So it doesn't know how the functions there are implemented, and as a result cannot optimize as aggressively. But I think you don't need to concern yourself with that just yet (:
缺点:编译给定的.cpp文件时,编译器不能“看到”其他.cpp文件中的内容。所以它不知道函数是如何实现的,因此无法进行积极的优化。但是我认为你现在还不需要担心这个问题。
#6
5
The basic idea that headers are only included and cpp files are only compiled. This will become more useful once you have many cpp files, and recompiling the whole application when you modify only one of them will be too slow. Or when the functions in the files will start depending on each other. So, you should separate class declarations into your header files, leave implementation in cpp files and write a Makefile (or something else, depending on what tools are you using) to compile the cpp files and link the resulting object files into a program.
只包含头文件和只编译cpp文件的基本思想。一旦您有了许多cpp文件,并且在修改其中一个文件的时候,重新编译整个应用程序时,这将变得更加有用。或者当文件中的函数开始相互依赖时。因此,您应该将类声明分离到头文件中,将实现保留在cpp文件中,并编写一个Makefile(或其他内容,取决于您使用的工具)来编译cpp文件,并将结果对象文件链接到一个程序中。
#7
3
If you #include a cpp file in several other files in your program, the compiler will try to compile the cpp file multiple times, and will generate an error as there will be multiple implementations of the same methods.
如果您在程序中的其他几个文件中包含一个cpp文件,编译器将尝试多次编译cpp文件,并将产生一个错误,因为相同方法将有多个实现。
Compilation will take longer (which becomes a problem on large projects), if you make edits in #included cpp files, which then force recompilation of any files #including them.
如果您在#include cpp文件中进行编辑,那么编译将花费更长的时间(这在大型项目中是一个问题),然后强制重新编译任何包含#的文件。
Just put your declarations into header files and include those (as they don't actually generate code per se), and the linker will hook up the declarations with the corresponding cpp code (which then only gets compiled once).
只需将声明放入头文件并包含它们(因为它们实际上并不生成代码本身),链接器将把声明与相应的cpp代码连接起来(然后只编译一次)。
#8
2
While it is certainly possible to do as you did, the standard practice is to put shared declarations into header files (.h), and definitions of functions and variables - implementation - into source files (.cpp).
虽然可以像您所做的那样做,但是标准的实践是将共享声明放入头文件(.h)中,并将函数和变量的定义(实现)放入源文件(.cpp)中。
As a convention, this helps make it clear where everything is, and makes a clear distinction between interface and implementation of your modules. It also means that you never have to check to see if a .cpp file is included in another, before adding something to it that could break if it was defined in several different units.
作为一种惯例,这有助于清楚地说明所有事情的位置,并明确区分模块的接口和实现。它还意味着,您永远不必检查是否包含了.cpp文件,然后再添加一些东西,如果它是在几个不同的单元中定义的,那么它可能会中断。
#9
2
re-usability, architecture and data encapsulation
可重用性、架构和数据封装
here's an example:
这里有一个例子:
say you create a cpp file which contains a simple form of string routines all in a class mystring, you place the class decl for this in a mystring.h compiling mystring.cpp to a .obj file
假设您创建了一个cpp文件,其中包含一个简单的形式的字符串例程,这些例程都在一个类mystring中,您将这个类的decl放在一个mystring中。h mystring编译。cpp到。obj文件
now in your main program (e.g. main.cpp) you include header and link with the mystring.obj. to use mystring in your program you don't care about the details how mystring is implemented since the header says what it can do
现在,在你的主程序(例如main.cpp)中,你包含了header和与mystring.obj的链接。要在程序中使用mystring,您不需要考虑mystring是如何实现的,因为header指定了它的功能
now if a buddy wants to use your mystring class you give him mystring.h and the mystring.obj, he also doesn't necessarily need to know how it works as long as it works.
如果一个朋友想要使用mystring类,你就给他mystring。h和mystring。obj,他也不需要知道它是如何工作的只要它有效。
later if you have more such .obj files you can combine them into a .lib file and link to that instead.
稍后,如果您有更多这样的.obj文件,您可以将它们合并到.lib文件中,并将其链接到该文件中。
you can also decide to change the mystring.cpp file and implement it more effectively, this will not affect your main.cpp or your buddies program.
您还可以决定更改mystring。cpp文件并更有效地实现它,这不会影响您的main。cpp或你的伙伴程序。
#10
2
If it works for you then there is nothing wrong with it -- except that it will ruffle the feathers of people who think that there is only one way to do things.
如果它对你有效,那么它没有什么错——除了它会激怒那些认为只有一种方法可以做事情的人。
Many of the answers given here address optimizations for large-scale software projects. These are good things to know about, but there is no point in optimizing a small project as if it were a large project -- that is what is known as "premature optimization". Depending on your development environment, there may be significant extra complexity involved in setting up a build configuration to support multiple source files per program.
这里给出的许多答案都是针对大型软件项目的优化。这些都是需要了解的好东西,但是把一个小项目当作一个大项目来优化是没有意义的——这就是所谓的“过早优化”。根据您的开发环境的不同,在为每个程序设置支持多个源文件的构建配置时,可能会有很大的额外复杂性。
If, over time, your project evolves and you find that the build process is taking too long, then you can refactor your code to use multiple source files for faster incremental builds.
如果随着时间的推移,您的项目不断发展,并且您发现构建过程花费的时间太长,那么您可以重构代码,以便使用多个源文件进行更快的增量构建。
Several of the answers discuss separating interface from implementation. However, this is not an inherent feature of include files, and it is quite common to #include "header" files that directly incorporate their implementation (even the C++ Standard Library does this to a significant degree).
有几个答案讨论了接口和实现之间的分离。但是,这并不是include文件的固有特性,而且直接合并实现的#include“头”文件也很常见(甚至c++标准库也在很大程度上这样做)。
The only thing truly "unconventional" about what you have done was naming your included files ".cpp" instead of ".h" or ".hpp".
你所做的唯一一件真正“非常规”的事情就是给你的文件命名。cpp”而不是“。h "或" . hpp。”
#11
1
When you compile and link a program the compiler first compiles the individual cpp files and then they link (connect) them. The headers will never get compiled, unless included in a cpp file first.
当您编译并链接一个程序时,编译器首先编译单个的cpp文件,然后它们链接(连接)它们。除非首先包含在cpp文件中,否则不会编译头文件。
Typically headers are declarations and cpp are implementation files. In the headers you define an interface for a class or function but you leave out how you actually implement the details. This way you don't have to recompile every cpp file if you make a change in one.
通常头文件是声明,cpp是实现文件。在header中,您为类或函数定义了接口,但是您省略了如何实际实现细节。这样,如果在一个cpp文件中做了更改,就不必重新编译每个cpp文件。
#12
1
I will suggest you to go through Large Scale C++ Software Design by John Lakos. In the college, we usually write small projects where we do not come across such problems. The book highlights the importance of separating interfaces and the implementations.
我建议您进行John Lakos的大规模c++软件设计。在大学里,我们通常写小的项目,在那里我们不会遇到这样的问题。本书强调了分离接口和实现的重要性。
Header files usually have interfaces which are supposed not to be changed so frequently. Similarly a look into patterns like Virtual Constructor idiom will help you grasp the concept further.
头文件通常有接口,这些接口应该不会经常被修改。类似地,研究像虚构造函数习语这样的模式将帮助您进一步理解这个概念。
I am still learning like you :)
我还像你一样在学习:
#13
1
It's like writing a book, you want to print out finished chapters only once
Say you are writing a book. If you put the chapters in separate files then you only need to print out a chapter if you have changed it. Working on one chapter doesn't change any of the others.
假设你正在写一本书。如果你把章节放在不同的文件中,那么你只需要打印一个章节,如果你改变了它。在一章里工作不会改变任何一个章节。
But including the cpp files is, from the compiler's point of view, like editing all of the chapters of the book in one file. Then if you change it you have to print all the pages of the entire book in order to get your revised chapter printed. There is no "print selected pages" option in object code generation.
但是从编译器的角度来看,包含cpp文件就像在一个文件中编辑书中的所有章节一样。然后如果你改变它,你必须打印出整本书的所有页,以便把修改后的章节打印出来。在对象代码生成中没有“打印选定的页面”选项。
Back to software: I have Linux and Ruby src lying around. A rough measure of lines of code...
回到软件:我有Linux和Ruby src。对代码行的粗略度量……
Linux Ruby
100,000 100,000 core functionality (just kernel/*, ruby top level dir)
10,000,000 200,000 everything
Any one of those four categories has a lot of code, hence the need for modularity. This kind of code base is surprisingly typical of real-world systems.
这四个类别中的任何一个都有大量的代码,因此需要模块化。这种代码库是真实世界系统的典型。