I am building a lot of auto-generated code, including one particularly large file (~15K lines), using a mingw32 cross compiler on linux. Most files are extremely quick, but this one large file takes an unexpectedly long time (~15 minutes) to compile.
我正在构建大量自动生成的代码,包括一个特别大的文件(~15K行),在linux上使用mingw32交叉编译器。大多数文件非常快,但这一个大文件需要很长时间(约15分钟)才能编译。
I have tried manipulating various optimization flags to see if they had any effect, without any luck. What I really need is some way of determining what g++ is doing that is taking so long. Are there any (relatively simple) ways to have g++ generate output about different phases of compilation, to help me narrow down what the hang-up might be?
我试过操纵各种优化标志,看看它们是否有任何影响,没有任何运气。我真正需要的是一些确定g ++正在做什么的方法。是否有任何(相对简单的)方法让g ++生成关于编译的不同阶段的输出,以帮助我缩小挂起的范围?
Sadly, I do not have the ability to rebuild this cross-compiler, so adding debugging information to the compiler and stepping through it is not a possibility.
遗憾的是,我没有能力重建这个交叉编译器,因此不可能将调试信息添加到编译器并单步调试。
What's in the file:
文件中包含的内容:
- a bunch of includes
- a bunch of string comparisons
- a bunch of if-then checks and constructor invocations
一堆包括
一堆字符串比较
一堆if-then检查和构造函数调用
The file is a factory for producing a ton of different specific subclasses of a certain parent class. Most of the includes, however, are nothing terribly fancy.
该文件是用于生成特定父类的大量不同特定子类的工厂。然而,大多数包括没有什么特别的花哨。
The results of -ftime-report, as suggested by Neil Butterworth, indicate that the "life analysis" phase is taking 921 seconds, which takes up most of the 15 minutes.
根据Neil Butterworth的建议,-ftime-report的结果表明,“生命分析”阶段需要921秒,占据了15分钟的大部分时间。
It appears that this takes place during data flow analysis. The file itself is a bunch of conditional string comparisons, constructing an object by class name provided as a string.
看起来这发生在数据流分析期间。文件本身是一堆条件字符串比较,按类名提供字符串构造对象。
We think changing this to point into a map of names to function pointers might improve things a bit, so we're going to try that.
我们认为将此更改为指向函数指针的名称映射可能会改善一些事情,因此我们将尝试这样做。
Indeed, generating a bunch of factory functions (per object) and creating a map from the string name of the object to a pointer to its factory function reduced compile time from the original 15 minutes to about 25 seconds, which will save everyone tons of time on their builds.
实际上,生成一堆工厂函数(每个对象)并从对象的字符串名称创建映射到指向其工厂函数的指针将编译时间从原来的15分钟缩短到大约25秒,这将节省每个人的大量时间他们的构建。
Thanks again to Neil Butterworth for the tip about -ftime-report.
再次感谢Neil Butterworth关于-ftime-report的提示。
8 个解决方案
#1
25
Won't give all the details you want, but try running with the -v
(verbose) and -ftime-report
flags. The latter produces a summary of what the compiler has been up to.
不会提供您想要的所有详细信息,但请尝试使用-v(详细)和-ftime-report标志运行。后者生成了编译器的最新信息。
#2
3
It most probably includes TONNES of includes. I believe -MD will list out all the include files in a given CPP file (That includes includes of includes and so forth).
它最有可能包括TONNES of includes。我相信-MD会列出给定CPP文件中的所有包含文件(包括包含等等)。
#3
2
What slows g++ down in general are templates. For example Boost loves to use them. This means nice code, great performances but poor compiling speed.
一般来说,减慢g ++的速度是模板。例如,Boost喜欢使用它们。这意味着代码很好,性能很好但编译速度很慢。
On the other hand, 15min seems extremely long. After a quick googling, it seems that it is a common problem with mingw
另一方面,15分钟似乎非常长。经过快速的谷歌搜索,似乎这是mingw的常见问题
#4
2
I'd use #if 0
/ #endif
to eliminate large portions of the source file from compilation. Repeat with different blocks of code until you pinpoint which block(s) are slow. For starters, you can see if your #include
's are the problem by using #if 0
/ #endif
to exclude everything but the #include
's.
我使用#if 0 / #endif来消除编译中的大部分源文件。重复使用不同的代码块,直到找出哪个块很慢。对于初学者,你可以通过使用#if 0 / #endif除了#include之外的所有内容来查看你的#include是否有问题。
#5
1
Another process to try is to add "progress marker" pragma
s to your code to trap the portion of the code that is taking a long time. The Visual Studio compiler provides #pragma message()
, although there is not a standard pragma for doing this.
另一个尝试的过程是在代码中添加“进度标记”pragma,以捕获需要很长时间的代码部分。 Visual Studio编译器提供了#pragma message(),尽管没有标准的pragma来执行此操作。
Put one marker at the beginning of the code and a marker at the end of the code. The end marker could be a #error
since you don't care about the remainder of the source file. Move the markers accordingly to trap the section of code taking the longest time.
在代码的开头放置一个标记,在代码的末尾放置一个标记。结束标记可能是#error,因为您不关心源文件的其余部分。相应地移动标记以捕获占用时间最长的代码段。
Just a thought...
只是一个想法...
#6
0
Related to @Goz and @Josh_Kelley, you can get gcc/g++ to spit out the preprocessed source (with #includes inline) using -E. That's one way to determine just how large your source is.
与@Goz和@Josh_Kelley相关,您可以使用-E来获取gcc / g ++来吐出预处理的源(使用#includes inline)。这是确定源的大小的一种方法。
And if the compiler itself is the problem, you may be able to strace the compile command that's taking a long time to see whether there's a particular file access or a specific internal action that's taking a long time.
如果编译器本身就是问题,那么您可能能够编译那些需要花费很长时间才能查看是否存在特定文件访问权限或特定内部操作需要很长时间的编译命令。
#7
0
What the compiler sees is the output of the pre-processor, so the size of the individual source is not a good measure, you have to consider the source and all the files it includes, and the files they include etc. Instantiation of templates for multiple types generates code for each separate type used, so that could end up being a lot of code. If you have made extensive used of STL containers for many classes for example.
编译器看到的是预处理器的输出,因此单个源的大小不是一个好的衡量标准,您必须考虑源和它包含的所有文件,以及它们包含的文件等。用于模板的实例化多种类型为每种使用的单独类型生成代码,因此最终可能会产生大量代码。例如,如果您已经为许多类广泛使用了STL容器。
15K lines in one source is rather a lot, but even if split up, all that code still needs to be compiled; however using an incremental build may mean that it does not all need compiling all the time. There really is no need for a file that large; its just poor practice/design. I start thinking about better modularisation when a file gets to 500 lines (although I am not dogmatic about it)
一个源中的15K行相当多,但即使拆分,所有代码仍然需要编译;但是,使用增量构建可能意味着并非所有时间都需要编译。真的不需要一个大的文件;它只是糟糕的做法/设计。当文件达到500行时,我开始考虑更好的模块化(尽管我不是教条主义)
#8
0
One thing to watch during the compile is how much memory your computer has free. If the compiler allocates so much memory that the computer starts swapping, compile time will go way, way up.
在编译期间要注意的一件事是计算机有多少内存空闲。如果编译器分配了太多的内存以致计算机开始交换,那么编译时间就会一路走来。
If you see that happen, an easily solution is to install more RAM... or just split the file into multiple parts that can be compiled separately.
如果你看到这种情况发生,一个简单的解决方案就是安装更多RAM ......或者只是将文件拆分成多个可以单独编译的部分。
#1
25
Won't give all the details you want, but try running with the -v
(verbose) and -ftime-report
flags. The latter produces a summary of what the compiler has been up to.
不会提供您想要的所有详细信息,但请尝试使用-v(详细)和-ftime-report标志运行。后者生成了编译器的最新信息。
#2
3
It most probably includes TONNES of includes. I believe -MD will list out all the include files in a given CPP file (That includes includes of includes and so forth).
它最有可能包括TONNES of includes。我相信-MD会列出给定CPP文件中的所有包含文件(包括包含等等)。
#3
2
What slows g++ down in general are templates. For example Boost loves to use them. This means nice code, great performances but poor compiling speed.
一般来说,减慢g ++的速度是模板。例如,Boost喜欢使用它们。这意味着代码很好,性能很好但编译速度很慢。
On the other hand, 15min seems extremely long. After a quick googling, it seems that it is a common problem with mingw
另一方面,15分钟似乎非常长。经过快速的谷歌搜索,似乎这是mingw的常见问题
#4
2
I'd use #if 0
/ #endif
to eliminate large portions of the source file from compilation. Repeat with different blocks of code until you pinpoint which block(s) are slow. For starters, you can see if your #include
's are the problem by using #if 0
/ #endif
to exclude everything but the #include
's.
我使用#if 0 / #endif来消除编译中的大部分源文件。重复使用不同的代码块,直到找出哪个块很慢。对于初学者,你可以通过使用#if 0 / #endif除了#include之外的所有内容来查看你的#include是否有问题。
#5
1
Another process to try is to add "progress marker" pragma
s to your code to trap the portion of the code that is taking a long time. The Visual Studio compiler provides #pragma message()
, although there is not a standard pragma for doing this.
另一个尝试的过程是在代码中添加“进度标记”pragma,以捕获需要很长时间的代码部分。 Visual Studio编译器提供了#pragma message(),尽管没有标准的pragma来执行此操作。
Put one marker at the beginning of the code and a marker at the end of the code. The end marker could be a #error
since you don't care about the remainder of the source file. Move the markers accordingly to trap the section of code taking the longest time.
在代码的开头放置一个标记,在代码的末尾放置一个标记。结束标记可能是#error,因为您不关心源文件的其余部分。相应地移动标记以捕获占用时间最长的代码段。
Just a thought...
只是一个想法...
#6
0
Related to @Goz and @Josh_Kelley, you can get gcc/g++ to spit out the preprocessed source (with #includes inline) using -E. That's one way to determine just how large your source is.
与@Goz和@Josh_Kelley相关,您可以使用-E来获取gcc / g ++来吐出预处理的源(使用#includes inline)。这是确定源的大小的一种方法。
And if the compiler itself is the problem, you may be able to strace the compile command that's taking a long time to see whether there's a particular file access or a specific internal action that's taking a long time.
如果编译器本身就是问题,那么您可能能够编译那些需要花费很长时间才能查看是否存在特定文件访问权限或特定内部操作需要很长时间的编译命令。
#7
0
What the compiler sees is the output of the pre-processor, so the size of the individual source is not a good measure, you have to consider the source and all the files it includes, and the files they include etc. Instantiation of templates for multiple types generates code for each separate type used, so that could end up being a lot of code. If you have made extensive used of STL containers for many classes for example.
编译器看到的是预处理器的输出,因此单个源的大小不是一个好的衡量标准,您必须考虑源和它包含的所有文件,以及它们包含的文件等。用于模板的实例化多种类型为每种使用的单独类型生成代码,因此最终可能会产生大量代码。例如,如果您已经为许多类广泛使用了STL容器。
15K lines in one source is rather a lot, but even if split up, all that code still needs to be compiled; however using an incremental build may mean that it does not all need compiling all the time. There really is no need for a file that large; its just poor practice/design. I start thinking about better modularisation when a file gets to 500 lines (although I am not dogmatic about it)
一个源中的15K行相当多,但即使拆分,所有代码仍然需要编译;但是,使用增量构建可能意味着并非所有时间都需要编译。真的不需要一个大的文件;它只是糟糕的做法/设计。当文件达到500行时,我开始考虑更好的模块化(尽管我不是教条主义)
#8
0
One thing to watch during the compile is how much memory your computer has free. If the compiler allocates so much memory that the computer starts swapping, compile time will go way, way up.
在编译期间要注意的一件事是计算机有多少内存空闲。如果编译器分配了太多的内存以致计算机开始交换,那么编译时间就会一路走来。
If you see that happen, an easily solution is to install more RAM... or just split the file into multiple parts that can be compiled separately.
如果你看到这种情况发生,一个简单的解决方案就是安装更多RAM ......或者只是将文件拆分成多个可以单独编译的部分。