头文件包含静态分析工具?

时间:2021-11-28 15:08:18

A colleague recently revealed to me that a single source file of ours includes over 3,400 headers during compile time. We have over 1,000 translation units that get compiled in a build, resulting in a huge performance penalty over headers that surely aren't all used.

一位同事最近向我透露,我们的一个源文件在编译期间包含了超过3400个头文件。我们有超过1000个翻译单元在一个构建中被编译,这导致了对标题的巨大性能损失,当然这些标题并不是全部被使用。

Are there any static analysis tools that would be able to shed light on the trees in such a forest, specifically giving us the ability to decide which ones we should work on paring out?

有什么静态分析工具可以帮助我们了解这样的森林中的树木,特别是让我们能够决定哪些树木应该被砍掉?

UPDATE

更新

Found some interesting information on the cost of including a header file (and the types of include guards to optimize its inclusion) here, originating from this question.

在这里找到了一些关于包含头文件的成本的有趣信息(以及优化包含保护的类型),这些信息源于这个问题。

8 个解决方案

#1


24  

The output of gcc -w -H <file> might be useful (If you parse it and put some counts in) the -w is there to suppress all warnings, which might be awkward to deal with.

gcc -w -H 的输出可能是有用的(如果您对它进行解析并输入一些计数),-w是用来抑制所有警告的,这可能会很棘手。

From the gcc docs:

gcc的文档:

-H

- h

Print the name of each header file used, in addition to other normal activities. Each name is indented to show how deep in the #include stack it is. Precompiled header files are also printed, even if they are found to be invalid; an invalid precompiled header file is printed with ...x and a valid one with ...!.

除了其他正常的活动之外,打印使用的每个头文件的名称。每个名称都被缩进以显示#include堆栈的深度。预编译的头文件也被打印,即使它们被发现是无效的;一个无效的预编译头文件被打印为…x和一个有效的…

The output looks like this:

输出如下:

. /usr/include/unistd.h
.. /usr/include/features.h
... /usr/include/bits/predefs.h
... /usr/include/sys/cdefs.h
.... /usr/include/bits/wordsize.h
... /usr/include/gnu/stubs.h
.... /usr/include/bits/wordsize.h
.... /usr/include/gnu/stubs-64.h
.. /usr/include/bits/posix_opt.h
.. /usr/include/bits/environments.h
... /usr/include/bits/wordsize.h
.. /usr/include/bits/types.h
... /usr/include/bits/wordsize.h
... /usr/include/bits/typesizes.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/bits/confname.h
.. /usr/include/getopt.h
. /usr/include/stdio.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/libio.h
... /usr/include/_G_config.h
.... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.... /usr/include/wchar.h
... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stdarg.h
.. /usr/include/bits/stdio_lim.h
.. /usr/include/bits/sys_errlist.h
Multiple include guards may be useful for:
/usr/include/bits/confname.h
/usr/include/bits/environments.h
/usr/include/bits/predefs.h
/usr/include/bits/stdio_lim.h
/usr/include/bits/sys_errlist.h
/usr/include/bits/typesizes.h
/usr/include/gnu/stubs-64.h
/usr/include/gnu/stubs.h
/usr/include/wchar.h

#2


3  

If you are using gcc/g++, the -M or -MM option will output a line with the information you seek. (The former will include system headers while the latter will not. There are other variants; see the manual.)

如果您正在使用gcc/g++ +, -M或-MM选项将输出一行您想要的信息。(前者将包含系统头,而后者将不包含。还有其他的变异;请参阅手册。)

$ gcc -M -c foo.c
foo.o: foo.c /usr/include/stdint.h /usr/include/features.h \
  /usr/include/sys/cdefs.h /usr/include/bits/wordsize.h \
  /usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \
  /usr/include/bits/wchar.h

You would need to remove the foo.o: foo.c at the beginning, but the rest is a list of all headers that the file depends on, so it would not be too hard to write a script to gather these and summarize them.

您需要删除foo。o:foo。c开头,但是其余的是文件所依赖的所有头文件的列表,所以编写一个脚本收集这些头文件并对其进行总结并不难。

Of course this suggestion is only useful on Unix and only if nobody else has a better idea. :-)

当然,这个建议只在Unix上有用,而且只有在其他人没有更好的想法时才有用。:-)

#3


3  

a few things-

几件事,

  • use "preprocess only" to look at your preprocessor output. gcc -E option, other compilers have the function too

    使用“preprocess only”查看预处理器输出。gcc -E选项,其他编译器也有这个功能

  • use precompiled headers.

    使用预编译头。

  • gcc has -verbose and --trace options which also display the full include tree, MSVC has the /showIncludes option found under Advanced C++ property page

    gcc有-verbose和-trace选项,也显示完整的include树,MSVC有/ showinclude选项,可以在高级c++属性页面中找到

Also, Displaying the #include hierarchy for a C++ file in Visual Studio

此外,在Visual Studio中显示c++文件的#include层次结构

#4


1  

GCC has a -M flag that will output a list of dependencies for a given source file. You could use that information to figure out which of your files have the most dependencies, which files are most depended on, etc.

GCC有一个-M标志,它将输出一个给定源文件的依赖项列表。您可以使用这些信息来确定哪些文件具有最多的依赖性,哪些文件最依赖,等等。

Check out the man page for more information. There are several variants of -M.

查看手册页了解更多信息。有几种-M的变体。

#5


1  

"Large Scale C++ Software Design" by John Lakos had tools that extracted the compile-time dependencies among source files.

John Lakos的“大型c++软件设计”工具提取了源文件之间的编译时依赖关系。

Unfortunately, their repository on Addison-Wesley's site is gone (along with AW's site itself), but I found a tarball here: http://prdownloads.sourceforge.net/introspector/LSC-rpkg-0.1.tgz?download

不幸的是,他们在Addison-Wesley网站上的存储库已经消失(以及AW的网站本身),但是我在这里找到了一个tarball: http://prdownloads.sourceforge.net/introspector/lsc -rpkg-0.1.tgz?

I found it useful several jobs ago, and it has the virtue of being free.

我在以前的工作中发现它很有用,而且它有免费的优点。

BTW, if you haven't read Lakos's book, it sounds like your project would benefit. (The current edition is a bit dated, but I hear that Lakos has another book coming out in 2012.)

顺便说一句,如果你还没有读过Lakos的书,听起来你的项目将会受益。(目前的版本有点过时,但我听说Lakos在2012年将出版另一本书。)

#6


1  

Personally I don't know if there is a tool that will say "Remove this file". It's really a complex matter that depends on a lot of things. Looking at a tree of include statements is surely going to drive you nuts.... It would drive me crazy, as well as ruin my eyes. There are better ways to do things to reduce your compile times.

我个人不知道是否有一个工具会说“删除这个文件”。这是一个复杂的问题,它取决于很多事情。看着树包括语句肯定会开车送你坚果....这会让我发疯,也会毁掉我的眼睛。有更好的方法来减少编译时间。

  1. De-inline your class methods.
  2. De-inline你的类方法。
  3. After deinlining them, re-examine your include statements and attempt to remove them. Usually helpful to delete them, and start over.
  4. 在删除它们之后,重新检查您的include语句并尝试删除它们。通常有助于删除它们,并重新开始。
  5. Prefer to use forward declarations are much as possible. If you de-inline methods in your header files you can do this alot.
  6. 尽量使用forward声明。如果你在你的头文件中去内联方法,你可以做很多。
  7. Break up large header files into smaller files. If a class in a file is used more often than most, then put it in a header file all by itself.
  8. 将大型头文件分解为较小的文件。如果一个文件中的类比大多数类使用得更频繁,那么将它单独放在一个头文件中。
  9. 1000 translational units is not very much actually. We have between 10-20 thousand. :)
  10. 1000个平移单位实际上不是很多。我们有10-2万。:)
  11. Get Incredibuild if your compile times are still too long.
  12. 如果您的编译时间仍然太长,那么就进行不可信的构建。

#7


0  

I heard there are some tools do it, but I don't use them.

我听说有一些工具可以做到这一点,但我不使用它们。

I created some tool https://sourceforge.net/p/headerfinder may be this is useful. Unfortunately it is "HOME MADE" tool with following issues,

我创建了一些工具https://sourceforge.net/p/headerfinder这可能有用。不幸的是,它是“自制”工具,有以下问题,

  • Developed in Vb.Net
  • 在Vb.Net开发的
  • Source code need to compiled
  • 源代码需要编译
  • Very slow and consumes memory.
  • 非常慢,消耗内存。
  • No help available.
  • 没有帮助。

#8


0  

GCC Has a flag (-save-temps) with which you can save intermediate files. This includes .ii files, which are the results of the preprocessor (so before compilation). You can write a script to parse this and determine the weight/cost/size of what is included, as well as the dependency tree.

GCC有一个标记(-save-temps),可以保存中间文件。这包括.ii文件,它是预处理器的结果(因此在编译之前)。您可以编写一个脚本来解析这个并确定包含的内容的权重/成本/大小,以及依赖树。

I wrote a Python script to do just this (publicly available here: https://gitlab.com/p_b_omta/gcc-include-analyzer).

为此,我编写了一个Python脚本(在这里可以公开使用:https://gitlab.com/p_b_omta/gcc-include analyzer)。

#1


24  

The output of gcc -w -H <file> might be useful (If you parse it and put some counts in) the -w is there to suppress all warnings, which might be awkward to deal with.

gcc -w -H 的输出可能是有用的(如果您对它进行解析并输入一些计数),-w是用来抑制所有警告的,这可能会很棘手。

From the gcc docs:

gcc的文档:

-H

- h

Print the name of each header file used, in addition to other normal activities. Each name is indented to show how deep in the #include stack it is. Precompiled header files are also printed, even if they are found to be invalid; an invalid precompiled header file is printed with ...x and a valid one with ...!.

除了其他正常的活动之外,打印使用的每个头文件的名称。每个名称都被缩进以显示#include堆栈的深度。预编译的头文件也被打印,即使它们被发现是无效的;一个无效的预编译头文件被打印为…x和一个有效的…

The output looks like this:

输出如下:

. /usr/include/unistd.h
.. /usr/include/features.h
... /usr/include/bits/predefs.h
... /usr/include/sys/cdefs.h
.... /usr/include/bits/wordsize.h
... /usr/include/gnu/stubs.h
.... /usr/include/bits/wordsize.h
.... /usr/include/gnu/stubs-64.h
.. /usr/include/bits/posix_opt.h
.. /usr/include/bits/environments.h
... /usr/include/bits/wordsize.h
.. /usr/include/bits/types.h
... /usr/include/bits/wordsize.h
... /usr/include/bits/typesizes.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/bits/confname.h
.. /usr/include/getopt.h
. /usr/include/stdio.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/libio.h
... /usr/include/_G_config.h
.... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.... /usr/include/wchar.h
... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stdarg.h
.. /usr/include/bits/stdio_lim.h
.. /usr/include/bits/sys_errlist.h
Multiple include guards may be useful for:
/usr/include/bits/confname.h
/usr/include/bits/environments.h
/usr/include/bits/predefs.h
/usr/include/bits/stdio_lim.h
/usr/include/bits/sys_errlist.h
/usr/include/bits/typesizes.h
/usr/include/gnu/stubs-64.h
/usr/include/gnu/stubs.h
/usr/include/wchar.h

#2


3  

If you are using gcc/g++, the -M or -MM option will output a line with the information you seek. (The former will include system headers while the latter will not. There are other variants; see the manual.)

如果您正在使用gcc/g++ +, -M或-MM选项将输出一行您想要的信息。(前者将包含系统头,而后者将不包含。还有其他的变异;请参阅手册。)

$ gcc -M -c foo.c
foo.o: foo.c /usr/include/stdint.h /usr/include/features.h \
  /usr/include/sys/cdefs.h /usr/include/bits/wordsize.h \
  /usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \
  /usr/include/bits/wchar.h

You would need to remove the foo.o: foo.c at the beginning, but the rest is a list of all headers that the file depends on, so it would not be too hard to write a script to gather these and summarize them.

您需要删除foo。o:foo。c开头,但是其余的是文件所依赖的所有头文件的列表,所以编写一个脚本收集这些头文件并对其进行总结并不难。

Of course this suggestion is only useful on Unix and only if nobody else has a better idea. :-)

当然,这个建议只在Unix上有用,而且只有在其他人没有更好的想法时才有用。:-)

#3


3  

a few things-

几件事,

  • use "preprocess only" to look at your preprocessor output. gcc -E option, other compilers have the function too

    使用“preprocess only”查看预处理器输出。gcc -E选项,其他编译器也有这个功能

  • use precompiled headers.

    使用预编译头。

  • gcc has -verbose and --trace options which also display the full include tree, MSVC has the /showIncludes option found under Advanced C++ property page

    gcc有-verbose和-trace选项,也显示完整的include树,MSVC有/ showinclude选项,可以在高级c++属性页面中找到

Also, Displaying the #include hierarchy for a C++ file in Visual Studio

此外,在Visual Studio中显示c++文件的#include层次结构

#4


1  

GCC has a -M flag that will output a list of dependencies for a given source file. You could use that information to figure out which of your files have the most dependencies, which files are most depended on, etc.

GCC有一个-M标志,它将输出一个给定源文件的依赖项列表。您可以使用这些信息来确定哪些文件具有最多的依赖性,哪些文件最依赖,等等。

Check out the man page for more information. There are several variants of -M.

查看手册页了解更多信息。有几种-M的变体。

#5


1  

"Large Scale C++ Software Design" by John Lakos had tools that extracted the compile-time dependencies among source files.

John Lakos的“大型c++软件设计”工具提取了源文件之间的编译时依赖关系。

Unfortunately, their repository on Addison-Wesley's site is gone (along with AW's site itself), but I found a tarball here: http://prdownloads.sourceforge.net/introspector/LSC-rpkg-0.1.tgz?download

不幸的是,他们在Addison-Wesley网站上的存储库已经消失(以及AW的网站本身),但是我在这里找到了一个tarball: http://prdownloads.sourceforge.net/introspector/lsc -rpkg-0.1.tgz?

I found it useful several jobs ago, and it has the virtue of being free.

我在以前的工作中发现它很有用,而且它有免费的优点。

BTW, if you haven't read Lakos's book, it sounds like your project would benefit. (The current edition is a bit dated, but I hear that Lakos has another book coming out in 2012.)

顺便说一句,如果你还没有读过Lakos的书,听起来你的项目将会受益。(目前的版本有点过时,但我听说Lakos在2012年将出版另一本书。)

#6


1  

Personally I don't know if there is a tool that will say "Remove this file". It's really a complex matter that depends on a lot of things. Looking at a tree of include statements is surely going to drive you nuts.... It would drive me crazy, as well as ruin my eyes. There are better ways to do things to reduce your compile times.

我个人不知道是否有一个工具会说“删除这个文件”。这是一个复杂的问题,它取决于很多事情。看着树包括语句肯定会开车送你坚果....这会让我发疯,也会毁掉我的眼睛。有更好的方法来减少编译时间。

  1. De-inline your class methods.
  2. De-inline你的类方法。
  3. After deinlining them, re-examine your include statements and attempt to remove them. Usually helpful to delete them, and start over.
  4. 在删除它们之后,重新检查您的include语句并尝试删除它们。通常有助于删除它们,并重新开始。
  5. Prefer to use forward declarations are much as possible. If you de-inline methods in your header files you can do this alot.
  6. 尽量使用forward声明。如果你在你的头文件中去内联方法,你可以做很多。
  7. Break up large header files into smaller files. If a class in a file is used more often than most, then put it in a header file all by itself.
  8. 将大型头文件分解为较小的文件。如果一个文件中的类比大多数类使用得更频繁,那么将它单独放在一个头文件中。
  9. 1000 translational units is not very much actually. We have between 10-20 thousand. :)
  10. 1000个平移单位实际上不是很多。我们有10-2万。:)
  11. Get Incredibuild if your compile times are still too long.
  12. 如果您的编译时间仍然太长,那么就进行不可信的构建。

#7


0  

I heard there are some tools do it, but I don't use them.

我听说有一些工具可以做到这一点,但我不使用它们。

I created some tool https://sourceforge.net/p/headerfinder may be this is useful. Unfortunately it is "HOME MADE" tool with following issues,

我创建了一些工具https://sourceforge.net/p/headerfinder这可能有用。不幸的是,它是“自制”工具,有以下问题,

  • Developed in Vb.Net
  • 在Vb.Net开发的
  • Source code need to compiled
  • 源代码需要编译
  • Very slow and consumes memory.
  • 非常慢,消耗内存。
  • No help available.
  • 没有帮助。

#8


0  

GCC Has a flag (-save-temps) with which you can save intermediate files. This includes .ii files, which are the results of the preprocessor (so before compilation). You can write a script to parse this and determine the weight/cost/size of what is included, as well as the dependency tree.

GCC有一个标记(-save-temps),可以保存中间文件。这包括.ii文件,它是预处理器的结果(因此在编译之前)。您可以编写一个脚本来解析这个并确定包含的内容的权重/成本/大小,以及依赖树。

I wrote a Python script to do just this (publicly available here: https://gitlab.com/p_b_omta/gcc-include-analyzer).

为此,我编写了一个Python脚本(在这里可以公开使用:https://gitlab.com/p_b_omta/gcc-include analyzer)。