我如何知道代码中哪些部分从未被使用过?

时间:2021-10-15 06:33:20

I have legacy C++ code that I'm supposed to remove unused code from. The problem is that the code base is large.

我有遗留的c++代码,我应该从中删除未使用的代码。问题是代码库很大。

How can I find out which code is never called/never used?

我怎样才能知道哪些代码从未被调用或使用过?

18 个解决方案

#1


190  

There are two varieties of unused code:

有两种类型的未使用代码:

  • the local one, that is, in some functions some paths or variables are unused (or used but in no meaningful way, like written but never read)
  • 局部函数,也就是说,在某些函数中,有些路径或变量是未使用的(或使用的,但没有任何意义,比如写的,但从不读的)
  • the global one: functions that are never called, global objects that are never accessed
  • 全局对象:从未调用的函数,从未访问过的全局对象

For the first kind, a good compiler can help:

对于第一种,一个好的编译器可以帮助:

  • -Wunused (GCC, Clang) should warn about unused variables, Clang unused analyzer has even been incremented to warn about variables that are never read (even though used).
  • - wused (GCC, Clang)应该警告未使用的变量,甚至增加了Clang未使用的分析器来警告从未读取(即使使用过)的变量。
  • -Wunreachable-code (older GCC, removed in 2010) should warn about local blocks that are never accessed (it happens with early returns or conditions that always evaluate to true)
  • - wunreach -code(较老的GCC在2010年被删除)应该对从未被访问的本地块发出警告(它发生在早期的返回或者总是评估为true的条件中)
  • there is no option I know of to warn about unused catch blocks, because the compiler generally cannot prove that no exception will be thrown.
  • 我知道没有选项可以警告未使用的catch块,因为编译器通常不能证明不会抛出异常。

For the second kind, it's much more difficult. Statically it requires whole program analysis, and even though link time optimization may actually remove dead code, in practice the program has been so much transformed at the time it is performed that it is near impossible to convey meaningful information to the user.

对于第二种,则要困难得多。静态地,它需要整个程序分析,即使链接时间优化实际上可以去除死代码,在实践中,程序在执行的时候已经发生了很大的变化,几乎不可能向用户传递有意义的信息。

There are therefore two approaches:

因此有两种方法:

  • The theoretic one is to use a static analyzer. A piece of software that will examine the whole code at once in great detail and find all the flow paths. In practice I don't know any that would work here.
  • 理论的一种是使用静态分析仪。一种软件,它将详细地检查整个代码并找到所有的流路径。在实践中,我不知道有什么能在这里发挥作用。
  • The pragmatic one is to use an heuristic: use a code coverage tool (in the GNU chain it's gcov. Note that specific flags should be passed during compilation for it to work properly). You run the code coverage tool with a good set of varied inputs (your unit-tests or non-regression tests), the dead code is necessarily within the unreached code... and so you can start from here.
  • 实用的方法是使用启发式:使用代码覆盖工具(在GNU链中是gcov)。请注意,在编译过程中应该传递特定的标志以使其正常工作)。您运行的代码覆盖工具具有一系列不同的输入(单元测试或非回归测试),死代码一定在未到达的代码中……你可以从这里开始。

If you are extremely interested in the subject, and have the time and inclination to actually work out a tool by yourself, I would suggest using the Clang libraries to build such a tool.

如果您对这个主题非常感兴趣,并且有时间和意愿亲自设计一个工具,我建议使用Clang库来构建这样的工具。

  1. Use the Clang library to get an AST (abstract syntax tree)
  2. 使用Clang库获取AST(抽象语法树)
  3. Perform a mark-and-sweep analysis from the entry points onward
  4. 从入口点开始执行标记-扫描分析

Because Clang will parse the code for you, and perform overload resolution, you won't have to deal with the C++ languages rules, and you'll be able to concentrate on the problem at hand.

因为Clang将为您解析代码,并执行重载解析,所以您不必处理c++语言规则,您将能够集中精力处理手边的问题。

However this kind of technique cannot identify the virtual overrides that are unused, since they could be called by third-party code you cannot reason about.

然而,这种技术不能识别未使用的虚拟覆盖,因为它们可以被第三方代码调用,而您不能对此进行推理。

#2


31  

For the case of unused whole functions (and unused global variables), GCC can actually do most of the work for you provided that you're using GCC and GNU ld.

对于未使用的完整函数(以及未使用的全局变量),如果您使用GCC和GNU ld, GCC实际上可以为您完成大部分工作。

When compiling the source, use -ffunction-sections and -fdata-sections, then when linking use -Wl,--gc-sections,--print-gc-sections. The linker will now list all the functions that could be removed because they were never called and all the globals that were never referenced.

在编译源代码时,使用- ffunc- sections和-fdata sections,然后在链接时使用-Wl,- gc-sections, print-gc-sections。链接器现在将列出所有可能被删除的函数,因为它们从未被调用,以及所有未被引用的全局变量。

(Of course, you can also skip the --print-gc-sections part and let the linker remove the functions silently, but keep them in the source.)

(当然,您也可以跳过- printc -section部分,并让链接器静默地删除函数,但保留它们在源代码中。)

Note: this will only find unused complete functions, it won't do anything about dead code within functions. Functions called from dead code in live functions will also be kept around.

注意:这只会发现未使用的完整函数,它不会对函数中的死代码做任何处理。在活动函数中从死代码中调用的函数也将被保留。

Some C++-specific features will also cause problems, in particular:

一些特定于c++的特性也会导致问题,特别是:

  • Virtual functions. Without knowing which subclasses exist and which are actually instantiated at run time, you can't know which virtual functions you need to exist in the final program. The linker doesn't have enough information about that so it will have to keep all of them around.
  • 虚函数。如果不知道存在哪些子类,以及在运行时实例化了哪些子类,就无法知道在最终的程序中需要存在哪些虚拟函数。链接器没有足够的信息,所以它必须保持所有的链接。
  • Globals with constructors, and their constructors. In general, the linker can't know that the constructor for a global doesn't have side effects, so it must run it. Obviously this means the global itself also needs to be kept.
  • 带有构造函数的全局变量,以及它们的构造函数。通常,链接器不知道全局变量的构造函数没有副作用,所以必须运行它。显然,这意味着全球本身也需要保持下去。

In both cases, anything used by a virtual function or a global-variable constructor also has to be kept around.

在这两种情况下,虚拟函数或全局变量构造函数使用的任何东西都必须保留。

An additional caveat is that if you're building a shared library, the default settings in GCC will export every function in the shared library, causing it to be "used" as far as the linker is concerned. To fix that you need to set the default to hiding symbols instead of exporting (using e.g. -fvisibility=hidden), and then explicitly select the exported functions that you need to export.

另外需要注意的是,如果您正在构建一个共享库,那么GCC中的默认设置将导出共享库中的每个函数,使其在链接器中“被使用”。为了解决这个问题,您需要将默认设置为隐藏符号,而不是导出(使用例如-fvisibility=hidden),然后显式地选择需要导出的函数。

#3


25  

Well if you using g++ you can use this flag -Wunused

如果你用g+你可以用这个标记- wused

According documentation:

根据文档:

Warn whenever a variable is unused aside from its declaration, whenever a function is declared static but never defined, whenever a label is declared but not used, and whenever a statement computes a result that is explicitly not used.

当一个变量未被使用时(除了它的声明之外)发出警告,当一个函数被声明为静态但从未被定义,当一个标签被声明但没有被使用,当一个语句计算一个显式未被使用的结果时发出警告。

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

Edit: Here is other useful flag -Wunreachable-code According documentation:

编辑:这里还有其他有用的标志- wunreach -code根据文档:

This option is intended to warn when the compiler detects that at least a whole line of source code will never be executed, because some condition is never satisfied or because it is after a procedure that never returns.

当编译器检测到至少一行源代码永远不会被执行时,这个选项是用来发出警告的,因为某些条件永远不会被满足,或者因为它在一个过程之后永远不会返回。

Update: I found similar topic Dead code detection in legacy C/C++ project

更新:我在遗留的C/ c++项目中发现了类似的主题死代码检测

#4


18  

I think you are looking for a code coverage tool. A code coverage tool will analyze your code as it is running, and it will let you know which lines of code were executed and how many times, as well as which ones were not.

我认为您正在寻找一个代码覆盖工具。代码覆盖工具将分析正在运行的代码,并让您知道执行了哪些代码行,执行了多少次,以及没有执行的代码行。

You could try giving this open source code coverage tool a chance: TestCocoon - code coverage tool for C/C++ and C#.

您可以尝试给这个开放源代码覆盖工具一个机会:TestCocoon -用于C/ c++和c#的代码覆盖工具。

#5


15  

The real answer here is: You can never really know for sure.

真正的答案是:你永远不可能确切知道。

At least, for nontrivial cases, you can't be sure you've gotten all of it. Consider the following from Wikipedia's article on unreachable code:

至少,对于非平凡的情况,你不能确定你已经得到了所有。请参考*关于不可访问代码的文章:

double x = sqrt(2);
if (x > 5)
{
  doStuff();
}

As Wikipedia correctly notes, a clever compiler may be able to catch something like this. But consider a modification:

正如*正确指出的,一个聪明的编译器可以捕捉到这样的东西。但考虑修改:

int y;
cin >> y;
double x = sqrt((double)y);

if (x != 0 && x < 1)
{
  doStuff();
}

Will the compiler catch this? Maybe. But to do that, it will need to do more than run sqrt against a constant scalar value. It will have to figure out that (double)y will always be an integer (easy), and then understand the mathematical range of sqrt for the set of integers (hard). A very sophisticated compiler might be able to do this for the sqrt function, or for every function in math.h, or for any fixed-input function whose domain it can figure out. This gets very, very complex, and the complexity is basically limitless. You can keep adding layers of sophistication to your compiler, but there will always be a way to sneak in some code that will be unreachable for any given set of inputs.

编译器会捕捉到这个吗?也许吧。但是要做到这一点,它需要做的不仅仅是对常数标量值运行sqrt。它必须计算出(双)y总是一个整数(很简单),然后理解整数集(硬)的平方根的数学范围。一个非常复杂的编译器可能可以对sqrt函数或数学中的每个函数都这样做。h,或者任何固定输入函数的定义域。这变得非常非常复杂,而且复杂性基本上是无限的。您可以继续向您的编译器添加复杂的层,但是总有一种方法可以插入一些对于任何给定输入集都不可访问的代码。

And then there are the input sets that simply never get entered. Input that would make no sense in real life, or get blocked by validation logic elsewhere. There's no way for the compiler to know about those.

然后还有一些输入集,它们根本不可能被输入。输入在现实生活中是没有意义的,或者被其他地方的验证逻辑阻塞。编译器没有办法知道这些。

The end result of this is that while the software tools others have mentioned are extremely useful, you're never going to know for sure that you caught everything unless you go through the code manually afterward. Even then, you'll never be certain that you didn't miss anything.

这样做的最终结果是,尽管其他人提到的软件工具非常有用,但您永远无法确定是否捕获了所有信息,除非您之后手动检查代码。即便如此,你也永远不会确定自己没有漏掉什么。

The only real solution, IMHO, is to be as vigilant as possible, use the automation at your disposal, refactor where you can, and constantly look for ways to improve your code. Of course, it's a good idea to do that anyway.

唯一真正的解决方案,IMHO,是尽可能地保持警惕,使用自动化,尽可能地重构,并不断寻找改进代码的方法。当然,无论如何这样做是个好主意。

#6


11  

I haven't used it myself, but cppcheck, claims to find unused functions. It probably won't solve the complete problem but it might be a start.

我自己没有使用它,但cppcheck声称找到了未使用的函数。它可能不能解决全部问题,但它可能是一个开始。

#7


9  

You could try using PC-lint/FlexeLint from Gimple Software. It claims to

您可以尝试使用Gimple软件中的PC-lint/FlexeLint。它声称

find unused macros, typedef's, classes, members, declarations, etc. across the entire project

在整个项目中查找未使用的宏、类型定义、类、成员、声明等

I've used it for static analysis and found it very good but I have to admit that I have not used it to specifically find dead code.

我已经将它用于静态分析,并发现它非常好,但是我必须承认,我还没有使用它来专门查找死代码。

#8


4  

My normal approach to finding unused stuff is

我发现未使用的东西的常规方法是

  1. make sure the build system handles dependency tracking correctly
  2. 确保构建系统正确地处理依赖项跟踪。
  3. set up a second monitor, with a full-screen terminal window, running repeated builds and showing the first screenful of output. watch "make 2>&1" tends to do the trick on Unix.
  4. 设置第二个监视器,使用全屏终端窗口,运行重复的构建并显示第一个屏幕输出。观看“制作2>和1”在Unix上有这样的效果。
  5. run a find-and-replace operation on the entire source tree, adding "//? " at the beginning of every line
  6. 在整个源树上运行查找和替换操作,添加“//?”在每一行的开头
  7. fix the first error flagged by the compiler, by removing the "//?" in the corresponding lines.
  8. 通过删除相应行中的“//?”修正编译器标记的第一个错误。
  9. Repeat until there are no errors left.
  10. 重复,直到没有错误。

This is a somewhat lengthy process, but it does give good results.

这是一个有点冗长的过程,但它确实带来了良好的结果。

#9


4  

Mark as much public functions and variables as private or protected without causing compilation error, while doing this, try to also refactor the code. By making functions private and to some extent protected, you reduced your search area since private functions can only be called from the same class (unless there are stupid macro or other tricks to circumvent access restriction, and if that's the case I'd recommend you find a new job). It is much easier to determine that you don't need a private function since only the class you're currently working on can call this function. This method is easier if your code base have small classes and is loosely coupled. If your code base does not have small classes or have very tight coupling, I suggest cleaning those up first.

将尽可能多的公共函数和变量标记为私有或受保护,而不会导致编译错误,在此过程中,还要尝试重构代码。通过使函数私有并在某种程度上受到保护,您减少了搜索区域,因为只能从同一个类调用私有函数(除非有一些愚蠢的宏或其他技巧来绕过访问限制,如果是这样的话,我建议您找到一个新的工作)。确定不需要私有函数要容易得多,因为只有当前正在处理的类才能调用这个函数。如果您的代码库有小类并且是松散耦合的,那么这个方法会更简单。如果您的代码库没有小的类或有非常紧密的耦合,我建议先清理它们。

Next will be to mark all the remaining public functions and make a call graph to figure out the relationship between the classes. From this tree, try to figure out which part of the branch looks like it can be trimmed.

接下来将标记所有剩余的公共函数,并创建一个调用图来确定类之间的关系。从这棵树中,试着找出分支的哪个部分看起来可以被修剪。

The advantage of this method is that you can do it on per module basis, so it is easy to keep passing your unittest without having large period of time when you've got broken code base.

这种方法的优点是,您可以在每个模块的基础上进行测试,因此很容易通过unittest,而不会在代码基础损坏的情况下花费大量时间。

#10


3  

If you are on Linux, you may want to look into callgrind, a C/C++ program analysis tool that is part of the valgrind suite, which also contains tools that check for memory leaks and other memory errors (which you should be using as well). It analyzes a running instance of your program, and produces data about its call graph, and about the performance costs of nodes on the call graph. It is usually used for performance analysis, but it also produces a call graph for your applications, so you can see what functions are called, as well as their callers.

如果您在Linux上,您可能想要查看callgrind,它是一个C/ c++程序分析工具,是valgrind套件的一部分,它还包含检查内存泄漏和其他内存错误的工具(您也应该使用它)。它分析程序的运行实例,并生成有关其调用图的数据,以及调用图上节点的性能成本。它通常用于性能分析,但它也为应用程序生成一个调用图,因此您可以看到调用了什么函数,以及它们的调用者。

This is obviously complementary to the static methods mentioned elsewhere on the page, and it will only be helpful for eliminating wholly unused classes, methods, and functions - it well not help find dead code inside methods which are actually called.

这显然是对页面上其他地方提到的静态方法的补充,它只会有助于消除完全未使用的类、方法和函数——它不会帮助在实际调用的方法中找到死代码。

#11


3  

I really haven't used any tool that does such a thing... But, as far as I've seen in all the answers, no one has ever said that this problem is uncomputable.

我真的没有使用任何工具来做这样的事情……但是,就我所看到的所有答案而言,从来没有人说过这个问题是不可计算的。

What do I mean by this? That this problem cannot be solved by any algorithm ever on a computer. This theorem (that such an algorithm doesn't exist) is a corollary of Turing's Halting Problem.

这是什么意思?这个问题不能用任何计算机上的算法来解决。这个定理(这种算法不存在)是图灵停止问题的推论。

All the tools you will use are not algorithms but heuristics (i.e not exact algorithms). They will not give you exactly all the code that's not used.

您将使用的所有工具不是算法,而是启发式(i)。e不是精确算法)。他们不会给你所有没有使用的代码。

#12


2  

One way is use a debugger and the compiler feature of eliminating unused machine code during compilation.

一种方法是使用调试器和编译器特性,在编译期间消除未使用的机器代码。

Once some machine code is eliminated the debugger won't let you put a breakpojnt on corresponding line of source code. So you put breakpoints everywhere and start the program and inspect the breakpoints - those which are in "no code loaded for this source" state correspond to eliminated code - either that code is never called or it has been inlined and you have to perform some minimum analysis to find which of those two happened.

一旦消除了一些机器代码,调试器就不会让您在相应的源代码行中放置breakpojnt。无处不在,所以你把断点启动程序和检查断点——那些“没有代码加载源”状态对应于消除代码,这些代码不会被称为或内联和你必须执行一些最低分析发现它发生的这两个。

At least that's how it works in Visual Studio and I guess other toolsets also can do that.

至少这是它在Visual Studio中的工作方式,我猜其他工具集也可以做到这一点。

That's lots of work, but I guess faster than manually analyzing all code.

这工作量很大,但我猜这比手工分析所有代码要快。

#13


1  

It depends of the platform you use to create your application.

这取决于创建应用程序所用的平台。

For example, if you use Visual Studio, you could use a tool like .NET ANTS Profiler which is able to parse and profile your code. This way, you should quickly know which part of your code is actually used. Eclipse also have equivalent plugins.

例如,如果您使用Visual Studio,您可以使用. net ANTS Profiler之类的工具来解析和配置代码。通过这种方式,您应该很快地知道实际使用了代码的哪一部分。Eclipse也有等效的插件。

Otherwise, if you need to know what function of your application is actually used by your end user, and if you can release your application easily, you can use a log file for an audit.

否则,如果您需要知道最终用户实际使用了应用程序的什么功能,并且如果您可以轻松地发布应用程序,那么您可以使用日志文件进行审计。

For each main function, you can trace its usage, and after a few days/week just get that log file, and have a look at it.

对于每个主函数,您可以跟踪它的使用情况,并且在几天/周之后就可以得到这个日志文件,并查看它。

#14


1  

CppDepend is a commercial tool which can detect unused types, methods and fields, and do much more. It is available for Windows and Linux (but currently has no 64-bit support), and comes with a 2-week trial.

CppDepend是一种商业工具,它可以检测未使用的类型、方法和字段,并做更多工作。它可用于Windows和Linux(但目前没有64位支持),并附带2周的试用。

Disclaimer: I don't work there, but I own a license for this tool (as well as NDepend, which is a more powerful alternative for .NET code).

免责声明:我不在那里工作,但是我拥有这个工具的许可证(以及NDepend,它是. net代码的一个更强大的替代)。

For those who are curious, here is an example built-in (customizable) rule for detecting dead methods, written in CQLinq:

对于那些好奇的人,这里有一个用CQLinq编写的用于检测死方法的内置(可定制的)规则示例:

// <Name>Potentially dead Methods</Name>
warnif count > 0
// Filter procedure for methods that should'nt be considered as dead
let canMethodBeConsideredAsDeadProc = new Func<IMethod, bool>(
    m => !m.IsPublic &&       // Public methods might be used by client applications of your Projects.
         !m.IsEntryPoint &&            // Main() method is not used by-design.
         !m.IsClassConstructor &&      
         !m.IsVirtual &&               // Only check for non virtual method that are not seen as used in IL.
         !(m.IsConstructor &&          // Don't take account of protected ctor that might be call by a derived ctors.
           m.IsProtected) &&
         !m.IsGeneratedByCompiler
)

// Get methods unused
let methodsUnused = 
   from m in JustMyCode.Methods where 
   m.NbMethodsCallingMe == 0 && 
   canMethodBeConsideredAsDeadProc(m)
   select m

// Dead methods = methods used only by unused methods (recursive)
let deadMethodsMetric = methodsUnused.FillIterative(
   methods => // Unique loop, just to let a chance to build the hashset.
              from o in new[] { new object() }
              // Use a hashet to make Intersect calls much faster!
              let hashset = methods.ToHashSet()
              from m in codeBase.Application.Methods.UsedByAny(methods).Except(methods)
              where canMethodBeConsideredAsDeadProc(m) &&
                    // Select methods called only by methods already considered as dead
                    hashset.Intersect(m.MethodsCallingMe).Count() == m.NbMethodsCallingMe
              select m)

from m in JustMyCode.Methods.Intersect(deadMethodsMetric.DefinitionDomain)
select new { m, m.MethodsCallingMe, depth = deadMethodsMetric[m] }

#15


0  

I don't think it can be done automatically.

我认为这不是自动完成的。

Even with code coverage tools, you need to provide sufficient input data to run.

即使使用代码覆盖工具,您也需要提供足够的输入数据来运行。

May be very complex and high priced static analysis tool such as from Coverity's or LLVM compiler could be help.

可能是非常复杂和昂贵的静态分析工具,如Coverity的或LLVM编译器可以帮助。

But I'm not sure and I would prefer manual code review.

但是我不确定,我更喜欢手工代码检查。

UPDATED

更新

Well.. only removing unused variables, unused functions is not hard though.

嗯. .不过,只删除未使用的变量,不使用的函数并不难。

UPDATED

更新

After read other answers and comments, I'm more strongly convinced that it can't be done.

在阅读了其他的答案和评论后,我更加坚信这是不可能的。

You have to know the code to have meaningful code coverage measure, and if you know that much manual editing will be faster than prepare/run/review coverage results.

您必须了解代码以获得有意义的代码覆盖率度量,并且如果您知道大量手工编辑将比准备/运行/检查覆盖率结果更快的话。

#16


0  

I had a friend ask me this very question today, and I looked around at some promising Clang developments, e.g. ASTMatchers and the Static Analyzer that might have sufficient visibility in the goings-on during compiling to determine the dead code sections, but then I found this:

今天我有个朋友问我这个问题,我环顾四周,发现了一些很有前途的Clang开发,比如ASTMatchers和静态分析器,它们在编译过程中可能有足够的可见性来确定死代码段,但是我发现:

https://blog.flameeyes.eu/2008/01/today-how-to-identify-unused-exported-functions-and-variables

https://blog.flameeyes.eu/2008/01/today-how-to-identify-unused-exported-functions-and-variables

It's pretty much a complete description of how to use a few GCC flags that are seemingly designed for the purpose of identifying unreferenced symbols!

这几乎是一个完整的描述,说明如何使用一些似乎为识别未引用符号而设计的GCC标志。

#17


0  

The general problem of if some function will be called is NP-Complete. You cannot know in advance in a general way if some function will be called as you won't know if a Turing machine will ever stop. You can get if there's some path (statically) that goes from main() to the function you have written, but that doesn't warrant you it will ever be called.

如果要调用某个函数,一般的问题是np完备性。如果你不知道一个图灵机是否会停止,你不可能提前知道一些函数是否会被调用。如果有从main()到您编写的函数的路径(静态),您可以得到,但这并不保证会调用它。

#18


-3  

Well if you using g++ you can use this flag -Wunused

如果你用g+你可以用这个标记- wused

According documentation:

根据文档:

Warn whenever a variable is unused aside from its declaration, whenever a function is declared static but never defined, whenever a label is declared but not used, and whenever a statement computes a result that is explicitly not used.

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

Edit: Here is other usefull flag -Wunreachable-code According documentation:

编辑:这里有其他有用的标志- wunreach -code根据文档:

This option is intended to warn when the compiler detects that at least a whole line of source code will never be executed, because some condition is never satisfied or because it is after a procedure that never returns.

#1


190  

There are two varieties of unused code:

有两种类型的未使用代码:

  • the local one, that is, in some functions some paths or variables are unused (or used but in no meaningful way, like written but never read)
  • 局部函数,也就是说,在某些函数中,有些路径或变量是未使用的(或使用的,但没有任何意义,比如写的,但从不读的)
  • the global one: functions that are never called, global objects that are never accessed
  • 全局对象:从未调用的函数,从未访问过的全局对象

For the first kind, a good compiler can help:

对于第一种,一个好的编译器可以帮助:

  • -Wunused (GCC, Clang) should warn about unused variables, Clang unused analyzer has even been incremented to warn about variables that are never read (even though used).
  • - wused (GCC, Clang)应该警告未使用的变量,甚至增加了Clang未使用的分析器来警告从未读取(即使使用过)的变量。
  • -Wunreachable-code (older GCC, removed in 2010) should warn about local blocks that are never accessed (it happens with early returns or conditions that always evaluate to true)
  • - wunreach -code(较老的GCC在2010年被删除)应该对从未被访问的本地块发出警告(它发生在早期的返回或者总是评估为true的条件中)
  • there is no option I know of to warn about unused catch blocks, because the compiler generally cannot prove that no exception will be thrown.
  • 我知道没有选项可以警告未使用的catch块,因为编译器通常不能证明不会抛出异常。

For the second kind, it's much more difficult. Statically it requires whole program analysis, and even though link time optimization may actually remove dead code, in practice the program has been so much transformed at the time it is performed that it is near impossible to convey meaningful information to the user.

对于第二种,则要困难得多。静态地,它需要整个程序分析,即使链接时间优化实际上可以去除死代码,在实践中,程序在执行的时候已经发生了很大的变化,几乎不可能向用户传递有意义的信息。

There are therefore two approaches:

因此有两种方法:

  • The theoretic one is to use a static analyzer. A piece of software that will examine the whole code at once in great detail and find all the flow paths. In practice I don't know any that would work here.
  • 理论的一种是使用静态分析仪。一种软件,它将详细地检查整个代码并找到所有的流路径。在实践中,我不知道有什么能在这里发挥作用。
  • The pragmatic one is to use an heuristic: use a code coverage tool (in the GNU chain it's gcov. Note that specific flags should be passed during compilation for it to work properly). You run the code coverage tool with a good set of varied inputs (your unit-tests or non-regression tests), the dead code is necessarily within the unreached code... and so you can start from here.
  • 实用的方法是使用启发式:使用代码覆盖工具(在GNU链中是gcov)。请注意,在编译过程中应该传递特定的标志以使其正常工作)。您运行的代码覆盖工具具有一系列不同的输入(单元测试或非回归测试),死代码一定在未到达的代码中……你可以从这里开始。

If you are extremely interested in the subject, and have the time and inclination to actually work out a tool by yourself, I would suggest using the Clang libraries to build such a tool.

如果您对这个主题非常感兴趣,并且有时间和意愿亲自设计一个工具,我建议使用Clang库来构建这样的工具。

  1. Use the Clang library to get an AST (abstract syntax tree)
  2. 使用Clang库获取AST(抽象语法树)
  3. Perform a mark-and-sweep analysis from the entry points onward
  4. 从入口点开始执行标记-扫描分析

Because Clang will parse the code for you, and perform overload resolution, you won't have to deal with the C++ languages rules, and you'll be able to concentrate on the problem at hand.

因为Clang将为您解析代码,并执行重载解析,所以您不必处理c++语言规则,您将能够集中精力处理手边的问题。

However this kind of technique cannot identify the virtual overrides that are unused, since they could be called by third-party code you cannot reason about.

然而,这种技术不能识别未使用的虚拟覆盖,因为它们可以被第三方代码调用,而您不能对此进行推理。

#2


31  

For the case of unused whole functions (and unused global variables), GCC can actually do most of the work for you provided that you're using GCC and GNU ld.

对于未使用的完整函数(以及未使用的全局变量),如果您使用GCC和GNU ld, GCC实际上可以为您完成大部分工作。

When compiling the source, use -ffunction-sections and -fdata-sections, then when linking use -Wl,--gc-sections,--print-gc-sections. The linker will now list all the functions that could be removed because they were never called and all the globals that were never referenced.

在编译源代码时,使用- ffunc- sections和-fdata sections,然后在链接时使用-Wl,- gc-sections, print-gc-sections。链接器现在将列出所有可能被删除的函数,因为它们从未被调用,以及所有未被引用的全局变量。

(Of course, you can also skip the --print-gc-sections part and let the linker remove the functions silently, but keep them in the source.)

(当然,您也可以跳过- printc -section部分,并让链接器静默地删除函数,但保留它们在源代码中。)

Note: this will only find unused complete functions, it won't do anything about dead code within functions. Functions called from dead code in live functions will also be kept around.

注意:这只会发现未使用的完整函数,它不会对函数中的死代码做任何处理。在活动函数中从死代码中调用的函数也将被保留。

Some C++-specific features will also cause problems, in particular:

一些特定于c++的特性也会导致问题,特别是:

  • Virtual functions. Without knowing which subclasses exist and which are actually instantiated at run time, you can't know which virtual functions you need to exist in the final program. The linker doesn't have enough information about that so it will have to keep all of them around.
  • 虚函数。如果不知道存在哪些子类,以及在运行时实例化了哪些子类,就无法知道在最终的程序中需要存在哪些虚拟函数。链接器没有足够的信息,所以它必须保持所有的链接。
  • Globals with constructors, and their constructors. In general, the linker can't know that the constructor for a global doesn't have side effects, so it must run it. Obviously this means the global itself also needs to be kept.
  • 带有构造函数的全局变量,以及它们的构造函数。通常,链接器不知道全局变量的构造函数没有副作用,所以必须运行它。显然,这意味着全球本身也需要保持下去。

In both cases, anything used by a virtual function or a global-variable constructor also has to be kept around.

在这两种情况下,虚拟函数或全局变量构造函数使用的任何东西都必须保留。

An additional caveat is that if you're building a shared library, the default settings in GCC will export every function in the shared library, causing it to be "used" as far as the linker is concerned. To fix that you need to set the default to hiding symbols instead of exporting (using e.g. -fvisibility=hidden), and then explicitly select the exported functions that you need to export.

另外需要注意的是,如果您正在构建一个共享库,那么GCC中的默认设置将导出共享库中的每个函数,使其在链接器中“被使用”。为了解决这个问题,您需要将默认设置为隐藏符号,而不是导出(使用例如-fvisibility=hidden),然后显式地选择需要导出的函数。

#3


25  

Well if you using g++ you can use this flag -Wunused

如果你用g+你可以用这个标记- wused

According documentation:

根据文档:

Warn whenever a variable is unused aside from its declaration, whenever a function is declared static but never defined, whenever a label is declared but not used, and whenever a statement computes a result that is explicitly not used.

当一个变量未被使用时(除了它的声明之外)发出警告,当一个函数被声明为静态但从未被定义,当一个标签被声明但没有被使用,当一个语句计算一个显式未被使用的结果时发出警告。

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

Edit: Here is other useful flag -Wunreachable-code According documentation:

编辑:这里还有其他有用的标志- wunreach -code根据文档:

This option is intended to warn when the compiler detects that at least a whole line of source code will never be executed, because some condition is never satisfied or because it is after a procedure that never returns.

当编译器检测到至少一行源代码永远不会被执行时,这个选项是用来发出警告的,因为某些条件永远不会被满足,或者因为它在一个过程之后永远不会返回。

Update: I found similar topic Dead code detection in legacy C/C++ project

更新:我在遗留的C/ c++项目中发现了类似的主题死代码检测

#4


18  

I think you are looking for a code coverage tool. A code coverage tool will analyze your code as it is running, and it will let you know which lines of code were executed and how many times, as well as which ones were not.

我认为您正在寻找一个代码覆盖工具。代码覆盖工具将分析正在运行的代码,并让您知道执行了哪些代码行,执行了多少次,以及没有执行的代码行。

You could try giving this open source code coverage tool a chance: TestCocoon - code coverage tool for C/C++ and C#.

您可以尝试给这个开放源代码覆盖工具一个机会:TestCocoon -用于C/ c++和c#的代码覆盖工具。

#5


15  

The real answer here is: You can never really know for sure.

真正的答案是:你永远不可能确切知道。

At least, for nontrivial cases, you can't be sure you've gotten all of it. Consider the following from Wikipedia's article on unreachable code:

至少,对于非平凡的情况,你不能确定你已经得到了所有。请参考*关于不可访问代码的文章:

double x = sqrt(2);
if (x > 5)
{
  doStuff();
}

As Wikipedia correctly notes, a clever compiler may be able to catch something like this. But consider a modification:

正如*正确指出的,一个聪明的编译器可以捕捉到这样的东西。但考虑修改:

int y;
cin >> y;
double x = sqrt((double)y);

if (x != 0 && x < 1)
{
  doStuff();
}

Will the compiler catch this? Maybe. But to do that, it will need to do more than run sqrt against a constant scalar value. It will have to figure out that (double)y will always be an integer (easy), and then understand the mathematical range of sqrt for the set of integers (hard). A very sophisticated compiler might be able to do this for the sqrt function, or for every function in math.h, or for any fixed-input function whose domain it can figure out. This gets very, very complex, and the complexity is basically limitless. You can keep adding layers of sophistication to your compiler, but there will always be a way to sneak in some code that will be unreachable for any given set of inputs.

编译器会捕捉到这个吗?也许吧。但是要做到这一点,它需要做的不仅仅是对常数标量值运行sqrt。它必须计算出(双)y总是一个整数(很简单),然后理解整数集(硬)的平方根的数学范围。一个非常复杂的编译器可能可以对sqrt函数或数学中的每个函数都这样做。h,或者任何固定输入函数的定义域。这变得非常非常复杂,而且复杂性基本上是无限的。您可以继续向您的编译器添加复杂的层,但是总有一种方法可以插入一些对于任何给定输入集都不可访问的代码。

And then there are the input sets that simply never get entered. Input that would make no sense in real life, or get blocked by validation logic elsewhere. There's no way for the compiler to know about those.

然后还有一些输入集,它们根本不可能被输入。输入在现实生活中是没有意义的,或者被其他地方的验证逻辑阻塞。编译器没有办法知道这些。

The end result of this is that while the software tools others have mentioned are extremely useful, you're never going to know for sure that you caught everything unless you go through the code manually afterward. Even then, you'll never be certain that you didn't miss anything.

这样做的最终结果是,尽管其他人提到的软件工具非常有用,但您永远无法确定是否捕获了所有信息,除非您之后手动检查代码。即便如此,你也永远不会确定自己没有漏掉什么。

The only real solution, IMHO, is to be as vigilant as possible, use the automation at your disposal, refactor where you can, and constantly look for ways to improve your code. Of course, it's a good idea to do that anyway.

唯一真正的解决方案,IMHO,是尽可能地保持警惕,使用自动化,尽可能地重构,并不断寻找改进代码的方法。当然,无论如何这样做是个好主意。

#6


11  

I haven't used it myself, but cppcheck, claims to find unused functions. It probably won't solve the complete problem but it might be a start.

我自己没有使用它,但cppcheck声称找到了未使用的函数。它可能不能解决全部问题,但它可能是一个开始。

#7


9  

You could try using PC-lint/FlexeLint from Gimple Software. It claims to

您可以尝试使用Gimple软件中的PC-lint/FlexeLint。它声称

find unused macros, typedef's, classes, members, declarations, etc. across the entire project

在整个项目中查找未使用的宏、类型定义、类、成员、声明等

I've used it for static analysis and found it very good but I have to admit that I have not used it to specifically find dead code.

我已经将它用于静态分析,并发现它非常好,但是我必须承认,我还没有使用它来专门查找死代码。

#8


4  

My normal approach to finding unused stuff is

我发现未使用的东西的常规方法是

  1. make sure the build system handles dependency tracking correctly
  2. 确保构建系统正确地处理依赖项跟踪。
  3. set up a second monitor, with a full-screen terminal window, running repeated builds and showing the first screenful of output. watch "make 2>&1" tends to do the trick on Unix.
  4. 设置第二个监视器,使用全屏终端窗口,运行重复的构建并显示第一个屏幕输出。观看“制作2>和1”在Unix上有这样的效果。
  5. run a find-and-replace operation on the entire source tree, adding "//? " at the beginning of every line
  6. 在整个源树上运行查找和替换操作,添加“//?”在每一行的开头
  7. fix the first error flagged by the compiler, by removing the "//?" in the corresponding lines.
  8. 通过删除相应行中的“//?”修正编译器标记的第一个错误。
  9. Repeat until there are no errors left.
  10. 重复,直到没有错误。

This is a somewhat lengthy process, but it does give good results.

这是一个有点冗长的过程,但它确实带来了良好的结果。

#9


4  

Mark as much public functions and variables as private or protected without causing compilation error, while doing this, try to also refactor the code. By making functions private and to some extent protected, you reduced your search area since private functions can only be called from the same class (unless there are stupid macro or other tricks to circumvent access restriction, and if that's the case I'd recommend you find a new job). It is much easier to determine that you don't need a private function since only the class you're currently working on can call this function. This method is easier if your code base have small classes and is loosely coupled. If your code base does not have small classes or have very tight coupling, I suggest cleaning those up first.

将尽可能多的公共函数和变量标记为私有或受保护,而不会导致编译错误,在此过程中,还要尝试重构代码。通过使函数私有并在某种程度上受到保护,您减少了搜索区域,因为只能从同一个类调用私有函数(除非有一些愚蠢的宏或其他技巧来绕过访问限制,如果是这样的话,我建议您找到一个新的工作)。确定不需要私有函数要容易得多,因为只有当前正在处理的类才能调用这个函数。如果您的代码库有小类并且是松散耦合的,那么这个方法会更简单。如果您的代码库没有小的类或有非常紧密的耦合,我建议先清理它们。

Next will be to mark all the remaining public functions and make a call graph to figure out the relationship between the classes. From this tree, try to figure out which part of the branch looks like it can be trimmed.

接下来将标记所有剩余的公共函数,并创建一个调用图来确定类之间的关系。从这棵树中,试着找出分支的哪个部分看起来可以被修剪。

The advantage of this method is that you can do it on per module basis, so it is easy to keep passing your unittest without having large period of time when you've got broken code base.

这种方法的优点是,您可以在每个模块的基础上进行测试,因此很容易通过unittest,而不会在代码基础损坏的情况下花费大量时间。

#10


3  

If you are on Linux, you may want to look into callgrind, a C/C++ program analysis tool that is part of the valgrind suite, which also contains tools that check for memory leaks and other memory errors (which you should be using as well). It analyzes a running instance of your program, and produces data about its call graph, and about the performance costs of nodes on the call graph. It is usually used for performance analysis, but it also produces a call graph for your applications, so you can see what functions are called, as well as their callers.

如果您在Linux上,您可能想要查看callgrind,它是一个C/ c++程序分析工具,是valgrind套件的一部分,它还包含检查内存泄漏和其他内存错误的工具(您也应该使用它)。它分析程序的运行实例,并生成有关其调用图的数据,以及调用图上节点的性能成本。它通常用于性能分析,但它也为应用程序生成一个调用图,因此您可以看到调用了什么函数,以及它们的调用者。

This is obviously complementary to the static methods mentioned elsewhere on the page, and it will only be helpful for eliminating wholly unused classes, methods, and functions - it well not help find dead code inside methods which are actually called.

这显然是对页面上其他地方提到的静态方法的补充,它只会有助于消除完全未使用的类、方法和函数——它不会帮助在实际调用的方法中找到死代码。

#11


3  

I really haven't used any tool that does such a thing... But, as far as I've seen in all the answers, no one has ever said that this problem is uncomputable.

我真的没有使用任何工具来做这样的事情……但是,就我所看到的所有答案而言,从来没有人说过这个问题是不可计算的。

What do I mean by this? That this problem cannot be solved by any algorithm ever on a computer. This theorem (that such an algorithm doesn't exist) is a corollary of Turing's Halting Problem.

这是什么意思?这个问题不能用任何计算机上的算法来解决。这个定理(这种算法不存在)是图灵停止问题的推论。

All the tools you will use are not algorithms but heuristics (i.e not exact algorithms). They will not give you exactly all the code that's not used.

您将使用的所有工具不是算法,而是启发式(i)。e不是精确算法)。他们不会给你所有没有使用的代码。

#12


2  

One way is use a debugger and the compiler feature of eliminating unused machine code during compilation.

一种方法是使用调试器和编译器特性,在编译期间消除未使用的机器代码。

Once some machine code is eliminated the debugger won't let you put a breakpojnt on corresponding line of source code. So you put breakpoints everywhere and start the program and inspect the breakpoints - those which are in "no code loaded for this source" state correspond to eliminated code - either that code is never called or it has been inlined and you have to perform some minimum analysis to find which of those two happened.

一旦消除了一些机器代码,调试器就不会让您在相应的源代码行中放置breakpojnt。无处不在,所以你把断点启动程序和检查断点——那些“没有代码加载源”状态对应于消除代码,这些代码不会被称为或内联和你必须执行一些最低分析发现它发生的这两个。

At least that's how it works in Visual Studio and I guess other toolsets also can do that.

至少这是它在Visual Studio中的工作方式,我猜其他工具集也可以做到这一点。

That's lots of work, but I guess faster than manually analyzing all code.

这工作量很大,但我猜这比手工分析所有代码要快。

#13


1  

It depends of the platform you use to create your application.

这取决于创建应用程序所用的平台。

For example, if you use Visual Studio, you could use a tool like .NET ANTS Profiler which is able to parse and profile your code. This way, you should quickly know which part of your code is actually used. Eclipse also have equivalent plugins.

例如,如果您使用Visual Studio,您可以使用. net ANTS Profiler之类的工具来解析和配置代码。通过这种方式,您应该很快地知道实际使用了代码的哪一部分。Eclipse也有等效的插件。

Otherwise, if you need to know what function of your application is actually used by your end user, and if you can release your application easily, you can use a log file for an audit.

否则,如果您需要知道最终用户实际使用了应用程序的什么功能,并且如果您可以轻松地发布应用程序,那么您可以使用日志文件进行审计。

For each main function, you can trace its usage, and after a few days/week just get that log file, and have a look at it.

对于每个主函数,您可以跟踪它的使用情况,并且在几天/周之后就可以得到这个日志文件,并查看它。

#14


1  

CppDepend is a commercial tool which can detect unused types, methods and fields, and do much more. It is available for Windows and Linux (but currently has no 64-bit support), and comes with a 2-week trial.

CppDepend是一种商业工具,它可以检测未使用的类型、方法和字段,并做更多工作。它可用于Windows和Linux(但目前没有64位支持),并附带2周的试用。

Disclaimer: I don't work there, but I own a license for this tool (as well as NDepend, which is a more powerful alternative for .NET code).

免责声明:我不在那里工作,但是我拥有这个工具的许可证(以及NDepend,它是. net代码的一个更强大的替代)。

For those who are curious, here is an example built-in (customizable) rule for detecting dead methods, written in CQLinq:

对于那些好奇的人,这里有一个用CQLinq编写的用于检测死方法的内置(可定制的)规则示例:

// <Name>Potentially dead Methods</Name>
warnif count > 0
// Filter procedure for methods that should'nt be considered as dead
let canMethodBeConsideredAsDeadProc = new Func<IMethod, bool>(
    m => !m.IsPublic &&       // Public methods might be used by client applications of your Projects.
         !m.IsEntryPoint &&            // Main() method is not used by-design.
         !m.IsClassConstructor &&      
         !m.IsVirtual &&               // Only check for non virtual method that are not seen as used in IL.
         !(m.IsConstructor &&          // Don't take account of protected ctor that might be call by a derived ctors.
           m.IsProtected) &&
         !m.IsGeneratedByCompiler
)

// Get methods unused
let methodsUnused = 
   from m in JustMyCode.Methods where 
   m.NbMethodsCallingMe == 0 && 
   canMethodBeConsideredAsDeadProc(m)
   select m

// Dead methods = methods used only by unused methods (recursive)
let deadMethodsMetric = methodsUnused.FillIterative(
   methods => // Unique loop, just to let a chance to build the hashset.
              from o in new[] { new object() }
              // Use a hashet to make Intersect calls much faster!
              let hashset = methods.ToHashSet()
              from m in codeBase.Application.Methods.UsedByAny(methods).Except(methods)
              where canMethodBeConsideredAsDeadProc(m) &&
                    // Select methods called only by methods already considered as dead
                    hashset.Intersect(m.MethodsCallingMe).Count() == m.NbMethodsCallingMe
              select m)

from m in JustMyCode.Methods.Intersect(deadMethodsMetric.DefinitionDomain)
select new { m, m.MethodsCallingMe, depth = deadMethodsMetric[m] }

#15


0  

I don't think it can be done automatically.

我认为这不是自动完成的。

Even with code coverage tools, you need to provide sufficient input data to run.

即使使用代码覆盖工具,您也需要提供足够的输入数据来运行。

May be very complex and high priced static analysis tool such as from Coverity's or LLVM compiler could be help.

可能是非常复杂和昂贵的静态分析工具,如Coverity的或LLVM编译器可以帮助。

But I'm not sure and I would prefer manual code review.

但是我不确定,我更喜欢手工代码检查。

UPDATED

更新

Well.. only removing unused variables, unused functions is not hard though.

嗯. .不过,只删除未使用的变量,不使用的函数并不难。

UPDATED

更新

After read other answers and comments, I'm more strongly convinced that it can't be done.

在阅读了其他的答案和评论后,我更加坚信这是不可能的。

You have to know the code to have meaningful code coverage measure, and if you know that much manual editing will be faster than prepare/run/review coverage results.

您必须了解代码以获得有意义的代码覆盖率度量,并且如果您知道大量手工编辑将比准备/运行/检查覆盖率结果更快的话。

#16


0  

I had a friend ask me this very question today, and I looked around at some promising Clang developments, e.g. ASTMatchers and the Static Analyzer that might have sufficient visibility in the goings-on during compiling to determine the dead code sections, but then I found this:

今天我有个朋友问我这个问题,我环顾四周,发现了一些很有前途的Clang开发,比如ASTMatchers和静态分析器,它们在编译过程中可能有足够的可见性来确定死代码段,但是我发现:

https://blog.flameeyes.eu/2008/01/today-how-to-identify-unused-exported-functions-and-variables

https://blog.flameeyes.eu/2008/01/today-how-to-identify-unused-exported-functions-and-variables

It's pretty much a complete description of how to use a few GCC flags that are seemingly designed for the purpose of identifying unreferenced symbols!

这几乎是一个完整的描述,说明如何使用一些似乎为识别未引用符号而设计的GCC标志。

#17


0  

The general problem of if some function will be called is NP-Complete. You cannot know in advance in a general way if some function will be called as you won't know if a Turing machine will ever stop. You can get if there's some path (statically) that goes from main() to the function you have written, but that doesn't warrant you it will ever be called.

如果要调用某个函数,一般的问题是np完备性。如果你不知道一个图灵机是否会停止,你不可能提前知道一些函数是否会被调用。如果有从main()到您编写的函数的路径(静态),您可以得到,但这并不保证会调用它。

#18


-3  

Well if you using g++ you can use this flag -Wunused

如果你用g+你可以用这个标记- wused

According documentation:

根据文档:

Warn whenever a variable is unused aside from its declaration, whenever a function is declared static but never defined, whenever a label is declared but not used, and whenever a statement computes a result that is explicitly not used.

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

http://docs.freebsd.org/info/gcc/gcc.info.Warning_Options.html

Edit: Here is other usefull flag -Wunreachable-code According documentation:

编辑:这里有其他有用的标志- wunreach -code根据文档:

This option is intended to warn when the compiler detects that at least a whole line of source code will never be executed, because some condition is never satisfied or because it is after a procedure that never returns.