使用c++字符串可能导致内存泄漏

时间:2022-06-06 09:00:57

Consider the following C++ program:

考虑以下c++程序:

#include <cstdlib> // for exit(3)
#include <string>
#include <iostream>
using namespace std;

void die()
{
    exit(0);
}

int main()
{
    string s("Hello, World!");
    cout << s << endl;
    die();
}

Running this through valgrind shows this (some output trimmed for brevity):

在valgrind中运行这段代码,可以看到以下内容(为简洁起见,对输出进行了一些调整):

==1643== HEAP SUMMARY:
==1643==     in use at exit: 26 bytes in 1 blocks
==1643==   total heap usage: 1 allocs, 0 frees, 26 bytes allocated
==1643==
==1643== LEAK SUMMARY:
==1643==    definitely lost: 0 bytes in 0 blocks
==1643==    indirectly lost: 0 bytes in 0 blocks
==1643==      possibly lost: 26 bytes in 1 blocks
==1643==    still reachable: 0 bytes in 0 blocks
==1643==         suppressed: 0 bytes in 0 blocks

As you can see, there's a possibility that 26 bytes allocated on the heap were lost. I know that the std::string class has a 12-byte struct (at least on my 32-bit x86 arch and GNU compiler 4.2.4), and "Hello, World!" with a null terminator has 14 bytes. If I understand it correctly, the 12-byte structure contains a pointer to the character string, the allocated size, and the reference count (someone correct me if I'm wrong here).

如您所见,在堆上分配的26字节有可能丢失。我知道std::string类有一个12字节的结构体(至少在我的32位x86 arch和GNU编译器4.2.4上是这样的),而“Hello, World!”带一个空终止符有14个字节。如果我理解正确,那么12字节的结构包含一个指向字符串、分配的大小和引用计数的指针(如果我做错了,请指正)。

Now my questions: How are C++ strings stored with regard to the stack/heap? Does a stack object exist for a std::string (or other STL containers) when declared?

现在我的问题是:如何将c++字符串存储在堆栈/堆中?声明std::string(或其他STL容器)时是否存在堆栈对象?

P.S. I've read somewhere that valgrind may report a false positive of a memory leak in some C++ programs that use STL containers (and "almost-containers" such as std::string). I'm not too worried about this leak, but it does pique my curiosity regarding STL containers and memory management.

附注:我曾在某些使用STL容器的c++程序中读到过,valgrind可能报告内存泄漏的假阳性(以及几乎是容器),比如std::string。我不太担心这个泄漏,但它确实激起了我对STL容器和内存管理的好奇。

5 个解决方案

#1


7  

Others are correct, you are leaking because you are calling exit. To be clear, the leak isn't the string allocated on the stack, it is memory allocated on the heap by the string. For example:

其他人是正确的,你泄漏是因为你在调用exit。显然,泄漏不是堆栈上分配的字符串,而是字符串在堆上分配的内存。例如:

struct Foo { };

int main()
{
    Foo f;
    die();
}

will not cause valgrind to report a leak.

不会导致阀研报告泄漏。

The leak is probable (instead of definite) because you have an interior pointer to memory allocated on the heap. basic_string is responsible for this. From the header on my machine:

泄漏是可能的(而不是确定的),因为您有一个指向堆上分配的内存的内部指针。basic_string负责这个。从我机器上的标题:

   *  A string looks like this:
   *
   *  @code
   *                                        [_Rep]
   *                                        _M_length
   *   [basic_string<char_type>]            _M_capacity
   *   _M_dataplus                          _M_refcount
   *   _M_p ---------------->               unnamed array of char_type
   *  @endcode
   *
   *  Where the _M_p points to the first character in the string, and
   *  you cast it to a pointer-to-_Rep and subtract 1 to get a
   *  pointer to the header.

They key is that _M_p doesn't point to the start of the memory allocated on the heap, it points to the first character in the string. Here is a simple example:

它们的关键是_M_p不指向在堆上分配的内存的开始,它指向字符串中的第一个字符。这里有一个简单的例子:

struct Foo
{
    Foo()
    {
        // Allocate 4 ints.
        m_data = new int[4];
        // Move the pointer.
        ++m_data;
        // Null the pointer
        //m_data = 0;
    }
    ~Foo()
    {
        // Put the pointer back, then delete it.
        --m_data;
        delete [] m_data;
    }
    int* m_data;
};

int main()
{
    Foo f;
    die();
}

This will report a probable leak in valgrind. If you comment out the lines where I move m_data valgrind will report 'still reachable'. If you uncomment the line where I set m_data to 0 you'll get a definite leak.

这将报告可能在valgrind泄漏。如果你注释掉我移动m_data valgrind的行,你会发现它仍然是可访问的。如果您取消我将m_data设置为0的那一行的注释,您将得到一个确定的泄漏。

The valgrind documentation has more information on probable leaks and interior pointers.

valgrind文档有关于可能的泄漏和内部指针的更多信息。

#2


11  

Calling exit "terminates the program without leaving the current block and hence without destroying any objects with automatic storage duration".

调用exit“在不离开当前块的情况下终止程序,因此不会破坏任何具有自动存储持续时间的对象”。

In other words, leak or not, you shouldn't really care. When you call exit, you're saying "close this program, I no longer care about anything in it." So stop caring. :)

换句话说,不管泄漏与否,你都不应该真正在意。当你调用exit时,你的意思是“关闭这个程序,我不再关心它里面的任何东西。”所以不要关心。:)

Obviously it's going to leak resources because you never let the destructor of the string run, absolutely regardless of how it manages those resources.

显然,它会泄漏资源,因为您永远不会让字符串的析构函数运行,无论它如何管理这些资源。

#3


4  

Of course this "leaks", by exiting before s's stack frame is left you don't give s's destructor a chance to execute.

当然,这个“泄漏”,在s的堆栈帧被留下之前退出,就不会给s的析构函数一个执行的机会。

As for your question wrt std::string storage: Different implementations do different things. Some allocate some 12 bytes on the stack which is used if the string is 12 bytes or shorter. Longer strings go to the heap. Other implementations always go to the heap. Some are reference counted and with copy-on-write semantics, some not. Please turn to Scott Meyers' Effective STL, Item 15.

至于你的问题wrt std::string storage:不同的实现做不同的事情。有些会在堆栈上分配12个字节,如果字符串是12字节或更短的话,就会使用这个堆栈。更长的字符串进入堆。其他实现总是进入堆。有些是引用计数和写时复制语义,有些不是。请参阅Scott Meyers' s Effective STL,第15项。

#4


1  

gcc STL has private memory pool for containers and strings. You can turn this off ; look in valgrind FAQ

gcc STL为容器和字符串提供私有内存池。你可以把它关掉;看看valgrind常见问题解答

http://valgrind.org/docs/manual/faq.html#faq.reports

http://valgrind.org/docs/manual/faq.html faq.reports

#5


1  

I would avoid using exit() I see no real reason to use that call. Not sure if it will cause the process to stop instantly without cleaning up the memory first although valgrind does still appear to run.

我将避免使用exit(),我没有看到使用这个调用的真正原因。不确定它是否会导致进程立即停止,而不首先清理内存,尽管valgrind仍然运行。

#1


7  

Others are correct, you are leaking because you are calling exit. To be clear, the leak isn't the string allocated on the stack, it is memory allocated on the heap by the string. For example:

其他人是正确的,你泄漏是因为你在调用exit。显然,泄漏不是堆栈上分配的字符串,而是字符串在堆上分配的内存。例如:

struct Foo { };

int main()
{
    Foo f;
    die();
}

will not cause valgrind to report a leak.

不会导致阀研报告泄漏。

The leak is probable (instead of definite) because you have an interior pointer to memory allocated on the heap. basic_string is responsible for this. From the header on my machine:

泄漏是可能的(而不是确定的),因为您有一个指向堆上分配的内存的内部指针。basic_string负责这个。从我机器上的标题:

   *  A string looks like this:
   *
   *  @code
   *                                        [_Rep]
   *                                        _M_length
   *   [basic_string<char_type>]            _M_capacity
   *   _M_dataplus                          _M_refcount
   *   _M_p ---------------->               unnamed array of char_type
   *  @endcode
   *
   *  Where the _M_p points to the first character in the string, and
   *  you cast it to a pointer-to-_Rep and subtract 1 to get a
   *  pointer to the header.

They key is that _M_p doesn't point to the start of the memory allocated on the heap, it points to the first character in the string. Here is a simple example:

它们的关键是_M_p不指向在堆上分配的内存的开始,它指向字符串中的第一个字符。这里有一个简单的例子:

struct Foo
{
    Foo()
    {
        // Allocate 4 ints.
        m_data = new int[4];
        // Move the pointer.
        ++m_data;
        // Null the pointer
        //m_data = 0;
    }
    ~Foo()
    {
        // Put the pointer back, then delete it.
        --m_data;
        delete [] m_data;
    }
    int* m_data;
};

int main()
{
    Foo f;
    die();
}

This will report a probable leak in valgrind. If you comment out the lines where I move m_data valgrind will report 'still reachable'. If you uncomment the line where I set m_data to 0 you'll get a definite leak.

这将报告可能在valgrind泄漏。如果你注释掉我移动m_data valgrind的行,你会发现它仍然是可访问的。如果您取消我将m_data设置为0的那一行的注释,您将得到一个确定的泄漏。

The valgrind documentation has more information on probable leaks and interior pointers.

valgrind文档有关于可能的泄漏和内部指针的更多信息。

#2


11  

Calling exit "terminates the program without leaving the current block and hence without destroying any objects with automatic storage duration".

调用exit“在不离开当前块的情况下终止程序,因此不会破坏任何具有自动存储持续时间的对象”。

In other words, leak or not, you shouldn't really care. When you call exit, you're saying "close this program, I no longer care about anything in it." So stop caring. :)

换句话说,不管泄漏与否,你都不应该真正在意。当你调用exit时,你的意思是“关闭这个程序,我不再关心它里面的任何东西。”所以不要关心。:)

Obviously it's going to leak resources because you never let the destructor of the string run, absolutely regardless of how it manages those resources.

显然,它会泄漏资源,因为您永远不会让字符串的析构函数运行,无论它如何管理这些资源。

#3


4  

Of course this "leaks", by exiting before s's stack frame is left you don't give s's destructor a chance to execute.

当然,这个“泄漏”,在s的堆栈帧被留下之前退出,就不会给s的析构函数一个执行的机会。

As for your question wrt std::string storage: Different implementations do different things. Some allocate some 12 bytes on the stack which is used if the string is 12 bytes or shorter. Longer strings go to the heap. Other implementations always go to the heap. Some are reference counted and with copy-on-write semantics, some not. Please turn to Scott Meyers' Effective STL, Item 15.

至于你的问题wrt std::string storage:不同的实现做不同的事情。有些会在堆栈上分配12个字节,如果字符串是12字节或更短的话,就会使用这个堆栈。更长的字符串进入堆。其他实现总是进入堆。有些是引用计数和写时复制语义,有些不是。请参阅Scott Meyers' s Effective STL,第15项。

#4


1  

gcc STL has private memory pool for containers and strings. You can turn this off ; look in valgrind FAQ

gcc STL为容器和字符串提供私有内存池。你可以把它关掉;看看valgrind常见问题解答

http://valgrind.org/docs/manual/faq.html#faq.reports

http://valgrind.org/docs/manual/faq.html faq.reports

#5


1  

I would avoid using exit() I see no real reason to use that call. Not sure if it will cause the process to stop instantly without cleaning up the memory first although valgrind does still appear to run.

我将避免使用exit(),我没有看到使用这个调用的真正原因。不确定它是否会导致进程立即停止,而不首先清理内存,尽管valgrind仍然运行。