获取std :: string的字符串的大小(以字节为单位)

时间:2020-12-30 21:43:26

I would like to get the bytes a std::string's string occupies in memory, not the number of characters. The string contains a multibyte string. Would std::string::size() do this for me?

我想得到std :: string的字符串在内存中占用的字节数,而不是字符数。该字符串包含多字节字符串。 std :: string :: size()会为我做这个吗?

EDIT: Also, does size() also include the terminating NULL?

编辑:另外,size()还包括终止NULL吗?

6 个解决方案

#1


21  

std::string operates on bytes, not on Unicode characters, so std::string::size() will indeed return the size of the data in bytes (without the overhead that std::string needs to store the data, of course).

std :: string对字节进行操作,而不是对Unicode字符进行操作,因此std :: string :: size()确实会以字节为单位返回数据的大小(当然,没有std :: string需要存储数据的开销) )。

No, std::string stores only the data you tell it to store (it does not need the trailing NULL character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL character.

不,std :: string只存储您告诉它存储的数据(它不需要尾随的NULL字符)。因此它不会包含在大小中,除非您显式创建一个尾随NULL字符的字符串。

#2


6  

You could be pedantic about it:

你可能会对此迂腐:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

But std::string::value_type is char and sizeof(char) is defined as 1.

但是std :: string :: value_type是char,而sizeof(char)定义为1。

This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).

这只有在你输入字符串类型时才变得很重要(因为它可能会在将来发生变化,或者因为编译器选项而变化)。

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);

#3


5  

std::string::size() is indeed the size in bytes.

std :: string :: size()的确是以字节为单位的大小。

#4


4  

To get the amount of memory in use by the string you would have to sum the capacity() with the overhead used for management. Note that it is capacity() and not size(). The capacity determines the number of characters (charT) allocated, while size() tells you how many of them are actually in use.

要获得字符串使用的内存量,您必须将capacity()与用于管理的开销相加。请注意,它是capacity()而不是size()。容量决定了分配的字符数(charT),而size()告诉你实际使用了多少个字符。

In particular, std::string implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size() will be decremented, but in most cases (this is implementation defined) capacity() will not.

特别是,std :: string实现通常不会* shrink_to_fit *内容,所以如果你创建一个字符串然后从末尾删除元素,size()将递减,但在大多数情况下(这是实现定义) capacity()不会。

Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17 could be allocating as much as 32 characters.

某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在为字符串使用两个大小的块的功率的实现中,大小为17的字符串可以分配多达32个字符。

#5


2  

Yes, size() will give you the number of char in the string. One character in multibyte encoding take up multiple char.

是的,size()将为您提供字符串中的char数。多字节编码中的一个字符占用多个字符。

#6


0  

There is inherent conflict in the question as written: std::string is defined as std::basic_string<char,...> -- that is, its element type is char (1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t?).

在写的问题中存在固有的冲突:std :: string被定义为std :: basic_string - 也就是说,它的元素类型是char(1字节),但后来你说“ string包含多字节字符串“(”multibyte“== wchar_t?)。 ,...>

The size() member function does not count a trailing null. It's value represents the number of characters (not bytes).

size()成员函数不计算尾随空值。它的值表示字符数(不是字节数)。

Assuming you intended to say your multibyte string is std::wstring (alias for std::basic_string<wchar_t,...>), the memory footprint for the std::wstring's characters, including the null-terminator is:

假设您打算说您的多字节字符串是std :: wstring(std :: basic_string 的别名),std :: wstring的字符的内存占用量,包括null终止符是: ,...>

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:

考虑如何编写一个可重用的模板函数,这对于std :: basic_string <>的任何潜在实例化都是有用的,就像这样**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<> (they have defaults).

**为简单起见,忽略很少为std :: basic_string <>显式指定的traits和allocator类型(它们有默认值)。

#1


21  

std::string operates on bytes, not on Unicode characters, so std::string::size() will indeed return the size of the data in bytes (without the overhead that std::string needs to store the data, of course).

std :: string对字节进行操作,而不是对Unicode字符进行操作,因此std :: string :: size()确实会以字节为单位返回数据的大小(当然,没有std :: string需要存储数据的开销) )。

No, std::string stores only the data you tell it to store (it does not need the trailing NULL character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL character.

不,std :: string只存储您告诉它存储的数据(它不需要尾随的NULL字符)。因此它不会包含在大小中,除非您显式创建一个尾随NULL字符的字符串。

#2


6  

You could be pedantic about it:

你可能会对此迂腐:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

But std::string::value_type is char and sizeof(char) is defined as 1.

但是std :: string :: value_type是char,而sizeof(char)定义为1。

This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).

这只有在你输入字符串类型时才变得很重要(因为它可能会在将来发生变化,或者因为编译器选项而变化)。

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);

#3


5  

std::string::size() is indeed the size in bytes.

std :: string :: size()的确是以字节为单位的大小。

#4


4  

To get the amount of memory in use by the string you would have to sum the capacity() with the overhead used for management. Note that it is capacity() and not size(). The capacity determines the number of characters (charT) allocated, while size() tells you how many of them are actually in use.

要获得字符串使用的内存量,您必须将capacity()与用于管理的开销相加。请注意,它是capacity()而不是size()。容量决定了分配的字符数(charT),而size()告诉你实际使用了多少个字符。

In particular, std::string implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size() will be decremented, but in most cases (this is implementation defined) capacity() will not.

特别是,std :: string实现通常不会* shrink_to_fit *内容,所以如果你创建一个字符串然后从末尾删除元素,size()将递减,但在大多数情况下(这是实现定义) capacity()不会。

Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17 could be allocating as much as 32 characters.

某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在为字符串使用两个大小的块的功率的实现中,大小为17的字符串可以分配多达32个字符。

#5


2  

Yes, size() will give you the number of char in the string. One character in multibyte encoding take up multiple char.

是的,size()将为您提供字符串中的char数。多字节编码中的一个字符占用多个字符。

#6


0  

There is inherent conflict in the question as written: std::string is defined as std::basic_string<char,...> -- that is, its element type is char (1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t?).

在写的问题中存在固有的冲突:std :: string被定义为std :: basic_string - 也就是说,它的元素类型是char(1字节),但后来你说“ string包含多字节字符串“(”multibyte“== wchar_t?)。 ,...>

The size() member function does not count a trailing null. It's value represents the number of characters (not bytes).

size()成员函数不计算尾随空值。它的值表示字符数(不是字节数)。

Assuming you intended to say your multibyte string is std::wstring (alias for std::basic_string<wchar_t,...>), the memory footprint for the std::wstring's characters, including the null-terminator is:

假设您打算说您的多字节字符串是std :: wstring(std :: basic_string 的别名),std :: wstring的字符的内存占用量,包括null终止符是: ,...>

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:

考虑如何编写一个可重用的模板函数,这对于std :: basic_string <>的任何潜在实例化都是有用的,就像这样**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<> (they have defaults).

**为简单起见,忽略很少为std :: basic_string <>显式指定的traits和allocator类型(它们有默认值)。