I would like to get the bytes a std::string
's string occupies in memory, not the number of characters. The string contains a multibyte string. Would std::string::size()
do this for me?
我想得到std :: string的字符串在内存中占用的字节数,而不是字符数。该字符串包含多字节字符串。 std :: string :: size()会为我做这个吗?
EDIT: Also, does size()
also include the terminating NULL
?
编辑:另外,size()还包括终止NULL吗?
6 个解决方案
#1
21
std::string
operates on bytes, not on Unicode characters, so std::string::size()
will indeed return the size of the data in bytes (without the overhead that std::string
needs to store the data, of course).
std :: string对字节进行操作,而不是对Unicode字符进行操作,因此std :: string :: size()确实会以字节为单位返回数据的大小(当然,没有std :: string需要存储数据的开销) )。
No, std::string
stores only the data you tell it to store (it does not need the trailing NULL
character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL
character.
不,std :: string只存储您告诉它存储的数据(它不需要尾随的NULL字符)。因此它不会包含在大小中,除非您显式创建一个尾随NULL字符的字符串。
#2
6
You could be pedantic about it:
你可能会对此迂腐:
std::string x("X");
std::cout << x.size() * sizeof(std::string::value_type);
But std::string::value_type is char and sizeof(char) is defined as 1.
但是std :: string :: value_type是char,而sizeof(char)定义为1。
This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).
这只有在你输入字符串类型时才变得很重要(因为它可能会在将来发生变化,或者因为编译器选项而变化)。
// Some header file:
typedef std::basic_string<T_CHAR> T_string;
// Source a million miles away
T_string x("X");
std::cout << x.size() * sizeof(T_string::value_type);
#3
5
std::string::size()
is indeed the size in bytes.
std :: string :: size()的确是以字节为单位的大小。
#4
4
To get the amount of memory in use by the string you would have to sum the capacity()
with the overhead used for management. Note that it is capacity()
and not size()
. The capacity determines the number of characters (charT
) allocated, while size()
tells you how many of them are actually in use.
要获得字符串使用的内存量,您必须将capacity()与用于管理的开销相加。请注意,它是capacity()而不是size()。容量决定了分配的字符数(charT),而size()告诉你实际使用了多少个字符。
In particular, std::string
implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size()
will be decremented, but in most cases (this is implementation defined) capacity()
will not.
特别是,std :: string实现通常不会* shrink_to_fit *内容,所以如果你创建一个字符串然后从末尾删除元素,size()将递减,但在大多数情况下(这是实现定义) capacity()不会。
Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17
could be allocating as much as 32
characters.
某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在为字符串使用两个大小的块的功率的实现中,大小为17的字符串可以分配多达32个字符。
#5
2
Yes, size() will give you the number of char
in the string. One character in multibyte encoding take up multiple char
.
是的,size()将为您提供字符串中的char数。多字节编码中的一个字符占用多个字符。
#6
0
There is inherent conflict in the question as written: std::string
is defined as std::basic_string<char,...>
-- that is, its element type is char
(1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t
?).
在写的问题中存在固有的冲突:std :: string被定义为std :: basic_string
The size()
member function does not count a trailing null. It's value represents the number of characters (not bytes).
size()成员函数不计算尾随空值。它的值表示字符数(不是字节数)。
Assuming you intended to say your multibyte string is std::wstring
(alias for std::basic_string<wchar_t,...>
), the memory footprint for the std::wstring
's characters, including the null-terminator is:
假设您打算说您的多字节字符串是std :: wstring(std :: basic_string
std::wstring myString;
...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);
It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:
考虑如何编写一个可重用的模板函数,这对于std :: basic_string <>的任何潜在实例化都是有用的,就像这样**:
// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}
** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<>
(they have defaults).
**为简单起见,忽略很少为std :: basic_string <>显式指定的traits和allocator类型(它们有默认值)。
#1
21
std::string
operates on bytes, not on Unicode characters, so std::string::size()
will indeed return the size of the data in bytes (without the overhead that std::string
needs to store the data, of course).
std :: string对字节进行操作,而不是对Unicode字符进行操作,因此std :: string :: size()确实会以字节为单位返回数据的大小(当然,没有std :: string需要存储数据的开销) )。
No, std::string
stores only the data you tell it to store (it does not need the trailing NULL
character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL
character.
不,std :: string只存储您告诉它存储的数据(它不需要尾随的NULL字符)。因此它不会包含在大小中,除非您显式创建一个尾随NULL字符的字符串。
#2
6
You could be pedantic about it:
你可能会对此迂腐:
std::string x("X");
std::cout << x.size() * sizeof(std::string::value_type);
But std::string::value_type is char and sizeof(char) is defined as 1.
但是std :: string :: value_type是char,而sizeof(char)定义为1。
This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).
这只有在你输入字符串类型时才变得很重要(因为它可能会在将来发生变化,或者因为编译器选项而变化)。
// Some header file:
typedef std::basic_string<T_CHAR> T_string;
// Source a million miles away
T_string x("X");
std::cout << x.size() * sizeof(T_string::value_type);
#3
5
std::string::size()
is indeed the size in bytes.
std :: string :: size()的确是以字节为单位的大小。
#4
4
To get the amount of memory in use by the string you would have to sum the capacity()
with the overhead used for management. Note that it is capacity()
and not size()
. The capacity determines the number of characters (charT
) allocated, while size()
tells you how many of them are actually in use.
要获得字符串使用的内存量,您必须将capacity()与用于管理的开销相加。请注意,它是capacity()而不是size()。容量决定了分配的字符数(charT),而size()告诉你实际使用了多少个字符。
In particular, std::string
implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size()
will be decremented, but in most cases (this is implementation defined) capacity()
will not.
特别是,std :: string实现通常不会* shrink_to_fit *内容,所以如果你创建一个字符串然后从末尾删除元素,size()将递减,但在大多数情况下(这是实现定义) capacity()不会。
Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17
could be allocating as much as 32
characters.
某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在为字符串使用两个大小的块的功率的实现中,大小为17的字符串可以分配多达32个字符。
#5
2
Yes, size() will give you the number of char
in the string. One character in multibyte encoding take up multiple char
.
是的,size()将为您提供字符串中的char数。多字节编码中的一个字符占用多个字符。
#6
0
There is inherent conflict in the question as written: std::string
is defined as std::basic_string<char,...>
-- that is, its element type is char
(1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t
?).
在写的问题中存在固有的冲突:std :: string被定义为std :: basic_string
The size()
member function does not count a trailing null. It's value represents the number of characters (not bytes).
size()成员函数不计算尾随空值。它的值表示字符数(不是字节数)。
Assuming you intended to say your multibyte string is std::wstring
(alias for std::basic_string<wchar_t,...>
), the memory footprint for the std::wstring
's characters, including the null-terminator is:
假设您打算说您的多字节字符串是std :: wstring(std :: basic_string
std::wstring myString;
...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);
It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:
考虑如何编写一个可重用的模板函数,这对于std :: basic_string <>的任何潜在实例化都是有用的,就像这样**:
// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}
** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<>
(they have defaults).
**为简单起见,忽略很少为std :: basic_string <>显式指定的traits和allocator类型(它们有默认值)。