如何将UTF-8编码的std :: string转换为UTF-16 std :: string

时间:2021-11-16 20:14:01

How can i convert UTF-8 encoded std::string to UTF-16 std::string? Is it possible?

如何将UTF-8编码的std :: string转换为UTF-16 std :: string?可能吗?

And no, i can't use std::wstring in my case.

不,我不能在我的情况下使用std :: wstring。

Windows, MSVC-11.0.

2 个解决方案

#1


4  

How about trying like this:-

怎么样这样: -

std::string s = u8"Your string";

// #include <codecvt>
std::wstring_convert<std::codecvt<char16_t,char,std::mbstate_t>,char16_t> convert;

std::u16string u16 = convert.from_bytes(s);
std::string u8 = convert.to_bytes(u16);

Also check this for UTF to UTF conversion.

同时检查这是否为UTF转换为UTF。

From the docs:-

来自文档: -

The specialization codecvt converts between the UTF-16 and UTF-8 encoding schemes, and the specialization codecvt converts between the UTF-32 and UTF-8 encoding schemes.

专门化codecvt在UTF-16和UTF-8编码方案之间进行转换,专门化codecvt在UTF-32和UTF-8编码方案之间进行转换。

#2


0  

I've come across dozens of such problems trying to do this and similar with Visual Studio, and just gave up. There is a known issue linking when doing conversions with e.g. std::wstring's convert and using std::codecvt.

我遇到过几十个这样的问题试图做到这一点,与Visual Studio类似,只是放弃了。将转换与例如转换时的链接存在已知问题。 std :: wstring的转换和使用std :: codecvt。

Please see here: Convert C++ std::string to UTF-16-LE encoded string

请看这里:将C ++ std :: string转换为UTF-16-LE编码的字符串

What I did to resolve my problem was copied in the code from a kind poster, which uses the iconv library. Then all I had to do was call convert(my_str, strlen(my_str), &used_bytes), where my_str was a char[], strlen(my_str) was its length, and size_t used_bytes = strlen(my_str)*3; I just gave it enough bytes to work with. In that function, you can change iconv_t foo = iconv_open("UTF-16", "UTF-8"), investigate the setlocale() and creation of the enc string passed to iconv_open() above in the function which is sitting there in all it's glory in the link above.

我用来解决我的问题所做的是从一张使用iconv库的海报中复制的代码。然后我所要做的就是调用convert(my_str,strlen(my_str)和used_bytes),其中my_str是char [],strlen(my_str)是它的长度,size_t used_bytes = strlen(my_str)* 3;我只是给它足够的字节来处理。在该函数中,您可以更改iconv_t foo = iconv_open(“UTF-16”,“UTF-8”),调查setlocale()并创建在上面的函数中传递给iconv_open()的enc字符串。所有这都是上面链接中的荣耀。

The gotcha is compiling and using iconv, it almost expects Cygwin or such on Windows, but you can use that with Visual Studio. There is a purely Win32 libiconv at https://github.com/win-iconv/win-iconv which might suit your needs.

这个问题正在编译和使用iconv,它几乎可以在Windows上使用Cygwin等,但你可以在Visual Studio中使用它。 https://github.com/win-iconv/win-iconv上有一个纯粹的Win32 libiconv可能适合您的需求。

I would say give iconv a try, and see how it goes in a short test program. Good luck!

我会说尝试一下iconv,看看它是如何进入一个简短的测试程序。祝好运!

#1


4  

How about trying like this:-

怎么样这样: -

std::string s = u8"Your string";

// #include <codecvt>
std::wstring_convert<std::codecvt<char16_t,char,std::mbstate_t>,char16_t> convert;

std::u16string u16 = convert.from_bytes(s);
std::string u8 = convert.to_bytes(u16);

Also check this for UTF to UTF conversion.

同时检查这是否为UTF转换为UTF。

From the docs:-

来自文档: -

The specialization codecvt converts between the UTF-16 and UTF-8 encoding schemes, and the specialization codecvt converts between the UTF-32 and UTF-8 encoding schemes.

专门化codecvt在UTF-16和UTF-8编码方案之间进行转换,专门化codecvt在UTF-32和UTF-8编码方案之间进行转换。

#2


0  

I've come across dozens of such problems trying to do this and similar with Visual Studio, and just gave up. There is a known issue linking when doing conversions with e.g. std::wstring's convert and using std::codecvt.

我遇到过几十个这样的问题试图做到这一点,与Visual Studio类似,只是放弃了。将转换与例如转换时的链接存在已知问题。 std :: wstring的转换和使用std :: codecvt。

Please see here: Convert C++ std::string to UTF-16-LE encoded string

请看这里:将C ++ std :: string转换为UTF-16-LE编码的字符串

What I did to resolve my problem was copied in the code from a kind poster, which uses the iconv library. Then all I had to do was call convert(my_str, strlen(my_str), &used_bytes), where my_str was a char[], strlen(my_str) was its length, and size_t used_bytes = strlen(my_str)*3; I just gave it enough bytes to work with. In that function, you can change iconv_t foo = iconv_open("UTF-16", "UTF-8"), investigate the setlocale() and creation of the enc string passed to iconv_open() above in the function which is sitting there in all it's glory in the link above.

我用来解决我的问题所做的是从一张使用iconv库的海报中复制的代码。然后我所要做的就是调用convert(my_str,strlen(my_str)和used_bytes),其中my_str是char [],strlen(my_str)是它的长度,size_t used_bytes = strlen(my_str)* 3;我只是给它足够的字节来处理。在该函数中,您可以更改iconv_t foo = iconv_open(“UTF-16”,“UTF-8”),调查setlocale()并创建在上面的函数中传递给iconv_open()的enc字符串。所有这都是上面链接中的荣耀。

The gotcha is compiling and using iconv, it almost expects Cygwin or such on Windows, but you can use that with Visual Studio. There is a purely Win32 libiconv at https://github.com/win-iconv/win-iconv which might suit your needs.

这个问题正在编译和使用iconv,它几乎可以在Windows上使用Cygwin等,但你可以在Visual Studio中使用它。 https://github.com/win-iconv/win-iconv上有一个纯粹的Win32 libiconv可能适合您的需求。

I would say give iconv a try, and see how it goes in a short test program. Good luck!

我会说尝试一下iconv,看看它是如何进入一个简短的测试程序。祝好运!