C编程,unicode和linux终端

时间:2021-09-23 11:43:21

So what I'm trying to do is write Japanese characters to my terminal screen using C and wide characters.

所以我要做的就是用C和宽字符把日语字符写到终端屏幕上。

The question is whats wrong with what I'm doing so that I can fix it, what other caveats should I expect while using wide characters and do you have any other comments about what I'm trying to do?

问题是我所做的事情有什么问题,我可以修正它,我在使用宽字符时应该期待什么其他的注意事项,你对我正在做的事情有什么其他的评论吗?




The bad code:

坏的代码:

#include <stdio.h>
#include <wchar.h>

int main( ) {
    wprintf(L"%c\n", L"\x3074");
}

This doesn't work, but I want to know why.


这行不通,但我想知道为什么。

the problem only gets worse when I try to use a wchar_t to hold a value:

当我试图使用wchar_t来保存一个值时,问题只会变得更糟:

wchar_t pi_0 = 0x3074;      // prints a "t" when used with wprintf
wchar_t pi_1 = "\x3074";    // gives compile time warning
wchar_t pi_2 = L"\x3074";   // gives compile time warning

So I'd also like to make this work too, as I plan on having data structures holding strings of these characters.

所以我也想让这个工作,因为我计划让数据结构持有这些字符的字符串。




Thanks!

谢谢!

2 个解决方案

#1


10  

The type of "\x3074" is const char[] and the type of L"\x3074" is const wchar_t[].

“\x3074”的类型是const char[], L“\x3074”的类型是const wchar_t[]。

If you need a wchar_t, use single quotes:

如果您需要wchar_t,请使用单引号:

L'\x3074'

Also %c prints a char, but for wchar_t you need a %lc.

同样,%c打印一个字符,但是对于wchar_t,您需要一个%lc。

#2


4  

There are at least two problems in the code.

代码中至少有两个问题。

  • the first one has been pointed out by Kenny, the format doesn't match the argument
  • 第一个是肯尼指出的,格式与论点不符。
  • the second one is that you miss a call to setlocale()
  • 第二个问题是您错过了对setlocale()的调用

(There is also the assumption that the wide character set is Unicode -- I seem to remember it is always the case for Linux, but it isn't universal).

(还有一种假设,即宽字符集是Unicode——我似乎记得它一直是Linux的情况,但它不是通用的)。

In a correctly configured terminal,

在正确配置的终端中,

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main( ) {
    setlocale(LC_ALL, "");
    wprintf(L"%ls\n", L"\x0152\x3074");
    return 0;
}

should work. If it doesn't, I would start by checking the result of setlocale() and wprint().

应该工作。如果没有,我将首先检查setlocale()和wprint()的结果。

(I've added U+0152 which is the OE ligature so that I can check the behavior; I'm not using a font which has U+3074)

(我已经加了U+0152,这是OE线,这样我可以检查行为;我没有使用U+3074的字体

#1


10  

The type of "\x3074" is const char[] and the type of L"\x3074" is const wchar_t[].

“\x3074”的类型是const char[], L“\x3074”的类型是const wchar_t[]。

If you need a wchar_t, use single quotes:

如果您需要wchar_t,请使用单引号:

L'\x3074'

Also %c prints a char, but for wchar_t you need a %lc.

同样,%c打印一个字符,但是对于wchar_t,您需要一个%lc。

#2


4  

There are at least two problems in the code.

代码中至少有两个问题。

  • the first one has been pointed out by Kenny, the format doesn't match the argument
  • 第一个是肯尼指出的,格式与论点不符。
  • the second one is that you miss a call to setlocale()
  • 第二个问题是您错过了对setlocale()的调用

(There is also the assumption that the wide character set is Unicode -- I seem to remember it is always the case for Linux, but it isn't universal).

(还有一种假设,即宽字符集是Unicode——我似乎记得它一直是Linux的情况,但它不是通用的)。

In a correctly configured terminal,

在正确配置的终端中,

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main( ) {
    setlocale(LC_ALL, "");
    wprintf(L"%ls\n", L"\x0152\x3074");
    return 0;
}

should work. If it doesn't, I would start by checking the result of setlocale() and wprint().

应该工作。如果没有,我将首先检查setlocale()和wprint()的结果。

(I've added U+0152 which is the OE ligature so that I can check the behavior; I'm not using a font which has U+3074)

(我已经加了U+0152,这是OE线,这样我可以检查行为;我没有使用U+3074的字体