如何在delphi 7中将unicode字符转换为ascii代码?

时间:2022-08-09 20:18:42

Yes we're talking about ASCII codes. My appologies I'm not the Delphi dev here.

是的,我们在谈论ASCII代码。我的appologies我不是德尔福开发人员。

7 个解决方案

#1


6  

For Delphi 7, I'd get the free Unicode Library by Mike Lischke who is the author of Virtual Treeview.

对于Delphi 7,我将获得Virtual Treeview的作者Mike Lischke的免费Unicode库。

The libary includes a lot of conversion functions to go to and from Unicode, so you can use the ones that make most sense in your application.

该库包含许多转换和转换Unicode的转换函数,因此您可以使用在您的应用程序中最有意义的转换函数。

Or you can upgrade to Delphi 2009 which has built-in encoding routines, and its own library of conversion functions.

或者您可以升级到具有内置编码例程的Delphi 2009,以及它自己的转换函数库。

#2


3  

Let's get a few things straight. Character set (charset) and character encodings are two related but different concepts. A character set is an abstract list of characters with some sort of integer character code associated. Then there are character encodings, which is basically an algorithm that describes how the characters are represented in bytes.

让我们直截了当。字符集(charset)和字符编码是两个相关但不同的概念。字符集是一个抽象的字符列表,其中包含某种整数字符代码。然后是字符编码,它基本上是一种描述字符如何以字节表示的算法。

ASCII acts as both the character set and encoding. It uses 7 bits to express 128 characters (94 printable). Unicode on the other hand is a character set, expressing 1,114,112 code points. There are several encodings to represent Unicode strings but most notable ones are UTF-8, UTF-16, UTF-16LE, and UTF-32. In other words, a single Unicode character can be represented in different ways depending on the encodings.

ASCII充当字符集和编码。它使用7位表示128个字符(94个可打印)。另一方面,Unicode是一个字符集,表示1,114,112个代码点。有几种编码来表示Unicode字符串,但最值得注意的是UTF-8,UTF-16,UTF-16LE和UTF-32。换句话说,单个Unicode字符可以根据编码以不同方式表示。

How can I convert unicode characters to ascii codes in delphi 7?

如何在delphi 7中将unicode字符转换为ascii代码?

I think the question could be interpreted in two ways.

我认为这个问题可以用两种方式解释。

  1. I have a Unicode string in some encoding that only includes ASCII printable characters. How can I convert the string into a byte array of ASCII encoding?

    我在某些编码中有一个Unicode字符串,只包含ASCII可打印字符。如何将字符串转换为ASCII编码的字节数组?

  2. I have a Unicode string in some encoding that also includes non-ASCII printable characters such as Chinese characters. How can I encode the string into a ASCII encoding without losing information, and later decode it back to the original Unicode string?

    我在某些编码中有一个Unicode字符串,其中还包括非ASCII可打印字符,如中文字符。如何将字符串编码为ASCII编码而不丢失信息,然后将其解码回原始的Unicode字符串?

If you mean the first, you can load the Unicode string into WideString like Osman is saying and do

如果你的意思是第一个,你可以像在Osman说的那样将Unicode字符串加载到WideString中

var
  original: WideString;
  s: AnsiString;
begin
  s := AnsiString(original);

If you mean the second, you would need a generic encoding algorithm like Base64 encoding. You can use DCPBase64.pas included in David Barton's DCPcrypt v2 Beta 3.

如果你的意思是第二个,你需要像Base64编码这样的通用编码算法。您可以使用David Barton的DCPcrypt v2 Beta 3中包含的DCPBase64.pas。

#3


1  

It depends what your definition of conversion is. If you want to map the 127 lowest characters to the Unicode equivalent, you can use an explicit cast. But this creates garbage if the string contains higher characters.

这取决于您对转换的定义。如果要将127个最低字符映射到Unicode等效字符,可以使用显式强制转换。但是如果字符串包含更高的字符,则会产生垃圾。

If you want mappings like ë -> e and û -> u, you can write your own code. But be aware that there are always characters that can't be converted.

如果你想要ë - > e和û - > u这样的映射,你可以编写自己的代码。但请注意,始终存在无法转换的字符。

#4


1  

"ASCII" is the name of a specific mapping of characters to numbers, but some people say "ASCII code" when they don't really mean ASCII at all; they just want the numeric value of a character, whatever mapping is in effect at the time. Does that description apply to you?

“ASCII”是字符到数字的特定映射的名称,但有些人说“ASCII代码”时它们根本不是指ASCII;他们只想要一个字符的数值,无论当时有效的映射。这种描述是否适用于您?

If so, then you can use the Ord standard function to get the Unicode code-point value of whatever Unicode character you have.

如果是这样,那么您可以使用Ord标准函数来获取您拥有的任何Unicode字符的Unicode代码点值。

var
  wc: WideChar;
  ws: WideString;
  x: Word;

x := Ord(wc);
x := Ord(ws[1]);

If you really meant ASCII, though, then you'll have to be more specific about what sort of conversion you have in mind.

但是,如果你真的是指ASCII,那么你必须更具体地考虑你想要的转换类型。

#5


1  

As an example, the letter A is represented in unicode as U+0041 and in ansi as just 41. So converting that would be pretty simple, but you must find out how the unicode character is encoded. The most common are UTF-16 and UTF-8. UTF 16, is basically two bytes per character, but even that is an oversimplification, as a character may have more bytes. UTF-8 sounds as if it means 1 byte per character but can be 2 or 3. To further complicate matters, UTF-16 can be little endian or big endian. (U+0041 or U+4100).

例如,字母A在unicode中表示为U + 0041,在ansi中表示为41。因此,转换它将非常简单,但您必须找出unicode字符的编码方式。最常见的是UTF-16和UTF-8。 UTF 16,基本上是每个字符两个字节,但即使这样也过于简单化,因为字符可能有更多的字节。 UTF-8听起来好像它意味着每个字符1个字节,但可以是2或3.更复杂的是,UTF-16可以是小端或大端。 (U + 0041或U + 4100)。

Where your question makes no sense is if you wanted to for example convert the arabic letter ain U+0639 to ansi on an English locale. You can't.

你的问题没有意义,如果你想在英语语言环境中将阿拉伯语字母a + U + 0639转换为ansi。你不能。

#6


1  

See related questions on converting from Unicode to ASCII:

请参阅有关从Unicode转换为ASCII的相关问题:

In general, character set of hundreds thousands entries cannot be converted to character set of 127 entries without some loss of information or encoding scheme.

通常,数十万个条目的字符集不能转换为127个条目的字符集,而不会丢失一些信息或编码方案。

#7


1  

You can use the function in http://swissdelphicenter.ch/en/showcode.php?id=1692
It converts Unicode string to Ansi string using specified code page.
If you want convert using default system codepage (defined in regional options as non-unicode codepage) you can do it simply like following:

您可以使用http://swissdelphicenter.ch/en/showcode.php?id=1692中的函数。它使用指定的代码页将Unicode字符串转换为Ansi字符串。如果您想使用默认系统代码页进行转换(在区域选项中定义为非unicode代码页),您可以执行以下操作:

var
  ws: widestring;
  s: string;
begin
  s:=string(ws)

#1


6  

For Delphi 7, I'd get the free Unicode Library by Mike Lischke who is the author of Virtual Treeview.

对于Delphi 7,我将获得Virtual Treeview的作者Mike Lischke的免费Unicode库。

The libary includes a lot of conversion functions to go to and from Unicode, so you can use the ones that make most sense in your application.

该库包含许多转换和转换Unicode的转换函数,因此您可以使用在您的应用程序中最有意义的转换函数。

Or you can upgrade to Delphi 2009 which has built-in encoding routines, and its own library of conversion functions.

或者您可以升级到具有内置编码例程的Delphi 2009,以及它自己的转换函数库。

#2


3  

Let's get a few things straight. Character set (charset) and character encodings are two related but different concepts. A character set is an abstract list of characters with some sort of integer character code associated. Then there are character encodings, which is basically an algorithm that describes how the characters are represented in bytes.

让我们直截了当。字符集(charset)和字符编码是两个相关但不同的概念。字符集是一个抽象的字符列表,其中包含某种整数字符代码。然后是字符编码,它基本上是一种描述字符如何以字节表示的算法。

ASCII acts as both the character set and encoding. It uses 7 bits to express 128 characters (94 printable). Unicode on the other hand is a character set, expressing 1,114,112 code points. There are several encodings to represent Unicode strings but most notable ones are UTF-8, UTF-16, UTF-16LE, and UTF-32. In other words, a single Unicode character can be represented in different ways depending on the encodings.

ASCII充当字符集和编码。它使用7位表示128个字符(94个可打印)。另一方面,Unicode是一个字符集,表示1,114,112个代码点。有几种编码来表示Unicode字符串,但最值得注意的是UTF-8,UTF-16,UTF-16LE和UTF-32。换句话说,单个Unicode字符可以根据编码以不同方式表示。

How can I convert unicode characters to ascii codes in delphi 7?

如何在delphi 7中将unicode字符转换为ascii代码?

I think the question could be interpreted in two ways.

我认为这个问题可以用两种方式解释。

  1. I have a Unicode string in some encoding that only includes ASCII printable characters. How can I convert the string into a byte array of ASCII encoding?

    我在某些编码中有一个Unicode字符串,只包含ASCII可打印字符。如何将字符串转换为ASCII编码的字节数组?

  2. I have a Unicode string in some encoding that also includes non-ASCII printable characters such as Chinese characters. How can I encode the string into a ASCII encoding without losing information, and later decode it back to the original Unicode string?

    我在某些编码中有一个Unicode字符串,其中还包括非ASCII可打印字符,如中文字符。如何将字符串编码为ASCII编码而不丢失信息,然后将其解码回原始的Unicode字符串?

If you mean the first, you can load the Unicode string into WideString like Osman is saying and do

如果你的意思是第一个,你可以像在Osman说的那样将Unicode字符串加载到WideString中

var
  original: WideString;
  s: AnsiString;
begin
  s := AnsiString(original);

If you mean the second, you would need a generic encoding algorithm like Base64 encoding. You can use DCPBase64.pas included in David Barton's DCPcrypt v2 Beta 3.

如果你的意思是第二个,你需要像Base64编码这样的通用编码算法。您可以使用David Barton的DCPcrypt v2 Beta 3中包含的DCPBase64.pas。

#3


1  

It depends what your definition of conversion is. If you want to map the 127 lowest characters to the Unicode equivalent, you can use an explicit cast. But this creates garbage if the string contains higher characters.

这取决于您对转换的定义。如果要将127个最低字符映射到Unicode等效字符,可以使用显式强制转换。但是如果字符串包含更高的字符,则会产生垃圾。

If you want mappings like ë -> e and û -> u, you can write your own code. But be aware that there are always characters that can't be converted.

如果你想要ë - > e和û - > u这样的映射,你可以编写自己的代码。但请注意,始终存在无法转换的字符。

#4


1  

"ASCII" is the name of a specific mapping of characters to numbers, but some people say "ASCII code" when they don't really mean ASCII at all; they just want the numeric value of a character, whatever mapping is in effect at the time. Does that description apply to you?

“ASCII”是字符到数字的特定映射的名称,但有些人说“ASCII代码”时它们根本不是指ASCII;他们只想要一个字符的数值,无论当时有效的映射。这种描述是否适用于您?

If so, then you can use the Ord standard function to get the Unicode code-point value of whatever Unicode character you have.

如果是这样,那么您可以使用Ord标准函数来获取您拥有的任何Unicode字符的Unicode代码点值。

var
  wc: WideChar;
  ws: WideString;
  x: Word;

x := Ord(wc);
x := Ord(ws[1]);

If you really meant ASCII, though, then you'll have to be more specific about what sort of conversion you have in mind.

但是,如果你真的是指ASCII,那么你必须更具体地考虑你想要的转换类型。

#5


1  

As an example, the letter A is represented in unicode as U+0041 and in ansi as just 41. So converting that would be pretty simple, but you must find out how the unicode character is encoded. The most common are UTF-16 and UTF-8. UTF 16, is basically two bytes per character, but even that is an oversimplification, as a character may have more bytes. UTF-8 sounds as if it means 1 byte per character but can be 2 or 3. To further complicate matters, UTF-16 can be little endian or big endian. (U+0041 or U+4100).

例如,字母A在unicode中表示为U + 0041,在ansi中表示为41。因此,转换它将非常简单,但您必须找出unicode字符的编码方式。最常见的是UTF-16和UTF-8。 UTF 16,基本上是每个字符两个字节,但即使这样也过于简单化,因为字符可能有更多的字节。 UTF-8听起来好像它意味着每个字符1个字节,但可以是2或3.更复杂的是,UTF-16可以是小端或大端。 (U + 0041或U + 4100)。

Where your question makes no sense is if you wanted to for example convert the arabic letter ain U+0639 to ansi on an English locale. You can't.

你的问题没有意义,如果你想在英语语言环境中将阿拉伯语字母a + U + 0639转换为ansi。你不能。

#6


1  

See related questions on converting from Unicode to ASCII:

请参阅有关从Unicode转换为ASCII的相关问题:

In general, character set of hundreds thousands entries cannot be converted to character set of 127 entries without some loss of information or encoding scheme.

通常,数十万个条目的字符集不能转换为127个条目的字符集,而不会丢失一些信息或编码方案。

#7


1  

You can use the function in http://swissdelphicenter.ch/en/showcode.php?id=1692
It converts Unicode string to Ansi string using specified code page.
If you want convert using default system codepage (defined in regional options as non-unicode codepage) you can do it simply like following:

您可以使用http://swissdelphicenter.ch/en/showcode.php?id=1692中的函数。它使用指定的代码页将Unicode字符串转换为Ansi字符串。如果您想使用默认系统代码页进行转换(在区域选项中定义为非unicode代码页),您可以执行以下操作:

var
  ws: widestring;
  s: string;
begin
  s:=string(ws)