Javascript导出CSV编码utf-8问题

时间:2023-01-05 20:06:17

I need to export javascript array to CSV file and download it. I did it but 'ı,ü,ö,ğ,ş' this characters looks like 'ı ü ö ÄŸ ÅŸ' in the CSV file. I have tried many solutions recommended on this site but didn't work for me.

我需要将javascript数组导出到CSV文件并下载。我做到了但ı,u,o,ğş”这个角色看起来像“一个±¼¶aÿaÿ的CSV文件。我尝试了很多在这个网站上推荐的解决方案,但都没有成功。

I added my code snippet, Can anyone solve this problem?

我添加了我的代码片段,有人能解决这个问题吗?

var csvString = 'ı,ü,ö,ğ,ş';

var a = window.document.createElement('a');
a.setAttribute('href', 'data:text/csv; charset=utf-8,' + encodeURIComponent(csvString));
a.setAttribute('download', 'example.csv');
a.click();

1 个解决方案

#1


4  

This depends on what program is opening the example.csv file. Using a text editor, the encoding will be UTF-8 and the characters will not be malformed. But using Excel the default encoding for CSV is ANSI and not UTF-8. So without forcing Excel using not ANSI but UTF-8 as the encoding, the characters will be malformed.

这取决于哪个程序正在打开示例。csv文件。使用文本编辑器,编码将是UTF-8,字符不会畸形。但是使用Excel对CSV进行默认编码是ANSI而不是UTF-8。因此,如果不使用非ANSI而是UTF-8作为编码来强制使用Excel,字符将会畸形。

Excel can be forced using UTF-8 for CSV with putting a BOM (Byte Order Mark) as first characters in the file. The default BOM for UTF-8 is the byte sequence 0xEF,0xBB,0xBF. So one could think simply putting "\xEF\xBB\xBF" as first bytes to the string will be the solution. But surely that would be too simple, wouldn't it? ;-) The problem with this is how to force JavaScript to not taking those bytes as characters. The "solution" is using a "universal BOM" "\uFEFF" as mentioned in Special Characters (JavaScript).

Excel可以强制使用UTF-8作为CSV,并将BOM(字节顺序标记)作为文件中的第一个字符。UTF-8的默认BOM是字节序列0xEF、0xBB、0xBF。因此,我们可以简单地将“\xEF\xBB\xBF”作为字符串的第一个字节来实现。但这肯定太简单了,不是吗?问题是如何强制JavaScript不把这些字节作为字符。“解决方案”是使用特殊字符(JavaScript)中提到的“通用BOM”“\uFEFF”。

Example:

例子:

var csvString = 'ı,ü,ü,ğ,ş';
var universalBOM = "\uFEFF";
var a = window.document.createElement('a');
a.setAttribute('href', 'data:text/csv; charset=utf-8,' + encodeURIComponent(universalBOM+csvString));
a.setAttribute('download', 'example.csv');
window.document.body.appendChild(a);
a.click();

See also Adding UTF-8 BOM to string/Blob.

还可以在字符串/Blob中添加UTF-8 BOM。

Using this, the encoding will be correct. But nevertheless, this only works properly if comma is the default list separator in your Windows locale settings. If not, if for example semicolon is the default list separator in your Windows locale settings, then all content will be in first column without splitting it by comma. Then you have to use semicolon as delimiter in the CSV also. But this is another problem and leads to the conclusion not using CSV at all but using libraries which can directly creating Excel files (*.xls or *.xlsx).

使用这个,编码将是正确的。但是,只有当逗号是Windows语言环境设置中的默认列表分隔符时,这才会正常工作。如果不是,如果在Windows语言环境设置中,例如分号是默认的列表分隔符,那么所有内容都将在第一列中,而不会用逗号分隔。然后在CSV中也要使用分号作为分隔符。但这是另一个问题,并导致结论根本不使用CSV,而是使用可以直接创建Excel文件的库(*)。xls或* .xlsx)。

#1


4  

This depends on what program is opening the example.csv file. Using a text editor, the encoding will be UTF-8 and the characters will not be malformed. But using Excel the default encoding for CSV is ANSI and not UTF-8. So without forcing Excel using not ANSI but UTF-8 as the encoding, the characters will be malformed.

这取决于哪个程序正在打开示例。csv文件。使用文本编辑器,编码将是UTF-8,字符不会畸形。但是使用Excel对CSV进行默认编码是ANSI而不是UTF-8。因此,如果不使用非ANSI而是UTF-8作为编码来强制使用Excel,字符将会畸形。

Excel can be forced using UTF-8 for CSV with putting a BOM (Byte Order Mark) as first characters in the file. The default BOM for UTF-8 is the byte sequence 0xEF,0xBB,0xBF. So one could think simply putting "\xEF\xBB\xBF" as first bytes to the string will be the solution. But surely that would be too simple, wouldn't it? ;-) The problem with this is how to force JavaScript to not taking those bytes as characters. The "solution" is using a "universal BOM" "\uFEFF" as mentioned in Special Characters (JavaScript).

Excel可以强制使用UTF-8作为CSV,并将BOM(字节顺序标记)作为文件中的第一个字符。UTF-8的默认BOM是字节序列0xEF、0xBB、0xBF。因此,我们可以简单地将“\xEF\xBB\xBF”作为字符串的第一个字节来实现。但这肯定太简单了,不是吗?问题是如何强制JavaScript不把这些字节作为字符。“解决方案”是使用特殊字符(JavaScript)中提到的“通用BOM”“\uFEFF”。

Example:

例子:

var csvString = 'ı,ü,ü,ğ,ş';
var universalBOM = "\uFEFF";
var a = window.document.createElement('a');
a.setAttribute('href', 'data:text/csv; charset=utf-8,' + encodeURIComponent(universalBOM+csvString));
a.setAttribute('download', 'example.csv');
window.document.body.appendChild(a);
a.click();

See also Adding UTF-8 BOM to string/Blob.

还可以在字符串/Blob中添加UTF-8 BOM。

Using this, the encoding will be correct. But nevertheless, this only works properly if comma is the default list separator in your Windows locale settings. If not, if for example semicolon is the default list separator in your Windows locale settings, then all content will be in first column without splitting it by comma. Then you have to use semicolon as delimiter in the CSV also. But this is another problem and leads to the conclusion not using CSV at all but using libraries which can directly creating Excel files (*.xls or *.xlsx).

使用这个,编码将是正确的。但是,只有当逗号是Windows语言环境设置中的默认列表分隔符时,这才会正常工作。如果不是,如果在Windows语言环境设置中,例如分号是默认的列表分隔符,那么所有内容都将在第一列中,而不会用逗号分隔。然后在CSV中也要使用分号作为分隔符。但这是另一个问题,并导致结论根本不使用CSV,而是使用可以直接创建Excel文件的库(*)。xls或* .xlsx)。