PHP到MySql到CSV到Excel UTF-8

时间:2022-10-24 23:30:08

I know this has been discussed several times but yet I'm getting crazy dealing with this problem. I have a form with a submit.php action. At first I didn't change anything about the charsets, I didn't use any utf8 header information.. The result was that I could read all the ä,ö,ü etc correctly inside the database. Now exporting them to .csv and importing them to Excel as UTF-8 charset (also tested all the others) results in an incorrect charset.

我知道这已经讨论了好几次但是我正在疯狂地处理这个问题。我有一个带有submit.php动作的表单。起初我没有改变关于charsets的任何内容,我没有使用任何utf8头信息..结果是我可以在数据库中正确读取所有ä,ö,ü等。现在将它们导出到.csv并将它们作为UTF-8字符集导入到Excel(也测试了所有其他字符集)会导致错误的字符集。

Now what I tried:

我现在尝试了什么:

PHP:

PHP:

header("Content-Type: text/html; charset=utf-8");
$mysqli->set_charset("utf8");

MySQL: I dropped my database and created a new one:

MySQL:我删除了我的数据库并创建了一个新数据库:

create database db CHARACTER SET utf8 COLLATE utf8_general_ci;
create table ...

I changed my my.cnf and restarted my sql server:

我更改了my.cnf并重新启动了我的sql server:

[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci

[mysql]
default-character-set=utf8

If I connect to my db via bash I receive the following output:

如果我通过bash连接到我的数据库,我收到以下输出:

| character_set_client     | utf8                             |
| character_set_connection | utf8                             |
| character_set_database   | utf8                             |
| character_set_filesystem | binary                           |
| character_set_results    | utf8                             |
| character_set_server     | utf8                             |
| character_set_system     | utf8                             |
| character_sets_dir       | /usr/local/mysql/share/charsets/ |

A php test:

一个PHP测试:

var_dump($mysqli->get_charset());

Giving me:

给我:

Current character set: utf8 object(stdClass)#3 (8) { ["charset"]=> string(4) "utf8"     ["collation"]=> string(15) "utf8_general_ci" ["dir"]=> string(0) "" ["min_length"]=> int(1) ["max_length"]=> int(3) ["number"]=> int(33) ["state"]=> int(1) ["comment"]=> string(13) "UTF-8 Unicode" }

Now I use:

现在我使用:

mysql -uroot -ppw db < require.sql > /tmp/test.csv

require.sql is simply a

require.sql只是一个

select * from table;

And again I'm unable to import it as a csv into Excel no matter if I choose UTF-8 or anything else. It's always giving me some crypto..

无论我选择UTF-8还是其他任何东西,我都无法将它作为csv导入Excel。它总是给我一些加密..

Hopefully someone got a hint what might went wrong here..

希望有人暗示这里可能出了什么问题..

Cheers

干杯

E: TextMate is giving me a correct output so it seems that the conversion actually worked and it's and Excel issue? Using Microsoft Office 2011.

E:TextMate正在给我一个正确的输出,所以看起来转换实际上有效,它和Excel问题?使用Microsoft Office 2011。

E2: Also tried the same stuff with latin1 - same issue, cannot import special characters into excel without breaking them. Any hint or workaround?

E2:也试过与latin1相同的东西 - 同样的问题,不能在不破坏它们的情况下将特殊字符导入excel。任何提示或解决方法?

E3: I found a workaround which is working with the Excel Import feature but not with double clicking the .csv.

E3:我找到了一个使用Excel导入功能的解决方法,但没有双击.csv。

    iconv -f utf8 -t ISO-8859-1 test.csv > test_ISO.csv

Now I'm able to import the csv into excel using Windows(ANSI). Still annoying to have to use this feature instead of doubleclicking. Also I really don't get why UTF8 isn't working, not even with the import feature, BOM added and the complete database in UTF8.

现在我可以使用Windows(ANSI)将csv导入excel。仍然很烦人必须使用此功能而不是双击。另外,我真的不明白为什么UTF8不工作,甚至没有导入功能,添加BOM和UTF8中的完整数据库。

Comma separation turned out to be a mess as well. 1. Concat_WS works only partly because it's adding a stupid concat_ws(..) header to the .csv file. Also "file test.csv" doesn't give me a "comma separated". This means even tho everything is separated by commas Excel won't notice it using double click. 2. sed/awk: Found some code snippets but all of them were separating the table very badly. E.g. colum street "streetname number" remained a 'streetname','number' which made 2 colums out of one and the table was screwed.

逗号分离也是一团糟。 1. Concat_WS只能部分工作,因为它在.csv文件中添加了一个愚蠢的concat_ws(..)标头。 “file test.csv”也没有给我一个“逗号分隔”。这意味着即使所有内容都以逗号分隔,Excel也不会通过双击来注意它。 2. sed / awk:找到了一些代码片段,但是所有代码片段都非常糟糕地分离了表格。例如。 colum street“streetname number”仍然是一个'streetname','number',它创造了2个colums,并且桌子被拧紧了。

So it seems to me that Excel can only open .csv with a double click which a) Are encoded with ISO-8859-1 (and only under windows because standard mac charset is Macintosh) b) File having the attribute "comma separated". This means if I create a .csv through Excel itself the output of

所以在我看来,Excel只能通过双击打开.csv a)用ISO-8859-1编码(并且只在windows下,因为标准的mac charset是Macintosh)b)具有“逗号分隔”属性的文件。这意味着如果我通过Excel本身创建一个.csv的输出

file test1.csv

would be

将会

test1.csv: ISO-8859 text, with CRLF line terminators

while a iconv changed charset with RegEx used for adding commas would look like:

使用用于添加逗号的RegEx的iconv更改字符集将如下所示:

test1.csv: ISO-8859 text

Pretty weird behaviour - maybe someone got a working solution.

很奇怪的行为 - 也许有人得到了一个有效的解决方案。

2 个解决方案

#1


0  

That's how I save the data taken from utf-8 mysql tables. You need to add BOM first. Example:

这就是我如何保存从utf-8 mysql表中获取的数据。您需要先添加BOM。例:

<?php
$fp = fopen(dirname(__FILE__).'/'.$filename, 'wb'); 
fputs($fp, "\xEF\xBB\xBF"); 
fputcsv($fp, array($utfstr_1,$utfstr_2);
fclose($fp);

Make sure that you also tells MySQL you're gonna use UTF-8

确保你也告诉MySQL你将使用UTF-8

mysql_query("SET CHARACTER SET utf8"); 
mysql_query("SET NAMES utf8"); 

You need to execute this before you're selecting any data.

您需要在选择任何数据之前执行此操作。

Propaply won't be bad if you set the locale:setlocale(LC_ALL, "en_US.UTF-8");

如果设置语言环境,那么预计也不会错:setlocale(LC_ALL,“en_US.UTF-8”);

Hope it helps.

希望能帮助到你。

#2


0  

Thanks everyone for the help, I finally managed to get a working - double clickable csv file which opens separated and displaying the letter correctly. For those who are interested in a good workflow here we go:

谢谢大家的帮助,我终于设法得到了一个工作 - 双击的csv文件,它打开分开并正确显示字母。对于那些对良好的工作流程感兴趣的人,我们去:

1.) My database is completely using UTF8. 2.) I export a form into my database via php. I'm using mysqli and as header information:

1.)我的数据库完全使用UTF8。 2.)我通过php将表单导出到我的数据库中。我正在使用mysqli和标题信息:

header("Content-Type: text/html; charset=ISO-8859");

I know this makes everything look crappy inside the database, feel free to use utf8 to make it look correctly but it doesn't matter in my case.

我知道这会使数据库中的所有内容看起来都很糟糕,可以随意使用utf8使它看起来正确但在我的情况下并不重要。

3.) I wrote a script executed by a cron daemon which a) removes the .csv files which were created previously

3.)我写了一个由cron守护进程执行的脚本,a)删除之前创建的.csv文件

rm -f path/to/csv ##I have 3 due to some renaming see below

b) creating the new csv using mysql (this is still UTF8)

b)使用mysql创建新的csv(这仍然是UTF8)

mysql -hSERVERIP -uUSER -pPASS DBNAME -e "select * from DBTABLE;" > PATH/TO/output.csv

Now you have a tab separated .csv and (if u exported from PHP in UTF8) it will display correctly in OpenOffice etc. but not in Excel. Even an import as UTF8 isn't working.

现在你有一个分隔符.csv和(如果你从UTF8中导出PHP),它将在OpenOffice等中正确显示,但不能在Excel中显示。即使是作为UTF8的导入也无效。

c) Making the file SEMICOLON separated (Excel standard, double clicking a comma separated file won't work at least not with the european version of Excel). I used a small python script semicolon.py:

c)将文件SEMICOLON分开(Excel标准,双击逗号分隔文件至少不能与欧洲版本的Excel一起使用)。我使用了一个小的python脚本semicolon.py:

import sys
import csv

tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, delimiter=";")
for row in tabin:
  commaout.writerow(row)

d) Now I had to call the script inside my cron sh file:

d)现在我不得不在我的cron sh文件中调用脚本:

/usr/bin/python PATH/TO/semicolon.py < output.csv > output_semi.csv

Make sure you use the full path for every file if u use the script as cron.

如果您使用脚本作为cron,请确保使用每个文件的完整路径。

e) Change the charset from UTF8 to ISO-8859-1 (Windows ANSI Excel standard) with iconv:

e)使用iconv将字符集从UTF8更改为ISO-8859-1(Windows ANSI Excel标准):

iconv -f utf8 -t ISO-8859-1 output_semi.csv > output_final.csv

And that's it. csv opens up on double click on Mac/Windows Excel 2010 (tested).

就是这样。双击Mac / Windows Excel 2010(已测试)即可打开csv。

Maybe this is a help for someone with similar problems. It drove me crazy.

也许这对有类似问题的人有帮助。它让我疯狂。

Edit: For some servers you don't need iconv because the output from the database is already ISO8859. You should check your csv after executing the mysql command:

编辑:对于某些服务器,您不需要iconv,因为数据库的输出已经是ISO8859。你应该在执行mysql命令后检查你的csv:

file output.csv

Use iconv only if the charset isn't iso8859-1

仅当字符集不是iso8859-1时才使用iconv

#1


0  

That's how I save the data taken from utf-8 mysql tables. You need to add BOM first. Example:

这就是我如何保存从utf-8 mysql表中获取的数据。您需要先添加BOM。例:

<?php
$fp = fopen(dirname(__FILE__).'/'.$filename, 'wb'); 
fputs($fp, "\xEF\xBB\xBF"); 
fputcsv($fp, array($utfstr_1,$utfstr_2);
fclose($fp);

Make sure that you also tells MySQL you're gonna use UTF-8

确保你也告诉MySQL你将使用UTF-8

mysql_query("SET CHARACTER SET utf8"); 
mysql_query("SET NAMES utf8"); 

You need to execute this before you're selecting any data.

您需要在选择任何数据之前执行此操作。

Propaply won't be bad if you set the locale:setlocale(LC_ALL, "en_US.UTF-8");

如果设置语言环境,那么预计也不会错:setlocale(LC_ALL,“en_US.UTF-8”);

Hope it helps.

希望能帮助到你。

#2


0  

Thanks everyone for the help, I finally managed to get a working - double clickable csv file which opens separated and displaying the letter correctly. For those who are interested in a good workflow here we go:

谢谢大家的帮助,我终于设法得到了一个工作 - 双击的csv文件,它打开分开并正确显示字母。对于那些对良好的工作流程感兴趣的人,我们去:

1.) My database is completely using UTF8. 2.) I export a form into my database via php. I'm using mysqli and as header information:

1.)我的数据库完全使用UTF8。 2.)我通过php将表单导出到我的数据库中。我正在使用mysqli和标题信息:

header("Content-Type: text/html; charset=ISO-8859");

I know this makes everything look crappy inside the database, feel free to use utf8 to make it look correctly but it doesn't matter in my case.

我知道这会使数据库中的所有内容看起来都很糟糕,可以随意使用utf8使它看起来正确但在我的情况下并不重要。

3.) I wrote a script executed by a cron daemon which a) removes the .csv files which were created previously

3.)我写了一个由cron守护进程执行的脚本,a)删除之前创建的.csv文件

rm -f path/to/csv ##I have 3 due to some renaming see below

b) creating the new csv using mysql (this is still UTF8)

b)使用mysql创建新的csv(这仍然是UTF8)

mysql -hSERVERIP -uUSER -pPASS DBNAME -e "select * from DBTABLE;" > PATH/TO/output.csv

Now you have a tab separated .csv and (if u exported from PHP in UTF8) it will display correctly in OpenOffice etc. but not in Excel. Even an import as UTF8 isn't working.

现在你有一个分隔符.csv和(如果你从UTF8中导出PHP),它将在OpenOffice等中正确显示,但不能在Excel中显示。即使是作为UTF8的导入也无效。

c) Making the file SEMICOLON separated (Excel standard, double clicking a comma separated file won't work at least not with the european version of Excel). I used a small python script semicolon.py:

c)将文件SEMICOLON分开(Excel标准,双击逗号分隔文件至少不能与欧洲版本的Excel一起使用)。我使用了一个小的python脚本semicolon.py:

import sys
import csv

tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, delimiter=";")
for row in tabin:
  commaout.writerow(row)

d) Now I had to call the script inside my cron sh file:

d)现在我不得不在我的cron sh文件中调用脚本:

/usr/bin/python PATH/TO/semicolon.py < output.csv > output_semi.csv

Make sure you use the full path for every file if u use the script as cron.

如果您使用脚本作为cron,请确保使用每个文件的完整路径。

e) Change the charset from UTF8 to ISO-8859-1 (Windows ANSI Excel standard) with iconv:

e)使用iconv将字符集从UTF8更改为ISO-8859-1(Windows ANSI Excel标准):

iconv -f utf8 -t ISO-8859-1 output_semi.csv > output_final.csv

And that's it. csv opens up on double click on Mac/Windows Excel 2010 (tested).

就是这样。双击Mac / Windows Excel 2010(已测试)即可打开csv。

Maybe this is a help for someone with similar problems. It drove me crazy.

也许这对有类似问题的人有帮助。它让我疯狂。

Edit: For some servers you don't need iconv because the output from the database is already ISO8859. You should check your csv after executing the mysql command:

编辑:对于某些服务器,您不需要iconv,因为数据库的输出已经是ISO8859。你应该在执行mysql命令后检查你的csv:

file output.csv

Use iconv only if the charset isn't iso8859-1

仅当字符集不是iso8859-1时才使用iconv