I know this has been discussed several times but yet I'm getting crazy dealing with this problem. I have a form with a submit.php action. At first I didn't change anything about the charsets, I didn't use any utf8 header information.. The result was that I could read all the ä,ö,ü etc correctly inside the database. Now exporting them to .csv
and importing them to Excel as UTF-8 charset
(also tested all the others) results in an incorrect charset.
我知道这已经讨论了好几次但是我正在疯狂地处理这个问题。我有一个带有submit.php动作的表单。起初我没有改变关于charsets的任何内容,我没有使用任何utf8头信息..结果是我可以在数据库中正确读取所有ä,ö,ü等。现在将它们导出到.csv并将它们作为UTF-8字符集导入到Excel(也测试了所有其他字符集)会导致错误的字符集。
Now what I tried:
我现在尝试了什么:
PHP:
PHP:
header("Content-Type: text/html; charset=utf-8");
$mysqli->set_charset("utf8");
MySQL: I dropped my database and created a new one:
MySQL:我删除了我的数据库并创建了一个新数据库:
create database db CHARACTER SET utf8 COLLATE utf8_general_ci;
create table ...
I changed my my.cnf and restarted my sql server:
我更改了my.cnf并重新启动了我的sql server:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
[mysql]
default-character-set=utf8
If I connect to my db via bash I receive the following output:
如果我通过bash连接到我的数据库,我收到以下输出:
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/charsets/ |
A php test:
一个PHP测试:
var_dump($mysqli->get_charset());
Giving me:
给我:
Current character set: utf8 object(stdClass)#3 (8) { ["charset"]=> string(4) "utf8" ["collation"]=> string(15) "utf8_general_ci" ["dir"]=> string(0) "" ["min_length"]=> int(1) ["max_length"]=> int(3) ["number"]=> int(33) ["state"]=> int(1) ["comment"]=> string(13) "UTF-8 Unicode" }
Now I use:
现在我使用:
mysql -uroot -ppw db < require.sql > /tmp/test.csv
require.sql is simply a
require.sql只是一个
select * from table;
And again I'm unable to import it as a csv into Excel no matter if I choose UTF-8 or anything else. It's always giving me some crypto..
无论我选择UTF-8还是其他任何东西,我都无法将它作为csv导入Excel。它总是给我一些加密..
Hopefully someone got a hint what might went wrong here..
希望有人暗示这里可能出了什么问题..
Cheers
干杯
E: TextMate is giving me a correct output so it seems that the conversion actually worked and it's and Excel issue? Using Microsoft Office 2011.
E:TextMate正在给我一个正确的输出,所以看起来转换实际上有效,它和Excel问题?使用Microsoft Office 2011。
E2: Also tried the same stuff with latin1 - same issue, cannot import special characters into excel without breaking them. Any hint or workaround?
E2:也试过与latin1相同的东西 - 同样的问题,不能在不破坏它们的情况下将特殊字符导入excel。任何提示或解决方法?
E3: I found a workaround which is working with the Excel Import feature but not with double clicking the .csv.
E3:我找到了一个使用Excel导入功能的解决方法,但没有双击.csv。
iconv -f utf8 -t ISO-8859-1 test.csv > test_ISO.csv
Now I'm able to import the csv into excel using Windows(ANSI). Still annoying to have to use this feature instead of doubleclicking. Also I really don't get why UTF8 isn't working, not even with the import feature, BOM added and the complete database in UTF8.
现在我可以使用Windows(ANSI)将csv导入excel。仍然很烦人必须使用此功能而不是双击。另外,我真的不明白为什么UTF8不工作,甚至没有导入功能,添加BOM和UTF8中的完整数据库。
Comma separation turned out to be a mess as well. 1. Concat_WS works only partly because it's adding a stupid concat_ws(..) header to the .csv file. Also "file test.csv" doesn't give me a "comma separated". This means even tho everything is separated by commas Excel won't notice it using double click. 2. sed/awk: Found some code snippets but all of them were separating the table very badly. E.g. colum street "streetname number" remained a 'streetname','number' which made 2 colums out of one and the table was screwed.
逗号分离也是一团糟。 1. Concat_WS只能部分工作,因为它在.csv文件中添加了一个愚蠢的concat_ws(..)标头。 “file test.csv”也没有给我一个“逗号分隔”。这意味着即使所有内容都以逗号分隔,Excel也不会通过双击来注意它。 2. sed / awk:找到了一些代码片段,但是所有代码片段都非常糟糕地分离了表格。例如。 colum street“streetname number”仍然是一个'streetname','number',它创造了2个colums,并且桌子被拧紧了。
So it seems to me that Excel can only open .csv with a double click which a) Are encoded with ISO-8859-1 (and only under windows because standard mac charset is Macintosh) b) File having the attribute "comma separated". This means if I create a .csv through Excel itself the output of
所以在我看来,Excel只能通过双击打开.csv a)用ISO-8859-1编码(并且只在windows下,因为标准的mac charset是Macintosh)b)具有“逗号分隔”属性的文件。这意味着如果我通过Excel本身创建一个.csv的输出
file test1.csv
would be
将会
test1.csv: ISO-8859 text, with CRLF line terminators
while a iconv changed charset with RegEx used for adding commas would look like:
使用用于添加逗号的RegEx的iconv更改字符集将如下所示:
test1.csv: ISO-8859 text
Pretty weird behaviour - maybe someone got a working solution.
很奇怪的行为 - 也许有人得到了一个有效的解决方案。
2 个解决方案
#1
0
That's how I save the data taken from utf-8 mysql tables. You need to add BOM first. Example:
这就是我如何保存从utf-8 mysql表中获取的数据。您需要先添加BOM。例:
<?php
$fp = fopen(dirname(__FILE__).'/'.$filename, 'wb');
fputs($fp, "\xEF\xBB\xBF");
fputcsv($fp, array($utfstr_1,$utfstr_2);
fclose($fp);
Make sure that you also tells MySQL you're gonna use UTF-8
确保你也告诉MySQL你将使用UTF-8
mysql_query("SET CHARACTER SET utf8");
mysql_query("SET NAMES utf8");
You need to execute this before you're selecting any data.
您需要在选择任何数据之前执行此操作。
Propaply won't be bad if you set the locale:setlocale(LC_ALL, "en_US.UTF-8");
如果设置语言环境,那么预计也不会错:setlocale(LC_ALL,“en_US.UTF-8”);
Hope it helps.
希望能帮助到你。
#2
0
Thanks everyone for the help, I finally managed to get a working - double clickable csv file which opens separated and displaying the letter correctly. For those who are interested in a good workflow here we go:
谢谢大家的帮助,我终于设法得到了一个工作 - 双击的csv文件,它打开分开并正确显示字母。对于那些对良好的工作流程感兴趣的人,我们去:
1.) My database is completely using UTF8. 2.) I export a form into my database via php. I'm using mysqli and as header information:
1.)我的数据库完全使用UTF8。 2.)我通过php将表单导出到我的数据库中。我正在使用mysqli和标题信息:
header("Content-Type: text/html; charset=ISO-8859");
I know this makes everything look crappy inside the database, feel free to use utf8 to make it look correctly but it doesn't matter in my case.
我知道这会使数据库中的所有内容看起来都很糟糕,可以随意使用utf8使它看起来正确但在我的情况下并不重要。
3.) I wrote a script executed by a cron daemon which a) removes the .csv files which were created previously
3.)我写了一个由cron守护进程执行的脚本,a)删除之前创建的.csv文件
rm -f path/to/csv ##I have 3 due to some renaming see below
b) creating the new csv using mysql (this is still UTF8)
b)使用mysql创建新的csv(这仍然是UTF8)
mysql -hSERVERIP -uUSER -pPASS DBNAME -e "select * from DBTABLE;" > PATH/TO/output.csv
Now you have a tab separated .csv and (if u exported from PHP in UTF8) it will display correctly in OpenOffice etc. but not in Excel. Even an import as UTF8 isn't working.
现在你有一个分隔符.csv和(如果你从UTF8中导出PHP),它将在OpenOffice等中正确显示,但不能在Excel中显示。即使是作为UTF8的导入也无效。
c) Making the file SEMICOLON separated (Excel standard, double clicking a comma separated file won't work at least not with the european version of Excel). I used a small python script semicolon.py:
c)将文件SEMICOLON分开(Excel标准,双击逗号分隔文件至少不能与欧洲版本的Excel一起使用)。我使用了一个小的python脚本semicolon.py:
import sys
import csv
tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, delimiter=";")
for row in tabin:
commaout.writerow(row)
d) Now I had to call the script inside my cron sh file:
d)现在我不得不在我的cron sh文件中调用脚本:
/usr/bin/python PATH/TO/semicolon.py < output.csv > output_semi.csv
Make sure you use the full path for every file if u use the script as cron.
如果您使用脚本作为cron,请确保使用每个文件的完整路径。
e) Change the charset from UTF8 to ISO-8859-1 (Windows ANSI Excel standard) with iconv:
e)使用iconv将字符集从UTF8更改为ISO-8859-1(Windows ANSI Excel标准):
iconv -f utf8 -t ISO-8859-1 output_semi.csv > output_final.csv
And that's it. csv opens up on double click on Mac/Windows Excel 2010 (tested).
就是这样。双击Mac / Windows Excel 2010(已测试)即可打开csv。
Maybe this is a help for someone with similar problems. It drove me crazy.
也许这对有类似问题的人有帮助。它让我疯狂。
Edit: For some servers you don't need iconv because the output from the database is already ISO8859. You should check your csv after executing the mysql command:
编辑:对于某些服务器,您不需要iconv,因为数据库的输出已经是ISO8859。你应该在执行mysql命令后检查你的csv:
file output.csv
Use iconv only if the charset isn't iso8859-1
仅当字符集不是iso8859-1时才使用iconv
#1
0
That's how I save the data taken from utf-8 mysql tables. You need to add BOM first. Example:
这就是我如何保存从utf-8 mysql表中获取的数据。您需要先添加BOM。例:
<?php
$fp = fopen(dirname(__FILE__).'/'.$filename, 'wb');
fputs($fp, "\xEF\xBB\xBF");
fputcsv($fp, array($utfstr_1,$utfstr_2);
fclose($fp);
Make sure that you also tells MySQL you're gonna use UTF-8
确保你也告诉MySQL你将使用UTF-8
mysql_query("SET CHARACTER SET utf8");
mysql_query("SET NAMES utf8");
You need to execute this before you're selecting any data.
您需要在选择任何数据之前执行此操作。
Propaply won't be bad if you set the locale:setlocale(LC_ALL, "en_US.UTF-8");
如果设置语言环境,那么预计也不会错:setlocale(LC_ALL,“en_US.UTF-8”);
Hope it helps.
希望能帮助到你。
#2
0
Thanks everyone for the help, I finally managed to get a working - double clickable csv file which opens separated and displaying the letter correctly. For those who are interested in a good workflow here we go:
谢谢大家的帮助,我终于设法得到了一个工作 - 双击的csv文件,它打开分开并正确显示字母。对于那些对良好的工作流程感兴趣的人,我们去:
1.) My database is completely using UTF8. 2.) I export a form into my database via php. I'm using mysqli and as header information:
1.)我的数据库完全使用UTF8。 2.)我通过php将表单导出到我的数据库中。我正在使用mysqli和标题信息:
header("Content-Type: text/html; charset=ISO-8859");
I know this makes everything look crappy inside the database, feel free to use utf8 to make it look correctly but it doesn't matter in my case.
我知道这会使数据库中的所有内容看起来都很糟糕,可以随意使用utf8使它看起来正确但在我的情况下并不重要。
3.) I wrote a script executed by a cron daemon which a) removes the .csv files which were created previously
3.)我写了一个由cron守护进程执行的脚本,a)删除之前创建的.csv文件
rm -f path/to/csv ##I have 3 due to some renaming see below
b) creating the new csv using mysql (this is still UTF8)
b)使用mysql创建新的csv(这仍然是UTF8)
mysql -hSERVERIP -uUSER -pPASS DBNAME -e "select * from DBTABLE;" > PATH/TO/output.csv
Now you have a tab separated .csv and (if u exported from PHP in UTF8) it will display correctly in OpenOffice etc. but not in Excel. Even an import as UTF8 isn't working.
现在你有一个分隔符.csv和(如果你从UTF8中导出PHP),它将在OpenOffice等中正确显示,但不能在Excel中显示。即使是作为UTF8的导入也无效。
c) Making the file SEMICOLON separated (Excel standard, double clicking a comma separated file won't work at least not with the european version of Excel). I used a small python script semicolon.py:
c)将文件SEMICOLON分开(Excel标准,双击逗号分隔文件至少不能与欧洲版本的Excel一起使用)。我使用了一个小的python脚本semicolon.py:
import sys
import csv
tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, delimiter=";")
for row in tabin:
commaout.writerow(row)
d) Now I had to call the script inside my cron sh file:
d)现在我不得不在我的cron sh文件中调用脚本:
/usr/bin/python PATH/TO/semicolon.py < output.csv > output_semi.csv
Make sure you use the full path for every file if u use the script as cron.
如果您使用脚本作为cron,请确保使用每个文件的完整路径。
e) Change the charset from UTF8 to ISO-8859-1 (Windows ANSI Excel standard) with iconv:
e)使用iconv将字符集从UTF8更改为ISO-8859-1(Windows ANSI Excel标准):
iconv -f utf8 -t ISO-8859-1 output_semi.csv > output_final.csv
And that's it. csv opens up on double click on Mac/Windows Excel 2010 (tested).
就是这样。双击Mac / Windows Excel 2010(已测试)即可打开csv。
Maybe this is a help for someone with similar problems. It drove me crazy.
也许这对有类似问题的人有帮助。它让我疯狂。
Edit: For some servers you don't need iconv because the output from the database is already ISO8859. You should check your csv after executing the mysql command:
编辑:对于某些服务器,您不需要iconv,因为数据库的输出已经是ISO8859。你应该在执行mysql命令后检查你的csv:
file output.csv
Use iconv only if the charset isn't iso8859-1
仅当字符集不是iso8859-1时才使用iconv