如何在Ruby中使用encode utf-8？

I am trying to extract a word from a first line of file:

我试图从第一行文件中提取一个单词：

LOCATION,Feij�,AC,a,b,c

LOCATION，Feij，AC，A，B，C

this way:

这条路：

2.0.0-p247 :005 > File.foreach(file).first

2.0.0-p247：005> File.foreach（file）.first

=> "LOCATION,Feij\xF3,AC,a,b,c\r\n"`

>“LOCATION，Feij \ xF3，AC，a，b，c \ r \ n”`

but when I try to use split:

但是当我尝试使用拆分时：

2.0.0-p247 :008 > File.foreach(file).first.split(",")

2.0.0-p247：008> File.foreach（file）.first.split（“，”）

ArgumentError: invalid byte sequence in UTF-8 from (irb):8:in split' from (irb):8 from /home/bleh/.rvm/rubies/ruby-2.0.0-p247/bin/irb:13:in'

ArgumentError：来自（irb）的UTF-8中的无效字节序列：8：来自（irb）的分割：来自/home/bleh/.rvm/rubies/ruby-2.0.0-p247/bin/irb:13的8：在'

What I expected is: Feijó

我的期望是：Feijó

I already try a lot of combinations like .encode and .force_encoding.

我已经尝试了很多组合，比如.encode和.force_encoding。

Some ideas?

一些想法？

1 个解决方案

#1

The character ó is \xF3 in the ISO-8859-1 encoding, so this is probably the encoding of the file (it could also be CP-1252.

字符ó是ISO-8859-1编码中的\ xF3，因此这可能是文件的编码（也可能是CP-1252）。

You can specify the encoding as an arg to File::foreach, and you can also ask Ruby to re-encode it to UTF-8 for you:

您可以将编码指定为File :: foreach的arg，也可以让Ruby将其重新编码为UTF-8：

File.foreach(file, :encoding => 'iso-8859-1:utf-8').first.split(",")

#1