This question already has an answer here:
这个问题已经有了答案:
- Java read file got a leading BOM [  ] 6 answers
- Java读取文件得到了一个主要的BOM [i >¿]6个答案
If I write this code, I get this as output --> This first:  and then the other lines
如果我写这段代码,我将它作为输出——>
try {
BufferedReader br = new BufferedReader(new FileReader(
"myFile.txt"));
String line;
while (line = br.readLine() != null) {
System.out.println(line);
}
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
How can I avoid it?
我怎样才能避免呢?
2 个解决方案
#1
14
You are getting the characters  on the first line because this sequence is the UTF-8 byte order mark (BOM). If a text file begins with a BOM, it's likely it was generated by a Windows program like Notepad.
您将在第一行获得字符i»,因为这个序列是UTF-8字节顺序标记(BOM)。如果一个文本文件以BOM开头,它很可能是由Windows程序(如记事本)生成的。
To solve your problem, we choose to read the file explicitly as UTF-8, instead of whatever default system character encoding (US-ASCII, etc.):
为了解决您的问题,我们选择将文件显式地读取为UTF-8,而不是任何默认的系统字符编码(US-ASCII等):
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("myFile.txt"),
"UTF-8"));
Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF. This character is optional - a legal UTF-8 file may or may not begin with it. So we will skip the first character only if it's U+FEFF:
然后在UTF-8中,字节序列i»¿解码为一个字符,即U+FEFF。此字符是可选的——合法的UTF-8文件可以或不可以以它开头。所以我们将跳过第一个字符只有当它是U+FEFF:
in.mark(1);
if (in.read() != 0xFEFF)
in.reset();
And now you can continue with the rest of your code.
现在您可以继续使用余下的代码。
#2
1
The problem could be in encoding used. try this:
问题可能是编码使用。试试这个:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream("yourfile"), "UTF-8"));
#1
14
You are getting the characters  on the first line because this sequence is the UTF-8 byte order mark (BOM). If a text file begins with a BOM, it's likely it was generated by a Windows program like Notepad.
您将在第一行获得字符i»,因为这个序列是UTF-8字节顺序标记(BOM)。如果一个文本文件以BOM开头,它很可能是由Windows程序(如记事本)生成的。
To solve your problem, we choose to read the file explicitly as UTF-8, instead of whatever default system character encoding (US-ASCII, etc.):
为了解决您的问题,我们选择将文件显式地读取为UTF-8,而不是任何默认的系统字符编码(US-ASCII等):
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("myFile.txt"),
"UTF-8"));
Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF. This character is optional - a legal UTF-8 file may or may not begin with it. So we will skip the first character only if it's U+FEFF:
然后在UTF-8中,字节序列i»¿解码为一个字符,即U+FEFF。此字符是可选的——合法的UTF-8文件可以或不可以以它开头。所以我们将跳过第一个字符只有当它是U+FEFF:
in.mark(1);
if (in.read() != 0xFEFF)
in.reset();
And now you can continue with the rest of your code.
现在您可以继续使用余下的代码。
#2
1
The problem could be in encoding used. try this:
问题可能是编码使用。试试这个:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream("yourfile"), "UTF-8"));