当我期待英语字符串时得到一个奇怪的输出

The following program reads text from a file named tes.txt and separates plain English string from a Urdu String which is same throughout the file. It acts as a stamp after every English word. The file looks like : (Urdu string follows English string)

以下程序从名为tes.txt的文件中读取文本,并将普通英语字符串与整个文件中相同的Urdu字符串分开。它充当每个英文单词后面的印章。该文件看起来像:(乌尔都语字符串跟随英文字符串)

سٹیمپ ختم ہو جاتی ہے

suhail

سٹیمپ ختم ہو جاتی ہے  

gupta

سٹیمپ ختم ہو جاتی ہے

ghazal
سٹیمپ ختم ہو جاتی ہے

While using windows I compile the following program :

使用Windows时,我编译以下程序:

import java.io.*;

class checker {
public static void main(String args[]) {
try {
     File f = new File("C:/Users/user/Desktop/tes.txt");
     FileReader reader = new FileReader(f);
     char buffer[] = new char[1024];
     String text = "";
     while( reader.read(buffer) > 0 ) {
        text += buffer.toString();
     }

     String splits[] = text.split("سٹیمپ ختم ہو جاتی ہے");

     for(int i=0;i<splits.length;i++) {
        System.out.println(splits[i]);
     }  
} catch(Exception exc) {
   exc.printStackTrace();
  }
}
}

as javac -encoding UTF-8 checker.java.But when I run this program I get output as [C@19b49e6. Why is this ? Also it prints only one string from the array. I also checked the length of buffer array,it comes out to be one. Why one (there is more than one string in the file that will come into the buffer after separating it from a regex)? Where have I made a mistake ?

作为javac -encoding UTF-8 checker.java.But当我运行这个程序时,我输出为[C @ 19b49e6。为什么是这样 ?它还只从数组中打印一个字符串。我还检查了缓冲阵列的长度,它出来了。为什么一个(文件中有多个字符串在从正则表达式中分离后会进入缓冲区)?我哪里弄错了?

3 个解决方案

#1

Your mistake is to assume that an array's toString gives you a textual representation of its elements. It does not. You want java.util.Arrays.toString(array) for that.

你的错误是假设数组的toString为你提供了元素的文本表示。它不是。你想要java.util.Arrays.toString(array)。

Also, let's say the file has 5 characters in it; you read 5 characters into your buffer of 1024 characters and add all 1024 to your String. That's 1019 null characters. I would suggest using BufferedReader.readLine() instead to read a file into a String or even Guava's Files.toString(File file, String charset) - http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/io/Files.html#toString(java.io.File,%20java.nio.charset.Charset)

另外,假设文件中有5个字符;你将5个字符读入1024个字符的缓冲区,并将所有1024个字符添加到你的字符串中。这是1019个空字符。我建议使用BufferedReader.readLine()代替将文件读入字符串甚至是Guava的Files.toString(文件文件,String charset) - http://docs.guava-libraries.googlecode.com/git/javadoc/com /google/common/io/Files.html#toString(java.io.File,%20java.nio.charset.Charset)

However, to explain how you can fix what you have, you just need to store the number of chars read and only use that many characters from the array. If that's not clear enough let me know and I'll write a code sample.

但是,要解释如何修复所拥有的内容,只需要存储读取的字符数,并仅使用数组中的那些字符。如果这还不够清楚,请告诉我,我会写一个代码示例。

#2

The char buffer[] is not being added to the string properly, change this line.

char buffer []未正确添加到字符串中,请更改此行。

     text += new String(buffer);

*Sorry for my previous answer I'm kinda sleepy.

*对不起我以前的回答我有点困了。

#3

You are not reading in the file content properly, here is a better way to read the content in:

您没有正确阅读文件内容,这是阅读以下内容的更好方法:

 String text = "";
 int readcount=0;
 while((readcount =  reader.read(buffer)) != -1 ) {
    text += new String(buffer, 0, readcount);
 }

 String[] splits = text.split("سٹیمپ ختم ہو جاتی ہے");

#1