如何使扫描仪正确读取转义字符?

时间:2022-05-22 22:29:13

I'm reading from a file that reads something like all on one line:

我正在读取一个文件,该文件在一行中读取所有内容:

Hello World!\nI've been trying to get this to work for a while now.\nFrustrating.\n

And my Scanner reads that from the file and puts it in a String:

我的扫描程序从文件中读取并将其放入字符串:

Scanner input = new Scanner(new File(fileName));
String str = input.nextLine();
System.out.print(str);

Now, I want the output then to be:

现在,我希望输出为:

Hello World!
I've been trying to get this work for a while now.
Frustrating.

But instead I'm getting the exact same thing as the input. That is, each \n is included in the output and everything is on one line instead of separate lines.

但相反,我得到了与输入完全相同的东西。也就是说,每个\ n都包含在输出中,并且所有内容都在一行而不是单独的行。

I thought that Scanner would be able to read the escape character properly but it's instead copying it onto the String like it's \\n.

我认为Scanner能够正确读取转义字符,但它会将它复制到字符串上,就像它的\ n一样。

3 个解决方案

#1


3  

If \n is written is the file you can't use nextLine() because there is not \n (end of line) but instead there is \\n (two characters).

如果\ n写入的是你不能使用nextLine()的文件,因为没有\ n(行尾)而是有\\ n(两个字符)。

Instead try with a delimiter :

而是尝试使用分隔符:

    Scanner sc = new Scanner(new File("/home/alain/Bureau/ttt.txt"));
    sc.useDelimiter("\\\\n");
    while(sc.hasNext()){
        System.out.println(sc.next());
    }

Output :

Hello World!

I've been trying to get this to work for a while now.

我一直试图让它工作一段时间。

Frustrating.

EDIT:

If you want to read the file and replace the \n in the text with actual EOL. You can simply use :

如果要读取文件并将文本中的\ n替换为实际EOL。你可以简单地使用:

Scanner sc = new Scanner(new File("/home/alain/Bureau/ttt.txt"));

//loop over real EOL
while(sc.hasNextLine()){

     //Replace the `\n` in the line with real EOL.
     System.out.println(sc.nextLine().replace("\\n", System.getProperty("line.separator")));
}

#2


4  

No, Scanner won't do that for you. You'll have to do the translation yourself.

不,扫描仪不会为你做那件事。你必须自己做翻译。

(Note that if you use something like sc.useDelimiter("\\\\n") as others have suggested you're breaking the functionality of the ordinary next() method and nextLine() may not function as expected.)

(请注意,如果您使用sc.useDelimiter(“\\\\ n”)之类的东西,则其他人建议您破坏普通next()方法的功能,而nextLine()可能无法按预期运行。)

Here's a sketch of how I would solve it:

这是我如何解决它的草图:

Change

Scanner input = new Scanner(new FileReader(fileName));

to

Scanner input = new Scanner(new JavaEscapeReader(new FileReader(fileName)));
                            ^^^^^^^^^^^^^^^^^^^^^                        ^

where JavaEscapeReader would extend FilterReader like this:

JavaEscapeReader将扩展FilterReader,如下所示:

class JavaEscapeReader extends FilterReader {

    JavaEscapeReader(Reader in) {
        super(in);
    }

    @Override
    public int read() throws IOException {
        int ch = super.read();
        switch (ch) {
        case '\\':
            switch (super.read()) {
            case '\\': return '\\';
            case 'n': return '\n';
            case 't': return '\t';
            case 'f': return '\f';
            // ...
            default:
                throw new IOException("Invalid char sequence.");
            }
        default:
            return ch;
        }
    }

    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
        int i = 0, ch;
        while (i < len && -1 != (ch = read()))
            cbuf[i++] = (char) ch;
        return i == 0 ? -1 : i;
    }
}

Given an input file with the content

给定带有内容的输入文件

Line1\nLine2
Line3\nLine3

the program

Scanner sc = new Scanner(new JavaEscapeReader(new FileReader("filename.txt")));
while (sc.hasNextLine())
    System.out.println(sc.nextLine());

prints

Line1
Line2
Line3
Line4

Another option is to use StringEscapeUtils.unescapeJava and post process the read strings.

另一种选择是使用StringEscapeUtils.unescapeJava并对读取字符串进行后处理。

#3


2  

You can use Scanner.useDelimiter to set your own delimiter. In your case using double quoted \\n:

您可以使用Scanner.useDelimiter设置自己的分隔符。在你的情况下使用双引号\\ n:

s.useDelimiter("\\\\n");

Example:

Scanner s = new Scanner("Hello World!\\nI've been trying to get this to " +
                        "work for a while now.\\nFrustrating.\\n");
s.useDelimiter("\\\\n");

System.out.println(s.next());
System.out.println(s.next());
System.out.println(s.next());

Outputs:

Hello World!
I've been trying to get this to work for a while now.
Frustrating.

#1


3  

If \n is written is the file you can't use nextLine() because there is not \n (end of line) but instead there is \\n (two characters).

如果\ n写入的是你不能使用nextLine()的文件,因为没有\ n(行尾)而是有\\ n(两个字符)。

Instead try with a delimiter :

而是尝试使用分隔符:

    Scanner sc = new Scanner(new File("/home/alain/Bureau/ttt.txt"));
    sc.useDelimiter("\\\\n");
    while(sc.hasNext()){
        System.out.println(sc.next());
    }

Output :

Hello World!

I've been trying to get this to work for a while now.

我一直试图让它工作一段时间。

Frustrating.

EDIT:

If you want to read the file and replace the \n in the text with actual EOL. You can simply use :

如果要读取文件并将文本中的\ n替换为实际EOL。你可以简单地使用:

Scanner sc = new Scanner(new File("/home/alain/Bureau/ttt.txt"));

//loop over real EOL
while(sc.hasNextLine()){

     //Replace the `\n` in the line with real EOL.
     System.out.println(sc.nextLine().replace("\\n", System.getProperty("line.separator")));
}

#2


4  

No, Scanner won't do that for you. You'll have to do the translation yourself.

不,扫描仪不会为你做那件事。你必须自己做翻译。

(Note that if you use something like sc.useDelimiter("\\\\n") as others have suggested you're breaking the functionality of the ordinary next() method and nextLine() may not function as expected.)

(请注意,如果您使用sc.useDelimiter(“\\\\ n”)之类的东西,则其他人建议您破坏普通next()方法的功能,而nextLine()可能无法按预期运行。)

Here's a sketch of how I would solve it:

这是我如何解决它的草图:

Change

Scanner input = new Scanner(new FileReader(fileName));

to

Scanner input = new Scanner(new JavaEscapeReader(new FileReader(fileName)));
                            ^^^^^^^^^^^^^^^^^^^^^                        ^

where JavaEscapeReader would extend FilterReader like this:

JavaEscapeReader将扩展FilterReader,如下所示:

class JavaEscapeReader extends FilterReader {

    JavaEscapeReader(Reader in) {
        super(in);
    }

    @Override
    public int read() throws IOException {
        int ch = super.read();
        switch (ch) {
        case '\\':
            switch (super.read()) {
            case '\\': return '\\';
            case 'n': return '\n';
            case 't': return '\t';
            case 'f': return '\f';
            // ...
            default:
                throw new IOException("Invalid char sequence.");
            }
        default:
            return ch;
        }
    }

    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
        int i = 0, ch;
        while (i < len && -1 != (ch = read()))
            cbuf[i++] = (char) ch;
        return i == 0 ? -1 : i;
    }
}

Given an input file with the content

给定带有内容的输入文件

Line1\nLine2
Line3\nLine3

the program

Scanner sc = new Scanner(new JavaEscapeReader(new FileReader("filename.txt")));
while (sc.hasNextLine())
    System.out.println(sc.nextLine());

prints

Line1
Line2
Line3
Line4

Another option is to use StringEscapeUtils.unescapeJava and post process the read strings.

另一种选择是使用StringEscapeUtils.unescapeJava并对读取字符串进行后处理。

#3


2  

You can use Scanner.useDelimiter to set your own delimiter. In your case using double quoted \\n:

您可以使用Scanner.useDelimiter设置自己的分隔符。在你的情况下使用双引号\\ n:

s.useDelimiter("\\\\n");

Example:

Scanner s = new Scanner("Hello World!\\nI've been trying to get this to " +
                        "work for a while now.\\nFrustrating.\\n");
s.useDelimiter("\\\\n");

System.out.println(s.next());
System.out.println(s.next());
System.out.println(s.next());

Outputs:

Hello World!
I've been trying to get this to work for a while now.
Frustrating.