如何在java中使用“。”作为String.split()的分隔符

时间:2022-09-10 22:30:45

What I am trying to do is read a .java file, and pick out all of the identifiers and store them in a list. My problem is with the .split() method. If you run this code the way it is, you will get ArrayOutOfBounds, but if you change the delimiter from "." to anything else, the code works. But I need to lines parsed by "." so is there another way I could accomplish this?

我想要做的是读取.java文件,并挑选出所有标识符并将它们存储在列表中。我的问题是.split()方法。如果按原样运行此代码,您将获得ArrayOutOfBounds,但是如果您从“。”更改分隔符。其他任何东西,代码都有效。但我需要用“。”解析的行。那么我还有另一种方法可以做到这一点吗?

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;


public class MyHash {
    private static String[] reserved = new String[100];
    private static List list = new LinkedList();
    private static List list2 = new LinkedList();

    public static void main (String args[]){
        Hashtable hashtable  = new Hashtable(997);
        makeReserved();
        readFile();
        String line;
        ListIterator itr = list.listIterator();
        int listIndex = 0;
        while (listIndex < list.size()) {

            if (itr.hasNext()){
                line = itr.next().toString();
                //PROBLEM IS HERE!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                String[] words = line.split(".");  //CHANGE THIS AND IT WILL WORK
                System.out.println(words[0]);      //TESTING TO SEE IF IT WORKED
            }
            listIndex++;
        }
    }

    public static void readFile() {
        String text;
        String[] words;
        BufferedReader in = null;
        try {
            in = new BufferedReader(new FileReader("MyHash.java")); //NAME OF INPUT FILE


        } catch (FileNotFoundException ex) {
            Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE, null, ex);
        }
        try {
            while ((text = in.readLine()) != null){
                text = text.trim();
                words = text.split("\\s+");
                for (int i = 0; i < words.length; i++){
                    list.add(words[i]);
                }
                for (int j = 0; j < reserved.length; j++){
                    if (list.contains(reserved[j])){
                        list.remove(reserved[j]);
                    }
                }


            }

        } catch (IOException ex) {
            Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE, null, ex);
        }
        try {
            in.close();
        } catch (IOException ex) {
            Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE, null, ex);
        }
    }

    public static int keyIt (int x) {
        int key = x % 997;
        return key;
    }

    public static int horner (String word){
        int length = word.length();
        char[] letters = new char[length];

        for (int i = 0; i < length; i++){
            letters[i]=word.charAt(i);
        }

        char[] alphabet = new char[26];
        String abc = "abcdefghijklmnopqrstuvwxyz";

        for (int i = 0; i < 26; i++){
            alphabet[i]=abc.charAt(i);
        }

        int[] numbers = new int[length];
        int place = 0;
        for (int i = 0; i < length; i++){
            for (int j = 0; j < 26; j++){
                if (alphabet[j]==letters[i]){
                    numbers[place]=j+1;
                    place++;

                }
            }
        }

        int hornered = numbers[0] * 32;

        for (int i = 1; i < numbers.length; i++){

            hornered += numbers[i];
            if (i == numbers.length -1){
                return hornered;
            }
            hornered = hornered % 997;
            hornered *= 32;
        }
        return hornered;
    }

    public static String[] makeReserved (){
        reserved[0] = "abstract";
        reserved[1] = "assert";
        reserved[2] = "boolean";
        reserved[3] = "break";
        reserved[4] = "byte";
        reserved[5] = "case";
        reserved[6] = "catch";
        reserved[7] = "char";
        reserved[8] = "class";
        reserved[9] = "const";
        reserved[10] = "continue";
        reserved[11] = "default";
        reserved[12] = "do";
        reserved[13] = "double";
        reserved[14] = "else";
        reserved[15] = "enum";
        reserved[16] = "extends";
        reserved[17] = "false";
        reserved[18] = "final";
        reserved[19] = "finally";
        reserved[20] = "float";
        reserved[21] = "for";
        reserved[22] = "goto";
        reserved[23] = "if";
        reserved[24] = "implements";
        reserved[25] = "import";
        reserved[26] = "instanceof";
        reserved[27] = "int";
        reserved[28] = "interface";
        reserved[29] = "long";
        reserved[30] = "native";
        reserved[31] = "new";
        reserved[32] = "null";
        reserved[33] = "package";
        reserved[34] = "private";
        reserved[35] = "protected";
        reserved[36] = "public";
        reserved[37] = "return";
        reserved[38] = "short";
        reserved[39] = "static";
        reserved[40] = "strictfp";
        reserved[41] = "super";
        reserved[42] = "switch";
        reserved[43] = "synchronize";
        reserved[44] = "this";
        reserved[45] = "throw";
        reserved[46] = "throws";
        reserved[47] = "trasient";
        reserved[48] = "true";
        reserved[49] = "try";
        reserved[50] = "void";
        reserved[51] = "volatile";
        reserved[52] = "while";
        reserved[53] = "=";
        reserved[54] = "==";
        reserved[55] = "!=";
        reserved[56] = "+";
        reserved[57] = "-";
        reserved[58] = "*";
        reserved[59] = "/";
        reserved[60] = "{";
        reserved[61] = "}";

        return reserved;
    }
}

8 个解决方案

#1


155  

String.split takes a regex, and '.' has a special meaning for regexes.

String.split采用正则表达式和'。'对正则表达式有特殊意义。

You (probably) want something like:

你(可能)想要这样的东西:

String[] words = line.split("\\.");

Some folks seem to be having trouble getting this to work, so here is some runnable code you can use to verify correct behaviour.

有些人似乎无法使其工作,所以这里有一些可用于验证正确行为的可运行代码。

import java.util.Arrays;

public class TestSplit {
  public static void main(String[] args) {
    String line = "aa.bb.cc.dd";
    String[] words = line.split("\\.");
    System.out.println(Arrays.toString(words));
    // Output is "[aa, bb, cc, dd]"
  }
}

#2


38  

When splitting with a string literal delimiter, the safest way is to use the Pattern.quote() method:

使用字符串文字分隔符拆分时,最安全的方法是使用Pattern.quote()方法:

String[] words = line.split(Pattern.quote("."));

As described by other answers, splitting with "\\." is correct, but quote() will do this escaping for you.

如其他答案所述,用“\\”分割。是正确的,但quote()会为你逃避。

#3


5  

The argument to split is a regular expression. The period is a regular expression metacharacter that matches anything, thus every character in line is considered to be a split character, and is thrown away, and all of the empty strings between them are thrown away (because they're empty strings). The result is that you have nothing left.

split的参数是一个正则表达式。句点是一个匹配任何东西的正则表达式元字符,因此行中的每个字符都被认为是一个分裂字符,并被丢弃,它们之间的所有空字符串都被丢弃(因为它们是空字符串)。结果是你什么都没有留下。

If you escape the period (by adding an escaped backslash before it), then you can match literal periods. (line.split("\\."))

如果你逃避期间(通过在它之前添加一个转义的反斜杠),那么你可以匹配文字句点。 (line.split( “\\。”))

#4


4  

Have you tried escaping the dot? like this:

你有没有试过逃离点?喜欢这个:

String[] words = line.split("\\.");

String [] words = line.split(“\\。”);

#5


2  

The argument to split is a regular expression. "." matches anything so your delimiter to split on is anything.

split的参数是一个正则表达式。 “”匹配任何东西,所以你的分隔符分裂是什么。

#6


2  

This is definitely not the best way to do this but, I got it done by doing something like following.

这绝对不是最好的方法,但是,我通过做类似的事情来完成它。

String imageName = "my_image.png";
String replace = imageName.replace('.','~');
String[] split = replace.split("~");

System.out.println("Image name : " + split[0]);
System.out.println("Image extension : " + split[1]);

Output,

输出,

Image name : my_image
Image extension : png

#7


1  

If performance is an issue, you should consider using StringTokenizer instead of split. StringTokenizer is much much faster than split, even though it is a "legacy" class (but not deprecated).

如果性能是个问题,您应该考虑使用StringTokenizer而不是split。 StringTokenizer比split可快得多,即使它是一个“遗留”类(但不是已弃用)。

#8


0  

You might be interested in the StringTokenizer class. However, the java docs advise that you use the .split method as StringTokenizer is a legacy class.

您可能对StringTokenizer类感兴趣。但是,java文档建议您使用.split方法,因为StringTokenizer是一个遗留类。

#1


155  

String.split takes a regex, and '.' has a special meaning for regexes.

String.split采用正则表达式和'。'对正则表达式有特殊意义。

You (probably) want something like:

你(可能)想要这样的东西:

String[] words = line.split("\\.");

Some folks seem to be having trouble getting this to work, so here is some runnable code you can use to verify correct behaviour.

有些人似乎无法使其工作,所以这里有一些可用于验证正确行为的可运行代码。

import java.util.Arrays;

public class TestSplit {
  public static void main(String[] args) {
    String line = "aa.bb.cc.dd";
    String[] words = line.split("\\.");
    System.out.println(Arrays.toString(words));
    // Output is "[aa, bb, cc, dd]"
  }
}

#2


38  

When splitting with a string literal delimiter, the safest way is to use the Pattern.quote() method:

使用字符串文字分隔符拆分时,最安全的方法是使用Pattern.quote()方法:

String[] words = line.split(Pattern.quote("."));

As described by other answers, splitting with "\\." is correct, but quote() will do this escaping for you.

如其他答案所述,用“\\”分割。是正确的,但quote()会为你逃避。

#3


5  

The argument to split is a regular expression. The period is a regular expression metacharacter that matches anything, thus every character in line is considered to be a split character, and is thrown away, and all of the empty strings between them are thrown away (because they're empty strings). The result is that you have nothing left.

split的参数是一个正则表达式。句点是一个匹配任何东西的正则表达式元字符,因此行中的每个字符都被认为是一个分裂字符,并被丢弃,它们之间的所有空字符串都被丢弃(因为它们是空字符串)。结果是你什么都没有留下。

If you escape the period (by adding an escaped backslash before it), then you can match literal periods. (line.split("\\."))

如果你逃避期间(通过在它之前添加一个转义的反斜杠),那么你可以匹配文字句点。 (line.split( “\\。”))

#4


4  

Have you tried escaping the dot? like this:

你有没有试过逃离点?喜欢这个:

String[] words = line.split("\\.");

String [] words = line.split(“\\。”);

#5


2  

The argument to split is a regular expression. "." matches anything so your delimiter to split on is anything.

split的参数是一个正则表达式。 “”匹配任何东西,所以你的分隔符分裂是什么。

#6


2  

This is definitely not the best way to do this but, I got it done by doing something like following.

这绝对不是最好的方法,但是,我通过做类似的事情来完成它。

String imageName = "my_image.png";
String replace = imageName.replace('.','~');
String[] split = replace.split("~");

System.out.println("Image name : " + split[0]);
System.out.println("Image extension : " + split[1]);

Output,

输出,

Image name : my_image
Image extension : png

#7


1  

If performance is an issue, you should consider using StringTokenizer instead of split. StringTokenizer is much much faster than split, even though it is a "legacy" class (but not deprecated).

如果性能是个问题,您应该考虑使用StringTokenizer而不是split。 StringTokenizer比split可快得多,即使它是一个“遗留”类(但不是已弃用)。

#8


0  

You might be interested in the StringTokenizer class. However, the java docs advise that you use the .split method as StringTokenizer is a legacy class.

您可能对StringTokenizer类感兴趣。但是,java文档建议您使用.split方法,因为StringTokenizer是一个遗留类。