在Java中正则表达非法字符

I've been looking through the Internet an after a big headache, cannon't find why this regular expression is wrong:

经过一段大问题后，我一直在寻找互联网，无法找到为什么这个正则表达式是错误的：

"\"\w*&&[\p{Punct}]\"["+sepChar+"]\"\w*&&[\p{Punct}]\""

I'm trying to read a master data file with the following pattern (quotes included):

我正在尝试使用以下模式读取主数据文件（包括引号）：

"TEXTVALUE":"TEXTVALUE":"TEXTVALUE"

and split each line with the regular expression above.

并使用上面的正则表达式拆分每一行。

So, for example:

所以，例如：

"Hello:John":"Hello:World":"Hello:Mark"

will be splitted into:

将分为：

{"Hello:John", "Hello:World", "Hello:Mark"}

2 个解决方案

#1

The backwards slash is the escape character in Java. You need to use two backslashes \\ to include a single backslash in the regex.

反向斜杠是Java中的转义字符。你需要使用两个反斜杠\\在正则表达式中包含一个反斜杠。

Try:

尝试：

"\"\\w*&&[\\p{Punct}]\"["+sepChar+"]\"\\w*&&[\\p{Punct}]\""

#2

Ok.

好。

Thanks to @kevin-bowersox for the help.

感谢@ kevin-bowersox的帮助。

It seems that Oracle has done a great job improving Java with version 7. With this code:

似乎Oracle在使用版本7改进Java方面做得很好。使用以下代码：

File file = new File(someFile);
BufferedReader br = new BufferedReader(file);
String line = null;
while((line = br.readLine()) != null){
  //todo
}

If your file has been formatted with a constant patern, for example:

如果您的文件已使用常量patern格式化，例如：

"TEXTVALUE":"TEXTVALUE":"TEXTVALUE"

It reads:

它写道：

"TEXTVALUE-->TEXTVALUE-->TEXTVALUE"

where '-->' stands for tabs ('\t')

其中' - >'代表制表符（'\ t'）

So, at the end, my solution is:

所以，最后，我的解决方案是：

public ArrayList getSplittedTextFromFile(String filePath) throws FileNotFoundException, IOException{
  ArrayList<String[]> ret = null;
  if (!filePath.isEmpty()){
    File input = new File(filePath);
    BufferedReader br = new BufferedReader(input);
    String line = null;
    while((line = br.readLine()) != null){
      String[] aSplit = line.split("\\t");
      if (ret == null)
        ret = new ArrayList<>();
      ret.add(aSplit);
    }//while
  }//fi
}//fnc

#1