Hi I need to write a regular expression in java that will find all instances of :
你好,我需要用java写一个正则表达式,它可以找到以下所有实例:
wsp:rsidP="005816D6" wsp:rsidR="005816D6" wsp:rsidRDefault="005816D6"
attributes in an XML string and strip them out:
XML字符串中的属性并将其删除:
So I need to rip out all attributes that starts with wsp:rsid
and ends with a double quote ("
)
因此,我需要将所有以wsp:rsid开头、以双引号(")结尾的属性提取出来
Thoughts on this:
对这些问题的看法:
String str = xmlstring.replaceAll("wsp:rsid/w", "");
- 字符串str = xmlstring。replaceAll(“wsp:rsid / w "," ");
String str = xmlstring.replaceAll("wsp:rsid[]\\"", "");
- 字符串str = xmlstring。replaceAll(wsp:rsid[]\ \ " "," ");
4 个解决方案
#1
2
xmlstring.replaceAll( "wsp:rsid\\w*?=\".*?\"", "" );
This works in my tests...
这在我的测试中行得通……
public void testReplaceAll() throws Exception {
String regex = "wsp:rsid\\w*?=\".*?\"";
assertEquals( "", "wsp:rsidP=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "", "wsp:rsidR=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "", "wsp:rsidRDefault=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "a=\"1\" >", "a=\"1\" wsp:rsidP=\"005816D6\">".replaceAll( regex, "" ) );
assertEquals(
"bob kuhar",
"bob wsp:rsidP=\"005816D6\" wsp:rsidRDefault=\"005816D6\" kuhar".replaceAll( regex, "" ) );
assertEquals(
" keepme=\"yes\" ",
"wsp:rsidP=\"005816D6\" keepme=\"yes\" wsp:rsidR=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals(
"<node a=\"l\" b=\"m\" c=\"r\">",
"<node a=\"l\" wsp:rsidP=\"0\" b=\"m\" wsp:rsidR=\"0\" c=\"r\">".replaceAll( regex, "" ) );
// Sadly doesn't handle the embedded \" case...
// assertEquals( "", "wsp:rsidR=\"hello\\\"world\"".replaceAll( regex, "" ) );
}
#2
1
Try:
试一试:
xmlstring.replaceAll("\\bwsp:rsid\\w*=\"[^\"]+(\\\\\"[^\"]*)*\"", "");
Also, your regexes are wrong. I suggest you go and plough through http://regular-expressions.info ;)
而且,你的正则表达式是错误的。我建议你去看看http://regular-expressions.info;
#3
0
Here are 2 functions. clean will do the replacement, extract will extract the data (if you want it, not sure)
这里有2个功能。clean会进行替换,extract会提取数据(如果需要,不确定)
Please excuse the style, I wanted you to be able to cut and paste the functions.
请原谅我的风格,我希望你们能够剪切和粘贴函数。
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Answer {
public static HashMap<String, String> extract(String s){
Pattern pattern = Pattern.compile("wsp:rsid(.+?)=\"(.+?)\"");
Matcher matcher = pattern.matcher(s);
HashMap<String, String> hm = new HashMap<String, String>();
//The first group is the string between the wsp:rsid and the =
//The second is the value
while (matcher.find()){
hm.put(matcher.group(1), matcher.group(2));
}
return hm;
}
public static String clean(String s){
Pattern pattern = Pattern.compile("wsp:rsid(.+?)=\"(.+?)\"");
Matcher matcher = pattern.matcher(s);
return matcher.replaceAll("");
}
public static void main(String[] args) {
System.out.print(clean("sadfasdfchri wsp:rsidP=\"005816D6\" foo=\"bar\" wsp:rsidR=\"005816D6\" wsp:rsidRDefault=\"005816D6\""));
HashMap<String, String> m = extract("sadfasdfchri wsp:rsidP=\"005816D6\" foo=\"bar\" wsp:rsidR=\"005816D6\" wsp:rsidRDefault=\"005816D6\"");
System.out.println("");
//ripped off of http://*.com/questions/1066589/java-iterate-through-hashmap
for (String key : m.keySet()) {
System.out.println("Key: " + key + ", Value: " + m.get(key));
}
}
}
returns:
返回:
sadfasdfchri foo="bar"
Key: RDefault, Value: 005816D6
Key: P, Value: 005816D6
Key: R, Value: 005816D6
#4
0
Unlike all other answers, this answer actually works!
与其他答案不同,这个答案确实有效!
xmlstring.replaceAll("\\bwsp:rsid\\w*?=\"[^\"]*\"", "");
Here's a test that fails with all other answers:
这里有一个测试,所有其他答案都失败了:
public static void main(String[] args) {
String xmlstring = "<tag wsp:rsidR=\"005816D6\" foo=\"bar\" wsp:rsidRDefault=\"005816D6\">hello</tag>";
System.out.println(xmlstring);
System.out.println(xmlstring.replaceAll("\\bwsp:rsid\\w*?=\"[^\"]*\"", ""));
}
Output:
输出:
<tag wsp:rsidR="005816D6" foo="bar" wsp:rsidRDefault="005816D6">hello</tag>
<tag foo="bar" >hello</tag>
#1
2
xmlstring.replaceAll( "wsp:rsid\\w*?=\".*?\"", "" );
This works in my tests...
这在我的测试中行得通……
public void testReplaceAll() throws Exception {
String regex = "wsp:rsid\\w*?=\".*?\"";
assertEquals( "", "wsp:rsidP=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "", "wsp:rsidR=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "", "wsp:rsidRDefault=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals( "a=\"1\" >", "a=\"1\" wsp:rsidP=\"005816D6\">".replaceAll( regex, "" ) );
assertEquals(
"bob kuhar",
"bob wsp:rsidP=\"005816D6\" wsp:rsidRDefault=\"005816D6\" kuhar".replaceAll( regex, "" ) );
assertEquals(
" keepme=\"yes\" ",
"wsp:rsidP=\"005816D6\" keepme=\"yes\" wsp:rsidR=\"005816D6\"".replaceAll( regex, "" ) );
assertEquals(
"<node a=\"l\" b=\"m\" c=\"r\">",
"<node a=\"l\" wsp:rsidP=\"0\" b=\"m\" wsp:rsidR=\"0\" c=\"r\">".replaceAll( regex, "" ) );
// Sadly doesn't handle the embedded \" case...
// assertEquals( "", "wsp:rsidR=\"hello\\\"world\"".replaceAll( regex, "" ) );
}
#2
1
Try:
试一试:
xmlstring.replaceAll("\\bwsp:rsid\\w*=\"[^\"]+(\\\\\"[^\"]*)*\"", "");
Also, your regexes are wrong. I suggest you go and plough through http://regular-expressions.info ;)
而且,你的正则表达式是错误的。我建议你去看看http://regular-expressions.info;
#3
0
Here are 2 functions. clean will do the replacement, extract will extract the data (if you want it, not sure)
这里有2个功能。clean会进行替换,extract会提取数据(如果需要,不确定)
Please excuse the style, I wanted you to be able to cut and paste the functions.
请原谅我的风格,我希望你们能够剪切和粘贴函数。
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Answer {
public static HashMap<String, String> extract(String s){
Pattern pattern = Pattern.compile("wsp:rsid(.+?)=\"(.+?)\"");
Matcher matcher = pattern.matcher(s);
HashMap<String, String> hm = new HashMap<String, String>();
//The first group is the string between the wsp:rsid and the =
//The second is the value
while (matcher.find()){
hm.put(matcher.group(1), matcher.group(2));
}
return hm;
}
public static String clean(String s){
Pattern pattern = Pattern.compile("wsp:rsid(.+?)=\"(.+?)\"");
Matcher matcher = pattern.matcher(s);
return matcher.replaceAll("");
}
public static void main(String[] args) {
System.out.print(clean("sadfasdfchri wsp:rsidP=\"005816D6\" foo=\"bar\" wsp:rsidR=\"005816D6\" wsp:rsidRDefault=\"005816D6\""));
HashMap<String, String> m = extract("sadfasdfchri wsp:rsidP=\"005816D6\" foo=\"bar\" wsp:rsidR=\"005816D6\" wsp:rsidRDefault=\"005816D6\"");
System.out.println("");
//ripped off of http://*.com/questions/1066589/java-iterate-through-hashmap
for (String key : m.keySet()) {
System.out.println("Key: " + key + ", Value: " + m.get(key));
}
}
}
returns:
返回:
sadfasdfchri foo="bar"
Key: RDefault, Value: 005816D6
Key: P, Value: 005816D6
Key: R, Value: 005816D6
#4
0
Unlike all other answers, this answer actually works!
与其他答案不同,这个答案确实有效!
xmlstring.replaceAll("\\bwsp:rsid\\w*?=\"[^\"]*\"", "");
Here's a test that fails with all other answers:
这里有一个测试,所有其他答案都失败了:
public static void main(String[] args) {
String xmlstring = "<tag wsp:rsidR=\"005816D6\" foo=\"bar\" wsp:rsidRDefault=\"005816D6\">hello</tag>";
System.out.println(xmlstring);
System.out.println(xmlstring.replaceAll("\\bwsp:rsid\\w*?=\"[^\"]*\"", ""));
}
Output:
输出:
<tag wsp:rsidR="005816D6" foo="bar" wsp:rsidRDefault="005816D6">hello</tag>
<tag foo="bar" >hello</tag>