如何使用正则表达式查找子字符串

时间:2021-11-03 19:35:01
UDF2<String, String, Boolean> contains = new UDF2<String, String, Boolean>() {
 private static final long serialVersionUID = -5239951370238629896L;
 @Override
     public Boolean call(String t1, String t2) throws Exception {
        Pattern p1 = Pattern.compile(t1);
        Pattern p2 = Pattern.compile(t2);
        return  p1.toString().contains(p2.toString());
     }
 };
 spark.udf().register("contains", contains, DataTypes.BooleanType);

In the above find a key in other string, if found it return true but it returns sub string of t2 also.

在上面找到其他字符串中的一个键,如果发现它返回true但它也返回t2的子字符串。

Actual Output:

t1 Hello world
t2:Hello
t2 :wo
t2:rl
t2:Hello world
t1 returns all this 3 but i want only hello or world key 

I try this

我试试这个

Pattern p1 = Pattern.compile("^"+t1+"$");
Pattern p2 = Pattern.compile("^"+t2+"$");
return  p1.toString().contains(p2.toString());

But it work if t2 contains Helow world i want Hello OR world any one is present it return True Can you please help me to write Reguler Expression

但它的工作如果t2包含Helow世界我想要Hello OR世界任何一个存在它返回True你可以帮我写Reguler Expression

2 个解决方案

#1


0  

Your question isn't very clear, but basically you don't need regular expression to check whether substring of one string in another, you can just use

你的问题不是很清楚,但基本上你不需要正则表达式来检查另一个字符串中是否有子串,你可以只使用

boolean isSubstring = t1.contains(t2);

if t2 is indeed a regular expression, not a regular string, you need to create a Pattern object from it (as you did), Then create a Matcher on the string which you wish to check, and then check with Matcher.find() method

如果t2确实是正则表达式,而不是常规字符串,则需要从中创建一个Pattern对象(如您所做),然后在要检查的字符串上创建一个Matcher,然后使用Matcher.find()进行检查方法

Pattern p = Pattern.compile(t2);
Matcher m = p.matcher(t1);
boolean isSubstring = m.find();

#2


0  

You don't need to use regex, you can just use String::contains method, here is a simple example :

你不需要使用正则表达式,你可以只使用String :: contains方法,这里有一个简单的例子:

String line = "Hellow My best world of java";
String str = "Hello world";
String[] spl = str.replaceAll("\\s+", " ").split(" ");
boolean check = true;
for(String s : spl){
    if(!line.contains(s)){
        check = false;
        break;
    }
}
System.out.println(check ? "Contain all" : "Not contains all");

The idea is :

这个想法是:

  1. split your words with space
  2. 用空间分开你的话语

  3. loop throw this results
  4. 循环抛出这个结果

  5. check if the your string contains all this results, if one is not exist break your loop and return false
  6. 检查你的字符串是否包含所有这些结果,如果不存在则断开你的循环并返回false

#1


0  

Your question isn't very clear, but basically you don't need regular expression to check whether substring of one string in another, you can just use

你的问题不是很清楚,但基本上你不需要正则表达式来检查另一个字符串中是否有子串,你可以只使用

boolean isSubstring = t1.contains(t2);

if t2 is indeed a regular expression, not a regular string, you need to create a Pattern object from it (as you did), Then create a Matcher on the string which you wish to check, and then check with Matcher.find() method

如果t2确实是正则表达式,而不是常规字符串,则需要从中创建一个Pattern对象(如您所做),然后在要检查的字符串上创建一个Matcher,然后使用Matcher.find()进行检查方法

Pattern p = Pattern.compile(t2);
Matcher m = p.matcher(t1);
boolean isSubstring = m.find();

#2


0  

You don't need to use regex, you can just use String::contains method, here is a simple example :

你不需要使用正则表达式,你可以只使用String :: contains方法,这里有一个简单的例子:

String line = "Hellow My best world of java";
String str = "Hello world";
String[] spl = str.replaceAll("\\s+", " ").split(" ");
boolean check = true;
for(String s : spl){
    if(!line.contains(s)){
        check = false;
        break;
    }
}
System.out.println(check ? "Contain all" : "Not contains all");

The idea is :

这个想法是:

  1. split your words with space
  2. 用空间分开你的话语

  3. loop throw this results
  4. 循环抛出这个结果

  5. check if the your string contains all this results, if one is not exist break your loop and return false
  6. 检查你的字符串是否包含所有这些结果,如果不存在则断开你的循环并返回false