使用正则表达式确定逗号分隔值是否有效

时间:2021-03-10 12:49:36

I have a string with comma separated alphanumeric values. The values are considered valid if they are exactly 2 characters in length, and if they have at least 1 alphabetical character. If all the values are valid then I would like to "capture" the entire string including commas. If a value is missing (back to back commas) then the entire string is invalid. I can only use Regex for this. Whitespace is ignored, the programming language used is Java

我有一个逗号分隔的字母数字值的字符串。如果这些值的长度恰好为2个字符,并且它们至少包含1个字母字符,则认为这些值有效。如果所有值都有效,那么我想“捕获”整个字符串,包括逗号。如果缺少值(背靠背逗号),则整个字符串无效。我只能使用正则表达式。空格被忽略,使用的编程语言是Java

Examples

  • "3F, 4B, AA, A4B" - not captured because 'A4B' is length 3
  • “3F,4B,AA,A4B” - 未捕获,因为'A4B'的长度为3

  • "3F, 4B, 55, A4" - not captured because '55' does not have at least 1 alphabetical char
  • “3F,4B,55,A4” - 未捕获,因为'55'没有至少1个字母字符

  • "3F, 4B,," - not captured because missing value between 2nd and 3rd comma
  • “3F,4B ,,” - 未捕获,因为第2个和第3个逗号之间缺少值

  • "3F, 4B, AA, A" - not captured because 'A' value is length 1
  • “3F,4B,AA,A” - 未捕获,因为'A'值是长度1

  • "3F, 4B, AA," - captured (trailing comma allowed)
  • “3F,4B,AA,” - 捕获(允许尾随逗号)

4 个解决方案

#1


2  

I would just brute force this one using the following expression

我只是使用下面的表达式来强制执行此操作

((\d[A-Z]|[A-Z]{2}|[A-Z]\d),\s)*(\d[A-Z]|[A-Z]{2}|[A-Z]\d),?$

Heres a breakdown:

下面是故障:

In your case theres 2 characters which have 3 specific cases where they are valid

在你的情况下,2个字符有3个特殊情况,它们是有效的

\d[A-Z]|[A-Z]{2}|[A-Z]\d
    - \d[A-Z] - digit followed by uppercase A-Z characters
    - [A-Z]{2} - 2 uppercase A-Z characters
    - [A-Z]\d - uppercase A-Z character followed by a digit

Then using that as a base I made an expression which said I need that set of cases to be followed by a comma and a space 0 or more times

然后使用它作为基础我做了一个表达式,表示我需要一组逗号后跟逗号和空格0次或更多次

(                               - start group
    (\d[A-Z]|[A-Z]{2}|[A-Z]\d)  - group as explained above
    ,\s                         - followed by comma and space
)*                              - entire group 0 or more times

Then I followed that with the same expression but added some additional modifiers to it

然后我用相同的表达式跟着它,但添加了一些额外的修饰符

(                               - start group
     \d[A-Z]|[A-Z]{2}|[A-Z]\d   - group as explained above
)                               - end group
,?                              - 0 or 1 trailing comma
$                               - match end of line

Theres probably a more elegant way of doing this expression but this way seems pretty straight forward. Heres some java examples of it in use.

这可能是一种更优雅的表达方式,但这种方式看起来非常简单。下面是一些使用它的Java示例。

String expression = "((\\d[A-Z]|[A-Z]{2}|[A-Z]\\d),\\s)*(\\d[A-Z]|[A-Z]{2}|[A-Z]\\d),?$";

System.out.println("3F, 4B, AA, A4B".matches(expression)); // false
System.out.println("3F, 4B, 55, A4".matches(expression)); // false
System.out.println("3F, 4B, 5A, A4".matches(expression)); // true
System.out.println("3F, 4B,,".matches(expression)); // false
System.out.println("3F, 4B, AA, A".matches(expression)); // false
System.out.println("3F, 4B, AA,".matches(expression)); // true

Theres alot of really good websites which let you test regex in your browser and get feedback immediately. This is a great way to build and test a regex and many times theres even a nice explanation peice on the page telling you about the expression you wrote.

有很多非常好的网站,可让您在浏览器中测试正则表达式并立即获得反馈。这是一个构建和测试正则表达式的好方法,很多时候甚至在页面上有一个很好的解释,告诉你你写的表达式。

Although many of these websites do not offer a Java enviroment for expression evaluation most languages have the same or very close to the same specification for regular expressions. To build this expression i tested it in Javascript then ran it in java to make sure it worked. Heres a link to the saved expression so you can test it yourself https://regex101.com/r/uP4oY2/1

虽然许多这些网站不提供用于表达式评估的Java环境,但大多数语言对于正则表达式具有相同或非常接近相同的规范。为了构建这个表达式,我在Javascript中测试它然后在java中运行它以确保它工作。下面是保存表达式的链接,以便您自己测试https://regex101.com/r/uP4oY2/1

#2


2  

First, you could simplify the valid format to [Alpha+Digit][Alpha] OR [Alpha][Alpha+Digit]:

首先,您可以将有效格式简化为[Alpha + Digit] [Alpha]或[Alpha] [Alpha + Digit]:

String regex = "[a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z]"

Then you want to allow any number of whitespace around it:

然后你想允许它周围的任意数量的空白:

String regex = "\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*"  

And you want it to be followed by a comma, unless it's the end of the string:

并且你希望它后面跟一个逗号,除非它是字符串的结尾:

String regex = "\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*(,|$)"  

And this pattern can repeat any number of times (one or more):

这种模式可以重复任意次数(一次或多次):

String regex = "(\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*(,|$))+"  

#3


1  

Ok so the idea is to have three groups connected with or

好的,所以想法是将三个组连接或

(Alpha Digit or Digit Alpha or Alpha Alpha)

Then we will allow whitespaces at the ends

然后我们将在末尾允许空格

whitespace zero or more (Alpha Digit or Digit Alpha or Alpha Alpha) whitespace zero or more

And last, we will repeat this 4 times with commas in between.

最后,我们将用逗号重复这4次。

#4


1  

You could try following regex:

您可以尝试以下正则表达式:

^((\s+)??(\d[a-z]|[a-z]\d|[a-z]{2}),?)+?$

This regex can be used in java as

这个正则表达式可以在java中使用

boolean foundMatch = text.matches("(?ismd)^((\\s+)??(\\d[a-z]|[a-z]\\d|[a-z]{2}),?)+?$");

Test cases:

3F, 4B, AA, C5              // true
3F, 4B, AA, C5,             // true
3F, 4B, AA, C5,,            // false
3F, 4B, A, C5               // false
3F, 4B, AA, C5, 45, A4B     // false

#1


2  

I would just brute force this one using the following expression

我只是使用下面的表达式来强制执行此操作

((\d[A-Z]|[A-Z]{2}|[A-Z]\d),\s)*(\d[A-Z]|[A-Z]{2}|[A-Z]\d),?$

Heres a breakdown:

下面是故障:

In your case theres 2 characters which have 3 specific cases where they are valid

在你的情况下,2个字符有3个特殊情况,它们是有效的

\d[A-Z]|[A-Z]{2}|[A-Z]\d
    - \d[A-Z] - digit followed by uppercase A-Z characters
    - [A-Z]{2} - 2 uppercase A-Z characters
    - [A-Z]\d - uppercase A-Z character followed by a digit

Then using that as a base I made an expression which said I need that set of cases to be followed by a comma and a space 0 or more times

然后使用它作为基础我做了一个表达式,表示我需要一组逗号后跟逗号和空格0次或更多次

(                               - start group
    (\d[A-Z]|[A-Z]{2}|[A-Z]\d)  - group as explained above
    ,\s                         - followed by comma and space
)*                              - entire group 0 or more times

Then I followed that with the same expression but added some additional modifiers to it

然后我用相同的表达式跟着它,但添加了一些额外的修饰符

(                               - start group
     \d[A-Z]|[A-Z]{2}|[A-Z]\d   - group as explained above
)                               - end group
,?                              - 0 or 1 trailing comma
$                               - match end of line

Theres probably a more elegant way of doing this expression but this way seems pretty straight forward. Heres some java examples of it in use.

这可能是一种更优雅的表达方式,但这种方式看起来非常简单。下面是一些使用它的Java示例。

String expression = "((\\d[A-Z]|[A-Z]{2}|[A-Z]\\d),\\s)*(\\d[A-Z]|[A-Z]{2}|[A-Z]\\d),?$";

System.out.println("3F, 4B, AA, A4B".matches(expression)); // false
System.out.println("3F, 4B, 55, A4".matches(expression)); // false
System.out.println("3F, 4B, 5A, A4".matches(expression)); // true
System.out.println("3F, 4B,,".matches(expression)); // false
System.out.println("3F, 4B, AA, A".matches(expression)); // false
System.out.println("3F, 4B, AA,".matches(expression)); // true

Theres alot of really good websites which let you test regex in your browser and get feedback immediately. This is a great way to build and test a regex and many times theres even a nice explanation peice on the page telling you about the expression you wrote.

有很多非常好的网站,可让您在浏览器中测试正则表达式并立即获得反馈。这是一个构建和测试正则表达式的好方法,很多时候甚至在页面上有一个很好的解释,告诉你你写的表达式。

Although many of these websites do not offer a Java enviroment for expression evaluation most languages have the same or very close to the same specification for regular expressions. To build this expression i tested it in Javascript then ran it in java to make sure it worked. Heres a link to the saved expression so you can test it yourself https://regex101.com/r/uP4oY2/1

虽然许多这些网站不提供用于表达式评估的Java环境,但大多数语言对于正则表达式具有相同或非常接近相同的规范。为了构建这个表达式,我在Javascript中测试它然后在java中运行它以确保它工作。下面是保存表达式的链接,以便您自己测试https://regex101.com/r/uP4oY2/1

#2


2  

First, you could simplify the valid format to [Alpha+Digit][Alpha] OR [Alpha][Alpha+Digit]:

首先,您可以将有效格式简化为[Alpha + Digit] [Alpha]或[Alpha] [Alpha + Digit]:

String regex = "[a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z]"

Then you want to allow any number of whitespace around it:

然后你想允许它周围的任意数量的空白:

String regex = "\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*"  

And you want it to be followed by a comma, unless it's the end of the string:

并且你希望它后面跟一个逗号,除非它是字符串的结尾:

String regex = "\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*(,|$)"  

And this pattern can repeat any number of times (one or more):

这种模式可以重复任意次数(一次或多次):

String regex = "(\\s*([a-zA-Z][a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z])\\s*(,|$))+"  

#3


1  

Ok so the idea is to have three groups connected with or

好的,所以想法是将三个组连接或

(Alpha Digit or Digit Alpha or Alpha Alpha)

Then we will allow whitespaces at the ends

然后我们将在末尾允许空格

whitespace zero or more (Alpha Digit or Digit Alpha or Alpha Alpha) whitespace zero or more

And last, we will repeat this 4 times with commas in between.

最后,我们将用逗号重复这4次。

#4


1  

You could try following regex:

您可以尝试以下正则表达式:

^((\s+)??(\d[a-z]|[a-z]\d|[a-z]{2}),?)+?$

This regex can be used in java as

这个正则表达式可以在java中使用

boolean foundMatch = text.matches("(?ismd)^((\\s+)??(\\d[a-z]|[a-z]\\d|[a-z]{2}),?)+?$");

Test cases:

3F, 4B, AA, C5              // true
3F, 4B, AA, C5,             // true
3F, 4B, AA, C5,,            // false
3F, 4B, A, C5               // false
3F, 4B, AA, C5, 45, A4B     // false