[a-zA-Z0-9\-]的Regex,允许在中间而不是在开始或结束时使用破折号。

时间:2020-12-18 21:16:55

Update:

This question was an epic failure, but here's the working solution. It's based on Gumbo's answer (Gumbo's was close to working so I chose it as the accepted answer):

这个问题是一个史诗般的失败,但这是一个有效的解决方案。这是基于Gumbo的回答(Gumbo已经接近工作了,所以我选择了它作为被接受的答案):

Solution:

r'(?=[a-zA-Z0-9\-]{4,25}$)^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$'

Original Question (albeit, after 3 edits)

I'm using Python and I'm not trying to extract the value, but rather test to make sure it fits the pattern.

我使用的是Python,而不是试图提取值,而是测试以确保它符合模式。

allowed values:

spam123-spam-eggs-eggs1
spam123-eggs123
spam
1234
eggs123

Not allowed values:

eggs1-
-spam123
spam--spam

I just can't have a dash at the starting or the end. There is a question on here that works in the opposite direction by getting the string value after the fact, but I simply need to test for the value so that I can disallow it. Also, it can be a maximum of 25 chars long, but a minimum of 4 chars long. Also, no 2 dashes can touch each other.

我只是不能在开始或结束时冲刺。这里有一个问题,它的工作方向与实际情况相反,但我只需要测试它的值,这样我就可以不允许它了。此外,它最多可以有25个字符长,但至少有4个字符长。此外,没有两个破折号可以互相接触。

Here's what I've come up with after some experimentation with lookbehind, etc:

以下是我在做了一些实验之后的发现:

# Nothing here

4 个解决方案

#1


15  

Try this regular expression:

试试这个正则表达式:

^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

This regular expression does only allow hyphens to separate sequences of one or more characters of [a-zA-Z0-9].

这个正则表达式只允许连字符分隔一个或多个字符序列(a-zA-Z0-9)。


Edit    Following up your comment: The expression (…)* allows the part inside the group to be repeated zero or more times. That means

编辑您的评论:表达式(…)*允许组内的部分重复零次或多次。这意味着

a(bc)*

is the same as

是一样的

a|abc|abcbc|abcbcbc|abcbcbcbc|…

Edit    Now that you changed the requirements: As you probably don’t want to restrict each hyphen separated part of the words in its length, you will need a look-ahead assertion to take the length into account:

编辑现在您已经更改了需求:由于您可能不希望限制每个连字符的长度,您将需要一个向前的断言来考虑长度:

(?=[a-zA-Z0-9-]{4,25}$)^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

#2


4  

The current regex is simple and fairly readable. Rather than making it long and complicated, have you considered applying the other constraints with normal Python string processing tools?

当前的正则表达式很简单,可读性很好。您是否考虑过使用普通的Python字符串处理工具来应用其他约束,而不是让它变得冗长和复杂?

import re

def fits_pattern(string):
    if (4 <= len(string) <= 25 and
        "--" not in string and
        not string.startswith("-") and
        not string.endswith("-")):

        return re.match(r"[a-zA-Z0-9\-]", string)
    else:
        return None

#3


2  

It should be something like this:

应该是这样的:

^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

You are telling it to look for only one char, either a-z, A-Z, 0-9 or -, that is what [] does.

你告诉它只查找一个字符,或者a-z, a-z, 0-9或者-,这就是[]所做的。

So if you do [abc] you will match only "a", or "b" or "c". not "abc"

因此,如果你做[abc],你将只匹配“a”,或“b”或“c”。不是“abc”

Have fun.

玩得开心。

#4


0  

If you simply don't want a dash at the end and beginning, try ^[^-].*?[^-]$

如果你只是不想在结束和开始的时候出现破折号,可以尝试一下。

Edit: Bah, you keep changing it.

编辑:呸,你不停地换。

#1


15  

Try this regular expression:

试试这个正则表达式:

^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

This regular expression does only allow hyphens to separate sequences of one or more characters of [a-zA-Z0-9].

这个正则表达式只允许连字符分隔一个或多个字符序列(a-zA-Z0-9)。


Edit    Following up your comment: The expression (…)* allows the part inside the group to be repeated zero or more times. That means

编辑您的评论:表达式(…)*允许组内的部分重复零次或多次。这意味着

a(bc)*

is the same as

是一样的

a|abc|abcbc|abcbcbc|abcbcbcbc|…

Edit    Now that you changed the requirements: As you probably don’t want to restrict each hyphen separated part of the words in its length, you will need a look-ahead assertion to take the length into account:

编辑现在您已经更改了需求:由于您可能不希望限制每个连字符的长度,您将需要一个向前的断言来考虑长度:

(?=[a-zA-Z0-9-]{4,25}$)^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

#2


4  

The current regex is simple and fairly readable. Rather than making it long and complicated, have you considered applying the other constraints with normal Python string processing tools?

当前的正则表达式很简单,可读性很好。您是否考虑过使用普通的Python字符串处理工具来应用其他约束,而不是让它变得冗长和复杂?

import re

def fits_pattern(string):
    if (4 <= len(string) <= 25 and
        "--" not in string and
        not string.startswith("-") and
        not string.endswith("-")):

        return re.match(r"[a-zA-Z0-9\-]", string)
    else:
        return None

#3


2  

It should be something like this:

应该是这样的:

^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$

You are telling it to look for only one char, either a-z, A-Z, 0-9 or -, that is what [] does.

你告诉它只查找一个字符,或者a-z, a-z, 0-9或者-,这就是[]所做的。

So if you do [abc] you will match only "a", or "b" or "c". not "abc"

因此,如果你做[abc],你将只匹配“a”,或“b”或“c”。不是“abc”

Have fun.

玩得开心。

#4


0  

If you simply don't want a dash at the end and beginning, try ^[^-].*?[^-]$

如果你只是不想在结束和开始的时候出现破折号,可以尝试一下。

Edit: Bah, you keep changing it.

编辑:呸,你不停地换。