我可以从一组哈希键构建Perl Regex

时间:2022-05-29 23:21:09

(related to previous question: Do I need to reset a Perl hash index?)

(与上一个问题相关:我是否需要重置Perl哈希索引?)

I have a hash coming in from a file which is defined as follows:

我有一个来自文件的哈希,其定义如下:

%project_keys = (
    cd     => "continuous_delivery",
    cm     => "customer_management",
    dem    => "demand",
    dis    => "dis",
    do     => "devops",
    sel    => "selection",
    seo    => "seo"
);

I need to check whether a review title has the correct format, and if so, link to a separate URL.

我需要检查评论标题是否具有正确的格式,如果是,请链接到单独的URL。

For instance, if a review title is

例如,如果评论标题是

"cm1234 - Do some CM work"

then I want to link to the following URL:

然后我想链接到以下URL:

http://projects/customer_management/setter/1234

Currently, I'm using the following (hard-coded) regex:

目前,我正在使用以下(硬编码)正则表达式:

if ($title =~ /(cd|cm|dem|dis|do|sel|seo)(\d+)\s.*/) {
    my $url = 'http://projects/'.$project_keys{$1}.'/setter/'.$2
}

but obviously I'd like to build the regex from the hash keys themselves (the hash example above will change fairly frequently). I thought about simply naively concatenating the keys as follows:

但显然我想从哈希键本身构建正则表达式(上面的哈希示例会相当频繁地更改)。我想过简单地按键连接键如下:

# Build the regex
my $regex = '';
foreach my $key ( keys %project_keys ) {
    $regex += $key + '|';
}
$regex = substr($regex, 0, -1); # Chop off the last pipe
$regex = '('.$regex.')(\d+)\s.*';
if ($title =~ /$regex/) {
    my $url = 'http://projects/'.$project_keys{$1}.'/setter/'.$2
}

but a) it's not working as I would wish, and b) I assume there's a much better Perl way to do this. Or is there?

但是a)它没有按照我的意愿工作,并且b)我认为有更好的Perl方法来做到这一点。或者有吗?

1 个解决方案

#1


6  

Your main problem comes from trying to use + to join strings. It doesn't do that in Perl, the string concatenation operator is .. But a loop with string concatenation can often be done better with join instead.

您的主要问题来自尝试使用+来连接字符串。它在Perl中没有这样做,字符串连接运算符是..但是,通过连接,通常可以更好地完成具有字符串连接的循环。

I would suggest:

我会建议:

my $project_match = join '|', map quotemeta, keys %project_keys;

if ($title =~ /($project_match)(\d+)\s/) {
   my $url = 'http://projects/'.$project_keys{$1}.'/setter/'.$2;
   # Something with $url
}

quotemeta is a function that escapes any regex metacharacters that occur in a string. There aren't any in your example, but it's good practice to use it always and avoid unexpected bugs.

quotemeta是一个函数,它可以转义字符串中出现的任何正则表达式元字符。在您的示例中没有任何内容,但最好始终使用它并避免意外错误。

I left out the trailing .* in your pattern, because there's no need to say "and then some stuff, or maybe no stuff" if you don't actually do anything with the stuff. The pattern doesn't need to match the entire string, unless you anchor it to the beginning and end of the string.

我在你的模式中遗漏了尾随。*因为如果你实际上没有对这些东西做任何事情,就没有必要说“然后有些东西,或者也许没有东西”。除非将其锚定到字符串的开头和结尾,否则该模式不需要匹配整个字符串。

#1


6  

Your main problem comes from trying to use + to join strings. It doesn't do that in Perl, the string concatenation operator is .. But a loop with string concatenation can often be done better with join instead.

您的主要问题来自尝试使用+来连接字符串。它在Perl中没有这样做,字符串连接运算符是..但是,通过连接,通常可以更好地完成具有字符串连接的循环。

I would suggest:

我会建议:

my $project_match = join '|', map quotemeta, keys %project_keys;

if ($title =~ /($project_match)(\d+)\s/) {
   my $url = 'http://projects/'.$project_keys{$1}.'/setter/'.$2;
   # Something with $url
}

quotemeta is a function that escapes any regex metacharacters that occur in a string. There aren't any in your example, but it's good practice to use it always and avoid unexpected bugs.

quotemeta是一个函数,它可以转义字符串中出现的任何正则表达式元字符。在您的示例中没有任何内容,但最好始终使用它并避免意外错误。

I left out the trailing .* in your pattern, because there's no need to say "and then some stuff, or maybe no stuff" if you don't actually do anything with the stuff. The pattern doesn't need to match the entire string, unless you anchor it to the beginning and end of the string.

我在你的模式中遗漏了尾随。*因为如果你实际上没有对这些东西做任何事情,就没有必要说“然后有些东西,或者也许没有东西”。除非将其锚定到字符串的开头和结尾,否则该模式不需要匹配整个字符串。