Given a url the following regular expression is able insert/substitute in words at certain points in the urls.
给定一个URL,以下正则表达式能够在URL中的某些点处插入/替换单词。
Code:
码:
#!/usr/bin/perl
use strict;
use warnings;
#use diagnostics;
my @insert_words = qw/HELLO GOODBYE/;
my $word = 0;
my $match;
while (<DATA>) {
chomp;
foreach my $word (@insert_words)
{
my $repeat = 1;
while ((my $match=$_) =~ s|(?<![/])(?:[/](?![/])[^/]*){$repeat}[^/]*\K|$word|)
{
print "$match\n";
$repeat++;
}
print "\n";
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
10.15.16.17/dog/cat/rabbit/
The output given (for the first example url in __DATA__
with the HELLO
word):
给出的输出(对于__DATA__中的第一个示例url,带有HELLO字):
http://www.*.com/dogHELLO/cat/rabbit/
http://www.*.com/dog/catHELLO/rabbit/
http://www.*.com/dog/cat/rabbitHELLO/
http://www.*.com/dog/cat/rabbit/HELLO
Where I am now stuck:
我现在陷入困境的地方:
I would now like to alter the regular expression so that the output will look like what is shown below:
我现在想改变正则表达式,使输出看起来如下所示:
http://www.*.com/dogHELLO/cat/rabbit/
http://www.*.com/dog/catHELLO/rabbit/
http://www.*.com/dog/cat/rabbitHELLO/
http://www.*.com/dog/cat/rabbit/HELLO
#above is what it already does at the moment
#below is what i also want it to be able to do as well
http://www.*.com/HELLOdog/cat/rabbit/ #<-puts the word at the start of the string
http://www.*.com/dog/HELLOcat/rabbit/
http://www.*.com/dog/cat/HELLOrabbit/
http://www.*.com/dog/cat/rabbit/HELLO
http://www.*.com/HELLO/cat/rabbit/ #<- now also replaces the string with the word
http://www.*.com/dog/HELLO/rabbit/
http://www.*.com/dog/cat/HELLO/
http://www.*.com/dog/cat/rabbit/HELLO
But I am having trouble getting it to automatically do this within the one regular expression.
但我无法让它在一个正则表达式中自动执行此操作。
Any help with this matter would be highly appreciated, many thanks
对此事的任何帮助都将受到高度赞赏,非常感谢
3 个解决方案
#1
1
One solution:
一解决方案:
use strict;
use warnings;
use URI qw( );
my @insert_words = qw( HELLO );
while (<DATA>) {
chomp;
my $url = URI->new($_);
my $path = $url->path();
for (@insert_words) {
# Use package vars to communicate with /(?{})/ blocks.
local our $insert_word = $_;
local our @paths;
$path =~ m{
^(.*/)([^/]*)((?:/.*)?)\z
(?{
push @paths, "$1$insert_word$2$3";
if (length($2)) {
push @paths, "$1$insert_word$3";
push @paths, "$1$2$insert_word$3";
}
})
(?!)
}x;
for (@paths) {
$url->path($_);
print "$url\n";
}
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
http://10.15.16.17/dog/cat/rabbit/
#2
1
Without crazy regexes:
没有疯狂的正则表达式:
use strict;
use warnings;
use URI qw( );
my @insert_words = qw( HELLO );
while (<DATA>) {
chomp;
my $url = URI->new($_);
my $path = $url->path();
for my $insert_word (@insert_words) {
my @parts = $path =~ m{/([^/]*)}g;
my @paths;
for my $part_idx (0..$#parts) {
my $orig_part = $parts[$part_idx];
local $parts[$part_idx];
{
$parts[$part_idx] = $insert_word . $orig_part;
push @paths, join '', map "/$_", @parts;
}
if (length($orig_part)) {
{
$parts[$part_idx] = $insert_word;
push @paths, join '', map "/$_", @parts;
}
{
$parts[$part_idx] = $orig_part . $insert_word;
push @paths, join '', map "/$_", @parts;
}
}
}
for (@paths) {
$url->path($_);
print "$url\n";
}
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
http://10.15.16.17/dog/cat/rabbit/
#3
1
one more solution:
还有一个解决方案
#!/usr/bin/perl
use strict;
use warnings;
my @insert_words = qw/HELLO GOODBYE/;
while (<DATA>) {
chomp;
/(?<![\/])(?:[\/](?![\/])[^\/]*)/p;
my $begin_part = ${^PREMATCH};
my $tail = ${^MATCH} . ${^POSTMATCH};
my @tail_chunks = split /\//, $tail;
foreach my $word (@insert_words) {
for my $index (1..$#tail_chunks) {
my @new_tail = @tail_chunks;
$new_tail[$index] = $word . $tail_chunks[$index];
my $str = $begin_part . join "/", @new_tail;
print $str, "\n";
$new_tail[$index] = $tail_chunks[$index] . $word;
$str = $begin_part . join "/", @new_tail;
print $str, "\n";
}
print "\n";
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
10.15.16.17/dog/cat/rabbit/
#1
1
One solution:
一解决方案:
use strict;
use warnings;
use URI qw( );
my @insert_words = qw( HELLO );
while (<DATA>) {
chomp;
my $url = URI->new($_);
my $path = $url->path();
for (@insert_words) {
# Use package vars to communicate with /(?{})/ blocks.
local our $insert_word = $_;
local our @paths;
$path =~ m{
^(.*/)([^/]*)((?:/.*)?)\z
(?{
push @paths, "$1$insert_word$2$3";
if (length($2)) {
push @paths, "$1$insert_word$3";
push @paths, "$1$2$insert_word$3";
}
})
(?!)
}x;
for (@paths) {
$url->path($_);
print "$url\n";
}
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
http://10.15.16.17/dog/cat/rabbit/
#2
1
Without crazy regexes:
没有疯狂的正则表达式:
use strict;
use warnings;
use URI qw( );
my @insert_words = qw( HELLO );
while (<DATA>) {
chomp;
my $url = URI->new($_);
my $path = $url->path();
for my $insert_word (@insert_words) {
my @parts = $path =~ m{/([^/]*)}g;
my @paths;
for my $part_idx (0..$#parts) {
my $orig_part = $parts[$part_idx];
local $parts[$part_idx];
{
$parts[$part_idx] = $insert_word . $orig_part;
push @paths, join '', map "/$_", @parts;
}
if (length($orig_part)) {
{
$parts[$part_idx] = $insert_word;
push @paths, join '', map "/$_", @parts;
}
{
$parts[$part_idx] = $orig_part . $insert_word;
push @paths, join '', map "/$_", @parts;
}
}
}
for (@paths) {
$url->path($_);
print "$url\n";
}
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
http://10.15.16.17/dog/cat/rabbit/
#3
1
one more solution:
还有一个解决方案
#!/usr/bin/perl
use strict;
use warnings;
my @insert_words = qw/HELLO GOODBYE/;
while (<DATA>) {
chomp;
/(?<![\/])(?:[\/](?![\/])[^\/]*)/p;
my $begin_part = ${^PREMATCH};
my $tail = ${^MATCH} . ${^POSTMATCH};
my @tail_chunks = split /\//, $tail;
foreach my $word (@insert_words) {
for my $index (1..$#tail_chunks) {
my @new_tail = @tail_chunks;
$new_tail[$index] = $word . $tail_chunks[$index];
my $str = $begin_part . join "/", @new_tail;
print $str, "\n";
$new_tail[$index] = $tail_chunks[$index] . $word;
$str = $begin_part . join "/", @new_tail;
print $str, "\n";
}
print "\n";
}
}
__DATA__
http://www.*.com/dog/cat/rabbit/
http://www.superuser.co.uk/dog/cat/rabbit/hamster/
10.15.16.17/dog/cat/rabbit/