I need to extract certain Abbreviations from a file such as ABS,TVS,and PERL. Any abbreviations which are in uppercase letters. I'd preferably like to do this with a regular expression. Any help is appreciated.
我需要从ABS,TVS和PERL等文件中提取某些缩写。任何大写字母的缩写。我最好喜欢用正则表达式来做这件事。任何帮助表示赞赏。
4 个解决方案
#1
It would have been nice to hear what part you were particularly having trouble with.
很高兴听到你特别遇到麻烦的部分。
my %abbr;
open my $inputfh, '<', 'filename'
or die "open error: $!\n";
while ( my $line = readline($inputfh) ) {
while ( $line =~ /\b([A-Z]{2,})\b/g ) {
$abbr{$1}++;
}
}
for my $abbr ( sort keys %abbr ) {
print "Found $abbr $abbr{$abbr} time(s)\n";
}
#2
Reading text to be searched from standard input and writing all abbreviations found to standard output, separated by spaces:
从标准输入中读取要搜索的文本,并将所有缩写写入标准输出,用空格分隔:
my $text;
# Slurp all text
{ local $/ = undef; $text = <>; }
# Extract all sequences of 2 or more uppercase characters
my @abbrevs = $text =~ /\b([[:upper:]]{2,})\b/g;
# Output separated by spaces
print join(" ", @abbrevs), "\n";
Note the use of the POSIX character class [:upper:], which will match all uppercase characters, not just English ones (A-Z).
注意使用POSIX字符类[:upper:],它将匹配所有大写字符,而不仅仅是英文字符(A-Z)。
#3
Untested:
my %abbr;
open (my $input, "<", "filename")
|| die "open: $!";
for ( < $input > ) {
while (s/([A-Z][A-Z]+)//) {
$abbr{$1}++;
}
}
Modified it to look for at least two consecutive capital letters.
修改它以查找至少两个连续的大写字母。
#4
#!/usr/bin/perl
use strict;
use warnings;
my %abbrs = ();
while(<>){
my @words = split ' ', $_;
foreach my $word(@words){
$word =~ /([A-Z]{2,})/ && $abbrs{$1}++;
}
}
# %abbrs now contains all abreviations
#1
It would have been nice to hear what part you were particularly having trouble with.
很高兴听到你特别遇到麻烦的部分。
my %abbr;
open my $inputfh, '<', 'filename'
or die "open error: $!\n";
while ( my $line = readline($inputfh) ) {
while ( $line =~ /\b([A-Z]{2,})\b/g ) {
$abbr{$1}++;
}
}
for my $abbr ( sort keys %abbr ) {
print "Found $abbr $abbr{$abbr} time(s)\n";
}
#2
Reading text to be searched from standard input and writing all abbreviations found to standard output, separated by spaces:
从标准输入中读取要搜索的文本,并将所有缩写写入标准输出,用空格分隔:
my $text;
# Slurp all text
{ local $/ = undef; $text = <>; }
# Extract all sequences of 2 or more uppercase characters
my @abbrevs = $text =~ /\b([[:upper:]]{2,})\b/g;
# Output separated by spaces
print join(" ", @abbrevs), "\n";
Note the use of the POSIX character class [:upper:], which will match all uppercase characters, not just English ones (A-Z).
注意使用POSIX字符类[:upper:],它将匹配所有大写字符,而不仅仅是英文字符(A-Z)。
#3
Untested:
my %abbr;
open (my $input, "<", "filename")
|| die "open: $!";
for ( < $input > ) {
while (s/([A-Z][A-Z]+)//) {
$abbr{$1}++;
}
}
Modified it to look for at least two consecutive capital letters.
修改它以查找至少两个连续的大写字母。
#4
#!/usr/bin/perl
use strict;
use warnings;
my %abbrs = ();
while(<>){
my @words = split ' ', $_;
foreach my $word(@words){
$word =~ /([A-Z]{2,})/ && $abbrs{$1}++;
}
}
# %abbrs now contains all abreviations