如何根据正则表达式将文件内容分类到不同的组？

I have a flat file that contains a list of packages that are existing in the system. I want to find out if the package is

我有一个平面文件,其中包含系统中存在的软件包列表。我想知道包是否

a batch component (conventionally, names begin with batch),

批处理组件(通常,名称以批处理开头),

a service (names end with serv)

服务(名称以serv结尾)

a messaging daemon (names end with d)

消息传递守护进程(名称以d结尾)

a web component (names end with web)

一个Web组件(名称以web结尾)

and those that don't fall into any category (meaning not named per convention)

那些不属于任何类别的人(意思是没有根据惯例命名)

I have written this bash script for the same:

我为此写了这个bash脚本:

grep serv$ pack_list.txt > serv_list.txt
grep d$ pack_list.txt > daemon_list.txt
grep ^batch pack_list.txt > batch_list.txt
grep web$ pack_list.txt > web_list.txt
grep -v serv$ pack_list.txt | grep -v d$ | grep -v ^batch | grep -v web$ > uncat_list.txt

While it satisfies my current requirement and does not take much time, I cannot help but wonder if some other language would be a better choice for these kind of operations.

虽然它满足了我目前的要求并且不花费太多时间,但我不禁想知道其他语言是否会成为这类操作的更好选择。

---EDIT--

Example input would be:

示例输入将是:

fileserv
batch_file_processor
userweb
processord

Each would go into a different file.

每个都会进入一个不同的文件。

To clarify what I am looking for: I am looking for some language where this processing would have better syntactic support than:

为了澄清我在寻找什么:我正在寻找一种语言,这种处理将有更好的语法支持:

A command like grep for each regex.

像每个正则表达式的grep命令。

A series of if conditions like Python or Perl would do.

像Python或Perl这样的一系列if条件都可以。

Something along the lines of:

有点像:

switch line.match($1):
    case (pattern1):
          ...
    case (pattern2):
          ...

Any suggestions?

1 个解决方案

#1

A single Awk process can do this much better, for each line matching against your patterns and redirecting output appropriately:

对于与模式匹配的每一行并适当地重定向输出,单个Awk进程可以做得更好:

awk '{
  if ($0 ~ /serv$/) { print > "serv_list.txt" }
  else if ($0 ~ /d$) { print > "daemon_list.txt" }
  // ... and so on
  else { print > "uncat_list.txt" }
}' pack_list.txt

#1

A single Awk process can do this much better, for each line matching against your patterns and redirecting output appropriately:

对于与模式匹配的每一行并适当地重定向输出,单个Awk进程可以做得更好:

awk '{
  if ($0 ~ /serv$/) { print > "serv_list.txt" }
  else if ($0 ~ /d$) { print > "daemon_list.txt" }
  // ... and so on
  else { print > "uncat_list.txt" }
}' pack_list.txt

秒客网

如何根据正则表达式将文件内容分类到不同的组？

1 个解决方案

#1

#1

相关文章