Sed或Awk或Perl替换句子

时间:2021-03-25 15:29:15

I need to make a substitution using Sed or other program. I have these patterns <ehh> <mmm> <mhh> repeated at the beginning of a sentences and I need to substitute for nothing.

我需要使用Sed或其他程序进行替换。我有这些模式< ehh> < MMM> < MHH>在句子的开头重复,我需要替代什么。

I am trying this:

我在尝试这个:

echo "$line" | sed 's/&lt;[a-zA-z]+&gt;//g'

But I get the same result, nothing changes. Anyone can help?

但我得到了相同的结果,没有任何变化。有人可以帮忙吗?

Thank you!

3 个解决方案

#1


For me, for the test file

对我来说,对于测试文件

&lt;ahh&gt; test
&lt;mmm&gt;test 1

the following

sed 's/^&lt;[a-zA-Z]\+&gt;//g' testfile

produces

 test
test 1

which seems to be what you want. Note that for basic regular expressions, you use \+ whereas for extended regular expressions, you use + (and need to use the -r switch for sed).

这似乎是你想要的。请注意,对于基本正则表达式,使用\ +而对于扩展正则表达式,使用+(并且需要对sed使用-r开关)。

NB: I added a ^to the check since you said: at the beginning of the line.

注意:我在检查中添加了一个^,因为你说:在行的开头。

#2


echo '&lt;ehh&gt; &lt;mmm&gt; &lt;mhh&gt;blabla bla' | \
sed '^Js/^\([[:space:]]*\&lt;[a-zA-Z]\{3\}\&gt;\)\{1,\}//'
  • remove all starting occurence of your pattern (including heading space)
  • 删除模式的所有开始出现(包括标题空间)

  • I escape & to be sure due to sed meaning of this character in pattern (work without on my AIX)
  • 我逃避并确保由于模式中这个字符的sed含义(在我的AIX上没有工作)

  • I don't use g because it remove several occurence of full pattern and there is only 1 begin (^) and use a multi occurence counter with group instead \(\)\{1,\}
  • 我不使用g,因为它删除了几个完整模式的出现并且只有1个开始(^)并且使用带有组的多次出现计数器而不是\(\)\ {1,\}

#3


If the goal is to get the last parameter from lines like this:

如果目标是从这样的行获取最后一个参数:

&lt;ahh&gt; test
&lt;mmm&gt;test 1

You can do:

你可以做:

awk -F\; '/^&lt;[[:alpha:]]+&gt/ {print $NF}' <<< "$line"
 test
test 1

It will search for pattern &lt;[[:alpha:]]+&gt and print last field on line, separated by ;

它将搜索模式< [[:alpha:]] +&gt并在线上打印最后一个字段,由;分隔;

#1


For me, for the test file

对我来说,对于测试文件

&lt;ahh&gt; test
&lt;mmm&gt;test 1

the following

sed 's/^&lt;[a-zA-Z]\+&gt;//g' testfile

produces

 test
test 1

which seems to be what you want. Note that for basic regular expressions, you use \+ whereas for extended regular expressions, you use + (and need to use the -r switch for sed).

这似乎是你想要的。请注意,对于基本正则表达式,使用\ +而对于扩展正则表达式,使用+(并且需要对sed使用-r开关)。

NB: I added a ^to the check since you said: at the beginning of the line.

注意:我在检查中添加了一个^,因为你说:在行的开头。

#2


echo '&lt;ehh&gt; &lt;mmm&gt; &lt;mhh&gt;blabla bla' | \
sed '^Js/^\([[:space:]]*\&lt;[a-zA-Z]\{3\}\&gt;\)\{1,\}//'
  • remove all starting occurence of your pattern (including heading space)
  • 删除模式的所有开始出现(包括标题空间)

  • I escape & to be sure due to sed meaning of this character in pattern (work without on my AIX)
  • 我逃避并确保由于模式中这个字符的sed含义(在我的AIX上没有工作)

  • I don't use g because it remove several occurence of full pattern and there is only 1 begin (^) and use a multi occurence counter with group instead \(\)\{1,\}
  • 我不使用g,因为它删除了几个完整模式的出现并且只有1个开始(^)并且使用带有组的多次出现计数器而不是\(\)\ {1,\}

#3


If the goal is to get the last parameter from lines like this:

如果目标是从这样的行获取最后一个参数:

&lt;ahh&gt; test
&lt;mmm&gt;test 1

You can do:

你可以做:

awk -F\; '/^&lt;[[:alpha:]]+&gt/ {print $NF}' <<< "$line"
 test
test 1

It will search for pattern &lt;[[:alpha:]]+&gt and print last field on line, separated by ;

它将搜索模式< [[:alpha:]] +&gt并在线上打印最后一个字段,由;分隔;