How to recursively replace characters with sed?

时间:2021-11-29 08:37:12

Is it possible to replace occurrences of a character sequence recursively without iterating over the same sequence again?

是否有可能递归地替换字符序列的出现而无需再次迭代相同的序列?

By performing a sed as in the following scenarios I can get the mentioned output.

通过在以下场景中执行sed,我可以获得上述输出。

$ echo XX | sed -e 's/XX/XoX/g'
XoX  
$ echo XXX | sed -e 's/XX/XoX/g'
XoXX  
$ echo XXXX | sed -e 's/XX/XoX/g'
XoXXoX  

However, I'm expecting the output to follow the following behavior.

但是,我期望输出遵循以下行为。

Input:

XX
XXX
XXXX

Expected output:

XoX
XoXoX
XoXoXoX

Is it possible to achieve the expected behavior with sed alone?

是否有可能单独使用sed实现预期的行为?

4 个解决方案

#1


22  

You can do:

你可以做:

> echo XXXX | sed -e ':loop' -e 's/XX/XoX/g' -e 't loop'
XoXoXoX

With:

  • -e ':loop' : Create a "loop" label
  • -e':loop':创建一个“循环”标签

  • -e 't loop' : Jump to the "loop" label if previous substitution was successful
  • -e't循环':如果先前的替换成功,则跳转到“循环”标签

#2


9  

In this particular case look-ahead or look-behind would be useful. I think GNU sed doesn't support these. With perl:

在这种特殊情况下,前瞻或后视将是有用的。我认为GNU sed不支持这些。使用perl:

perl -ne 's/X(?=X)/Xo/g; print;'

You could also use lookbehind and lookahead like:

你也可以使用lookbehind和lookahead,如:

s/(?<=X)(?=X)/o/g

Where:

(?<=X) is a positive lookbehind, a zero-length assertion that make sure we have an X before the current position
(?=X) is a positive lookahead, a zero-length assertion that make sure we have an X after the current position

(?<= X)是一个正向的后观,一个零长度的断言,确保我们在当前位置之前有一个X(?= X)是一个正向前瞻,一个零长度断言,确保我们有一个X后目前的立场

Using in a perl one-liner:

在perl单行中使用:

perl -pe 's/(?<=X)(?=X)/o/g' inputfile

Where:

-p causes Perl to assume a loop around the program with an implicit print of the current line

-p使Perl假定循环程序,并隐含打印当前行

#3


5  

The looping answer is the general way to do what you are asking.

循环答案是做你要求的一般方式。

However in the case of your data, assuming you are using GNU you can simply do:

但是对于您的数据,假设您使用的是GNU,则可以执行以下操作:

sed 's/\B/o/g'

The \b and \B options are regex extensions:

\ b和\ B选项是正则表达式扩展:

  • \b matches word boundaries, i.e. the transition from a "word" character to "non-word" character, or vice-versa
  • \ b匹配单词边界,即从“单词”字符到“非单词”字符的过渡,反之亦然

  • \B matches the opposite of \b. i.e. the gaps "inside" words. This allows us to insert characters inside of a word but not outside, as required.
  • \ B匹配\ b的反面。即“内部”字样的间隙。这允许我们根据需要在单词内插入字符,但不在外部。

Try it online.

在线尝试。


Alternatively if you don't have GNU sed, you can still achieve your goal without looping:

或者,如果你没有GNU sed,你仍然可以实现你的目标而不需要循环:

sed 's/X/&o/g;s/o$//'

This simply replaces every X with Xo, and the removes the final o from the string.

这只是用Xo替换每个X,并从字符串中删除最后的o。

Try it online.

在线尝试。

#4


4  

I checked if there is any sort of a flag to make this happen.
Even if that behavior was there it is going to be highly resource consuming.

我检查了是否有任何类型的标志来实现这一点。即使这种行为存在,它也将耗费大量资源。

However, in this particular use case, it is possible to have the expression just twice and achieve the required functionality. i.e. with 2 repeating sed expressions.

但是,在这个特定的用例中,可以使表达式只有两次并实现所需的功能。即有2个重复的sed表达式。

echo XX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'     # outputs XoX
echo XXX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'    # outputs XoXoX
echo XXXX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'   # outputs XoXoXoX

#1


22  

You can do:

你可以做:

> echo XXXX | sed -e ':loop' -e 's/XX/XoX/g' -e 't loop'
XoXoXoX

With:

  • -e ':loop' : Create a "loop" label
  • -e':loop':创建一个“循环”标签

  • -e 't loop' : Jump to the "loop" label if previous substitution was successful
  • -e't循环':如果先前的替换成功,则跳转到“循环”标签

#2


9  

In this particular case look-ahead or look-behind would be useful. I think GNU sed doesn't support these. With perl:

在这种特殊情况下,前瞻或后视将是有用的。我认为GNU sed不支持这些。使用perl:

perl -ne 's/X(?=X)/Xo/g; print;'

You could also use lookbehind and lookahead like:

你也可以使用lookbehind和lookahead,如:

s/(?<=X)(?=X)/o/g

Where:

(?<=X) is a positive lookbehind, a zero-length assertion that make sure we have an X before the current position
(?=X) is a positive lookahead, a zero-length assertion that make sure we have an X after the current position

(?<= X)是一个正向的后观,一个零长度的断言,确保我们在当前位置之前有一个X(?= X)是一个正向前瞻,一个零长度断言,确保我们有一个X后目前的立场

Using in a perl one-liner:

在perl单行中使用:

perl -pe 's/(?<=X)(?=X)/o/g' inputfile

Where:

-p causes Perl to assume a loop around the program with an implicit print of the current line

-p使Perl假定循环程序,并隐含打印当前行

#3


5  

The looping answer is the general way to do what you are asking.

循环答案是做你要求的一般方式。

However in the case of your data, assuming you are using GNU you can simply do:

但是对于您的数据,假设您使用的是GNU,则可以执行以下操作:

sed 's/\B/o/g'

The \b and \B options are regex extensions:

\ b和\ B选项是正则表达式扩展:

  • \b matches word boundaries, i.e. the transition from a "word" character to "non-word" character, or vice-versa
  • \ b匹配单词边界,即从“单词”字符到“非单词”字符的过渡,反之亦然

  • \B matches the opposite of \b. i.e. the gaps "inside" words. This allows us to insert characters inside of a word but not outside, as required.
  • \ B匹配\ b的反面。即“内部”字样的间隙。这允许我们根据需要在单词内插入字符,但不在外部。

Try it online.

在线尝试。


Alternatively if you don't have GNU sed, you can still achieve your goal without looping:

或者,如果你没有GNU sed,你仍然可以实现你的目标而不需要循环:

sed 's/X/&o/g;s/o$//'

This simply replaces every X with Xo, and the removes the final o from the string.

这只是用Xo替换每个X,并从字符串中删除最后的o。

Try it online.

在线尝试。

#4


4  

I checked if there is any sort of a flag to make this happen.
Even if that behavior was there it is going to be highly resource consuming.

我检查了是否有任何类型的标志来实现这一点。即使这种行为存在,它也将耗费大量资源。

However, in this particular use case, it is possible to have the expression just twice and achieve the required functionality. i.e. with 2 repeating sed expressions.

但是,在这个特定的用例中,可以使表达式只有两次并实现所需的功能。即有2个重复的sed表达式。

echo XX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'     # outputs XoX
echo XXX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'    # outputs XoXoX
echo XXXX | sed -e 's/XX/XoX/g' -e 's/XX/XoX/g'   # outputs XoXoXoX