如何在bash脚本中根据正则表达式拆分字符串

时间:2021-10-20 00:08:06

I have such a string:

我有这样一个字符串:

msg='123abc456def'

Now I need to split msg and get the result as below:

现在我需要拆分msg并获得如下结果:

['123', 'abc', '456', 'def']

In python, I can do like this:

在python中,我可以这样做:

pattern = re.compile(r'(\d+)')
res = pattern.split(msg)[1:]

How to get the same result in bash script?
I've tried like this but it doesn't work:

如何在bash脚本中获得相同的结果?我试过这样但是它不起作用:

IFS='[0-9]'    # how to define IFS with regex?
echo ${msg[@]}

3 个解决方案

#1


2  

Getting the substrings with grep, and putting the output in an array using command substitution:

使用grep获取子字符串,并使用命令替换将输出放入数组中:

$ msg='123abc456def'

$ out=( $(grep -Eo '[[:digit:]]+|[^[:digit:]]+' <<<"$msg") )

$ echo "${out[0]}"
123

$ echo "${out[1]}"
abc

$ echo "${out[@]}"
123 abc 456 def
  • The Regex (ERE) pattern [[:digit:]]+|[^[:digit:]]+ matches one or more digits ([[:digit:]]+) OR (|) one or more non-digits ([^[:digit:]]+.
  • 正则表达式(ERE)模式[[:digit:]] + | [^ [:digit:]] +匹配一个或多个数字([[:digit:]] +)OR(|)一个或多个非数字( [^ [:数字:]] +。

#2


2  

Given that you already know how to solve this in Python, you can solve it using the code shown in the question:

鉴于您已经知道如何在Python中解决这个问题,您可以使用问题中显示的代码来解决它:

MSG=123abc456def;
python -c "import re; print('\n'.join(re.split(r'(\\d+)', '${MSG}')[1:]))"

While python is not as standard of an executable as say grep or awk, does that really matter to you?

虽然python不像grep或awk那样是可执行文件的标准,但这对你真的很重要吗?

#3


1  

I would do matching instead of splitting. Here, I used grep but you can use the same regex in pure bash also.

我会做匹配而不是分裂。在这里,我使用grep,但你也可以在纯bash中使用相同的正则表达式。

$ msg='123abc456def'
$ grep -oE '[0-9]+|[^0-9]+' <<<$msg
123
abc
456
def

#1


2  

Getting the substrings with grep, and putting the output in an array using command substitution:

使用grep获取子字符串,并使用命令替换将输出放入数组中:

$ msg='123abc456def'

$ out=( $(grep -Eo '[[:digit:]]+|[^[:digit:]]+' <<<"$msg") )

$ echo "${out[0]}"
123

$ echo "${out[1]}"
abc

$ echo "${out[@]}"
123 abc 456 def
  • The Regex (ERE) pattern [[:digit:]]+|[^[:digit:]]+ matches one or more digits ([[:digit:]]+) OR (|) one or more non-digits ([^[:digit:]]+.
  • 正则表达式(ERE)模式[[:digit:]] + | [^ [:digit:]] +匹配一个或多个数字([[:digit:]] +)OR(|)一个或多个非数字( [^ [:数字:]] +。

#2


2  

Given that you already know how to solve this in Python, you can solve it using the code shown in the question:

鉴于您已经知道如何在Python中解决这个问题,您可以使用问题中显示的代码来解决它:

MSG=123abc456def;
python -c "import re; print('\n'.join(re.split(r'(\\d+)', '${MSG}')[1:]))"

While python is not as standard of an executable as say grep or awk, does that really matter to you?

虽然python不像grep或awk那样是可执行文件的标准,但这对你真的很重要吗?

#3


1  

I would do matching instead of splitting. Here, I used grep but you can use the same regex in pure bash also.

我会做匹配而不是分裂。在这里,我使用grep,但你也可以在纯bash中使用相同的正则表达式。

$ msg='123abc456def'
$ grep -oE '[0-9]+|[^0-9]+' <<<$msg
123
abc
456
def