正则表达式的一串数字

时间:2021-04-06 21:39:45

Lets say I have a string of 4 numbers separated by space

假设我有一串由空格分隔的4个数字

1 4 16 28
# stored in $ids

So, to get the 4 number separately in four variables I was using

所以,在我使用的四个变量中分别得到4个数字

id1=$(echo $ids | sed -r "s/([0-9]+ ){0}([0-9]+) ?.*/\2/g")
id2=$(echo $ids | sed -r "s/([0-9]+ ){1}([0-9]+) ?.*/\2/g")
id3=$(echo $ids | sed -r "s/([0-9]+ ){2}([0-9]+) ?.*/\2/g")
id4=$(echo $ids | sed -r "s/([0-9]+ ){3}([0-9]+) ?.*/\2/g")

But the problem is, if the string has less then 4 numbers, the previous numbers are repeated.

但问题是,如果字符串少于4个数字,则重复之前的数字。

For example, if the string was

例如,如果字符串是

1

then what we will get is

那么我们将得到的是什么

id1=1
id2=1
id3=1
id4=1

What I want is that, if there are less numbers, then the extra variables should have the value 0.

我想要的是,如果数字较少,则额外变量的值应为0。

Is there a way to do that?

有没有办法做到这一点?

Note : I can only use sed

注意:我只能使用sed

3 个解决方案

#1


1  

Try this regex instead, it returns "" if there is nothing to return:

请尝试使用此正则表达式,如果没有任何内容可返回,则返回“”:

$ echo 1 2 3 4 5 6|sed -r "s/^([0-9]+ *){0,3}//g;s/([0-9]+).*/\1/g;s/^$/0/g"
4
#                 change this latter value ^ 

$ echo 1 2 3|sed -r "s/^([0-9]+ *){0,3}//g;s/([0-9]+).*/\1/g;s/^$/0/g"
0
$

#2


1  

If you can use awk, you could try something like this:

如果你可以使用awk,你可以尝试这样的事情:

$ awk 'function m(a) { return a ~ /^[0-9]+$/ ? a : 0 } 
  { print m($1), m($2), m($3), m($4) }'
1 4 16 281 # input
1 4 16 281
1          # input
1 0 0 0

The function tests whether the field contains one or more digits, replacing with 0 if it doesn't.

该函数测试该字段是否包含一个或多个数字,如果不包含,则替换为0。

The best way to get the output into separate variables depends on what you plan on doing with them later and your shell version.

将输出转换为单独变量的最佳方法取决于您计划稍后使用它们和shell版本。

Using process substitution, you can do this:

使用进程替换,您可以执行以下操作:

str='1 4 16 281'
read -r id1 id2 id3 id4 < <(echo "$str" | awk '
  function m(a) { return a ~ /^[0-9]+$/ ? a : 0 } 
    { print m($1), m($2), m($3), m($4) }')

Less efficient but perhaps easier to understand would be to simplify the script and run it 4 times:

效率较低但可能更容易理解的是简化脚本并运行4次:

id1=$(echo "$str" | awk -v col=1 '{ print $col ~ /^[0-9]+$/ ? $col : 0 }')

#3


0  

You don't need sed. This will work in any POSIX-compatible shell:

你不需要sed。这适用于任何POSIX兼容的shell:

read -r id1 id2 id3 id4 rest <<EOF
$ids 0 0 0 0
EOF

This will work for any number of values in ids. The line in the here document will contain at least 4 values, regardless of the value of ids, and any extra fields will be assigned to rest and can be ignored.

这适用于id中的任意数量的值。无论ids的值如何,此处文档中的行将包含至少4个值,并且任何额外字段将被分配给休息,并且可以忽略。

#1


1  

Try this regex instead, it returns "" if there is nothing to return:

请尝试使用此正则表达式,如果没有任何内容可返回,则返回“”:

$ echo 1 2 3 4 5 6|sed -r "s/^([0-9]+ *){0,3}//g;s/([0-9]+).*/\1/g;s/^$/0/g"
4
#                 change this latter value ^ 

$ echo 1 2 3|sed -r "s/^([0-9]+ *){0,3}//g;s/([0-9]+).*/\1/g;s/^$/0/g"
0
$

#2


1  

If you can use awk, you could try something like this:

如果你可以使用awk,你可以尝试这样的事情:

$ awk 'function m(a) { return a ~ /^[0-9]+$/ ? a : 0 } 
  { print m($1), m($2), m($3), m($4) }'
1 4 16 281 # input
1 4 16 281
1          # input
1 0 0 0

The function tests whether the field contains one or more digits, replacing with 0 if it doesn't.

该函数测试该字段是否包含一个或多个数字,如果不包含,则替换为0。

The best way to get the output into separate variables depends on what you plan on doing with them later and your shell version.

将输出转换为单独变量的最佳方法取决于您计划稍后使用它们和shell版本。

Using process substitution, you can do this:

使用进程替换,您可以执行以下操作:

str='1 4 16 281'
read -r id1 id2 id3 id4 < <(echo "$str" | awk '
  function m(a) { return a ~ /^[0-9]+$/ ? a : 0 } 
    { print m($1), m($2), m($3), m($4) }')

Less efficient but perhaps easier to understand would be to simplify the script and run it 4 times:

效率较低但可能更容易理解的是简化脚本并运行4次:

id1=$(echo "$str" | awk -v col=1 '{ print $col ~ /^[0-9]+$/ ? $col : 0 }')

#3


0  

You don't need sed. This will work in any POSIX-compatible shell:

你不需要sed。这适用于任何POSIX兼容的shell:

read -r id1 id2 id3 id4 rest <<EOF
$ids 0 0 0 0
EOF

This will work for any number of values in ids. The line in the here document will contain at least 4 values, regardless of the value of ids, and any extra fields will be assigned to rest and can be ignored.

这适用于id中的任意数量的值。无论ids的值如何,此处文档中的行将包含至少4个值,并且任何额外字段将被分配给休息,并且可以忽略。