如何在bash脚本中使用sed到regex字符串和数字

I want to separate string and number in a file to get a specific number in bash script, such as:

我想在一个文件中分离字符串和数字，以获得bash脚本中的特定数字，例如:

Branches executed:75.38% of 1190

分支机构执行:75.38%的1190年

I want to only get number

我只想要数字

75.38

75.38

. I have try like the code below

。我试过如下代码

$new_value=value | sed -r 's/.*_([0-9]*)\..*/\1/g'

but it was incorrect and it was failed.

但这是错误的，而且失败了。

How should it works? Thank you before for your help.

应该是如何运作的吗?谢谢你之前的帮助。

3 个解决方案

#1

You can use the following regex to extract the first number in a line:

您可以使用以下regex提取一行中的第一个数字:

^[^0-9]*\([0-9.]*\).*$

Usage:

用法:

% echo 'Branches executed:75.38% of 1190' | sed 's/^[^0-9]*\([0-9.]*\).*$/\1/'
75.38

#2

Give this a try:

给这一个尝试:

value=$(sed "s/^Branches executed:\([0-9][.0-9]*[0-9]*\)%.*$/\1/" afile)

It is assumed that the line appears only once in afile.

假设该行只出现在一个文件中。

The value is stored in the value variable.

值存储在值变量中。

#3

There are several things here that we could improve. One is that you need to escape the parentheses in sed: \(...\)

这里有几个我们可以改进的地方。一个是您需要转义sed中的圆括号:\(…\)

Another one is that it would be good to have a full specification of the input strings as well as a good script that can help us to play with this.

另一个问题是，如果有一个完整的输入字符串规范，以及一个可以帮助我们处理这个问题的好脚本，那就更好了。

Anyway, this is my first attempt: Update: I added a little more bash around this regex so it'll be more easy to play with it:

无论如何，这是我的第一次尝试:Update:我在这个regex上添加了一些抨击，这样就更容易使用它:

value='Branches executed:75.38% of 1190'
new_value=`echo $value | sed -e 's/[^0-9]*\([0-9]*\.[0-9]*\).*/\1/g'`
echo $new_value

Update 2: as john pointed out, it will match only numbers that contain a decimal dot. We can fix it with an optional group: \(\.[0-9]\+\)?. An explanation for the optional group:

更新2:正如john指出的，它将只匹配包含小数点的数字。我们可以用一个可选的组来修复它:\(\.[0-9]\+\)?可选组的解释:

\(...\) is a group.
\(…\)是一组。
\(...\)? Is a group that appears zero or one times (mind the question mark).
\(…\)?是一个出现0或1次(注意问号)的组。
\.[0-9]\+ is the pattern for a dot and one or more digits.
\。[0-9]\+是一个点和一个或多个数字的模式。

Putting all together:

把所有在一起:

value='Branches executed:75.38% of 1190'
new_value=`echo $value | sed -e 's/[^0-9]*\([0-9]\+\(\.[0-9]\+\)\?\).*/\1/g'`
echo $new_value

#1