使用linux命令获取第10列上具有负值的行

I need to get the lines that have negative values in the 10th column. To do this I would like to use command line tools, such as grep or sth else.

我需要在第10列中获取具有负值的行。为此,我想使用命令行工具,例如grep或sth else。

My file looks like this:

我的文件看起来像这样:

CUFF.258    CUFF.258    -   X:3346-3649 q1  q2  OK  1801.26 49.1276 -5.19633    3.04579 0.00232068  0.0343639   yes
CUFF.270    CUFF.270    -   X:785379-802854 q1  q2  OK  3452.95 15.4353 -7.80545    4.11536 3.86579e-05 0.00141746  yes
CUFF.291    CUFF.291    -   X:2035520-2038972   q1  q2  OK  40.6787 914.414 4.4905  -3.23369    0.00122202  0.0216311   yes
CUFF.303    CUFF.303    -   X:2608113-2614358   q1  q2  OK  263.583 18.2568 -3.85175    3.81319 0.000137187 0.00419976  yes
CUFF.304    CUFF.304    -   X:2813802-2818416   q1  q2  OK  0   352.966 1.79769e+308    1.79769e+308    0.000135079 0.00419976  yes
CUFF.315    CUFF.315    -   X:3286518-3342976   q1  q2  OK  475.812 19.775  -4.58864    3.38001 0.00072482  0.0144964   yes
CUFF.328    CUFF.328    -   X:4216658-4257029   q1  q2  OK  26.3907 664.784 4.65479 -3.98494    6.7498e-05  0.00221167  yes
CUFF.339    CUFF.339    -   X:4820540-4832077   q1  q2  OK  4993.62 130.117 -5.2622 4.48626 7.24836e-06 0.000384913 yes
CUFF.341    CUFF.341    -   X:4979865-5145183   q1  q2  OK  10.9841 109.543 3.31801 -3.00298    0.00267352  0.0381224   yes
CUFF.350    CUFF.350    -   X:5521697-5542510   q1  q2  OK  15.4241 263.2   4.0929  -3.32719    0.000877259 0.0167875   yes

I tried to do this using regular expressions with grep. But that way wasn't correct since I got some false positive lines. I used grep -e '-.\.' to get negative values in general and that gave me lines from other columns. What's the proper way to do this?

我尝试使用grep的正则表达式来做到这一点。但这种方式不正确,因为我得到了一些误报。我用过grep -e' - 。\。'一般来说得到负值,这给了我其他列的行。这样做的正确方法是什么?

2 个解决方案

#1

I'd use awk:

我用awk:

awk '$10 < 0' yourfile

#2

I think this regex finds what you need: negative numbers in the tenth column, if the columns are space-separated.

我认为这个正则表达式可以找到你需要的东西:第十列中的负数,如果列是空格分隔的。

/^(?:[^\s]+\s+){9}(\-[0-9\.]+)/m

Basically, it's a 9-repeating pattern of non-spaces, followed by exactly one hyphen (negative sign) and any number of digits and decimals... you can, of course, be more precise if necessary.

基本上,它是一个9重复的非空格模式,后面只有一个连字符(负号)和任意数量的数字和小数......当然,如果有必要,你可以更精确。

Edit: If you need to use this from the command line using grep, you would need to surround the regex with single quotes, and you can drop the / characters and multiline (m) option as those are defaults for grep'ing files:

编辑:如果你需要使用grep从命令行使用它,你需要用单引号括起正则表达式,你可以删除/ characters和multiline(m)选项,因为这些是grep'ing文件的默认值:

~$ grep -P '^(?:[^\s]+\s+){9}(\-[0-9\.]+)' somefile.txt

Note that I've included the -P option here as this is a "Perl-style" regex.

请注意,我在这里包含了-P选项,因为这是一个“Perl风格”的正则表达式。

#1