I need to get the lines that have negative values in the 10th column. To do this I would like to use command line tools, such as grep or sth else.
我需要在第10列中获取具有负值的行。为此,我想使用命令行工具,例如grep或sth else。
My file looks like this:
我的文件看起来像这样:
CUFF.258 CUFF.258 - X:3346-3649 q1 q2 OK 1801.26 49.1276 -5.19633 3.04579 0.00232068 0.0343639 yes
CUFF.270 CUFF.270 - X:785379-802854 q1 q2 OK 3452.95 15.4353 -7.80545 4.11536 3.86579e-05 0.00141746 yes
CUFF.291 CUFF.291 - X:2035520-2038972 q1 q2 OK 40.6787 914.414 4.4905 -3.23369 0.00122202 0.0216311 yes
CUFF.303 CUFF.303 - X:2608113-2614358 q1 q2 OK 263.583 18.2568 -3.85175 3.81319 0.000137187 0.00419976 yes
CUFF.304 CUFF.304 - X:2813802-2818416 q1 q2 OK 0 352.966 1.79769e+308 1.79769e+308 0.000135079 0.00419976 yes
CUFF.315 CUFF.315 - X:3286518-3342976 q1 q2 OK 475.812 19.775 -4.58864 3.38001 0.00072482 0.0144964 yes
CUFF.328 CUFF.328 - X:4216658-4257029 q1 q2 OK 26.3907 664.784 4.65479 -3.98494 6.7498e-05 0.00221167 yes
CUFF.339 CUFF.339 - X:4820540-4832077 q1 q2 OK 4993.62 130.117 -5.2622 4.48626 7.24836e-06 0.000384913 yes
CUFF.341 CUFF.341 - X:4979865-5145183 q1 q2 OK 10.9841 109.543 3.31801 -3.00298 0.00267352 0.0381224 yes
CUFF.350 CUFF.350 - X:5521697-5542510 q1 q2 OK 15.4241 263.2 4.0929 -3.32719 0.000877259 0.0167875 yes
I tried to do this using regular expressions with grep. But that way wasn't correct since I got some false positive lines. I used grep -e '-.\.'
to get negative values in general and that gave me lines from other columns. What's the proper way to do this?
我尝试使用grep的正则表达式来做到这一点。但这种方式不正确,因为我得到了一些误报。我用过grep -e' - 。\。'一般来说得到负值,这给了我其他列的行。这样做的正确方法是什么?
2 个解决方案
#1
3
I'd use awk:
我用awk:
awk '$10 < 0' yourfile
#2
1
I think this regex finds what you need: negative numbers in the tenth column, if the columns are space-separated.
我认为这个正则表达式可以找到你需要的东西:第十列中的负数,如果列是空格分隔的。
/^(?:[^\s]+\s+){9}(\-[0-9\.]+)/m
Basically, it's a 9-repeating pattern of non-spaces, followed by exactly one hyphen (negative sign) and any number of digits and decimals... you can, of course, be more precise if necessary.
基本上,它是一个9重复的非空格模式,后面只有一个连字符(负号)和任意数量的数字和小数......当然,如果有必要,你可以更精确。
Edit: If you need to use this from the command line using grep
, you would need to surround the regex with single quotes, and you can drop the /
characters and multiline (m
) option as those are defaults for grep'ing files:
编辑:如果你需要使用grep从命令行使用它,你需要用单引号括起正则表达式,你可以删除/ characters和multiline(m)选项,因为这些是grep'ing文件的默认值:
~$ grep -P '^(?:[^\s]+\s+){9}(\-[0-9\.]+)' somefile.txt
Note that I've included the -P
option here as this is a "Perl-style" regex.
请注意,我在这里包含了-P选项,因为这是一个“Perl风格”的正则表达式。
#1
3
I'd use awk:
我用awk:
awk '$10 < 0' yourfile
#2
1
I think this regex finds what you need: negative numbers in the tenth column, if the columns are space-separated.
我认为这个正则表达式可以找到你需要的东西:第十列中的负数,如果列是空格分隔的。
/^(?:[^\s]+\s+){9}(\-[0-9\.]+)/m
Basically, it's a 9-repeating pattern of non-spaces, followed by exactly one hyphen (negative sign) and any number of digits and decimals... you can, of course, be more precise if necessary.
基本上,它是一个9重复的非空格模式,后面只有一个连字符(负号)和任意数量的数字和小数......当然,如果有必要,你可以更精确。
Edit: If you need to use this from the command line using grep
, you would need to surround the regex with single quotes, and you can drop the /
characters and multiline (m
) option as those are defaults for grep'ing files:
编辑:如果你需要使用grep从命令行使用它,你需要用单引号括起正则表达式,你可以删除/ characters和multiline(m)选项,因为这些是grep'ing文件的默认值:
~$ grep -P '^(?:[^\s]+\s+){9}(\-[0-9\.]+)' somefile.txt
Note that I've included the -P
option here as this is a "Perl-style" regex.
请注意,我在这里包含了-P选项,因为这是一个“Perl风格”的正则表达式。