如何使用linux命令显示我文件中每行的第一个单词?

时间:2021-10-31 22:29:28

I have a file containing many lines, and I want to display only the first word of each line with the Linux commands.

我有一个包含许多行的文件,我想用Linux命令只显示每行的第一个单词。

How can I do that?

我怎样才能做到这一点?

6 个解决方案

#1


26  

Try doing this using :

尝试使用grep执行此操作:

grep -Eo '^[^ ]+' file

#2


25  

You can use awk:

你可以使用awk:

awk '{print $1}' your_file

This will "print" the first column ($1) in your_file.

这将“打印”your_file中的第一列($ 1)。

#3


6  

try doing this with coreutils cut :

尝试使用coreutils cut执行此操作:

cut -d' ' -f1 file

#4


3  

I see there are already answers. But you can also do this with sed:

我看到已经有了答案。但你也可以用sed做到这一点:

sed 's/ .*//' fileName

#5


1  

The above solutions seem to fit your specific case. For a more general application of your question, consider that words are generally defined as being separated by whitespace, but not necessarily space characters specifically. Columns in your file may be tab-separated, for example, or even separated by a mixture of tabs and spaces.

以上解决方案似乎适合您的具体情况。对于问题的更一般应用,请考虑单词通常被定义为由空格分隔,但不一定是空格字符。例如,文件中的列可以是制表符分隔的,或者甚至由制表符和空格的混合分隔。

The previous examples are all useful for finding space-separated words, while only the awk example also finds words separated by other whitespace characters (and in fact this turns out to be rather difficult to do uniformly across various sed/grep versions). You may also want to explicitly skip empty lines, by amending the awk statement thus:

前面的示例对于查找以空格分隔的单词都很有用,而只有awk示例还可以找到由其他空白字符分隔的单词(事实上,这在各种sed / grep版本中统一执行起来相当困难)。您可能还希望通过修改awk语句显式地跳过空行:

awk '{if ($1 !="") print $1}' your_file

If you are also concerned about the possibility of empty fields, i.e., lines that begin with whitespace, then a more robust solution would be in order. I'm not adept enough with awk to produce a one-liner for such cases, but a short python script that does the trick might look like:

如果你还担心空字段的可能性,即以空格开头的行,那么就会有一个更健壮的解决方案。我不熟悉awk为这种情况生成单行程,但是一个简短的python脚本可以实现这个技巧:

>>> import re
>>> for line in open('your_file'):
...     words = re.split(r'\s', line)
...     if words and words[0]:
...         print words[0]

#6


0  

...or on Windows (if you have GnuWin32 grep) :

...或在Windows上(如果你有GnuWin32 grep):

grep -Eo "^[^ ]+" file

#1


26  

Try doing this using :

尝试使用grep执行此操作:

grep -Eo '^[^ ]+' file

#2


25  

You can use awk:

你可以使用awk:

awk '{print $1}' your_file

This will "print" the first column ($1) in your_file.

这将“打印”your_file中的第一列($ 1)。

#3


6  

try doing this with coreutils cut :

尝试使用coreutils cut执行此操作:

cut -d' ' -f1 file

#4


3  

I see there are already answers. But you can also do this with sed:

我看到已经有了答案。但你也可以用sed做到这一点:

sed 's/ .*//' fileName

#5


1  

The above solutions seem to fit your specific case. For a more general application of your question, consider that words are generally defined as being separated by whitespace, but not necessarily space characters specifically. Columns in your file may be tab-separated, for example, or even separated by a mixture of tabs and spaces.

以上解决方案似乎适合您的具体情况。对于问题的更一般应用,请考虑单词通常被定义为由空格分隔,但不一定是空格字符。例如,文件中的列可以是制表符分隔的,或者甚至由制表符和空格的混合分隔。

The previous examples are all useful for finding space-separated words, while only the awk example also finds words separated by other whitespace characters (and in fact this turns out to be rather difficult to do uniformly across various sed/grep versions). You may also want to explicitly skip empty lines, by amending the awk statement thus:

前面的示例对于查找以空格分隔的单词都很有用,而只有awk示例还可以找到由其他空白字符分隔的单词(事实上,这在各种sed / grep版本中统一执行起来相当困难)。您可能还希望通过修改awk语句显式地跳过空行:

awk '{if ($1 !="") print $1}' your_file

If you are also concerned about the possibility of empty fields, i.e., lines that begin with whitespace, then a more robust solution would be in order. I'm not adept enough with awk to produce a one-liner for such cases, but a short python script that does the trick might look like:

如果你还担心空字段的可能性,即以空格开头的行,那么就会有一个更健壮的解决方案。我不熟悉awk为这种情况生成单行程,但是一个简短的python脚本可以实现这个技巧:

>>> import re
>>> for line in open('your_file'):
...     words = re.split(r'\s', line)
...     if words and words[0]:
...         print words[0]

#6


0  

...or on Windows (if you have GnuWin32 grep) :

...或在Windows上(如果你有GnuWin32 grep):

grep -Eo "^[^ ]+" file