I'm using Windows, and I would like to extract certain columns from a text file using a Perl, Python, batch etc. one-liner.
我正在使用Windows,我想使用Perl、Python、批处理等一行程序从文本文件中提取某些列。
On Unix I could do this:
在Unix上,我可以这样做:
cut -d " " -f 1-3 <my file>
How can I do this on Windows?
在Windows上我怎么做?
5 个解决方案
#1
10
Here is a Perl one-liner to print the first 3 whitespace-delimited columns of a file. This can be run on Windows (or Unix). Refer to perlrun.
下面是一个Perl一行程序,用于打印文件的前3个以空格分隔的列。这可以在Windows(或Unix)上运行。指perlrun。
perl -ane "print qq(@F[0..2]\n)" file.txt
#2
3
you can download GNU windows and use your normal cut/awk etc.. Or natively, you can use vbscript
您可以下载GNU windows并使用您的正常剪切/awk等。或者,您可以使用vbscript。
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objArgs = WScript.Arguments
strFile = objArgs(0)
Set objFile = objFS.OpenTextFile(strFile)
Do Until objFile.AtEndOfLine
strLine=objFile.ReadLine
sp = Split(strLine," ")
s=""
For i=0 To 2
s=s&" "&sp(i)
Next
WScript.Echo s
Loop
save the above as mysplit.vbs and on command line
将上面的保存为mysplit。vbs和命令行
c:\test> cscript //nologo mysplit.vbs file
Or just simple batch
或者只是简单的批处理
@echo off
for /f "tokens=1,2,3 delims= " %%a in (file) do (echo %%a %%b %%c)
If you want a Python one liner
如果你想要一条巨蟒
c:\test> type file|python -c "import sys; print [' '.join(i.split()[:3]) for i in sys.stdin.readlines()]"
#3
2
That's rather simple Python script:
这是相当简单的Python脚本:
for line in open("my file"):
parts = line.split(" ")
print " ".join(parts[0:3])
#4
1
The easiest way to do it would be to install Cygwin and use the Unix cut
command.
最简单的方法是安装Cygwin并使用Unix cut命令。
#5
0
If you are dealing with a text file that has very long lines and you are only interested in the first 3 columns, then splitting a fixed number of times yourself will be a lot faster than using the -a
option:
如果你处理的文本文件很长,而且只对前3列感兴趣,那么将固定的次数分配给你会比使用-a选项快得多:
perl -ne "@F = split /\s/, $_, 4; print qq(@F[0..2]\n)" file.txt
perl -ne "@F = split /\s/, $_, 4;打印qq(@F[0 . . 2]\ n)”file.txt
rather than
而不是
perl -ane "print qq(@F[0..2]\n)" file.txt
perl -ane“打印qq(@F[0. 2]\n)”文件.txt。
This is because the -a
option will split on every whitespace in a line, which potentially can lead to a lot of extra splitting.
这是因为-a选项将对一行中的每个空格进行分割,这可能会导致大量额外的分割。
#1
10
Here is a Perl one-liner to print the first 3 whitespace-delimited columns of a file. This can be run on Windows (or Unix). Refer to perlrun.
下面是一个Perl一行程序,用于打印文件的前3个以空格分隔的列。这可以在Windows(或Unix)上运行。指perlrun。
perl -ane "print qq(@F[0..2]\n)" file.txt
#2
3
you can download GNU windows and use your normal cut/awk etc.. Or natively, you can use vbscript
您可以下载GNU windows并使用您的正常剪切/awk等。或者,您可以使用vbscript。
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objArgs = WScript.Arguments
strFile = objArgs(0)
Set objFile = objFS.OpenTextFile(strFile)
Do Until objFile.AtEndOfLine
strLine=objFile.ReadLine
sp = Split(strLine," ")
s=""
For i=0 To 2
s=s&" "&sp(i)
Next
WScript.Echo s
Loop
save the above as mysplit.vbs and on command line
将上面的保存为mysplit。vbs和命令行
c:\test> cscript //nologo mysplit.vbs file
Or just simple batch
或者只是简单的批处理
@echo off
for /f "tokens=1,2,3 delims= " %%a in (file) do (echo %%a %%b %%c)
If you want a Python one liner
如果你想要一条巨蟒
c:\test> type file|python -c "import sys; print [' '.join(i.split()[:3]) for i in sys.stdin.readlines()]"
#3
2
That's rather simple Python script:
这是相当简单的Python脚本:
for line in open("my file"):
parts = line.split(" ")
print " ".join(parts[0:3])
#4
1
The easiest way to do it would be to install Cygwin and use the Unix cut
command.
最简单的方法是安装Cygwin并使用Unix cut命令。
#5
0
If you are dealing with a text file that has very long lines and you are only interested in the first 3 columns, then splitting a fixed number of times yourself will be a lot faster than using the -a
option:
如果你处理的文本文件很长,而且只对前3列感兴趣,那么将固定的次数分配给你会比使用-a选项快得多:
perl -ne "@F = split /\s/, $_, 4; print qq(@F[0..2]\n)" file.txt
perl -ne "@F = split /\s/, $_, 4;打印qq(@F[0 . . 2]\ n)”file.txt
rather than
而不是
perl -ane "print qq(@F[0..2]\n)" file.txt
perl -ane“打印qq(@F[0. 2]\n)”文件.txt。
This is because the -a
option will split on every whitespace in a line, which potentially can lead to a lot of extra splitting.
这是因为-a选项将对一行中的每个空格进行分割,这可能会导致大量额外的分割。