I want to skip the first 17 lines while reading a text file.
在阅读文本文件时,我想跳过前17行。
Let's say the file looks like:
假设文件是这样的:
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
good stuff
I just want the good stuff. What I'm doing is a lot more complicated, but this is the part I'm having trouble with.
我只想要好的东西。我所做的事情要复杂得多,但这是我遇到麻烦的部分。
8 个解决方案
#1
72
Use a slice, like below
使用切片,如下图所示。
with open('yourfile.txt') as f:
lines_after_17 = f.readlines()[17:]
If the file is too big to load in memory:
如果文件太大,无法载入内存:
with open('yourfile.txt') as f:
for _ in xrange(17):
next(f)
for line in f:
# do stuff
#2
17
Use itertools.islice
, starting at index 17. It will automatically skip the 17 first lines.
出现使用itertools。islice,从索引17开始。它会自动跳过第17行。
import itertools
with open('file.txt') as f:
for line in itertools.islice(f, 17, None): # start=17, stop=None
# process lines
#3
2
for line in dropwhile(isBadLine, lines):
# process as you see fit
Full demo:
完整的演示:
from itertools import *
def isBadLine(line):
return line=='0'
with open(...) as f:
for line in dropwhile(isBadLine, f):
# process as you see fit
Advantages: This is easily extensible to cases where your prefix lines are more complicated than "0" (but not interdependent).
优点:这很容易扩展到您的前缀行比“0”更复杂的情况(但不是相互依赖的)。
#4
2
This solution helped me to skip the number of lines specified by the linetostart
variable. You get the index (int) and the line (string) if you want to keep track of those too. In your case, you substitute linetostart with 18, or assign 18 to linetostart variable.
这个解决方案帮助我跳过linetostart变量指定的行数。如果您想要跟踪这些索引,则可以获得索引(int)和行(字符串)。在您的情况中,您可以用18替换linetostart,或者将18指定为linetostart变量。
f = open("file.txt", 'r')
for i, line in enumerate(f, linetostart):
#Your code
#5
0
Here is a method to get lines between two line numbers in a file:
这里有一种方法,可以在文件中的两个行号之间获取行:
import sys
def file_line(name,start=1,end=sys.maxint):
lc=0
with open(s) as f:
for line in f:
lc+=1
if lc>=start and lc<=end:
yield line
s='/usr/share/dict/words'
l1=list(file_line(s,235880))
l2=list(file_line(s,1,10))
print l1
print l2
Output:
输出:
['Zyrian\n', 'Zyryan\n', 'zythem\n', 'Zythia\n', 'zythum\n', 'Zyzomys\n', 'Zyzzogeton\n']
['A\n', 'a\n', 'aa\n', 'aal\n', 'aalii\n', 'aam\n', 'Aani\n', 'aardvark\n', 'aardwolf\n', 'Aaron\n']
Just call it with one parameter to get from line n -> EOF
只需用一个参数调用它,就可以从n ->。
#6
0
If it's a table.
如果它是一个表。
pd.read_table("path/to/file", sep="\t", index_col=0, skiprows=17)
pd。read_table(“路径/ /文件”,9 =“t \”,index_col = 0,skiprows = 17)
#7
0
If you don't want to read the whole file into memory at once, you can use a few tricks:
如果您不希望同时将整个文件读入内存,您可以使用一些技巧:
With next(iterator)
you can advance to the next line:
使用next(迭代器),您可以进入下一行:
with open("filename.txt") as f:
next(f)
next(f)
next(f)
for line in f:
print(f)
Of course, this is slighly ugly, so itertools has a better way of doing this:
当然,这有点难看,所以itertools有更好的方法:
from itertools import islice
with open("filename.txt") as f:
# start at line 17 and never stop (None), until the end
for line in islice(f, 17, None):
print(f)
#8
-1
You can use a List-Comprehension to make it a one-liner:
你可以用一种理解的方式使它成为一行:
[fl.readline() for i in xrange(17)]
More about list comprehension in PEP 202 and in the Python documentation.
更多关于PEP 202和Python文档中的列表理解。
#1
72
Use a slice, like below
使用切片,如下图所示。
with open('yourfile.txt') as f:
lines_after_17 = f.readlines()[17:]
If the file is too big to load in memory:
如果文件太大,无法载入内存:
with open('yourfile.txt') as f:
for _ in xrange(17):
next(f)
for line in f:
# do stuff
#2
17
Use itertools.islice
, starting at index 17. It will automatically skip the 17 first lines.
出现使用itertools。islice,从索引17开始。它会自动跳过第17行。
import itertools
with open('file.txt') as f:
for line in itertools.islice(f, 17, None): # start=17, stop=None
# process lines
#3
2
for line in dropwhile(isBadLine, lines):
# process as you see fit
Full demo:
完整的演示:
from itertools import *
def isBadLine(line):
return line=='0'
with open(...) as f:
for line in dropwhile(isBadLine, f):
# process as you see fit
Advantages: This is easily extensible to cases where your prefix lines are more complicated than "0" (but not interdependent).
优点:这很容易扩展到您的前缀行比“0”更复杂的情况(但不是相互依赖的)。
#4
2
This solution helped me to skip the number of lines specified by the linetostart
variable. You get the index (int) and the line (string) if you want to keep track of those too. In your case, you substitute linetostart with 18, or assign 18 to linetostart variable.
这个解决方案帮助我跳过linetostart变量指定的行数。如果您想要跟踪这些索引,则可以获得索引(int)和行(字符串)。在您的情况中,您可以用18替换linetostart,或者将18指定为linetostart变量。
f = open("file.txt", 'r')
for i, line in enumerate(f, linetostart):
#Your code
#5
0
Here is a method to get lines between two line numbers in a file:
这里有一种方法,可以在文件中的两个行号之间获取行:
import sys
def file_line(name,start=1,end=sys.maxint):
lc=0
with open(s) as f:
for line in f:
lc+=1
if lc>=start and lc<=end:
yield line
s='/usr/share/dict/words'
l1=list(file_line(s,235880))
l2=list(file_line(s,1,10))
print l1
print l2
Output:
输出:
['Zyrian\n', 'Zyryan\n', 'zythem\n', 'Zythia\n', 'zythum\n', 'Zyzomys\n', 'Zyzzogeton\n']
['A\n', 'a\n', 'aa\n', 'aal\n', 'aalii\n', 'aam\n', 'Aani\n', 'aardvark\n', 'aardwolf\n', 'Aaron\n']
Just call it with one parameter to get from line n -> EOF
只需用一个参数调用它,就可以从n ->。
#6
0
If it's a table.
如果它是一个表。
pd.read_table("path/to/file", sep="\t", index_col=0, skiprows=17)
pd。read_table(“路径/ /文件”,9 =“t \”,index_col = 0,skiprows = 17)
#7
0
If you don't want to read the whole file into memory at once, you can use a few tricks:
如果您不希望同时将整个文件读入内存,您可以使用一些技巧:
With next(iterator)
you can advance to the next line:
使用next(迭代器),您可以进入下一行:
with open("filename.txt") as f:
next(f)
next(f)
next(f)
for line in f:
print(f)
Of course, this is slighly ugly, so itertools has a better way of doing this:
当然,这有点难看,所以itertools有更好的方法:
from itertools import islice
with open("filename.txt") as f:
# start at line 17 and never stop (None), until the end
for line in islice(f, 17, None):
print(f)
#8
-1
You can use a List-Comprehension to make it a one-liner:
你可以用一种理解的方式使它成为一行:
[fl.readline() for i in xrange(17)]
More about list comprehension in PEP 202 and in the Python documentation.
更多关于PEP 202和Python文档中的列表理解。