Python: Problems with islice to read N number of lines at a time
from itertools import islice N = 16 infile = open("my_very_large_text_file", "r") lines_gen = islice(infile, N) for lines in lines_gen: ...process my lines...
我正在尝试使用" from itertools import islice",以便一次使用liblas模块从* .las文件读取大量行。 (我的目标是逐块阅读)
遵循以下问题:Python如何一次读取N行
islice() can be used to get the next n items of an iterator. Thus,
list(islice(f, n)) will return a list of the next n lines of the file
f. Using this inside a loop will give you the file in chunks of n
lines. At the end of the file, the list might be shorter, and finally
the call will return an empty list.
我使用以下代码:
from numpy import nonzero from liblas import file as lasfile from itertools import islice chunkSize = 1000000 f = lasfile.File(inFile,None,'r') # open LAS while True: chunk = list(islice(f,chunkSize)) if not chunk: break # do other stuff
但是我有这个问题:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
len(f) 2866390 chunk = list(islice(f, 1000000)) len(chunk) **1000000** chunk = list(islice(f, 1000000)) len(chunk) **1000000** chunk = list(islice(f, 1000000)) len(chunk) **866390** chunk = list(islice(f, 1000000)) len(chunk) **1000000** |
当文件f最终到达时,islice重新启动以读取文件。
感谢您的任何建议和帮助。 非常感谢
相关讨论
- Gah,所以您的lasfile.File类型违反了所有迭代器约定?
- 我对lasfile.File真的很不好过
更改属于liblas软件包的file.py的源代码。 当前__iter__被定义为(github上的src)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
def __iter__(self): """Iterator support (read mode only) >>> points = [] >>> for i in f: ... points.append(i) ... print i # doctest: +ELLIPSIS <liblas.point.Point object at ...> """ if self.mode == 0: self.at_end = False p = core.las.LASReader_GetNextPoint(self.handle) while p and not self.at_end: yield point.Point(handle=p, copy=True) p = core.las.LASReader_GetNextPoint(self.handle) if not p: self.at_end = True else: self.close() self.open() |
您会看到文件结束时将其关闭并再次打开,因此迭代将在文件的开头再次开始。
尝试在片刻之后删除最后一个else块,因此该方法的正确代码应为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
def __iter__(self): """Iterator support (read mode only) >>> points = [] >>> for i in f: ... points.append(i) ... print i # doctest: +ELLIPSIS <liblas.point.Point object at ...> """ if self.mode == 0: self.at_end = False p = core.las.LASReader_GetNextPoint(self.handle) while p and not self.at_end: yield point.Point(handle=p, copy=True) p = core.las.LASReader_GetNextPoint(self.handle) if not p: self.at_end = True |