Python 3.x打印特定标题后的行数

问题描述:

我有一个问题,我似乎无法解决;如果这是重复的道歉,但永远不会有真正的答案。我从配置文件中提取特定的信息,以文本块的形式显示信息,我只需要打印特定的块,而不需要标题。因此,例如,(与下面的文本格式),我只会想捕捉低于头2的信息,但没有什么过去头3:Python 3.x打印特定标题后的行数

# output could containmultiple headers, and lines, or no lines per header this is an example of what could be present but it is not absolute. 

header1 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header2 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header3 
------- 

header4 
------- 
line1 
line2 
line3 # can be multiple linnes or none not known 

这里是我开始,但卡在第二循环布尔码或逻辑,用于以打印头块的唯一的行:

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 


for fooline in scrape: 

     if "Header" in fooline: 
       #print(fooline) # prints all lines 
        #print lines under header 2 and stop before header 3 



scrape.close() 

使用的标题行检测到打开/关闭,控制打印的布尔:

RAW_FILE = "scrap.txt" 

DESIRED = 'header2' 

with open(RAW_FILE) as scrape: 

    printing = False 

    for line in scrape: 

     if line.startswith(DESIRED): 
      printing = True 
     elif line.startswith('header'): 
      printing = False 
     elif line.startswith('-------'): 
      continue 
     elif printing: 
      print(line, end='') 

OUTPUT

> python3 test.py 
line1 
line2 
line3 # can be muiplies availables or known 

> 

根据需要进行调整。

+0

这是极好的感谢,如果我也想打印在该行的对象,我会怎样去做。我尝试分割它并打印行[0]以获得'3'。 line sample =“3 man enable none”,但没有运气不断返回一个没有对象,也许我不理解的东西。 – onxx

可以设置,启动和停止收集,基于匹配header2header3内容的标志。

随着example.txt含有提供的完整数据。例如:

f = "example.txt" 
scrape = open(f,"r") 

collect = 0 
wanted = [] 

for fooline in scrape: 
    if "header2" in fooline: 
     collect = 1 
    if "header3" in fooline: 
     collect = 2 

    if collect == 1: 
     wanted.append(fooline) 
    elif collect == 2: 
     break 

scrape.close() 

wanted输出:

['header2\n', 
'-------\n', 
'line1\n', 
'line2\n', 
'line3 # can be muiplies availables or known\n', 
'\n'] 

最初,将flag设置为False。检查该行是否以header2开头。如果True,则设置为flag。如果该行以header3开头,请将flag设置为False

如果设置了flag,则打印行。

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 
flag = False 

for fooline in scrape: 
    if fooline.find("header3") == 0: flag = False # or break 
    if flag: 
     print(fooline) 
    if fooline.find("header2") == 0: flag = True 
scrape.close() 

输出:

------- 

line1 

line2 

line3 # can be muiplies availables or known 

您可以考虑使用正则表达式来打破成块这一点。

如果该文件是管理的规模,只是看它一下子和使用正则表达式,如:

(^header\d+[\s\S]+?(?=^header|\Z)) 

把它分解成块。 Demo

然后Python代码看起来像这样(得到头之间的任何文本):

import re 

with open(fn) as f: 
    txt=f.read() 

for m in re.finditer(r'(^header\d+[\s\S]+?(?=^header|\Z))', txt, re.M): 
    print(m.group(1)) 

如果该文件是不是你想要一饮而尽读什么更大,你可以使用mmap与一个正则表达式,并以相当大的块读取一个文件。

如果您正在寻找只有一个头,是,更容易:

m=re.search(r'(^header2[\s\S]+?(?=^header|\Z))', txt, re.M) 
if m: 
    print(m.group(1)) 

Demo of regex