Python 3.x打印特定标题后的行数

问题描述：

我有一个问题，我似乎无法解决;如果这是重复的道歉，但永远不会有真正的答案。我从配置文件中提取特定的信息，以文本块的形式显示信息，我只需要打印特定的块，而不需要标题。因此，例如，（与下面的文本格式），我只会想捕捉低于头2的信息，但没有什么过去头3：Python 3.x打印特定标题后的行数

# output could containmultiple headers, and lines, or no lines per header this is an example of what could be present but it is not absolute. 

header1 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header2 
------- 
line1 
line2 
line3 # can be muiplies availables or known 

header3 
------- 

header4 
------- 
line1 
line2 
line3 # can be multiple linnes or none not known

这里是我开始，但卡在第二循环布尔码或逻辑，用于以打印头块的唯一的行：

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 


for fooline in scrape: 

     if "Header" in fooline: 
       #print(fooline) # prints all lines 
        #print lines under header 2 and stop before header 3 



scrape.close()

答

使用的标题行检测到打开/关闭，控制打印的布尔：

RAW_FILE = "scrap.txt" 

DESIRED = 'header2' 

with open(RAW_FILE) as scrape: 

    printing = False 

    for line in scrape: 

     if line.startswith(DESIRED): 
      printing = True 
     elif line.startswith('header'): 
      printing = False 
     elif line.startswith('-------'): 
      continue 
     elif printing: 
      print(line, end='')

OUTPUT

> python3 test.py 
line1 
line2 
line3 # can be muiplies availables or known 

>

根据需要进行调整。

这是极好的感谢，如果我也想打印在该行的对象，我会怎样去做。我尝试分割它并打印行[0]以获得'3'。 line sample =“3 man enable none”，但没有运气不断返回一个没有对象，也许我不理解的东西。 – onxx

答

可以设置，启动和停止收集，基于匹配header2和header3内容的标志。

随着example.txt含有提供的完整数据。例如：

f = "example.txt" 
scrape = open(f,"r") 

collect = 0 
wanted = [] 

for fooline in scrape: 
    if "header2" in fooline: 
     collect = 1 
    if "header3" in fooline: 
     collect = 2 

    if collect == 1: 
     wanted.append(fooline) 
    elif collect == 2: 
     break 

scrape.close()

wanted输出：

['header2\n', 
'-------\n', 
'line1\n', 
'line2\n', 
'line3 # can be muiplies availables or known\n', 
'\n']

答

最初，将flag设置为False。检查该行是否以header2开头。如果True，则设置为flag。如果该行以header3开头，请将flag设置为False。

如果设置了flag，则打印行。

Raw_file = "scrap.txt" 
scrape = open(Raw_file,"r") 
flag = False 

for fooline in scrape: 
    if fooline.find("header3") == 0: flag = False # or break 
    if flag: 
     print(fooline) 
    if fooline.find("header2") == 0: flag = True 
scrape.close()

输出：

------- 

line1 

line2 

line3 # can be muiplies availables or known

答

您可以考虑使用正则表达式来打破成块这一点。

如果该文件是管理的规模，只是看它一下子和使用正则表达式，如：

(^header\d+[\s\S]+?(?=^header|\Z))

把它分解成块。 Demo

然后Python代码看起来像这样（得到头之间的任何文本）：

import re 

with open(fn) as f: 
    txt=f.read() 

for m in re.finditer(r'(^header\d+[\s\S]+?(?=^header|\Z))', txt, re.M): 
    print(m.group(1))

如果该文件是不是你想要一饮而尽读什么更大，你可以使用mmap与一个正则表达式，并以相当大的块读取一个文件。

如果您正在寻找只有一个头，是，更容易：

m=re.search(r'(^header2[\s\S]+?(?=^header|\Z))', txt, re.M) 
if m: 
    print(m.group(1))

Demo of regex

Python 3.x打印特定标题后的行数

相关推荐