Python 3.x打印特定标题后的行数
问题描述:
我有一个问题,我似乎无法解决;如果这是重复的道歉,但永远不会有真正的答案。我从配置文件中提取特定的信息,以文本块的形式显示信息,我只需要打印特定的块,而不需要标题。因此,例如,(与下面的文本格式),我只会想捕捉低于头2的信息,但没有什么过去头3:Python 3.x打印特定标题后的行数
# output could containmultiple headers, and lines, or no lines per header this is an example of what could be present but it is not absolute.
header1
-------
line1
line2
line3 # can be muiplies availables or known
header2
-------
line1
line2
line3 # can be muiplies availables or known
header3
-------
header4
-------
line1
line2
line3 # can be multiple linnes or none not known
这里是我开始,但卡在第二循环布尔码或逻辑,用于以打印头块的唯一的行:
Raw_file = "scrap.txt"
scrape = open(Raw_file,"r")
for fooline in scrape:
if "Header" in fooline:
#print(fooline) # prints all lines
#print lines under header 2 and stop before header 3
scrape.close()
答
使用的标题行检测到打开/关闭,控制打印的布尔:
RAW_FILE = "scrap.txt"
DESIRED = 'header2'
with open(RAW_FILE) as scrape:
printing = False
for line in scrape:
if line.startswith(DESIRED):
printing = True
elif line.startswith('header'):
printing = False
elif line.startswith('-------'):
continue
elif printing:
print(line, end='')
OUTPUT
> python3 test.py
line1
line2
line3 # can be muiplies availables or known
>
根据需要进行调整。
答
可以设置,启动和停止收集,基于匹配header2
和header3
内容的标志。
随着example.txt
含有提供的完整数据。例如:
f = "example.txt"
scrape = open(f,"r")
collect = 0
wanted = []
for fooline in scrape:
if "header2" in fooline:
collect = 1
if "header3" in fooline:
collect = 2
if collect == 1:
wanted.append(fooline)
elif collect == 2:
break
scrape.close()
wanted
输出:
['header2\n',
'-------\n',
'line1\n',
'line2\n',
'line3 # can be muiplies availables or known\n',
'\n']
答
最初,将flag
设置为False
。检查该行是否以header2
开头。如果True
,则设置为flag
。如果该行以header3
开头,请将flag
设置为False
。
如果设置了flag
,则打印行。
Raw_file = "scrap.txt"
scrape = open(Raw_file,"r")
flag = False
for fooline in scrape:
if fooline.find("header3") == 0: flag = False # or break
if flag:
print(fooline)
if fooline.find("header2") == 0: flag = True
scrape.close()
输出:
-------
line1
line2
line3 # can be muiplies availables or known
答
您可以考虑使用正则表达式来打破成块这一点。
如果该文件是管理的规模,只是看它一下子和使用正则表达式,如:
(^header\d+[\s\S]+?(?=^header|\Z))
把它分解成块。 Demo
然后Python代码看起来像这样(得到头之间的任何文本):
import re
with open(fn) as f:
txt=f.read()
for m in re.finditer(r'(^header\d+[\s\S]+?(?=^header|\Z))', txt, re.M):
print(m.group(1))
如果该文件是不是你想要一饮而尽读什么更大,你可以使用mmap与一个正则表达式,并以相当大的块读取一个文件。
如果您正在寻找只有一个头,是,更容易:
m=re.search(r'(^header2[\s\S]+?(?=^header|\Z))', txt, re.M)
if m:
print(m.group(1))
这是极好的感谢,如果我也想打印在该行的对象,我会怎样去做。我尝试分割它并打印行[0]以获得'3'。 line sample =“3 man enable none”,但没有运气不断返回一个没有对象,也许我不理解的东西。 – onxx