Python脚本从数组中包含文字的文件中删除行
问题描述:
我有以下脚本,它根据数组识别要删除的文件中的行,但不会删除它们。Python脚本从数组中包含文字的文件中删除行
我应该改变什么?
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup(filename):
print "fixup ", filename
fin = open(filename)
fout = open(filename2 , "w")
for line in fin.readlines():
for item in offending:
print "got one",line
line = line.replace(item, "MUST DELETE")
line=line.strip()
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
答
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup(filename):
fin = open(filename)
fout = open(filename2 , "w")
for line in fin:
if True in [item in line for item in offending]:
continue
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
编辑:甚至更好:
for line in fin:
if not True in [item in line for item in offending]:
fout.write(line)
答
基本策略是将输入文件的副本写入输出文件,但是有更改。在你的情况下,变化非常简单:你只需省略你不想要的行。
将安全文件安全写入后,可以删除原始文件并使用'os.rename()'将临时文件重命名为原始文件名。我喜欢将temp文件写入与原始文件相同的目录中,以确保我有权写入该目录,并且因为我不知道os.rename()
是否可以将文件从一个卷移动到另一个卷。
你不需要说for line in fin.readlines()
;这足以说明for line in fin
。当您使用.readlines()
时,您告诉Python将输入文件的每一行都一次全部读入内存;当您仅使用fin
时,您一次只能读取一行。
这里是你的代码,修改后做这些改变。
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def line_offends(line, offending):
for word in line.split():
if word in offending:
return True
return False
def fixup(filename):
print "fixup ", filename
fin = open(filename)
fout = open(filename2 , "w")
for line in fin:
if line_offends(line, offending):
continue
fout.write(line)
fin.close()
fout.close()
#os.rename() left as an exercise for the student
fixup(sourcefile)
如果line_offends()
返回true,我们执行continue
和循环继续不执行下一部分。这意味着该行永远不会被写入。对于这个简单的例子,它真的会一样好做这样:
for line in fin:
if not line_offends(line, offending):
fout.write(line)
我与continue
因为经常有不平凡的工作正在主循环中完成写它,你想如果测试是真的,请避免这一切。恕我直言,有一个简单的“如果这条线是不需要的,继续”,而不是缩进一个if
内的一大堆东西,可能是非常罕见的条件更好。
答
您并未将其写入输出文件。另外,我会使用“in”来检查行中存在的字符串。请参阅下面的修改后的脚本(未测试):
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup(filename):
print "fixup ", filename
fin = open(filename)
fout = open(filename2 , "w")
for line in fin.readlines():
if not offending in line:
# There are no offending words in this line
# write it to the output file
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
答
'''这是一个相当简单的实现,但应该做的你是什么搜索'''
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup(filename):
print "fixup ", filename
fin = open(filename)
fout = open(filename2 , "w")
for line in fin.readlines():
for item in offending:
print "got one",line
line = line.replace(item, "MUST DELETE")
line=line.strip()
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
我不认为你打印过任何东西。 – Daenyth 2010-06-15 06:08:47
@Daenyth - 修改该行。它在o/p文件中打印出每行三行 – romesub 2010-06-15 06:11:37