如何阅读带有数据块的复杂txt文件并将其保存为python中的csv文件?

问题描述:

如果我有一个文件,举办这样的如何阅读带有数据块的复杂txt文件并将其保存为python中的csv文件?

++++++++++++++ 
Country 1 

**this sentence is not important. 
**date 25.09.2017, also not important 
******* 
Address 
**Office 

     Address A, 100 City. Country X 
**work time 09h00-16h00<br>9h00-14h00 
**www.example.com 
**[email protected]; 
**012/345 67 89 
**téléfax 123/456 67 89 
******* 
Address 
**Home Office 

     Address A, 200 City. Country X 
**[email protected]; 
**001/000 00 00 
**téléfax 111/111 11 11 
******* 
Address 
**Living address 

     Address 0, 123 City 
**[email protected] 
**000/000 00 00 
**téléfax 222/222 22 22 
++++++++++++++ 
Country 2 

**this sentence is not important. 
**date 25.09.2017, also not important 
******* 
Address 
**Office 

     AAA 11, 30 City 

     BBB 22, 30 City 
**work time 08h00-12h30 
**www.example.com 
**[email protected] 
**000/000 00 00 
**téléfax 111/11 11 11 
******* 

ETC 

,我想放在CSV文件中的数据与这些列:

Country (Line right after ++++++++++++++), Address (Line right after *******), Office (after **), WorkTime (after **), Website (after **), Email (after **), Phone (after **), Fax (after **) 

我怎么做在Python?问题是,在一些列表中缺少数据,所以我知道csv文件中的一些行最终会搞砸,但我不介意在执行此操作后进行一些手动调整数据库的工作。另一个问题是,国名不同,所以我需要使用++++++++++++++作为分隔符。

我想是这样的

import csv 
with open('listofdata.txt', 'r') as FILE: 
    DATA = FILE.read() 

LIST = DATA.split('++++++++++++++') 

LIST2 = [] 
LIST3 = [] 
LIST4 = [] 

for ITEMS in LIST: 
    LIST2 = ITEMS.split('*******')  
    for items2 in LIST2: 
     LIST3 = items2.split('**') 
     LIST4.append(LIST3) 


with open('file.csv', 'w') as CSV: 
    for ITEMS in LIST4: 
     csv.write(ITEMS) 

但它不工作。

错误:`回溯(最近通话最后一个): 文件 “test.py” 22行,在 csv.write(项目) AttributeError的: '模块' 对象有没有属性 '写'

`

在最后一行中,您写下了文件对象“csv”而不是“CSV”,这就是出现错误的原因。

我添加了关于如何在python中将csv模块用于您的代码的过程。

你现在要做的就是解析方法。

代码:

import csv 
with open('listofdata.txt', 'r') as FILE: 
    DATA = FILE.read() 

LIST = DATA.split('++++++++++++++') 

LIST2 = [] 
LIST3 = [] 
LIST4 = [] 

for ITEMS in LIST: 
    LIST2 = ITEMS.split('*******') 
    for items2 in LIST2: 
     LIST3 = items2.split('**') 
     LIST4.append(LIST3) 

with open('file.csv', 'w') as csvfile: 
    spamwriter = csv.writer(csvfile, delimiter=',') 
    for ITEMS in LIST4: 
     spamwriter.writerow(ITEMS) 

输出:

"" 

" 
Country 1 

","this sentence is not important. 
","date 25.09.2017, also not important 
" 

" 
Address 
","Office 

     Address A, 100 City. Country X 
","work time 09h00-16h00<br>9h00-14h00 
","www.example.com 
","[email protected]; 
","012/345 67 89 
","téléfax 123/456 67 89 
" 

" 
Address 
","Home Office 

     Address A, 200 City. Country X 
","[email protected]; 
","001/000 00 00 
","téléfax 111/111 11 11 
" 

" 
Address 
","Living address 

     Address 0, 123 City 
","[email protected] 
","000/000 00 00 
","téléfax 222/222 22 22 
" 

" 
Country 2 

","this sentence is not important. 
","date 25.09.2017, also not important 
" 

" 
Address 
","Office 

     AAA 11, 30 City 

     BBB 22, 30 City 
","work time 08h00-12h30 
","www.example.com 
","[email protected] 
","000/000 00 00 
","téléfax 111/11 11 11 
" 

" 
" 
+0

非常感谢队友!现在只需了解如何使用上述规则编写该csv:国家,地址,办公室,工作时间,网站,电子邮件,电话,传真。 – CsharpNoob

当您保存到csv文件使用csv.writer。但首先,您必须为您的listofdata.txt文件的结构准备解析器,然后才能将数据保存到csv文件。

或者,您可以使用csv.DictWriter,但无论如何您必须先准备解析器。