用Python处理csv文件

问题描述:

我试图通过两列输出2个csv文件之间的差异并创建第三个csv文件。如何我做下面的代码按列0和3用Python处理csv文件

import csv 

f1 = open ("ted.csv") 
oldFile1 = csv.reader(f1, delimiter=',') 
oldList1 = list(oldFile1) 

f2 = open ("ted2.csv") 
newFile2 = csv.reader(f2, delimiter=',') 
newList2 = list(newFile2) 

f1.close() 
f2.close() 

output1 = set(tuple(row) for row in newList2 if row not in oldList1) 
output2 = set(tuple(row) for row in oldList1 if row not in newList2) 

with open('Michal_K.csv','w') as csvfile: 
     wr = csv.writer(csvfile,delimiter=',') 
     for line in (output2).difference(output1): 
      wr.writerow(line) 
+1

这就是大熊猫写的东西。看看那个图书馆! – AZhao

+0

啊我看到谢谢你会看看。 –

比较,如果你想从ted.csv没有任何相同的第三和第四列元素ted2的,创建一个集的行从ted2这些元素和写作之前检查从ted.csv每一行:

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out: 
    r1, r2 = csv.reader(f1), csv.reader(f2) 
    st = set((row[0], row[3]) for row in r1) 
    wr = csv.writer(out) 
    for row in (row for row in r2 if (row[0],row[3]) not in st): 
      wr.writerow(row) 

如果你真的想要的东西就像你来自哪里,都得到独特的行symmetric difference然后做一组来自各个第三和第四列文件:

from itertools import chain 

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out: 
    r1, r2 = csv.reader(f1), csv.reader(f2) 
    st1 = set((row[0], row[3]) for row in r1) 
    st2 = set((row[0], row[3]) for row in r2) 
    f1.seek(0), f2.seek(0) 
    wr = csv.writer(out) 
    r1, r2 = csv.reader(f1), csv.reader(f2) 
    output1 = (row for row in r1 if (row[0], row[3]) not in st2) 
    output2 = (row for row in r2 if (row[0], row[3]) not in st1) 
    for row in chain.from_iterable((output1, output2)): 
     wr.writerow(row) 
+0

谢谢,Im在行[0]和行[3]中没有相同元素的行之后。仍然会尝试第二种方法来找出差异。 –

+0

第二种方法应该给你基于第一和第四列的对称差异 –

+0

第二种方法给了我一个超出范围的列表索引。 –