比较2数据帧,并添加差异列,Python的3.6

问题描述:

我有10列比较2数据帧,并添加差异列,Python的3.6

df1: col1, col2, col3, col4, col5, col6, col7, col8, col9, col10 

,并用5列

df2: col1, col2, col6, col9, col3 

我想与df1比较df2和添加列另一个数据帧数据帧df1df2这是不存在的。

这不是Compare Pandas dataframes and add column的重复,我不想从df1添加任何值,只是想添加空白列。

dfa = pd.DataFrame({'a':[1,2,3], 'b':[5,6,7]}) 
dfb = pd.DataFrame({'a':[7,7,7], 'c':[4,4,4], 'e':[0,0,0]}) 

>>> dfa 
    a b 
0 1 5 
1 2 6 
2 3 7 
>>> dfb 
    a c e 
0 7 4 0 
1 7 4 0 
2 7 4 0 

查找与不同的列(S)

>>> col_diff = dfb.columns.difference(dfa.columns) 
>>> col_diff 
Index(['c', 'e'], dtype='object') 

使新列的列表,并将它们添加:

>>> new = col_diff.tolist() 
>>> new 
['c', 'e'] 
>>> 
>>> for col in new: 
...  dfa[col] = None 

>>> dfa 
    a b  c  e 
0 1 5 None None 
1 2 6 None None 
2 3 7 None None 
>>> 

使用DataFrame.assign(相同的初始DataFrames)

>>> # try it when the df indices are different 
>>> dfc = dfb.set_index('a') 
>>> dfc 
    c e 
a  
7 4 0 
7 4 0 
7 4 0 

>>> diff = dfc.columns.difference(dfa.columns) 
>>> new = diff.tolist() 
>>> new = {col:None for col in new} 
>>> dfa = dfa.assign(**new) 

>>> dfa 
    a b  c  e 
0 1 5 None None 
1 2 6 None None 
2 3 7 None None 

要做到这一点,指数必须匹配。假设他们这样做,你可以试试:

pd.concat([df1.drop(df2.columns, axis=1), df2], axis=1)