比较2数据帧,并添加差异列,Python的3.6
问题描述:
df1: col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
,并用5列
df2: col1, col2, col6, col9, col3
我想与df1
比较df2
和添加列另一个数据帧数据帧df1
到df2
这是不存在的。
这不是Compare Pandas dataframes and add column的重复,我不想从df1
添加任何值,只是想添加空白列。
答
dfa = pd.DataFrame({'a':[1,2,3], 'b':[5,6,7]})
dfb = pd.DataFrame({'a':[7,7,7], 'c':[4,4,4], 'e':[0,0,0]})
>>> dfa
a b
0 1 5
1 2 6
2 3 7
>>> dfb
a c e
0 7 4 0
1 7 4 0
2 7 4 0
查找与不同的列(S)
>>> col_diff = dfb.columns.difference(dfa.columns)
>>> col_diff
Index(['c', 'e'], dtype='object')
使新列的列表,并将它们添加:
>>> new = col_diff.tolist()
>>> new
['c', 'e']
>>>
>>> for col in new:
... dfa[col] = None
>>> dfa
a b c e
0 1 5 None None
1 2 6 None None
2 3 7 None None
>>>
使用DataFrame.assign(相同的初始DataFrames)
>>> # try it when the df indices are different
>>> dfc = dfb.set_index('a')
>>> dfc
c e
a
7 4 0
7 4 0
7 4 0
>>> diff = dfc.columns.difference(dfa.columns)
>>> new = diff.tolist()
>>> new = {col:None for col in new}
>>> dfa = dfa.assign(**new)
>>> dfa
a b c e
0 1 5 None None
1 2 6 None None
2 3 7 None None
答
要做到这一点,指数必须匹配。假设他们这样做,你可以试试:
pd.concat([df1.drop(df2.columns, axis=1), df2], axis=1)