根据python熊猫中的2列选择DF中的特定行
问题描述:
我将excel中的数据加载到熊猫数据框中。我现在只希望仅选择其ASSESSMENT ID是每个APPID的最大ASSESSMENT ID以及该APPID的所有UI SEQ ID的行。根据python熊猫中的2列选择DF中的特定行
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT .
1 appname 2493 11 Question No .
1 appname 13808 11 Question Ctry of domicile .
1 appname 13808 11 Question Name .
1 appname 35316 11 Question Ctry of domicile .
1 appname 35316 11 Question Name .
1 appname 35316 11 Question Nationality .
1 appname 2493 12 Question Corp name .
1 appname 2493 12 Question Cr Br Scr .
1 appname 2493 12 Question Inc And Assests .
1 appname 2493 12 Question Int, Ext Reg Reports .
1 appname 13808 12 Question Corp name .
1 appname 35316 12 Question Corp name .
1 appname 2493 13 Question No .
1 appname 13808 13 Question No .
1 appname 35316 13 Question No .
1 appname 2493 14 Question No .
1 appname 13808 14 Question firms Pos .
1 appname 35316 14 Question firms Pos .
其结果将是
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT .
1 appname 35316 11 Question Ctry of domicile .
1 appname 35316 11 Question Name .
1 appname 35316 11 Question Nationality .
1 appname 35316 12 Question Corp name .
1 appname 35316 13 Question No .
1 appname 35316 14 Question firms Pos .
答
我认为你需要boolean indexing
与apply
创建面膜:
df1 = df[df.groupby(['APPID', 'UI SEQ NUMBER'])['ASSESSMENT ID'].apply(lambda x:x==x.max())]
print (df1)
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT.
3 1 appname 35316 11 Question Ctry of domicile.
4 1 appname 35316 11 Question Name.
5 1 appname 35316 11 Question Nationality.
11 1 appname 35316 12 Question Corp name.
14 1 appname 35316 13 Question No.
17 1 appname 35316 14 Question firms Pos.
或者,如果不需要的所有重复值使用idxmax
:
df1 = df.loc[df.groupby(['APPID', 'UI SEQ NUMBER'])['ASSESSMENT ID'].idxmax()]
print (df1)
APPID APPNAME ASSESSMENT ID UI SEQ NUMBER QUESTION ANSWER TEXT.
3 1 appname 35316 11 Question Ctry of domicile.
11 1 appname 35316 12 Question Corp name.
14 1 appname 35316 13 Question No.
17 1 appname 35316 14 Question firms Pos.
请[不要张贴图像的代码(或链接到他们)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-所以当问一个问题) – jezrael
道歉张贴图像,但没有其他方式,我可以从excel发布数据到这里没有适当的格式 – vivek
嗯,如果复制粘贴并添加4个空格前,它不会每行工作? – jezrael