使用df.apply()

使用df.apply()

问题描述:

对每一行应用一个带有参数的函数当我所应用的函数超级微不足道(如.upper()或simple)时,我已经看到足够多关于使用pandas df.apply()乘法)。但是,当我尝试应用我的自定义函数时,我不断收到各种错误。我不知道这个错误开始:使用df.apply()

这里是我的简单的例子:

我的假数据:

inp = [{'c1':10, 'c2':1}, {'c1':11,'c2':110}, {'c1':12,'c2':0}] 
df1 = pd.DataFrame(inp) 
print(df1) 

我的假功能

def fake_funk(row, upper, lower): 
    if lower < row['c1'] < upper: 
     return(1) 
    elif row['c2'] > upper: 
     return(2) 
    else: 
     return(0) 

测试它确实有效:

for index, row in df1.iterrows(): 
    print(fake_funk(row,11,1)) 
1 
2 
0 

现在使用适用()

df1.apply(lambda row,: fake_funk(row,11,1)) 

我得到的错误是相当长:

--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)() 

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:14010)() 

TypeError: an integer is required 

During handling of the above exception, another exception occurred: 

KeyError         Traceback (most recent call last) 
<ipython-input-116-a554e891e761> in <module>() 
----> 1 df1.apply(lambda row,: fake_funk(row,11,1)) 

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 
    4260       f, axis, 
    4261       reduce=reduce, 
-> 4262       ignore_failures=ignore_failures) 
    4263    else: 
    4264     return self._apply_broadcast(f, axis) 

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce) 
    4356    try: 
    4357     for i, v in enumerate(series_gen): 
-> 4358      results[i] = func(v) 
    4359      keys.append(v.name) 
    4360    except Exception as e: 

<ipython-input-116-a554e891e761> in <lambda>(row) 
----> 1 df1.apply(lambda row,: fake_funk(row,11,1)) 

<ipython-input-115-e95f3470fb25> in fake_funk(row, upper, lower) 
     1 def fake_funk(row, upper, lower): 
----> 2  if lower < row['c1'] < upper: 
     3   return(1) 
     4  elif row['c2'] > upper: 
     5   return(2) 

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key) 
    599   key = com._apply_if_callable(key, self) 
    600   try: 
--> 601    result = self.index.get_value(self, key) 
    602 
    603    if not is_scalar(result): 

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_value(self, series, key) 
    2475   try: 
    2476    return self._engine.get_value(s, k, 
-> 2477           tz=getattr(series.dtype, 'tz', None)) 
    2478   except KeyError as e1: 
    2479    if len(self) > 0 and self.inferred_type in ['integer', 'boolean']: 

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4404)() 

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4087)() 

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5210)() 

KeyError: ('c1', 'occurred at index c1') 

默认情况下,apply沿零轴运行。看来你需要沿第一轴进行操作。顺便说一句,你也不需要lambda。只需传递一个args参数,这应该足够了。

df1.apply(fake_funk, axis=1, args=(11, 1)) 

0 1 
1 2 
2 0 
dtype: int64 
+0

哇。我知道这将是一件简单的事情。谢谢。 – AustinM