转换列的数据列举字典键值
问题描述:
有没有更好的方法(以最少的代码的意义),它可以做如下:一列转换为枚举数值,所以应该去有点这样:转换列的数据列举字典键值
- 得到设置项目在列
- 做出enumrated字典与键值
- 恢复与价值的关键
- 使用键值的结果,而不是新数据列中的数据。
所以这里就是我今天做的,不知道是否有人能显示一个经典的方式做到这一点,所以我能避免写功能get_color_val:
import pandas as pd
cars = pd.DataFrame({"car_name": ["BMW","BMW","ACCURA","ACCURA","ACCURA","BMW","BMW","BMW"],"color":["RED","RED","RED","RED","GREEN","BLACK","BLUE","BLUE"]})
color_dict = dict(enumerate(set(cars["color"])))
color_dict = dict((y,x) for x,y in color_dict.iteritems())
def get_color_val(row):
my_key = row["color"]
my_value = color_dict.get(my_key)
return my_value
cars["color_val"] = cars.apply(get_color_val, axis=1)
cars = cars.drop("color",1)
print cars
结果
Before------------
car_name color
0 BMW RED
1 BMW RED
2 ACCURA RED
3 ACCURA RED
4 ACCURA GREEN
5 BMW BLACK
6 BMW BLUE
7 BMW BLUE
After------------
car_name color_val
0 BMW 3
1 BMW 3
2 ACCURA 3
3 ACCURA 3
4 ACCURA 2
5 BMW 1
6 BMW 0
7 BMW 0
答
我会在这种情况下使用pd.factorize():
In [8]: cars['color_val'] = pd.factorize(cars.color)[0]
In [9]: cars
Out[9]:
car_name color color_val
0 BMW RED 0
1 BMW RED 0
2 ACCURA RED 0
3 ACCURA RED 0
4 ACCURA GREEN 1
5 BMW BLACK 2
6 BMW BLUE 3
7 BMW BLUE 3
在一行中?!?!?!哇谢谢! – adhg
@adhg,是的,另一个爱熊猫的理由......;) – MaxU