七周成为数据分析师(案例python脚本)
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.read_csv("CDNOW_MASTER.CSV")
df.head()
Out[1]:
ID | DATE | NUM | TOTAL | |
---|---|---|---|---|
0 | 1 | 19970101 | 1 | 11.77 |
1 | 2 | 19970112 | 1 | 12.00 |
2 | 2 | 19970112 | 5 | 77.00 |
3 | 3 | 19970102 | 2 | 20.76 |
4 | 3 | 19970330 | 2 | 20.76 |
- ID:购买用户id
- DATE:用户购买日期
- NUM:用户购买数量
- TOTAL:用户购买总额
In [2]:
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 69659 entries, 0 to 69658 Data columns (total 4 columns): ID 69659 non-null int64 DATE 69659 non-null int64 NUM 69659 non-null int64 TOTAL 69659 non-null float64 dtypes: float64(1), int64(3) memory usage: 2.1 MB
In [3]:
df.describe()
Out[3]:
ID | DATE | NUM | TOTAL | |
---|---|---|---|---|
count | 69659.000000 | 6.965900e+04 | 69659.000000 | 69659.000000 |
mean | 11470.854592 | 1.997228e+07 | 2.410040 | 35.893648 |
std | 6819.904848 | 3.837735e+03 | 2.333924 | 36.281942 |
min | 1.000000 | 1.997010e+07 | 1.000000 | 0.000000 |
25% | 5506.000000 | 1.997022e+07 | 1.000000 | 14.490000 |
50% | 11410.000000 | 1.997042e+07 | 2.000000 | 25.980000 |
75% | 17273.000000 | 1.997111e+07 | 3.000000 | 43.700000 |
max | 23570.000000 | 1.998063e+07 | 99.000000 | 1286.010000 |
- 大部分订单只有1-3的销量,平均值为2.4,且存在一定极值干扰
- 用户消费金额比较稳定,平均数是36,中位数是25,存在一定极值干扰
In [4]:
df['DATE']=pd.to_datetime(df['DATE'],format='%Y%m%d')
df['month']=df.DATE.values.astype('datetime64[M]')
df
Out[4]:
ID | DATE | NUM | TOTAL | month | |
---|---|---|---|---|---|
0 | 1 | 1997-01-01 | 1 | 11.77 | 1997-01-01 |
1 | 2 | 1997-01-12 | 1 | 12.00 | 1997-01-01 |
2 | 2 | 1997-01-12 | 5 | 77.00 | 1997-01-01 |
3 | 3 | 1997-01-02 | 2 | 20.76 | 1997-01-01 |
4 | 3 | 1997-03-30 | 2 | 20.76 | 1997-03-01 |
5 | 3 | 1997-04-02 | 2 | 19.54 | 1997-04-01 |
6 | 3 | 1997-11-15 | 5 | 57.45 | 1997-11-01 |
7 | 3 | 1997-11-25 | 4 | 20.96 | 1997-11-01 |
8 | 3 | 1998-05-28 | 1 | 16.99 | 1998-05-01 |
9 | 4 | 1997-01-01 | 2 | 29.33 | 1997-01-01 |
10 | 4 | 1997-01-18 | 2 | 29.73 | 1997-01-01 |
11 | 4 | 1997-08-02 | 1 | 14.96 | 1997-08-01 |
12 | 4 | 1997-12-12 | 2 | 26.48 | 1997-12-01 |
13 | 5 | 1997-01-01 | 2 | 29.33 | 1997-01-01 |
14 | 5 | 1997-01-14 | 1 | 13.97 | 1997-01-01 |
15 | 5 | 1997-02-04 | 3 | 38.90 | 1997-02-01 |
16 | 5 | 1997-04-11 | 3 | 45.55 | 1997-04-01 |
17 | 5 | 1997-05-31 | 3 | 38.71 | 1997-05-01 |
18 | 5 | 1997-06-16 | 2 | 26.14 | 1997-06-01 |
19 | 5 | 1997-07-22 | 2 | 28.14 | 1997-07-01 |
20 | 5 | 1997-09-15 | 3 | 40.47 | 1997-09-01 |
21 | 5 | 1997-12-08 | 4 | 46.46 | 1997-12-01 |
22 | 5 | 1997-12-12 | 3 | 40.47 | 1997-12-01 |
23 | 5 | 1998-01-03 | 3 | 37.47 | 1998-01-01 |
24 | 6 | 1997-01-01 | 1 | 20.99 | 1997-01-01 |
25 | 7 | 1997-01-01 | 2 | 28.74 | 1997-01-01 |
26 | 7 | 1997-10-11 | 7 | 97.43 | 1997-10-01 |
27 | 7 | 1998-03-22 | 9 | 138.50 | 1998-03-01 |
28 | 8 | 1997-01-01 | 1 | 9.77 | 1997-01-01 |
29 | 8 | 1997-02-13 | 1 | 13.97 | 1997-02-01 |
... | ... | ... | ... | ... | ... |
69629 | 23556 | 1997-09-27 | 3 | 31.47 | 1997-09-01 |
69630 | 23556 | 1998-01-03 | 2 | 28.98 | 1998-01-01 |
69631 | 23556 | 1998-06-07 | 2 | 28.98 | 1998-06-01 |
69632 | 23557 | 1997-03-25 | 1 | 14.37 | 1997-03-01 |
69633 | 23558 | 1997-03-25 | 2 | 28.13 | 1997-03-01 |
69634 | 23558 | 1997-05-18 | 3 | 45.51 | 1997-05-01 |
69635 | 23558 | 1997-06-24 | 2 | 23.74 | 1997-06-01 |
69636 | 23558 | 1998-02-25 | 4 | 48.22 | 1998-02-01 |
69637 | 23559 | 1997-03-25 | 2 | 23.54 | 1997-03-01 |
69638 | 23559 | 1997-05-18 | 3 | 35.31 | 1997-05-01 |
69639 | 23559 | 1997-06-27 | 3 | 52.80 | 1997-06-01 |
69640 | 23560 | 1997-03-25 | 1 | 18.36 | 1997-03-01 |
69641 | 23561 | 1997-03-25 | 2 | 30.92 | 1997-03-01 |
69642 | 23561 | 1998-01-28 | 1 | 15.49 | 1998-01-01 |
69643 | 23561 | 1998-05-29 | 3 | 37.05 | 1998-05-01 |
69644 | 23562 | 1997-03-25 | 2 | 29.33 | 1997-03-01 |
69645 | 23563 | 1997-03-25 | 1 | 10.77 | 1997-03-01 |
69646 | 23563 | 1997-10-04 | 2 | 47.98 | 1997-10-01 |
69647 | 23564 | 1997-03-25 | 1 | 11.77 | 1997-03-01 |
69648 | 23564 | 1997-05-21 | 1 | 11.77 | 1997-05-01 |
69649 | 23564 | 1997-11-30 | 3 | 46.47 | 1997-11-01 |
69650 | 23565 | 1997-03-25 | 1 | 11.77 | 1997-03-01 |
69651 | 23566 | 1997-03-25 | 2 | 36.00 | 1997-03-01 |
69652 | 23567 | 1997-03-25 | 1 | 20.97 | 1997-03-01 |
69653 | 23568 | 1997-03-25 | 1 | 22.97 | 1997-03-01 |
69654 | 23568 | 1997-04-05 | 4 | 83.74 | 1997-04-01 |
69655 | 23568 | 1997-04-22 | 1 | 14.99 | 1997-04-01 |
69656 | 23569 | 1997-03-25 | 2 | 25.74 | 1997-03-01 |
69657 | 23570 | 1997-03-25 | 3 | 51.12 | 1997-03-01 |
69658 | 23570 | 1997-03-26 | 2 | 42.96 | 1997-03-01 |
69659 rows × 5 columns
1.进行用户消费趋势分析(按月)¶
- 每月的消费总金额
- 每月的消费次数
- 每月的产品购买量
- 每月的消费人数
In [5]:
grouped_month=df.groupby(by='month')
order_month_amount=grouped_month.TOTAL.sum()
order_month_amount.head()
Out[5]:
month 1997-01-01 299060.17 1997-02-01 379590.03 1997-03-01 393155.27 1997-04-01 142824.49 1997-05-01 107933.30 Name: TOTAL, dtype: float64
In [6]:
#可视化显示在页面
%matplotlib inline
#更改设计风格
plt.style.use('ggplot')
order_month_amount.plot()
Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6a8a8208>
由上图可知,消费金额在前三个月达到最高峰,后续消费较为稳定,有轻微下降趋势
In [7]:
grouped_month.ID.count().plot()
Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6a449860>
由上图可知,消费次数在前三个月达到最高峰,后续消费次数较为稳定,有轻微下降趋势
In [8]:
grouped_month.NUM.sum().plot()
Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6a4ec5c0>
由上图可知,每月购买产品量在前三个月达到最高峰,后续购买量较为稳定,有轻微下降趋势
In [9]:
grouped_month.ID.apply(lambda x :len(set(x))).plot()
Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6a906438>
由上图可知,每月消费人数在前三个月达到最高峰,每月消费人数较为稳定,有轻微下降趋势
2.用户个体消费分析¶
- 用户消费金额、消费次数的描述统计
- 用户消费金额和消费次数的散点图
- 用户消费金额的分布图
- 用户消费次数的分布图
- 用户累计消费金额占比
In [10]:
grouped_user=df.groupby('ID')
In [11]:
grouped_user.sum().describe()
Out[11]:
NUM | TOTAL | |
---|---|---|
count | 23570.000000 | 23570.000000 |
mean | 7.122656 | 106.080426 |
std | 16.983531 | 240.925195 |
min | 1.000000 | 0.000000 |
25% | 1.000000 | 19.970000 |
50% | 3.000000 | 43.395000 |
75% | 7.000000 | 106.475000 |
max | 1033.000000 | 13990.930000 |
- 用户平均购买了7张CD,但中位值只有3,说明小部分用户购买大多数的CD
- 用户平均消费106元,中位值有43,判断同上,有极值干扰
In [12]:
grouped_user.sum().query('TOTAL<4000').plot.scatter(x='TOTAL',y='NUM')
Out[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6abad7b8>
用户消费总金额集中在1500元以下,购买总CD数量集中在100张以下
In [13]:
grouped_user.sum().query('NUM<100').TOTAL.plot.hist(bins=20)
Out[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6abcc128>
大多数用户消费金额在250元以下
In [14]:
grouped_user.sum().query('NUM<100').NUM.plot.hist(bins=20)
Out[14]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6bbd75c0>
大多数用户购买CD数量在20张以下
In [15]:
user_cumsum=grouped_user.sum().sort_values('TOTAL').apply(lambda x :x.cumsum()/x.sum())
user_cumsum.reset_index().TOTAL.plot()
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6bd3d198>
按照用户消费金额进行升序排列,50%的用户仅贡献了15%的消费额度,消费总额排名前5000的贡献了60%的消费额度
3.用户消费行为¶
- 用户第一次消费(首购)
- 用户最后一次消费
- 新老客消费比
- 多数用户仅消费一次?
- 每月新老客占比?
- 用户分层
- RFM
- 新、老、活跃、回流、流失
- 用户购买周期(按订单)
- 用户消费周期描述
- 用户消费周期分布
- 用户生命周期(按照第一次&最后一次消费)
- 用户生命周期描述
- 用户生命周期分布
In [16]:
grouped_user.min().DATE.value_counts().plot()
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6bd0c128>
用户第一次购买分布,集中在前三个月。并且在2-11至2-25有一次剧烈波动,具体再分析
In [17]:
grouped_user.max().DATE.value_counts().plot()
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6bddde48>
用户最后一次购买比第一次购买分布广,大部分最后一次购买集中在前三个月,说明很多用户第一次购买后不再进行购买,随着时间,最后一次购买数在增加,消费呈流失上升的状况。
In [18]:
user_life=grouped_user.DATE.agg(['min','max']).reset_index()
user_life.head()
Out[18]:
ID | min | max | |
---|---|---|---|
0 | 1 | 1997-01-01 | 1997-01-01 |
1 | 2 | 1997-01-12 | 1997-01-12 |
2 | 3 | 1997-01-02 | 1998-05-28 |
3 | 4 | 1997-01-01 | 1997-12-12 |
4 | 5 | 1997-01-01 | 1998-01-03 |
In [19]:
(user_life['min']==user_life['max']).value_counts()
Out[19]:
True 12054 False 11516 dtype: int64
有一半用户, 就 消费了一次
In [20]:
user_life['regDate']=user_life['min'].values.astype('datetime64[M]')
user_life
Out[20]:
ID | min | max | regDate | |
---|---|---|---|---|
0 | 1 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
1 | 2 | 1997-01-12 | 1997-01-12 | 1997-01-01 |
2 | 3 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
3 | 4 | 1997-01-01 | 1997-12-12 | 1997-01-01 |
4 | 5 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
5 | 6 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
6 | 7 | 1997-01-01 | 1998-03-22 | 1997-01-01 |
7 | 8 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
8 | 9 | 1997-01-01 | 1998-06-08 | 1997-01-01 |
9 | 10 | 1997-01-21 | 1997-01-21 | 1997-01-01 |
10 | 11 | 1997-01-01 | 1998-02-20 | 1997-01-01 |
11 | 12 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
12 | 13 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
13 | 14 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
14 | 15 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
15 | 16 | 1997-01-01 | 1997-09-10 | 1997-01-01 |
16 | 17 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
17 | 18 | 1997-01-04 | 1997-01-04 | 1997-01-01 |
18 | 19 | 1997-01-01 | 1997-06-10 | 1997-01-01 |
19 | 20 | 1997-01-01 | 1997-01-18 | 1997-01-01 |
20 | 21 | 1997-01-01 | 1997-01-13 | 1997-01-01 |
21 | 22 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
22 | 23 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
23 | 24 | 1997-01-01 | 1998-01-20 | 1997-01-01 |
24 | 25 | 1997-01-01 | 1998-06-08 | 1997-01-01 |
25 | 26 | 1997-01-01 | 1997-01-26 | 1997-01-01 |
26 | 27 | 1997-01-07 | 1997-01-12 | 1997-01-01 |
27 | 28 | 1997-01-01 | 1997-03-08 | 1997-01-01 |
28 | 29 | 1997-01-01 | 1998-04-26 | 1997-01-01 |
29 | 30 | 1997-01-01 | 1997-02-14 | 1997-01-01 |
... | ... | ... | ... | ... |
23540 | 23541 | 1997-03-25 | 1997-04-02 | 1997-03-01 |
23541 | 23542 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23542 | 23543 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23543 | 23544 | 1997-03-25 | 1998-01-24 | 1997-03-01 |
23544 | 23545 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23545 | 23546 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23546 | 23547 | 1997-03-25 | 1997-04-07 | 1997-03-01 |
23547 | 23548 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23548 | 23549 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23549 | 23550 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23550 | 23551 | 1997-03-25 | 1997-09-11 | 1997-03-01 |
23551 | 23552 | 1997-03-25 | 1997-04-03 | 1997-03-01 |
23552 | 23553 | 1997-03-25 | 1997-03-28 | 1997-03-01 |
23553 | 23554 | 1997-03-25 | 1998-02-01 | 1997-03-01 |
23554 | 23555 | 1997-03-25 | 1998-06-10 | 1997-03-01 |
23555 | 23556 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
23556 | 23557 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23557 | 23558 | 1997-03-25 | 1998-02-25 | 1997-03-01 |
23558 | 23559 | 1997-03-25 | 1997-06-27 | 1997-03-01 |
23559 | 23560 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23560 | 23561 | 1997-03-25 | 1998-05-29 | 1997-03-01 |
23561 | 23562 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23562 | 23563 | 1997-03-25 | 1997-10-04 | 1997-03-01 |
23563 | 23564 | 1997-03-25 | 1997-11-30 | 1997-03-01 |
23564 | 23565 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23565 | 23566 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23566 | 23567 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23567 | 23568 | 1997-03-25 | 1997-04-22 | 1997-03-01 |
23568 | 23569 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
23569 | 23570 | 1997-03-25 | 1997-03-26 | 1997-03-01 |
23570 rows × 4 columns
In [21]:
user_life_month=df.groupby(['month','ID']).count().reset_index()
In [22]:
user_life_month=user_life_month.merge(user_life,on='ID')
In [23]:
user_life_month
Out[23]:
month | ID | DATE | NUM | TOTAL | min | max | regDate | |
---|---|---|---|---|---|---|---|---|
0 | 1997-01-01 | 1 | 1 | 1 | 1 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
1 | 1997-01-01 | 2 | 2 | 2 | 2 | 1997-01-12 | 1997-01-12 | 1997-01-01 |
2 | 1997-01-01 | 3 | 1 | 1 | 1 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
3 | 1997-03-01 | 3 | 1 | 1 | 1 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
4 | 1997-04-01 | 3 | 1 | 1 | 1 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
5 | 1997-11-01 | 3 | 2 | 2 | 2 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
6 | 1998-05-01 | 3 | 1 | 1 | 1 | 1997-01-02 | 1998-05-28 | 1997-01-01 |
7 | 1997-01-01 | 4 | 2 | 2 | 2 | 1997-01-01 | 1997-12-12 | 1997-01-01 |
8 | 1997-08-01 | 4 | 1 | 1 | 1 | 1997-01-01 | 1997-12-12 | 1997-01-01 |
9 | 1997-12-01 | 4 | 1 | 1 | 1 | 1997-01-01 | 1997-12-12 | 1997-01-01 |
10 | 1997-01-01 | 5 | 2 | 2 | 2 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
11 | 1997-02-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
12 | 1997-04-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
13 | 1997-05-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
14 | 1997-06-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
15 | 1997-07-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
16 | 1997-09-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
17 | 1997-12-01 | 5 | 2 | 2 | 2 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
18 | 1998-01-01 | 5 | 1 | 1 | 1 | 1997-01-01 | 1998-01-03 | 1997-01-01 |
19 | 1997-01-01 | 6 | 1 | 1 | 1 | 1997-01-01 | 1997-01-01 | 1997-01-01 |
20 | 1997-01-01 | 7 | 1 | 1 | 1 | 1997-01-01 | 1998-03-22 | 1997-01-01 |
21 | 1997-10-01 | 7 | 1 | 1 | 1 | 1997-01-01 | 1998-03-22 | 1997-01-01 |
22 | 1998-03-01 | 7 | 1 | 1 | 1 | 1997-01-01 | 1998-03-22 | 1997-01-01 |
23 | 1997-01-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
24 | 1997-02-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
25 | 1997-06-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
26 | 1997-07-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
27 | 1997-11-01 | 8 | 2 | 2 | 2 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
28 | 1997-12-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
29 | 1998-03-01 | 8 | 1 | 1 | 1 | 1997-01-01 | 1998-03-29 | 1997-01-01 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
55349 | 1997-06-01 | 23556 | 1 | 1 | 1 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
55350 | 1997-07-01 | 23556 | 2 | 2 | 2 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
55351 | 1997-09-01 | 23556 | 1 | 1 | 1 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
55352 | 1998-01-01 | 23556 | 1 | 1 | 1 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
55353 | 1998-06-01 | 23556 | 1 | 1 | 1 | 1997-03-25 | 1998-06-07 | 1997-03-01 |
55354 | 1997-03-01 | 23557 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55355 | 1997-03-01 | 23558 | 1 | 1 | 1 | 1997-03-25 | 1998-02-25 | 1997-03-01 |
55356 | 1997-05-01 | 23558 | 1 | 1 | 1 | 1997-03-25 | 1998-02-25 | 1997-03-01 |
55357 | 1997-06-01 | 23558 | 1 | 1 | 1 | 1997-03-25 | 1998-02-25 | 1997-03-01 |
55358 | 1998-02-01 | 23558 | 1 | 1 | 1 | 1997-03-25 | 1998-02-25 | 1997-03-01 |
55359 | 1997-03-01 | 23559 | 1 | 1 | 1 | 1997-03-25 | 1997-06-27 | 1997-03-01 |
55360 | 1997-05-01 | 23559 | 1 | 1 | 1 | 1997-03-25 | 1997-06-27 | 1997-03-01 |
55361 | 1997-06-01 | 23559 | 1 | 1 | 1 | 1997-03-25 | 1997-06-27 | 1997-03-01 |
55362 | 1997-03-01 | 23560 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55363 | 1997-03-01 | 23561 | 1 | 1 | 1 | 1997-03-25 | 1998-05-29 | 1997-03-01 |
55364 | 1998-01-01 | 23561 | 1 | 1 | 1 | 1997-03-25 | 1998-05-29 | 1997-03-01 |
55365 | 1998-05-01 | 23561 | 1 | 1 | 1 | 1997-03-25 | 1998-05-29 | 1997-03-01 |
55366 | 1997-03-01 | 23562 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55367 | 1997-03-01 | 23563 | 1 | 1 | 1 | 1997-03-25 | 1997-10-04 | 1997-03-01 |
55368 | 1997-10-01 | 23563 | 1 | 1 | 1 | 1997-03-25 | 1997-10-04 | 1997-03-01 |
55369 | 1997-03-01 | 23564 | 1 | 1 | 1 | 1997-03-25 | 1997-11-30 | 1997-03-01 |
55370 | 1997-05-01 | 23564 | 1 | 1 | 1 | 1997-03-25 | 1997-11-30 | 1997-03-01 |
55371 | 1997-11-01 | 23564 | 1 | 1 | 1 | 1997-03-25 | 1997-11-30 | 1997-03-01 |
55372 | 1997-03-01 | 23565 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55373 | 1997-03-01 | 23566 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55374 | 1997-03-01 | 23567 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55375 | 1997-03-01 | 23568 | 1 | 1 | 1 | 1997-03-25 | 1997-04-22 | 1997-03-01 |
55376 | 1997-04-01 | 23568 | 2 | 2 | 2 | 1997-03-25 | 1997-04-22 | 1997-03-01 |
55377 | 1997-03-01 | 23569 | 1 | 1 | 1 | 1997-03-25 | 1997-03-25 | 1997-03-01 |
55378 | 1997-03-01 | 23570 | 2 | 2 | 2 | 1997-03-25 | 1997-03-26 | 1997-03-01 |
55379 rows × 8 columns
In [24]:
user_life_month['isNew']=user_life_month.apply(lambda x:True if x.month==x.regDate else np.NaN,axis=1)
In [25]:
user_life_month_count=user_life_month.groupby('month').count().reset_index()
In [26]:
user_life_month_count
Out[26]:
month | ID | DATE | NUM | TOTAL | min | max | regDate | isNew | |
---|---|---|---|---|---|---|---|---|---|
0 | 1997-01-01 | 7846 | 7846 | 7846 | 7846 | 7846 | 7846 | 7846 | 7846 |
1 | 1997-02-01 | 9633 | 9633 | 9633 | 9633 | 9633 | 9633 | 9633 | 8476 |
2 | 1997-03-01 | 9524 | 9524 | 9524 | 9524 | 9524 | 9524 | 9524 | 7248 |
3 | 1997-04-01 | 2822 | 2822 | 2822 | 2822 | 2822 | 2822 | 2822 | 0 |
4 | 1997-05-01 | 2214 | 2214 | 2214 | 2214 | 2214 | 2214 | 2214 | 0 |
5 | 1997-06-01 | 2339 | 2339 | 2339 | 2339 | 2339 | 2339 | 2339 | 0 |
6 | 1997-07-01 | 2180 | 2180 | 2180 | 2180 | 2180 | 2180 | 2180 | 0 |
7 | 1997-08-01 | 1772 | 1772 | 1772 | 1772 | 1772 | 1772 | 1772 | 0 |
8 | 1997-09-01 | 1739 | 1739 | 1739 | 1739 | 1739 | 1739 | 1739 | 0 |
9 | 1997-10-01 | 1839 | 1839 | 1839 | 1839 | 1839 | 1839 | 1839 | 0 |
10 | 1997-11-01 | 2028 | 2028 | 2028 | 2028 | 2028 | 2028 | 2028 | 0 |
11 | 1997-12-01 | 1864 | 1864 | 1864 | 1864 | 1864 | 1864 | 1864 | 0 |
12 | 1998-01-01 | 1537 | 1537 | 1537 | 1537 | 1537 | 1537 | 1537 | 0 |
13 | 1998-02-01 | 1551 | 1551 | 1551 | 1551 | 1551 | 1551 | 1551 | 0 |
14 | 1998-03-01 | 2060 | 2060 | 2060 | 2060 | 2060 | 2060 | 2060 | 0 |
15 | 1998-04-01 | 1437 | 1437 | 1437 | 1437 | 1437 | 1437 | 1437 | 0 |
16 | 1998-05-01 | 1488 | 1488 | 1488 | 1488 | 1488 | 1488 | 1488 | 0 |
17 | 1998-06-01 | 1506 | 1506 | 1506 | 1506 | 1506 | 1506 | 1506 | 0 |
In [27]:
user_life_month_count['isOld']=pd.DataFrame(user_life_month_count['ID']-user_life_month_count['isNew'])
user_life_month_count.drop(columns=['DATE','NUM','TOTAL','min','max','regDate'],inplace=True)
In [28]:
user_life_month_count.rename(columns={'ID':'all'},inplace=True)
In [29]:
user_life_month_count
Out[29]:
month | all | isNew | isOld | |
---|---|---|---|---|
0 | 1997-01-01 | 7846 | 7846 | 0 |
1 | 1997-02-01 | 9633 | 8476 | 1157 |
2 | 1997-03-01 | 9524 | 7248 | 2276 |
3 | 1997-04-01 | 2822 | 0 | 2822 |
4 | 1997-05-01 | 2214 | 0 | 2214 |
5 | 1997-06-01 | 2339 | 0 | 2339 |
6 | 1997-07-01 | 2180 | 0 | 2180 |
7 | 1997-08-01 | 1772 | 0 | 1772 |
8 | 1997-09-01 | 1739 | 0 | 1739 |
9 | 1997-10-01 | 1839 | 0 | 1839 |
10 | 1997-11-01 | 2028 | 0 | 2028 |
11 | 1997-12-01 | 1864 | 0 | 1864 |
12 | 1998-01-01 | 1537 | 0 | 1537 |
13 | 1998-02-01 | 1551 | 0 | 1551 |
14 | 1998-03-01 | 2060 | 0 | 2060 |
15 | 1998-04-01 | 1437 | 0 | 1437 |
16 | 1998-05-01 | 1488 | 0 | 1488 |
17 | 1998-06-01 | 1506 | 0 | 1506 |
新用户购买集中在前三个月,三个月后无新用户购买,均为老用户购买
In [30]:
rfm=df.pivot_table(index='ID',values=['NUM','TOTAL','DATE'],aggfunc={'NUM':'sum','TOTAL':'sum','DATE':'max'})
rfm
Out[30]:
DATE | NUM | TOTAL | |
---|---|---|---|
ID | |||
1 | 1997-01-01 | 1 | 11.77 |
2 | 1997-01-12 | 6 | 89.00 |
3 | 1998-05-28 | 16 | 156.46 |
4 | 1997-12-12 | 7 | 100.50 |
5 | 1998-01-03 | 29 | 385.61 |
6 | 1997-01-01 | 1 | 20.99 |
7 | 1998-03-22 | 18 | 264.67 |
8 | 1998-03-29 | 18 | 197.66 |
9 | 1998-06-08 | 6 | 95.85 |
10 | 1997-01-21 | 3 | 39.31 |
11 | 1998-02-20 | 4 | 58.55 |
12 | 1997-01-01 | 4 | 57.06 |
13 | 1997-01-01 | 4 | 72.94 |
14 | 1997-01-01 | 2 | 29.92 |
15 | 1997-01-01 | 4 | 52.87 |
16 | 1997-09-10 | 8 | 79.87 |
17 | 1997-01-01 | 5 | 73.22 |
18 | 1997-01-04 | 1 | 14.96 |
19 | 1997-06-10 | 11 | 175.12 |
20 | 1997-01-18 | 46 | 653.01 |
21 | 1997-01-13 | 4 | 75.11 |
22 | 1997-01-01 | 1 | 14.37 |
23 | 1997-01-01 | 2 | 24.74 |
24 | 1998-01-20 | 4 | 57.77 |
25 | 1998-06-08 | 12 | 137.53 |
26 | 1997-01-26 | 6 | 102.69 |
27 | 1997-01-12 | 10 | 135.87 |
28 | 1997-03-08 | 7 | 90.99 |
29 | 1998-04-26 | 28 | 435.81 |
30 | 1997-02-14 | 2 | 28.34 |
... | ... | ... | ... |
23541 | 1997-04-02 | 2 | 57.34 |
23542 | 1997-03-25 | 5 | 77.43 |
23543 | 1997-03-25 | 2 | 50.76 |
23544 | 1998-01-24 | 12 | 134.63 |
23545 | 1997-03-25 | 1 | 24.99 |
23546 | 1997-03-25 | 1 | 13.97 |
23547 | 1997-04-07 | 2 | 23.54 |
23548 | 1997-03-25 | 2 | 23.54 |
23549 | 1997-03-25 | 2 | 27.13 |
23550 | 1997-03-25 | 2 | 25.28 |
23551 | 1997-09-11 | 12 | 264.63 |
23552 | 1997-04-03 | 4 | 49.38 |
23553 | 1997-03-28 | 8 | 98.58 |
23554 | 1998-02-01 | 3 | 36.37 |
23555 | 1998-06-10 | 14 | 189.18 |
23556 | 1998-06-07 | 15 | 203.00 |
23557 | 1997-03-25 | 1 | 14.37 |
23558 | 1998-02-25 | 11 | 145.60 |
23559 | 1997-06-27 | 8 | 111.65 |
23560 | 1997-03-25 | 1 | 18.36 |
23561 | 1998-05-29 | 6 | 83.46 |
23562 | 1997-03-25 | 2 | 29.33 |
23563 | 1997-10-04 | 3 | 58.75 |
23564 | 1997-11-30 | 5 | 70.01 |
23565 | 1997-03-25 | 1 | 11.77 |
23566 | 1997-03-25 | 2 | 36.00 |
23567 | 1997-03-25 | 1 | 20.97 |
23568 | 1997-04-22 | 6 | 121.70 |
23569 | 1997-03-25 | 2 | 25.74 |
23570 | 1997-03-26 | 5 | 94.08 |
23570 rows × 3 columns
In [31]:
rfm['R']=-(rfm['DATE']-rfm['DATE'].max())/np.timedelta64(1,'D')
rfm
Out[31]:
DATE | NUM | TOTAL | R | |
---|---|---|---|---|
ID | ||||
1 | 1997-01-01 | 1 | 11.77 | 545.0 |
2 | 1997-01-12 | 6 | 89.00 | 534.0 |
3 | 1998-05-28 | 16 | 156.46 | 33.0 |
4 | 1997-12-12 | 7 | 100.50 | 200.0 |
5 | 1998-01-03 | 29 | 385.61 | 178.0 |
6 | 1997-01-01 | 1 | 20.99 | 545.0 |
7 | 1998-03-22 | 18 | 264.67 | 100.0 |
8 | 1998-03-29 | 18 | 197.66 | 93.0 |
9 | 1998-06-08 | 6 | 95.85 | 22.0 |
10 | 1997-01-21 | 3 | 39.31 | 525.0 |
11 | 1998-02-20 | 4 | 58.55 | 130.0 |
12 | 1997-01-01 | 4 | 57.06 | 545.0 |
13 | 1997-01-01 | 4 | 72.94 | 545.0 |
14 | 1997-01-01 | 2 | 29.92 | 545.0 |
15 | 1997-01-01 | 4 | 52.87 | 545.0 |
16 | 1997-09-10 | 8 | 79.87 | 293.0 |
17 | 1997-01-01 | 5 | 73.22 | 545.0 |
18 | 1997-01-04 | 1 | 14.96 | 542.0 |
19 | 1997-06-10 | 11 | 175.12 | 385.0 |
20 | 1997-01-18 | 46 | 653.01 | 528.0 |
21 | 1997-01-13 | 4 | 75.11 | 533.0 |
22 | 1997-01-01 | 1 | 14.37 | 545.0 |
23 | 1997-01-01 | 2 | 24.74 | 545.0 |
24 | 1998-01-20 | 4 | 57.77 | 161.0 |
25 | 1998-06-08 | 12 | 137.53 | 22.0 |
26 | 1997-01-26 | 6 | 102.69 | 520.0 |
27 | 1997-01-12 | 10 | 135.87 | 534.0 |
28 | 1997-03-08 | 7 | 90.99 | 479.0 |
29 | 1998-04-26 | 28 | 435.81 | 65.0 |
30 | 1997-02-14 | 2 | 28.34 | 501.0 |
... | ... | ... | ... | ... |
23541 | 1997-04-02 | 2 | 57.34 | 454.0 |
23542 | 1997-03-25 | 5 | 77.43 | 462.0 |
23543 | 1997-03-25 | 2 | 50.76 | 462.0 |
23544 | 1998-01-24 | 12 | 134.63 | 157.0 |
23545 | 1997-03-25 | 1 | 24.99 | 462.0 |
23546 | 1997-03-25 | 1 | 13.97 | 462.0 |
23547 | 1997-04-07 | 2 | 23.54 | 449.0 |
23548 | 1997-03-25 | 2 | 23.54 | 462.0 |
23549 | 1997-03-25 | 2 | 27.13 | 462.0 |
23550 | 1997-03-25 | 2 | 25.28 | 462.0 |
23551 | 1997-09-11 | 12 | 264.63 | 292.0 |
23552 | 1997-04-03 | 4 | 49.38 | 453.0 |
23553 | 1997-03-28 | 8 | 98.58 | 459.0 |
23554 | 1998-02-01 | 3 | 36.37 | 149.0 |
23555 | 1998-06-10 | 14 | 189.18 | 20.0 |
23556 | 1998-06-07 | 15 | 203.00 | 23.0 |
23557 | 1997-03-25 | 1 | 14.37 | 462.0 |
23558 | 1998-02-25 | 11 | 145.60 | 125.0 |
23559 | 1997-06-27 | 8 | 111.65 | 368.0 |
23560 | 1997-03-25 | 1 | 18.36 | 462.0 |
23561 | 1998-05-29 | 6 | 83.46 | 32.0 |
23562 | 1997-03-25 | 2 | 29.33 | 462.0 |
23563 | 1997-10-04 | 3 | 58.75 | 269.0 |
23564 | 1997-11-30 | 5 | 70.01 | 212.0 |
23565 | 1997-03-25 | 1 | 11.77 | 462.0 |
23566 | 1997-03-25 | 2 | 36.00 | 462.0 |
23567 | 1997-03-25 | 1 | 20.97 | 462.0 |
23568 | 1997-04-22 | 6 | 121.70 | 434.0 |
23569 | 1997-03-25 | 2 | 25.74 | 462.0 |
23570 | 1997-03-26 | 5 | 94.08 | 461.0 |
23570 rows × 4 columns
In [32]:
rfm.rename(columns={'NUM':'F','TOTAL':'M'},inplace=True)
In [33]:
rfm.drop(columns=['DATE'],inplace=True)
rfm
Out[33]:
F | M | R | |
---|---|---|---|
ID | |||
1 | 1 | 11.77 | 545.0 |
2 | 6 | 89.00 | 534.0 |
3 | 16 | 156.46 | 33.0 |
4 | 7 | 100.50 | 200.0 |
5 | 29 | 385.61 | 178.0 |
6 | 1 | 20.99 | 545.0 |
7 | 18 | 264.67 | 100.0 |
8 | 18 | 197.66 | 93.0 |
9 | 6 | 95.85 | 22.0 |
10 | 3 | 39.31 | 525.0 |
11 | 4 | 58.55 | 130.0 |
12 | 4 | 57.06 | 545.0 |
13 | 4 | 72.94 | 545.0 |
14 | 2 | 29.92 | 545.0 |
15 | 4 | 52.87 | 545.0 |
16 | 8 | 79.87 | 293.0 |
17 | 5 | 73.22 | 545.0 |
18 | 1 | 14.96 | 542.0 |
19 | 11 | 175.12 | 385.0 |
20 | 46 | 653.01 | 528.0 |
21 | 4 | 75.11 | 533.0 |
22 | 1 | 14.37 | 545.0 |
23 | 2 | 24.74 | 545.0 |
24 | 4 | 57.77 | 161.0 |
25 | 12 | 137.53 | 22.0 |
26 | 6 | 102.69 | 520.0 |
27 | 10 | 135.87 | 534.0 |
28 | 7 | 90.99 | 479.0 |
29 | 28 | 435.81 | 65.0 |
30 | 2 | 28.34 | 501.0 |
... | ... | ... | ... |
23541 | 2 | 57.34 | 454.0 |
23542 | 5 | 77.43 | 462.0 |
23543 | 2 | 50.76 | 462.0 |
23544 | 12 | 134.63 | 157.0 |
23545 | 1 | 24.99 | 462.0 |
23546 | 1 | 13.97 | 462.0 |
23547 | 2 | 23.54 | 449.0 |
23548 | 2 | 23.54 | 462.0 |
23549 | 2 | 27.13 | 462.0 |
23550 | 2 | 25.28 | 462.0 |
23551 | 12 | 264.63 | 292.0 |
23552 | 4 | 49.38 | 453.0 |
23553 | 8 | 98.58 | 459.0 |
23554 | 3 | 36.37 | 149.0 |
23555 | 14 | 189.18 | 20.0 |
23556 | 15 | 203.00 | 23.0 |
23557 | 1 | 14.37 | 462.0 |
23558 | 11 | 145.60 | 125.0 |
23559 | 8 | 111.65 | 368.0 |
23560 | 1 | 18.36 | 462.0 |
23561 | 6 | 83.46 | 32.0 |
23562 | 2 | 29.33 | 462.0 |
23563 | 3 | 58.75 | 269.0 |
23564 | 5 | 70.01 | 212.0 |
23565 | 1 | 11.77 | 462.0 |
23566 | 2 | 36.00 | 462.0 |
23567 | 1 | 20.97 | 462.0 |
23568 | 6 | 121.70 | 434.0 |
23569 | 2 | 25.74 | 462.0 |
23570 | 5 | 94.08 | 461.0 |
23570 rows × 3 columns
In [34]:
def rfm_fumc(x):
level = x.apply(lambda x:'1' if x>=1 else '0')
label=level.R+level.F+level.M
d={
'111':'重要价值客户',
'011':'重要保持客户',
'101':'重要发展客户',
'001':'重要挽留客户',
'110':'一般价值客户',
'010':'一般保持客户',
'100':'一般发展客户',
'000':'一般挽留客户',
}
result=d[label]
return result
In [35]:
rfm['label']=rfm[['R','F','M']].apply(lambda x: x-x.mean()).apply(rfm_fumc,axis=1)
In [36]:
rfm.loc[rfm.label=='重要价值客户','color']='g'
rfm.loc[~(rfm.label=='重要价值客户'),'color']='r'
rfm.plot.scatter('F','R',c=rfm.color)
Out[36]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6c26a550>
In [37]:
rfm.groupby('label').sum()
Out[37]:
F | M | R | |
---|---|---|---|
label | |||
一般价值客户 | 182 | 1767.11 | 8512.0 |
一般保持客户 | 492 | 5100.77 | 7782.0 |
一般发展客户 | 29915 | 445233.28 | 6983699.0 |
一般挽留客户 | 15428 | 215075.77 | 621894.0 |
重要价值客户 | 9849 | 147180.09 | 286676.0 |
重要保持客户 | 105509 | 1555586.51 | 476502.0 |
重要发展客户 | 2322 | 49905.80 | 174340.0 |
重要挽留客户 | 4184 | 80466.30 | 96009.0 |
从RFM分层可知,大部分用户为重要保持客户,但是这是由于极值的影响,所有RFM的划分标准应该以业务为准
- 尽量用小部分的用户覆盖大部分的额度
- 不要为了数据好看划分等级
In [38]:
pivoted_counts=df.pivot_table(index='ID',columns='month',values='DATE',aggfunc='count').fillna(0)
pivoted_counts
Out[38]:
month | 1997-01-01 00:00:00 | 1997-02-01 00:00:00 | 1997-03-01 00:00:00 | 1997-04-01 00:00:00 | 1997-05-01 00:00:00 | 1997-06-01 00:00:00 | 1997-07-01 00:00:00 | 1997-08-01 00:00:00 | 1997-09-01 00:00:00 | 1997-10-01 00:00:00 | 1997-11-01 00:00:00 | 1997-12-01 00:00:00 | 1998-01-01 00:00:00 | 1998-02-01 00:00:00 | 1998-03-01 00:00:00 | 1998-04-01 00:00:00 | 1998-05-01 00:00:00 | 1998-06-01 00:00:00 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | ||||||||||||||||||
1 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
4 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 | 2.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
6 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
8 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
9 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
10 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
11 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
12 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
13 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
14 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
15 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
16 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
17 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
18 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
19 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
21 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
22 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
24 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
25 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 2.0 |
26 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
27 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
28 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
29 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
30 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
23541 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23542 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23543 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23544 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23545 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23546 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23547 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23548 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23549 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23550 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23551 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 2.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23552 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23553 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23554 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23555 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 |
23556 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
23557 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23558 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23559 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23560 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23561 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
23562 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23563 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23564 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23565 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23566 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23567 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23568 | 0.0 | 0.0 | 1.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23569 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23570 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23570 rows × 18 columns
In [39]:
df_purchase=pivoted_counts.applymap(lambda x:1 if x>0 else 0)
df_purchase
Out[39]:
month | 1997-01-01 00:00:00 | 1997-02-01 00:00:00 | 1997-03-01 00:00:00 | 1997-04-01 00:00:00 | 1997-05-01 00:00:00 | 1997-06-01 00:00:00 | 1997-07-01 00:00:00 | 1997-08-01 00:00:00 | 1997-09-01 00:00:00 | 1997-10-01 00:00:00 | 1997-11-01 00:00:00 | 1997-12-01 00:00:00 | 1998-01-01 00:00:00 | 1998-02-01 00:00:00 | 1998-03-01 00:00:00 | 1998-04-01 00:00:00 | 1998-05-01 00:00:00 | 1998-06-01 00:00:00 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | ||||||||||||||||||
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
8 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
9 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
10 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
11 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
12 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
13 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
14 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
15 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
16 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
17 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
18 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
19 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
20 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
21 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
22 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
24 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
25 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
26 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
27 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
28 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
29 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
30 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
23541 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23542 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23543 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23544 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
23545 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23546 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23547 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23548 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23549 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23550 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23551 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23552 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23553 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23554 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
23555 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
23556 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
23557 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23558 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
23559 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23560 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23561 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
23562 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23563 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23564 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23565 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23566 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23567 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23568 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23569 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23570 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23570 rows × 18 columns
In [40]:
def active_status(data):
status=[]
for i in range(18):
if data[i]==0:
if len(status)>0:
if status[i-1]=='unreg':
status.append('unreg')
else:
status.append('unactive')
else:
status.append('unreg')
else:
if len(status)==0:
status.append('new')
else:
if status[i-1]=='unactive':
status.append('return')
elif status[i-1]=='unreg':
status.append('new')
else:
status.append('active')
return status
In [41]:
puchaser_status=df_purchase.apply(active_status,axis=1)
In [42]:
puchaser_status
Out[42]:
month | 1997-01-01 00:00:00 | 1997-02-01 00:00:00 | 1997-03-01 00:00:00 | 1997-04-01 00:00:00 | 1997-05-01 00:00:00 | 1997-06-01 00:00:00 | 1997-07-01 00:00:00 | 1997-08-01 00:00:00 | 1997-09-01 00:00:00 | 1997-10-01 00:00:00 | 1997-11-01 00:00:00 | 1997-12-01 00:00:00 | 1998-01-01 00:00:00 | 1998-02-01 00:00:00 | 1998-03-01 00:00:00 | 1998-04-01 00:00:00 | 1998-05-01 00:00:00 | 1998-06-01 00:00:00 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | ||||||||||||||||||
1 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
2 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
3 | new | unactive | return | active | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | return | unactive |
4 | new | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive |
5 | new | active | unactive | return | active | active | active | unactive | return | unactive | unactive | return | active | unactive | unactive | unactive | unactive | unactive |
6 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
7 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive |
8 | new | active | unactive | unactive | unactive | return | active | unactive | unactive | unactive | return | active | unactive | unactive | return | unactive | unactive | unactive |
9 | new | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return |
10 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
11 | new | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive |
12 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
13 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
14 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
15 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
16 | new | unactive | unactive | unactive | unactive | unactive | return | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
17 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
18 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
19 | new | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
20 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
21 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
22 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
24 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive |
25 | new | unactive | unactive | unactive | unactive | unactive | return | active | unactive | return | unactive | unactive | unactive | unactive | unactive | return | active | active |
26 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
27 | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
28 | new | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
29 | new | active | active | active | active | unactive | return | unactive | return | unactive | return | unactive | unactive | unactive | unactive | return | unactive | unactive |
30 | new | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
23541 | unreg | unreg | new | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23542 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23543 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23544 | unreg | unreg | new | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive |
23545 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23546 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23547 | unreg | unreg | new | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23548 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23549 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23550 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23551 | unreg | unreg | new | unactive | unactive | return | unactive | return | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23552 | unreg | unreg | new | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23553 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23554 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive |
23555 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | return | unactive | unactive | unactive | unactive | return | active |
23556 | unreg | unreg | new | unactive | unactive | return | active | unactive | return | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | return |
23557 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23558 | unreg | unreg | new | unactive | return | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive |
23559 | unreg | unreg | new | unactive | return | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23560 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23561 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | return | unactive |
23562 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23563 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23564 | unreg | unreg | new | unactive | return | unactive | unactive | unactive | unactive | unactive | return | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23565 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23566 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23567 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23568 | unreg | unreg | new | active | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23569 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23570 | unreg | unreg | new | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive | unactive |
23570 rows × 18 columns
In [43]:
purchase_static_ct=puchaser_status.replace('unreg',np.nan).apply(lambda x:pd.value_counts(x))
purchase_static_ct
Out[43]:
month | 1997-01-01 00:00:00 | 1997-02-01 00:00:00 | 1997-03-01 00:00:00 | 1997-04-01 00:00:00 | 1997-05-01 00:00:00 | 1997-06-01 00:00:00 | 1997-07-01 00:00:00 | 1997-08-01 00:00:00 | 1997-09-01 00:00:00 | 1997-10-01 00:00:00 | 1997-11-01 00:00:00 | 1997-12-01 00:00:00 | 1998-01-01 00:00:00 | 1998-02-01 00:00:00 | 1998-03-01 00:00:00 | 1998-04-01 00:00:00 | 1998-05-01 00:00:00 | 1998-06-01 00:00:00 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
active | NaN | 1157.0 | 1681 | 1773.0 | 852.0 | 747.0 | 746.0 | 604.0 | 528.0 | 532.0 | 624.0 | 632.0 | 512.0 | 472.0 | 571.0 | 518.0 | 459.0 | 446.0 |
new | 7846.0 | 8476.0 | 7248 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
return | NaN | NaN | 595 | 1049.0 | 1362.0 | 1592.0 | 1434.0 | 1168.0 | 1211.0 | 1307.0 | 1404.0 | 1232.0 | 1025.0 | 1079.0 | 1489.0 | 919.0 | 1029.0 | 1060.0 |
unactive | NaN | 6689.0 | 14046 | 20748.0 | 21356.0 | 21231.0 | 21390.0 | 21798.0 | 21831.0 | 21731.0 | 21542.0 | 21706.0 | 22033.0 | 22019.0 | 21510.0 | 22133.0 | 22082.0 | 22064.0 |
In [44]:
purchase_static_ct.fillna(0).T.apply(lambda x:x/x.sum(),axis=1)
Out[44]:
active | new | return | unactive | |
---|---|---|---|---|
month | ||||
1997-01-01 | 0.000000 | 1.000000 | 0.000000 | 0.000000 |
1997-02-01 | 0.070886 | 0.519299 | 0.000000 | 0.409815 |
1997-03-01 | 0.071319 | 0.307510 | 0.025244 | 0.595927 |
1997-04-01 | 0.075223 | 0.000000 | 0.044506 | 0.880272 |
1997-05-01 | 0.036148 | 0.000000 | 0.057785 | 0.906067 |
1997-06-01 | 0.031693 | 0.000000 | 0.067543 | 0.900764 |
1997-07-01 | 0.031650 | 0.000000 | 0.060840 | 0.907510 |
1997-08-01 | 0.025626 | 0.000000 | 0.049555 | 0.924820 |
1997-09-01 | 0.022401 | 0.000000 | 0.051379 | 0.926220 |
1997-10-01 | 0.022571 | 0.000000 | 0.055452 | 0.921977 |
1997-11-01 | 0.026474 | 0.000000 | 0.059567 | 0.913958 |
1997-12-01 | 0.026814 | 0.000000 | 0.052270 | 0.920916 |
1998-01-01 | 0.021723 | 0.000000 | 0.043487 | 0.934790 |
1998-02-01 | 0.020025 | 0.000000 | 0.045779 | 0.934196 |
1998-03-01 | 0.024226 | 0.000000 | 0.063174 | 0.912601 |
1998-04-01 | 0.021977 | 0.000000 | 0.038990 | 0.939033 |
1998-05-01 | 0.019474 | 0.000000 | 0.043657 | 0.936869 |
1998-06-01 | 0.018922 | 0.000000 | 0.044972 | 0.936105 |
由上表可知,每月的用户消费状态变化
- 活跃用户,持续消费的用户,对应的是消费运营的质量
- 回流用户,之前不消费本月才消费,对应的是换回运营
- 不活跃用户,对应的是流失
In [45]:
diff=grouped_user.apply(lambda x :x.DATE-x.DATE.shift())
In [46]:
diff.reset_index().DATE.describe()
Out[46]:
count 46089 mean 68 days 23:22:13.567662 std 91 days 00:47:33.924168 min 0 days 00:00:00 25% 10 days 00:00:00 50% 31 days 00:00:00 75% 89 days 00:00:00 max 533 days 00:00:00 Name: DATE, dtype: object
In [47]:
(diff/np.timedelta64(1,'D')).hist(bins=20)
Out[47]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6e42ad30>
- 订单周期呈指数分布
- 用户的平均购买周期是68天
- 绝大部分用户的购买周期都低于100天
In [48]:
(user_life['max']-user_life['min']).describe()
Out[48]:
count 23570 mean 134 days 20:55:36.987696 std 180 days 13:46:43.039788 min 0 days 00:00:00 25% 0 days 00:00:00 50% 0 days 00:00:00 75% 294 days 00:00:00 max 544 days 00:00:00 dtype: object
In [49]:
((user_life['max']-user_life['min'])/np.timedelta64(1,'D')).hist(bins=40)
Out[49]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6c2622e8>
- 用户的生命周期受只购买一次的用户影响比较厉害(可以排除)
- 用户均消费134天, 中位数仅0天
In [50]:
u_1=((user_life['max']-user_life['min'])/np.timedelta64(1,'D')).reset_index()[0]
u_1[u_1>0].hist(bins=40)
Out[50]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c6ef7b518>
4.复购率与回购率分析¶
- 复购率
- 自然月内,购买多次的用户占总用户比
- 回购率
- 曾经购买过的用户在某一时期内的再次购买的占比
In [51]:
pivoted_counts
Out[51]:
month | 1997-01-01 00:00:00 | 1997-02-01 00:00:00 | 1997-03-01 00:00:00 | 1997-04-01 00:00:00 | 1997-05-01 00:00:00 | 1997-06-01 00:00:00 | 1997-07-01 00:00:00 | 1997-08-01 00:00:00 | 1997-09-01 00:00:00 | 1997-10-01 00:00:00 | 1997-11-01 00:00:00 | 1997-12-01 00:00:00 | 1998-01-01 00:00:00 | 1998-02-01 00:00:00 | 1998-03-01 00:00:00 | 1998-04-01 00:00:00 | 1998-05-01 00:00:00 | 1998-06-01 00:00:00 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | ||||||||||||||||||
1 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
4 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 | 2.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
6 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
8 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
9 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
10 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
11 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
12 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
13 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
14 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
15 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
16 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
17 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
18 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
19 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
21 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
22 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
24 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
25 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 2.0 |
26 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
27 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
28 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
29 | 1.0 | 1.0 | 1.0 | 2.0 | 2.0 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
30 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
23541 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23542 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23543 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23544 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23545 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23546 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23547 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23548 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23549 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23550 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23551 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 2.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23552 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23553 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23554 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23555 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 |
23556 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
23557 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23558 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23559 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23560 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23561 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
23562 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23563 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23564 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23565 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23566 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23567 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23568 | 0.0 | 0.0 | 1.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23569 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23570 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
23570 rows × 18 columns
In [52]:
(((pivoted_counts[pivoted_counts>1]).count())/((pivoted_counts[pivoted_counts>0]).count())).plot()
Out[52]:
<matplotlib.axes._subplots.AxesSubplot at 0x21c70a66f28>