熊猫resample不返回任何东西

问题描述：

我正在学习使用pandas resample（）函数，但是，下面的代码不会按预期返回任何内容。我每天都重新抽样时间序列。熊猫resample不返回任何东西

import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 

range = pd.date_range('2015-01-01','2015-12-31',freq='15min') 
df = pd.DataFrame(index = range) 

df['speed'] = np.random.randint(low=0, high=60, size=len(df.index)) 
df['distance'] = df['speed'] * 0.25 
df['cumulative_distance'] = df.distance.cumsum() 

print df.head() 

weekly_summary = pd.DataFrame() 
weekly_summary['speed'] = df.speed.resample('D').mean() 
weekly_summary['distance'] = df.distance.resample('D').sum() 

print weekly_summary.head()

输出

     speed distance cumulative_distance 
2015-01-01 00:00:00  40  10.00    10.00 
2015-01-01 00:15:00  6  1.50    11.50 
2015-01-01 00:30:00  31  7.75    19.25 
2015-01-01 00:45:00  41  10.25    29.50 
2015-01-01 01:00:00  59  14.75    44.25 

[5 rows x 3 columns] 
Empty DataFrame 
Columns: [speed, distance] 
Index: [] 

[0 rows x 2 columns]

这个工作对我来说，你使用的是什么版本的熊猫吗？ Mine是0.19.1，也许令人困惑的是，你最初创建一个空的df，然后分配一个新的列，这可能是因为它没有扩大df – EdChum

0.13.1是我的版本，这可能会在旧版本中失败。 – daydayup

@daydayup你可能想要更新熊猫的许多东西，在几年的时间内改变:) – miradulo

答

根据您的熊猫版，你将如何做到这一点会有所不同。

在大熊猫0.19.0，你的代码按预期工作：

In [7]: pd.__version__ 
Out[7]: '0.19.0' 

In [8]: df.speed.resample('D').mean().head() 
Out[8]: 
2015-01-01 28.562500 
2015-01-02 30.302083 
2015-01-03 30.864583 
2015-01-04 29.197917 
2015-01-05 30.708333 
Freq: D, Name: speed, dtype: float64

在旧版本中，您的解决方案可能无法正常工作，但至少在0.14.1，你可以调整它这样做：

>>> pd.__version__ 
'0.14.1' 
>>> df.speed.resample('D').mean() 
29.41087328767123 
>>> df.speed.resample('D', how='mean').head() 
2015-01-01 29.354167 
2015-01-02 26.791667 
2015-01-03 31.854167 
2015-01-04 26.593750 
2015-01-05 30.312500 
Freq: D, Name: speed, dtype: float64

谢谢你的帮助。刚刚升级到19.0以避免麻烦。 – daydayup

答

这看起来像旧版熊猫的一个问题，在新版本中，它会在指定索引不是相同形状的新列时放大df。什么应该工作是不要让空DF，而是通过最初的呼叫resample作为DF构造函数中的数据ARG：

In [8]: 
range = pd.date_range('2015-01-01','2015-12-31',freq='15min') 
df = pd.DataFrame(index = range) 
df['speed'] = np.random.randint(low=0, high=60, size=len(df.index)) 
df['distance'] = df['speed'] * 0.25 
df['cumulative_distance'] = df.distance.cumsum() 
print (df.head()) 
weekly_summary = pd.DataFrame(df.speed.resample('D').mean()) 
weekly_summary['distance'] = df.distance.resample('D').sum() 
print(weekly_summary.head()) 

        speed distance cumulative_distance 
2015-01-01 00:00:00  28  7.0     7.0 
2015-01-01 00:15:00  8  2.0     9.0 
2015-01-01 00:30:00  10  2.5     11.5 
2015-01-01 00:45:00  56  14.0     25.5 
2015-01-01 01:00:00  6  1.5     27.0 
       speed distance 
2015-01-01 27.895833 669.50 
2015-01-02 29.041667 697.00 
2015-01-03 27.104167 650.50 
2015-01-04 28.427083 682.25 
2015-01-05 27.854167 668.50

在这里，我通过调用resample作为DF构造函数中的数据ARG，这需要索引和列名，并创建一个列DF：

weekly_summary = pd.DataFrame(df.speed.resample('D').mean())

那么后继转让应该按预期工作

谢谢你的帮助。刚刚升级到19.0以避免麻烦。 – daydayup

熊猫resample不返回任何东西

相关推荐