如何从熊猫

问题描述:

日期和时间列中添加时间指数我有一个数据帧OHLC如下:如何从熊猫

 
trade_date trade_time open_price high_price low_price close_price volumn 
    19911223  15:00  27.70  27.9  27.60  27.80 1270 
    19911224  15:00  27.90  29.3  27.00  29.05 1050 
    19911225  15:00  29.15  30.0  29.10  29.30 2269 
    19911226  15:00  29.30  29.3  28.00  28.00 1918 
    19911227  15:00  28.00  28.5  28.00  28.45 2105 
    19911228  15:00  28.40  29.3  28.40  29.25 1116 
    19911230  15:00  29.30  29.4  28.80  28.80 1059 
    ........ 

如何将trade_date和trade_time列到时间序列指标结合起来? 我通过simular问题看,它们都基于read_csv ....

+0

是'trade_date'和'trade_time'字符串? – filmor 2014-09-29 09:11:18

+0

我认为你应该接受杰夫的答案,因为它会比我的快得多 – EdChum 2014-09-30 08:09:04

假设trade_date是D型Int64和trade_time是str那么下面将工作:

In [26]: 
# use strptime to format the data into a datetime  
import datetime as dt 
def datetime(x): 
    return dt.datetime.strptime(str(x.trade_date) + '' + x.trade_time, '%Y%m%d%H:%M') 
# create a datetime column call apply to do the conversion 
df['datetime'] = df.apply(lambda row: datetime(row), axis=1) 
# set the index to this datetime, by default this column will become the index and drop it as a column 
df.set_index('datetime',inplace=True) 
df 
Out[26]: 
        trade_date trade_time open_price high_price low_price \ 
datetime                   
1991-12-23 15:00:00 19911223  15:00  27.70  27.9  27.6 
1991-12-24 15:00:00 19911224  15:00  27.90  29.3  27.0 
1991-12-25 15:00:00 19911225  15:00  29.15  30.0  29.1 
1991-12-26 15:00:00 19911226  15:00  29.30  29.3  28.0 
1991-12-27 15:00:00 19911227  15:00  28.00  28.5  28.0 
1991-12-28 15:00:00 19911228  15:00  28.40  29.3  28.4 
1991-12-30 15:00:00 19911230  15:00  29.30  29.4  28.8 

        close_price volumn 
datetime         
1991-12-23 15:00:00  27.80 1270 
1991-12-24 15:00:00  29.05 1050 
1991-12-25 15:00:00  29.30 2269 
1991-12-26 15:00:00  28.00 1918 
1991-12-27 15:00:00  28.45 2105 
1991-12-28 15:00:00  29.25 1116 
1991-12-30 15:00:00  28.80 1059 

In [27]: 

df.index.dtype 
Out[27]: 
dtype('<M8[ns]') 
+0

谢谢,它的工作原理.... – firefoxuser 2014-09-30 01:22:13

这是一个全矢量SOLN 。

将trade_date列转换为 dtype(它可以是int64object d型先验)。将trade_time转换为timedelta64[ns] dtype。您需要通过添加秒组件来提示时间为hh:mm。

总结一个日期时间和一个timedelta产生一个日期时间。

In [5]: pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00') 
Out[5]: 
0 1991-12-23 15:00:00 
1 1991-12-24 15:00:00 
2 1991-12-25 15:00:00 
3 1991-12-26 15:00:00 
4 1991-12-27 15:00:00 
5 1991-12-28 15:00:00 
6 1991-12-30 15:00:00 
dtype: datetime64[ns] 

然后,您可以直接设置索引

In [6]: df.index = pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00') 

In [7]: df 
Out[7]: 
        trade_date trade_time open_price high_price low_price close_price volumn 
1991-12-23 15:00:00 19911223  15:00  27.70  27.9  27.6  27.80 1270 
1991-12-24 15:00:00 19911224  15:00  27.90  29.3  27.0  29.05 1050 
1991-12-25 15:00:00 19911225  15:00  29.15  30.0  29.1  29.30 2269 
1991-12-26 15:00:00 19911226  15:00  29.30  29.3  28.0  28.00 1918 
1991-12-27 15:00:00 19911227  15:00  28.00  28.5  28.0  28.45 2105 
1991-12-28 15:00:00 19911228  15:00  28.40  29.3  28.4  29.25 1116 
1991-12-30 15:00:00 19911230  15:00  29.30  29.4  28.8  28.80 1059