如何从熊猫
问题描述:
日期和时间列中添加时间指数我有一个数据帧OHLC如下:如何从熊猫
trade_date trade_time open_price high_price low_price close_price volumn 19911223 15:00 27.70 27.9 27.60 27.80 1270 19911224 15:00 27.90 29.3 27.00 29.05 1050 19911225 15:00 29.15 30.0 29.10 29.30 2269 19911226 15:00 29.30 29.3 28.00 28.00 1918 19911227 15:00 28.00 28.5 28.00 28.45 2105 19911228 15:00 28.40 29.3 28.40 29.25 1116 19911230 15:00 29.30 29.4 28.80 28.80 1059 ........
如何将trade_date和trade_time列到时间序列指标结合起来? 我通过simular问题看,它们都基于read_csv ....
答
假设trade_date是D型Int64
和trade_time是str
那么下面将工作:
In [26]:
# use strptime to format the data into a datetime
import datetime as dt
def datetime(x):
return dt.datetime.strptime(str(x.trade_date) + '' + x.trade_time, '%Y%m%d%H:%M')
# create a datetime column call apply to do the conversion
df['datetime'] = df.apply(lambda row: datetime(row), axis=1)
# set the index to this datetime, by default this column will become the index and drop it as a column
df.set_index('datetime',inplace=True)
df
Out[26]:
trade_date trade_time open_price high_price low_price \
datetime
1991-12-23 15:00:00 19911223 15:00 27.70 27.9 27.6
1991-12-24 15:00:00 19911224 15:00 27.90 29.3 27.0
1991-12-25 15:00:00 19911225 15:00 29.15 30.0 29.1
1991-12-26 15:00:00 19911226 15:00 29.30 29.3 28.0
1991-12-27 15:00:00 19911227 15:00 28.00 28.5 28.0
1991-12-28 15:00:00 19911228 15:00 28.40 29.3 28.4
1991-12-30 15:00:00 19911230 15:00 29.30 29.4 28.8
close_price volumn
datetime
1991-12-23 15:00:00 27.80 1270
1991-12-24 15:00:00 29.05 1050
1991-12-25 15:00:00 29.30 2269
1991-12-26 15:00:00 28.00 1918
1991-12-27 15:00:00 28.45 2105
1991-12-28 15:00:00 29.25 1116
1991-12-30 15:00:00 28.80 1059
In [27]:
df.index.dtype
Out[27]:
dtype('<M8[ns]')
+0
谢谢,它的工作原理.... – firefoxuser 2014-09-30 01:22:13
答
这是一个全矢量SOLN 。
将trade_date列转换为 dtype(它可以是int64
或object
d型先验)。将trade_time转换为timedelta64[ns]
dtype。您需要通过添加秒组件来提示时间为hh:mm。
总结一个日期时间和一个timedelta产生一个日期时间。
In [5]: pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00')
Out[5]:
0 1991-12-23 15:00:00
1 1991-12-24 15:00:00
2 1991-12-25 15:00:00
3 1991-12-26 15:00:00
4 1991-12-27 15:00:00
5 1991-12-28 15:00:00
6 1991-12-30 15:00:00
dtype: datetime64[ns]
然后,您可以直接设置索引
In [6]: df.index = pd.to_datetime(df['trade_date'],format='%Y%m%d') + pd.to_timedelta(df['trade_time'] + ':00')
In [7]: df
Out[7]:
trade_date trade_time open_price high_price low_price close_price volumn
1991-12-23 15:00:00 19911223 15:00 27.70 27.9 27.6 27.80 1270
1991-12-24 15:00:00 19911224 15:00 27.90 29.3 27.0 29.05 1050
1991-12-25 15:00:00 19911225 15:00 29.15 30.0 29.1 29.30 2269
1991-12-26 15:00:00 19911226 15:00 29.30 29.3 28.0 28.00 1918
1991-12-27 15:00:00 19911227 15:00 28.00 28.5 28.0 28.45 2105
1991-12-28 15:00:00 19911228 15:00 28.40 29.3 28.4 29.25 1116
1991-12-30 15:00:00 19911230 15:00 29.30 29.4 28.8 28.80 1059
是'trade_date'和'trade_time'字符串? – filmor 2014-09-29 09:11:18
我认为你应该接受杰夫的答案,因为它会比我的快得多 – EdChum 2014-09-30 08:09:04