I am trying to generate 15Min OHLCV data with a list of price and amount, with the example data:
我试图用价格和金额列表生成15Min OHLCV数据,示例数据如下:
price amount
unix_timestamp
2018-01-05 12:33:52 15861.00000 0.194755
2018-01-05 12:33:52 15860.00000 0.050000
2018-01-05 12:33:53 15860.00000 0.100000
2018-01-05 12:33:53 15860.00000 0.234208
2018-01-05 12:33:54 15860.00000 0.021911
2018-01-05 12:33:56 15861.00000 0.205245
...
Here's how the OHLCV data is generated with ffill to fill missing data:
以下是使用ffill生成OHLCV数据以填充缺失数据的方法:
ohlcv = data.resample(minutes).agg({
"price":"ohlc",
"amount": "sum",
}).rename(columns={'amount':'volume'}).ffill()
However, the results contains volume with '0' when calculating the sum of missing data instead of forward filling:
但是,在计算缺失数据的总和而不是向前填充时,结果包含“0”的体积:
open high low close volume
unix_timestamp
2018-01-05 12:30:00 15861.0 15946.0 15860.0 15891.0 246.554694
2018-01-05 12:45:00 15893.0 15912.0 15780.0 15877.0 608.036132
2018-01-05 13:00:00 15877.0 15950.0 15862.0 15950.0 303.742717
2018-01-05 13:15:00 15947.0 15956.0 15900.0 15939.0 347.864213
2018-01-05 13:30:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-05 13:45:00 15947.0 15956.0 15900.0 15939.0 0.000000
...
2018-01-22 10:45:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 11:00:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 11:15:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 11:30:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 11:45:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 12:00:00 15947.0 15956.0 15900.0 15939.0 0.000000
2018-01-22 12:15:00 11327.0 11327.0 11250.0 11250.0 193.271647
How do I do forward filling instead of filling with zeroes when the sum is NaN?
当总和为NaN时,如何向前填充而不是填充零?
1 个解决方案
#1
0
There is problem sum
function return 0
for NaN
s.
NaNs有问题和函数返回0。
Solution is replace them back by mask
and then apply function ffill
:
解决方案是通过掩码替换它们然后应用函数ffill:
print (data)
price amount
unix_timestamp
2018-01-05 12:33:52 15861.0 0.194755
2018-01-05 12:33:52 15860.0 0.050000
2018-01-05 12:33:53 15860.0 0.100000
2018-01-05 13:33:53 15860.0 0.234208
2018-01-05 14:33:54 15860.0 0.021911
2018-01-05 16:33:56 15861.0 0.205245
ohlcv = data.resample('15min').agg({
"price":"ohlc",
"amount": "sum",
}).rename(columns={'amount':'volume'})
m = ohlcv.loc[:, ('price','open')].isnull()
ohlcv.loc[:, ('volume','volume')] = ohlcv.loc[:, ('volume','volume')].mask(m)
ohlcv = ohlcv.ffill()
print (ohlcv)
price volume
open high low close volume
unix_timestamp
2018-01-05 12:30:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 12:45:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:00:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:15:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:30:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 13:45:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:00:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:15:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:30:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 14:45:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:00:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:15:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:30:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:45:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:00:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:15:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:30:00 15861.0 15861.0 15861.0 15861.0 0.205245
#1
0
There is problem sum
function return 0
for NaN
s.
NaNs有问题和函数返回0。
Solution is replace them back by mask
and then apply function ffill
:
解决方案是通过掩码替换它们然后应用函数ffill:
print (data)
price amount
unix_timestamp
2018-01-05 12:33:52 15861.0 0.194755
2018-01-05 12:33:52 15860.0 0.050000
2018-01-05 12:33:53 15860.0 0.100000
2018-01-05 13:33:53 15860.0 0.234208
2018-01-05 14:33:54 15860.0 0.021911
2018-01-05 16:33:56 15861.0 0.205245
ohlcv = data.resample('15min').agg({
"price":"ohlc",
"amount": "sum",
}).rename(columns={'amount':'volume'})
m = ohlcv.loc[:, ('price','open')].isnull()
ohlcv.loc[:, ('volume','volume')] = ohlcv.loc[:, ('volume','volume')].mask(m)
ohlcv = ohlcv.ffill()
print (ohlcv)
price volume
open high low close volume
unix_timestamp
2018-01-05 12:30:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 12:45:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:00:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:15:00 15861.0 15861.0 15860.0 15860.0 0.344755
2018-01-05 13:30:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 13:45:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:00:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:15:00 15860.0 15860.0 15860.0 15860.0 0.234208
2018-01-05 14:30:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 14:45:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:00:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:15:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:30:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 15:45:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:00:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:15:00 15860.0 15860.0 15860.0 15860.0 0.021911
2018-01-05 16:30:00 15861.0 15861.0 15861.0 15861.0 0.205245