What is the idiomatic way of converting a pandas DateTimeIndex to (an iterable of) Unix Time? This is probably not the way to go:
将pandas DateTimeIndex转换为(可迭代的)Unix时间的惯用方法是什么?这可能不是要走的路:
[time.mktime(t.timetuple()) for t in my_data_frame.index.to_pydatetime()]
2 个解决方案
#1
68
As DatetimeIndex
is ndarray
under the hood, you can do the conversion without a comprehension (much faster).
由于DatetimeIndex是ndarray,你可以在没有理解的情况下进行转换(更快)。
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: from datetime import datetime
In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
...: index = pd.DatetimeIndex(dates)
...:
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000],
dtype=int64)
In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)
%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop
%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop
#2
31
Note: Timestamp is just unix time with nanoseconds (so divide it by 10**9):
注意:时间戳只是unix时间,以纳秒为单位(因此除以10 ** 9):
[t.value // 10 ** 9 for t in tsframe.index]
For example:
例如:
In [1]: t = pd.Timestamp('2000-02-11 00:00:00')
In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>
In [3]: t.value
Out[3]: 950227200000000000L
In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0
As @root points out it's faster to extract the array of values directly:
正如@root指出的那样,直接提取值数组更快:
tsframe.index.astype(np.int64) // 10 ** 9
#1
68
As DatetimeIndex
is ndarray
under the hood, you can do the conversion without a comprehension (much faster).
由于DatetimeIndex是ndarray,你可以在没有理解的情况下进行转换(更快)。
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: from datetime import datetime
In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
...: index = pd.DatetimeIndex(dates)
...:
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000],
dtype=int64)
In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)
%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop
%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop
#2
31
Note: Timestamp is just unix time with nanoseconds (so divide it by 10**9):
注意:时间戳只是unix时间,以纳秒为单位(因此除以10 ** 9):
[t.value // 10 ** 9 for t in tsframe.index]
For example:
例如:
In [1]: t = pd.Timestamp('2000-02-11 00:00:00')
In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>
In [3]: t.value
Out[3]: 950227200000000000L
In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0
As @root points out it's faster to extract the array of values directly:
正如@root指出的那样,直接提取值数组更快:
tsframe.index.astype(np.int64) // 10 ** 9