I have a pandas DataFrame with dtype=numpy.datetime64
In the data I want to change
我有一只熊猫DataFrame和dtype=numpy。datetime64在我想要更改的数据中。
'2011-11-14T00:00:00.000000000'
to:
:
'2010-11-14T00:00:00.000000000'
or other year. Timedelta is not known, only year number to assign. this displays year in int
或其他。Timedelta不为人所知,仅指定年份编号。这个显示年在int。
Dates_profit.iloc[50][stock].astype('datetime64[Y]').astype(int)+1970
but can't assign value. Anyone know how to assign year to numpy.datetime64
?
但不能分配值。有人知道如何分配年份到numpy.datetime64吗?
3 个解决方案
#1
0
Consider the following approach:
考虑以下的方法:
In [115]: df
Out[115]:
Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05
In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))
In [117]: df
Out[117]:
Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05
#2
0
numpy.datetime64
objects are hard to work with. To update a value, it is normally easier to convert the date to a standard Python datetime
object, do the change and then convert it back to a numpy.datetime64
value again:
numpy。datetime64对象很难处理。要更新一个值,通常更容易将日期转换为标准的Python datetime对象,进行更改,然后将其转换回numpy。datetime64又值:
import numpy as np
from datetime import datetime
dt64 = np.datetime64('2011-11-14T00:00:00.000000000')
# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')
# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)
# update year
dt.replace(year=2010)
# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)
There might be simpler ways, but this works, at least.
可能会有更简单的方法,但至少这是可行的。
#3
0
This vectorised solution gives the same result as using pandas to iterate over with x.replace(year=n), but the speed up on large arrays is at least x10 faster.
这个矢量化的解决方案提供了与使用熊猫迭代x.replace(year=n)相同的结果,但是在大数组上的速度至少要快x10。
It is important to remember the year that the datetime64 object is replaced with should be a leap year. Using the python datetime library, the following crashes: datetime(2012,2,29).replace(year=2011) crashes. Here, the function 'replace_year' will simply move 2012-02-29 to 2011-03-01.
重要的是要记住datetime64对象被替换的年份应该是闰年。使用python datetime库,下面的崩溃:datetime(2012,2,29).replace(year=2011)崩溃。在这里,函数“replace_year”将简单地移动到2012-02-29到2011-03-01。
I'm using numpy v 1.13.1.
我用的是numpy v 1.13.1。
import numpy as np
import pandas as pd
def replace_year(x, year):
""" Year must be a leap year for this to work """
# Add number of days x is from JAN-01 to year-01-01
x_year = np.datetime64(str(year)+'-01-01') + (x - x.astype('M8[Y]'))
# Due to leap years calculate offset of 1 day for those days in non-leap year
yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)
# However, due to days in non-leap years prior March-01,
# correct for previous step by removing an extra day
non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')
# Finally, apply the day offset
x_year = x_year - day_offset
return x_year
x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)
x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))
print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))
#1
0
Consider the following approach:
考虑以下的方法:
In [115]: df
Out[115]:
Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05
In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))
In [117]: df
Out[117]:
Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05
#2
0
numpy.datetime64
objects are hard to work with. To update a value, it is normally easier to convert the date to a standard Python datetime
object, do the change and then convert it back to a numpy.datetime64
value again:
numpy。datetime64对象很难处理。要更新一个值,通常更容易将日期转换为标准的Python datetime对象,进行更改,然后将其转换回numpy。datetime64又值:
import numpy as np
from datetime import datetime
dt64 = np.datetime64('2011-11-14T00:00:00.000000000')
# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')
# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)
# update year
dt.replace(year=2010)
# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)
There might be simpler ways, but this works, at least.
可能会有更简单的方法,但至少这是可行的。
#3
0
This vectorised solution gives the same result as using pandas to iterate over with x.replace(year=n), but the speed up on large arrays is at least x10 faster.
这个矢量化的解决方案提供了与使用熊猫迭代x.replace(year=n)相同的结果,但是在大数组上的速度至少要快x10。
It is important to remember the year that the datetime64 object is replaced with should be a leap year. Using the python datetime library, the following crashes: datetime(2012,2,29).replace(year=2011) crashes. Here, the function 'replace_year' will simply move 2012-02-29 to 2011-03-01.
重要的是要记住datetime64对象被替换的年份应该是闰年。使用python datetime库,下面的崩溃:datetime(2012,2,29).replace(year=2011)崩溃。在这里,函数“replace_year”将简单地移动到2012-02-29到2011-03-01。
I'm using numpy v 1.13.1.
我用的是numpy v 1.13.1。
import numpy as np
import pandas as pd
def replace_year(x, year):
""" Year must be a leap year for this to work """
# Add number of days x is from JAN-01 to year-01-01
x_year = np.datetime64(str(year)+'-01-01') + (x - x.astype('M8[Y]'))
# Due to leap years calculate offset of 1 day for those days in non-leap year
yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)
# However, due to days in non-leap years prior March-01,
# correct for previous step by removing an extra day
non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')
# Finally, apply the day offset
x_year = x_year - day_offset
return x_year
x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)
x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))
print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))