如何在numpy datetime64中更改年值?

时间:2022-02-27 15:52:16

I have a pandas DataFrame with dtype=numpy.datetime64 In the data I want to change

我有一只熊猫DataFrame和dtype=numpy。datetime64在我想要更改的数据中。

'2011-11-14T00:00:00.000000000'

to:

:

'2010-11-14T00:00:00.000000000'

or other year. Timedelta is not known, only year number to assign. this displays year in int

或其他。Timedelta不为人所知,仅指定年份编号。这个显示年在int。

Dates_profit.iloc[50][stock].astype('datetime64[Y]').astype(int)+1970

but can't assign value. Anyone know how to assign year to numpy.datetime64?

但不能分配值。有人知道如何分配年份到numpy.datetime64吗?

3 个解决方案

#1


0  

Consider the following approach:

考虑以下的方法:

In [115]: df
Out[115]:
        Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05

In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))

In [117]: df
Out[117]:
        Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05

#2


0  

numpy.datetime64 objects are hard to work with. To update a value, it is normally easier to convert the date to a standard Python datetime object, do the change and then convert it back to a numpy.datetime64 value again:

numpy。datetime64对象很难处理。要更新一个值,通常更容易将日期转换为标准的Python datetime对象,进行更改,然后将其转换回numpy。datetime64又值:

import numpy as np
from datetime import datetime

dt64 = np.datetime64('2011-11-14T00:00:00.000000000')

# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')

# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)

# update year
dt.replace(year=2010)

# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)

There might be simpler ways, but this works, at least.

可能会有更简单的方法,但至少这是可行的。

#3


0  

This vectorised solution gives the same result as using pandas to iterate over with x.replace(year=n), but the speed up on large arrays is at least x10 faster.

这个矢量化的解决方案提供了与使用熊猫迭代x.replace(year=n)相同的结果,但是在大数组上的速度至少要快x10。

It is important to remember the year that the datetime64 object is replaced with should be a leap year. Using the python datetime library, the following crashes: datetime(2012,2,29).replace(year=2011) crashes. Here, the function 'replace_year' will simply move 2012-02-29 to 2011-03-01.

重要的是要记住datetime64对象被替换的年份应该是闰年。使用python datetime库,下面的崩溃:datetime(2012,2,29).replace(year=2011)崩溃。在这里,函数“replace_year”将简单地移动到2012-02-29到2011-03-01。

I'm using numpy v 1.13.1.

我用的是numpy v 1.13.1。

import numpy as np
import pandas as pd

def replace_year(x, year):
    """ Year must be a leap year for this to work """
    # Add number of days x is from JAN-01 to year-01-01 
    x_year = np.datetime64(str(year)+'-01-01') +  (x - x.astype('M8[Y]'))

    # Due to leap years calculate offset of 1 day for those days in non-leap year
    yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
    leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)

    # However, due to days in non-leap years prior March-01, 
    # correct for previous step by removing an extra day
    non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
    non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
    day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')

    # Finally, apply the day offset 
    x_year = x_year - day_offset
    return x_year


x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)

x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))

print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))

#1


0  

Consider the following approach:

考虑以下的方法:

In [115]: df
Out[115]:
        Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05

In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))

In [117]: df
Out[117]:
        Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05

#2


0  

numpy.datetime64 objects are hard to work with. To update a value, it is normally easier to convert the date to a standard Python datetime object, do the change and then convert it back to a numpy.datetime64 value again:

numpy。datetime64对象很难处理。要更新一个值,通常更容易将日期转换为标准的Python datetime对象,进行更改,然后将其转换回numpy。datetime64又值:

import numpy as np
from datetime import datetime

dt64 = np.datetime64('2011-11-14T00:00:00.000000000')

# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')

# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)

# update year
dt.replace(year=2010)

# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)

There might be simpler ways, but this works, at least.

可能会有更简单的方法,但至少这是可行的。

#3


0  

This vectorised solution gives the same result as using pandas to iterate over with x.replace(year=n), but the speed up on large arrays is at least x10 faster.

这个矢量化的解决方案提供了与使用熊猫迭代x.replace(year=n)相同的结果,但是在大数组上的速度至少要快x10。

It is important to remember the year that the datetime64 object is replaced with should be a leap year. Using the python datetime library, the following crashes: datetime(2012,2,29).replace(year=2011) crashes. Here, the function 'replace_year' will simply move 2012-02-29 to 2011-03-01.

重要的是要记住datetime64对象被替换的年份应该是闰年。使用python datetime库,下面的崩溃:datetime(2012,2,29).replace(year=2011)崩溃。在这里,函数“replace_year”将简单地移动到2012-02-29到2011-03-01。

I'm using numpy v 1.13.1.

我用的是numpy v 1.13.1。

import numpy as np
import pandas as pd

def replace_year(x, year):
    """ Year must be a leap year for this to work """
    # Add number of days x is from JAN-01 to year-01-01 
    x_year = np.datetime64(str(year)+'-01-01') +  (x - x.astype('M8[Y]'))

    # Due to leap years calculate offset of 1 day for those days in non-leap year
    yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
    leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)

    # However, due to days in non-leap years prior March-01, 
    # correct for previous step by removing an extra day
    non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
    non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
    day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')

    # Finally, apply the day offset 
    x_year = x_year - day_offset
    return x_year


x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)

x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))

print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))