将python pandas DataFrame转换为R dataframe以与rpy2一起使用

时间:2021-12-12 22:55:11

I am having trouble converting a pandas DataFrame in Python to an R object, for future use in R using rpy2.

我无法将Python中的pandas DataFrame转换为R对象,以便将来在R中使用rpy2。

The new pandas release 0.8.0 (released a few weeks ago) has a function to convert pandas DataFrames to R DataFrames. The problem is in converting the first column of my pandas DataFrame, which consists of python datetime objects (successively, in a time series). The conversion into an R dataframe returns an StrVector of the dates and times, rather than a vector of R datetime-type objects which I believe are called "POSIXct" objects.

新的pandas发布0.8.0(几周前发布)具有将pandas DataFrames转换为R DataFrames的功能。问题在于转换我的pandas DataFrame的第一列,它由python datetime对象(连续地,在时间序列中)组成。转换为R数据帧会返回日期和时间的StrVector,而不是R datetime类型对象的向量,我相信这些对象称为“POSIXct”对象。

I know the command to convert a string of the type returned to a POSIXct, using the command "as.POSIXct('yyyy-mm-dd hh:mm:ss')". Unfortunately I have not been able to figure out the way to convert all these strings in the StrVector to POSIXct using python and rpy2. The dates need to be in the POSIXct format to be used with the TTR library in R. Below is the relevant python code:

我知道使用命令“as.POSIXct('yyyy-mm-dd hh:mm:ss')”将返回类型的字符串转换为POSIXct的命令。不幸的是我无法找到使用python和rpy2将StrVector中的所有字符串转换为POSIXct的方法。日期需要采用POSIXct格式与R中的TTR库一起使用。下面是相关的python代码:

import pandas
from pandas import *
import pandas.rpy.common as com
import rpy2.robjects as robjects
r = robjects.r
r.library('TTR')        #library contains the function ADX, to be used later

dataframe = read_csv('file_name', parse_dates = [0], names  = ['Date','Col1','Col2','Col3']     #command makes 1st column into datetime.datetime object
r_dataframe = com.convert_to_r_dataframe(dataframe)

ADX = r['ADX']          #creating a name for an R function in python
adx = ADX(r_dataframe)    #will not work because the dates in r_dataframe are in a StrVector

Further I do not believe that the StrVector can be iterated through to convert each object to a POSIXct object individually, due to the definition of a StrVector. Maybe there is a way to cast a StrVector to a generic one?

此外,由于StrVector的定义,我不相信StrVector可以通过迭代将每个对象单独转换为POSIXct对象。也许有一种方法可以将StrVector转换为通用的?

Any help/insight into this matter is greatly appreciated. I am a novice programmer and have been working on this for a couple hours now to no avail.

非常感谢对此事的任何帮助/见解。我是一名新手程序员,现在已经工作了几个小时但无济于事。

Thank you!

3 个解决方案

#1


4  

The reason your ADX call fails is because it expects an xts or matrix-like object with 3 columns: High, Low, Close. Your object contains 4 columns. Drop the date column before passing r_dataframe to ADX and everything should work. You can then add the datetime column back to the ADX output.

ADX调用失败的原因是因为它需要一个xts或类似矩阵的对象,包含3列:High,Low,Close。您的对象包含4列。在将r_dataframe传递给ADX之前删除日期列,一切都应该有效。然后,您可以将datetime列添加回ADX输出。

Or, if you can set the row.names attribute of your R data.frame to the values of the Date column and then remove the Date column, you can convert your R data.frame to an xts object by calling as.xts(r.data.frame). Then you can pass that to ADX and convert the result back to a pandas DataFrame.

或者,如果您可以将R data.frame的row.names属性设置为Date列的值,然后删除Date列,则可以通过调用as.xts将R data.frame转换为xts对象(r .data.frame)。然后你可以将它传递给ADX并将结果转换回pandas DataFrame。

#2


1  

dalejung on GitHub has done quite a bit of work recently in creating a tighter pandas-xts interface with rpy2, you might get in touch with him or join the PyData mailing list

GaleHub上的dalejung最近在使用rpy2创建更紧密的pandas-xts接口方面做了大量工作,您可以与他联系或加入PyData邮件列表

#3


-1  

It's not answer what you want. But how about using piper library?

它不是你想要的答案。但是如何使用piper库?

It's just "pipe" between python and R. Thus it does not rarely occur problem something about converting. https://pypi.python.org/pypi/piper

它只是python和R之间的“管道”。因此它很少出现关于转换的问题。 https://pypi.python.org/pypi/piper

#1


4  

The reason your ADX call fails is because it expects an xts or matrix-like object with 3 columns: High, Low, Close. Your object contains 4 columns. Drop the date column before passing r_dataframe to ADX and everything should work. You can then add the datetime column back to the ADX output.

ADX调用失败的原因是因为它需要一个xts或类似矩阵的对象,包含3列:High,Low,Close。您的对象包含4列。在将r_dataframe传递给ADX之前删除日期列,一切都应该有效。然后,您可以将datetime列添加回ADX输出。

Or, if you can set the row.names attribute of your R data.frame to the values of the Date column and then remove the Date column, you can convert your R data.frame to an xts object by calling as.xts(r.data.frame). Then you can pass that to ADX and convert the result back to a pandas DataFrame.

或者,如果您可以将R data.frame的row.names属性设置为Date列的值,然后删除Date列,则可以通过调用as.xts将R data.frame转换为xts对象(r .data.frame)。然后你可以将它传递给ADX并将结果转换回pandas DataFrame。

#2


1  

dalejung on GitHub has done quite a bit of work recently in creating a tighter pandas-xts interface with rpy2, you might get in touch with him or join the PyData mailing list

GaleHub上的dalejung最近在使用rpy2创建更紧密的pandas-xts接口方面做了大量工作,您可以与他联系或加入PyData邮件列表

#3


-1  

It's not answer what you want. But how about using piper library?

它不是你想要的答案。但是如何使用piper库?

It's just "pipe" between python and R. Thus it does not rarely occur problem something about converting. https://pypi.python.org/pypi/piper

它只是python和R之间的“管道”。因此它很少出现关于转换的问题。 https://pypi.python.org/pypi/piper