This is a beginner's question but how do you save a 2d numpy array to a file in (compressed) R format using rpy2? To be clear, I want to save it in rpy2 and then later read it in using R. I would like to avoid csv as the amount of data will be large.
这是一个初学者的问题但是如何使用rpy2将2d numpy数组保存到(压缩)R格式的文件中?为了清楚起见,我想将它保存在rpy2中,然后使用R读取它。我想避免使用csv,因为数据量会很大。
4 个解决方案
#1
6
Looks like you want the save command. I would use the pandas R interface and do something like the following.
看起来你想要save命令。我会使用pandas R接口并执行以下操作。
import numpy as np
from rpy2.robjects import r
import pandas.rpy.common as com
from pandas import DataFrame
a = np.array([range(5), range(5)])
df = DataFrame(a)
df = com.convert_to_r_dataframe(df)
r.assign("foo", df)
r("save(foo, file='here.gzip', compress=TRUE)")
There may be a more elegant way, though. I'm open to better suggestions. The above, in R
would be used:
不过,可能会有更优雅的方式。我愿意接受更好的建议。以上,在R中将使用:
> load("here.gzip")
> foo
X0 X1 X2 X3 X4
0 0 1 2 3 4
1 0 1 2 3 4
You can bypass the use of pandas
and use numpy2ri from rpy2
. With something like:
您可以绕过pandas的使用并从rpy2使用numpy2ri。有类似的东西:
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
a = np.array([[i*2147483647**2 for i in range(5)], range(5)], dtype="uint64")
a = np.array(a, dtype="float64") # <- convert to double precision numeric since R doesn't have unsigned ints
ro = numpy2ri(a)
r.assign("bar", ro)
r("save(bar, file='another.gzip', compress=TRUE)")
In R
then:
在R然后:
> load("another.gzip")
> bar
[,1] [,2] [,3] [,4] [,5]
[1,] 0 4.611686e+18 9.223372e+18 1.383506e+19 1.844674e+19
[2,] 0 1.000000e+00 2.000000e+00 3.000000e+00 4.000000e+00
#2
3
Suppose that you have a dataframe called data then the following code help me to store this data as a matrix in R and then load it into R (R studio)
假设您有一个名为data的数据帧,那么下面的代码帮助我将这些数据存储为R中的矩阵,然后将其加载到R(R studio)
save data to R
# Take only the values of the dataframe
B=data.values
import rpy2.robjects as ro
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
nr,nc = B.shape
Br = ro.r.matrix(B, nrow=nr, ncol=nc)
ro.r.assign("B", Br)
ro.r("save(B, file='here.Rdata')")
Then go to R and write this
load("D:/.../here.Rdata")
This has done the job for me!
这对我来说已经完成了这项工作!
#3
2
Here's an example without pandas that adds column and row names
这是一个没有pandas的示例,它添加了列名和行名
import numpy as np
from rpy2.robjects import rinterface, r, IntVector, FloatVector, StrVector
# older (<2.1) versions of rpy2 have globenEvn vs globalenv
# let's fix it a little
if not hasattr(rinterface,'globalenv'):
warnings.warn('Old version of rpy2 detected')
rinterface.globalenv = rinterface.globalEnv
var_name = 'r_var'
vals = np.arange(20,dtype='float').reshape(4,5)
# transpose because R is column major vs python is row major
r_vals = FloatVector(vals.T.ravel())
# make it a matrix
rinterface.globalenv[var_name]=r['matrix'](r_vals,nrow=vals.shape[0])
# give it some row and column names
r("rownames(%s) <- c%s"%(var_name,tuple('ABCDEF'[i] for i in range(vals.shape[0]))))
r("colnames(%s) <- c%s"%(var_name,tuple(range(vals.shape[1]))))
#save it to file
r.save(var_name,file='r_from_py.rdata')
#4
2
An alternative to rpy2 is to write a mat-file and load this mat-file from R.
rpy2的替代方法是编写一个mat文件并从R加载这个mat文件。
in python:
在python中:
os.chdir("/home/user/proj") #specify a path to save to
import numpy as np
import scipy.io
x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)
scipy.io.savemat('test.mat', dict(x=x, y=y))
example copied from: "Converting" Numpy arrays to Matlab and vice versa
示例复制自:“转换”Numpy数组到Matlab,反之亦然
in R
在R
library(R.matlab)
object_list = readMat("/home/user/proj/test.mat")
I'm a beginner in python.
我是python的初学者。
#1
6
Looks like you want the save command. I would use the pandas R interface and do something like the following.
看起来你想要save命令。我会使用pandas R接口并执行以下操作。
import numpy as np
from rpy2.robjects import r
import pandas.rpy.common as com
from pandas import DataFrame
a = np.array([range(5), range(5)])
df = DataFrame(a)
df = com.convert_to_r_dataframe(df)
r.assign("foo", df)
r("save(foo, file='here.gzip', compress=TRUE)")
There may be a more elegant way, though. I'm open to better suggestions. The above, in R
would be used:
不过,可能会有更优雅的方式。我愿意接受更好的建议。以上,在R中将使用:
> load("here.gzip")
> foo
X0 X1 X2 X3 X4
0 0 1 2 3 4
1 0 1 2 3 4
You can bypass the use of pandas
and use numpy2ri from rpy2
. With something like:
您可以绕过pandas的使用并从rpy2使用numpy2ri。有类似的东西:
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
a = np.array([[i*2147483647**2 for i in range(5)], range(5)], dtype="uint64")
a = np.array(a, dtype="float64") # <- convert to double precision numeric since R doesn't have unsigned ints
ro = numpy2ri(a)
r.assign("bar", ro)
r("save(bar, file='another.gzip', compress=TRUE)")
In R
then:
在R然后:
> load("another.gzip")
> bar
[,1] [,2] [,3] [,4] [,5]
[1,] 0 4.611686e+18 9.223372e+18 1.383506e+19 1.844674e+19
[2,] 0 1.000000e+00 2.000000e+00 3.000000e+00 4.000000e+00
#2
3
Suppose that you have a dataframe called data then the following code help me to store this data as a matrix in R and then load it into R (R studio)
假设您有一个名为data的数据帧,那么下面的代码帮助我将这些数据存储为R中的矩阵,然后将其加载到R(R studio)
save data to R
# Take only the values of the dataframe
B=data.values
import rpy2.robjects as ro
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
nr,nc = B.shape
Br = ro.r.matrix(B, nrow=nr, ncol=nc)
ro.r.assign("B", Br)
ro.r("save(B, file='here.Rdata')")
Then go to R and write this
load("D:/.../here.Rdata")
This has done the job for me!
这对我来说已经完成了这项工作!
#3
2
Here's an example without pandas that adds column and row names
这是一个没有pandas的示例,它添加了列名和行名
import numpy as np
from rpy2.robjects import rinterface, r, IntVector, FloatVector, StrVector
# older (<2.1) versions of rpy2 have globenEvn vs globalenv
# let's fix it a little
if not hasattr(rinterface,'globalenv'):
warnings.warn('Old version of rpy2 detected')
rinterface.globalenv = rinterface.globalEnv
var_name = 'r_var'
vals = np.arange(20,dtype='float').reshape(4,5)
# transpose because R is column major vs python is row major
r_vals = FloatVector(vals.T.ravel())
# make it a matrix
rinterface.globalenv[var_name]=r['matrix'](r_vals,nrow=vals.shape[0])
# give it some row and column names
r("rownames(%s) <- c%s"%(var_name,tuple('ABCDEF'[i] for i in range(vals.shape[0]))))
r("colnames(%s) <- c%s"%(var_name,tuple(range(vals.shape[1]))))
#save it to file
r.save(var_name,file='r_from_py.rdata')
#4
2
An alternative to rpy2 is to write a mat-file and load this mat-file from R.
rpy2的替代方法是编写一个mat文件并从R加载这个mat文件。
in python:
在python中:
os.chdir("/home/user/proj") #specify a path to save to
import numpy as np
import scipy.io
x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)
scipy.io.savemat('test.mat', dict(x=x, y=y))
example copied from: "Converting" Numpy arrays to Matlab and vice versa
示例复制自:“转换”Numpy数组到Matlab,反之亦然
in R
在R
library(R.matlab)
object_list = readMat("/home/user/proj/test.mat")
I'm a beginner in python.
我是python的初学者。