Python编程灵活方便,R的模型方法众多,如何将两者结合起来,发挥更大的作用,值得探索。本文简单介绍如何在Python里调用R,实现两者的完美结合,并且给出实际可用的例子,手把手地教给大家。
rpy2的安装
NOTICE:本文实验环境是ubuntu14+Python2.7+R3.0.1 ##
rpy2 使得python里面可以调用R – it is a very useful tool
1. 安装
问题:用pip 或者 easy_install 无法正常安装 –解决方法:命令行下,sudo apt-get install python-rpy2
2. 调用R脚本
问题2.1:R脚本里面有新包如library(randomForest)时,提示没有这个编辑包 –解决方法:没有编辑包,首先检查在R里安装没有,如果安装了,查看一下安装位置,如果用非管理员安装,则无法对/usr/R/library或者site-library更改,因此新包会安装到其他地方去,导致在python里面无法调用新包。将新包copy到上述两个位置即可。
问题2.2:R脚本是自己编写的函数,返回值是个变量,不能是公式。
3. 上述两个问题解决了,基本就可以开开心心地在python里调用R了。
当然还有两种编程语言下,还有数据格式转换的问题。 – 数据格式转换–python的list转成R的vector。有strVector(); IntVector(); FloatVector(); complexVector(); FactorVector(); BoolVector(); –R的数据向python转。
4. 小测验代码 python_R.py; JustTest.R; JustTest1.R ## ##
使用例子code
以下内容是三段测试代码,亲测可用,方便大家练习掌握
注意事项,在代码注释里都写的很清楚
############## 测验代码 python_R.py ########################################
# -*- coding:utf-8 -*-
# objection: try to use R-script in python
# @author: yujianmin
# time: 2015.06.23
# python version: 2.7.6
# help(model)
# reference:
# rpy2--http://blog.sina.com.cn/s/blog_77b74e97010194mw.html
# rpy--http://www.dataguru.cn/thread-334440-1-1.html
import rpy2.robjects as robjects
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
######### string方式调用R ############
## robjects.r('Rcode')
robjects.r('pi')
######### 调用R脚本 #################
## robjects.r.source('path/R-script.r')
robjects.r.source('../R_code/JustTest.R')## HERE deal with using new package like 'randomForest'
print robjects.r('iris.rf')
print robjects.r('confusion')
robjects.r.source('../R_code/JustTest1.R')## HERE deal with using self-defined function with/without agr
robjects.r.Test()
robjects.r.Test1()
a = robjects.r.Test1(4)
print a
print a[0]
######## python数据与R数据格式 的转换 ## 自己查阅 ##
################# 测验代码:JustTest.R ############################################
# try to use R-script in python---here is the R-script
# @author: yujianmin
# time: 2015.06.23
# R version: 3.0.2
# help(function_name)---function help document
# help(package="package_name")---package help document
# function_name--function code
library(randomForest)
## here take a NOTICE: if you have not the right to change /R/library or R/site-library, when you install randomForest, it will install another path, you can find it using "sudo find / -name 'randomForest'. And copy it to the R/library; or apped the path to somewhere(I guess)
## use data set iris
data = iris
table(data$Species)
## create a randomForest model to classfy the iris species
iris.rf <- randomForest(Species~., data = data, importance=T, proximity=T)
print('--------here is the random model-------')
print(iris.rf)
print('--------here is the names of model-----')
print(names(iris.rf))
confusion = iris.rf$confusion
print(confusion)
############# 测验代码: JustTest1.R ############################################
# try to use R-script in python---here is the R-script-defined function
# @author: yujianmin
# time: 2015.06.23
# R version: 3.0.2
# help(function_name)---function help document
# help(package="package_name")---package help document
# function_name--function code
Test <- function(){
print('this is a self-defined function without arg')
print('-------hello world--------')
}
Test1 <- function(a=3){
print('this is a self-defined function with arg')
print('result is arg*2')
result = a*2
## here should be NOTICE: must be return 'result'. must not return (a*2).
## if do, it will error: arg would not be used
return(result)
}