在R中有对象关系映射的包吗?

时间:2021-07-16 16:54:29

(By object-relational mapping, I mean what is described here: Wikipedia: Object-relational mapping.)

(这里所说的对象关系映射,我的意思是这里描述的:Wikipedia:对象关系映射。)

Here is how I could imagine this work in R : a kind of "virtual data frame" is linked to a database, and returns the results of SQL queries when accessed. For instance, head(virtual_list) would actually return the results of (select * from mapped_table limit 5) on the mapped database.

以下是我在R中可以想象的工作方式:将一种“虚拟数据帧”链接到数据库,并在访问时返回SQL查询的结果。例如,head(virtual_list)实际上会返回映射数据库上的(select * from mapped_table limit 5)的结果。

I have found this post by John Myles White, but there seems to have been no progress in the last 3 years.

我找到了John Myles White的这篇文章,但是在过去的3年里似乎没有任何进展。

Is there a working package that implements this ?

是否存在实现此功能的工作包?

If not,

如果不是这样,

  1. Would it be useful ?
  2. 它有用吗?
  3. What would be the best way to implement it (S4 ?) ?
  4. 实现它的最佳方式是什么(S4 ?)

6 个解决方案

#1


10  

The very recent package dplyr is implementing this (amongst other amazing features).

最近的软件包dplyr正在实现这一点(以及其他令人惊叹的特性)。

Here are illustrations from the examples of function src_mysql():

下面是函数src_mysql()的例子:

# Connection basics ---------------------------------------------------------
# To connect to a database first create a src:
my_db <- src_mysql(host = "blah.com", user = "hadley",
  password = "pass")
# Then reference a tbl within that src
my_tbl <- tbl(my_db, "my_table")

# Methods -------------------------------------------------------------------
batting <- tbl(lahman_mysql(), "Batting")
dim(batting)
colnames(batting)
head(batting)

#2


7  

There is an old unsupported package, SQLiteDF, that does that. Build it from source and ignore the numerous error messages.

有一个旧的不受支持的包SQLiteDF可以做到这一点。从源代码构建它,并忽略大量的错误消息。

> # from example(sqlite.data.frame)
>
> library(SQLiteDF)
> iris.sdf <- sqlite.data.frame(iris)
> iris.sdf$Petal.Length[1:10] # $ done via SQL
 [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5

#3


2  

Looks like John Myles White he's given up on it.

看起来约翰·迈尔斯·怀特已经放弃了。

There is a bit of a workaround explained here.

这里有一个解决方法。

#4


1  

I don't think it would be useful. R is not a real OOP language. The "central" data structure in R is the data frame. No need for Object-Relational Mapping here.What you want is a mapping between SQL tables and data frames and the RMySQL and RODBC provide just that :

我不认为它有用。R不是真正的OOP语言。R中的“中心”数据结构是数据框架。这里不需要对象关系映射。您需要的是SQL表和数据帧之间的映射,而RMySQL和RODBC提供的是:

dbGetQuery to return the results of a query in a data frame and dbWriteTable to insert data in a table or do a bulk update ( from a data frame).

dbGetQuery返回数据框架和dbWriteTable中的查询结果,以在表中插入数据或进行批量更新(从数据帧)。

#5


1  

Next to the various driver packages for querying DBs (DBI, RODBC,RJDBC,RMySql,...) and dplyr, there's also sqldf https://cran.r-project.org/web/packages/sqldf/

在查询DBs (DBI、RODBC、RJDBC、RMySql、…)和dplyr的各种驱动程序包的旁边,还有sqldf https://cran.r project.org/web/packages/sqldf/。

This will automatically import dataframes into the db & let you query the data via sql. At the end the db is deleted.

这将自动将dataframes导入到db中,并允许您通过sql查询数据。最后删除db。

#6


0  

As an experienced R user, I would not use this. First off, this 'virtual frame' would be slow to use, since you constantly need to synchronize between R memory and the database. It would also require locking the database table, since otherwise you have unpredictable results due to other edits happening at the same time.

作为一个有经验的R用户,我不会使用这个。首先,这个“虚拟帧”将会很慢,因为您需要经常在R内存和数据库之间进行同步。它还需要锁定数据库表,否则由于同时进行其他编辑,您将得到不可预测的结果。

Finally, I do not think R is suited for implementing a different evaluation of promise objects. Doing myFrame$foo[ myFrame$foo > 40 ] will still fetch the full foo column, since you cannot possible implement a full translation scheme from R to SQL.

最后,我认为R不适合实现对承诺对象的不同评估。执行myFrame$foo[myFrame$foo > 40]仍然会获取完整的foo列,因为您无法实现从R到SQL的完整转换方案。

Therefore, I prefer to load a dataframe() from a query, use it, and write it back to the database if required.

因此,我更喜欢从查询加载dataframe(),使用它,并在需要时将其写回数据库。

#1


10  

The very recent package dplyr is implementing this (amongst other amazing features).

最近的软件包dplyr正在实现这一点(以及其他令人惊叹的特性)。

Here are illustrations from the examples of function src_mysql():

下面是函数src_mysql()的例子:

# Connection basics ---------------------------------------------------------
# To connect to a database first create a src:
my_db <- src_mysql(host = "blah.com", user = "hadley",
  password = "pass")
# Then reference a tbl within that src
my_tbl <- tbl(my_db, "my_table")

# Methods -------------------------------------------------------------------
batting <- tbl(lahman_mysql(), "Batting")
dim(batting)
colnames(batting)
head(batting)

#2


7  

There is an old unsupported package, SQLiteDF, that does that. Build it from source and ignore the numerous error messages.

有一个旧的不受支持的包SQLiteDF可以做到这一点。从源代码构建它,并忽略大量的错误消息。

> # from example(sqlite.data.frame)
>
> library(SQLiteDF)
> iris.sdf <- sqlite.data.frame(iris)
> iris.sdf$Petal.Length[1:10] # $ done via SQL
 [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5

#3


2  

Looks like John Myles White he's given up on it.

看起来约翰·迈尔斯·怀特已经放弃了。

There is a bit of a workaround explained here.

这里有一个解决方法。

#4


1  

I don't think it would be useful. R is not a real OOP language. The "central" data structure in R is the data frame. No need for Object-Relational Mapping here.What you want is a mapping between SQL tables and data frames and the RMySQL and RODBC provide just that :

我不认为它有用。R不是真正的OOP语言。R中的“中心”数据结构是数据框架。这里不需要对象关系映射。您需要的是SQL表和数据帧之间的映射,而RMySQL和RODBC提供的是:

dbGetQuery to return the results of a query in a data frame and dbWriteTable to insert data in a table or do a bulk update ( from a data frame).

dbGetQuery返回数据框架和dbWriteTable中的查询结果,以在表中插入数据或进行批量更新(从数据帧)。

#5


1  

Next to the various driver packages for querying DBs (DBI, RODBC,RJDBC,RMySql,...) and dplyr, there's also sqldf https://cran.r-project.org/web/packages/sqldf/

在查询DBs (DBI、RODBC、RJDBC、RMySql、…)和dplyr的各种驱动程序包的旁边,还有sqldf https://cran.r project.org/web/packages/sqldf/。

This will automatically import dataframes into the db & let you query the data via sql. At the end the db is deleted.

这将自动将dataframes导入到db中,并允许您通过sql查询数据。最后删除db。

#6


0  

As an experienced R user, I would not use this. First off, this 'virtual frame' would be slow to use, since you constantly need to synchronize between R memory and the database. It would also require locking the database table, since otherwise you have unpredictable results due to other edits happening at the same time.

作为一个有经验的R用户,我不会使用这个。首先,这个“虚拟帧”将会很慢,因为您需要经常在R内存和数据库之间进行同步。它还需要锁定数据库表,否则由于同时进行其他编辑,您将得到不可预测的结果。

Finally, I do not think R is suited for implementing a different evaluation of promise objects. Doing myFrame$foo[ myFrame$foo > 40 ] will still fetch the full foo column, since you cannot possible implement a full translation scheme from R to SQL.

最后,我认为R不适合实现对承诺对象的不同评估。执行myFrame$foo[myFrame$foo > 40]仍然会获取完整的foo列,因为您无法实现从R到SQL的完整转换方案。

Therefore, I prefer to load a dataframe() from a query, use it, and write it back to the database if required.

因此,我更喜欢从查询加载dataframe(),使用它,并在需要时将其写回数据库。