如何在R中管理多个包位置(文件夹)?

时间:2022-06-02 11:28:56

Before I upgrade to R-2.14, I want to take the opportunity to rationalise the folder structure of my installed packages.

在我升级到R-2.14之前,我想利用这个机会使我安装的软件包的文件夹结构合理化。

At the moment I use the R default, i.e. all new installed packages goes to R_LIBS_USER. However, I really distinguish between two classes of package:

目前我使用的是R默认值,也就是说,所有新安装的包都转到R_LIBS_USER。但是,我真的区分了两类包装:

  • Packages I use repeatedly to do my work, e.g. plyr, data.table, etc.
  • 我重复使用的包来完成我的工作,例如plyr,数据。表等。
  • Packages I install just to experiment with (often to replicate a question or answer on *)
  • 我安装的包只是为了试验(通常是为了在*上复制一个问题或答案)

Since install.packages offers one the option to specify a lib argument, this is clearly possible.

因为安装。包提供了一个指定lib参数的选项,这显然是可能的。

Is there an easy way to manage package locations, e.g. by creating some sensible settings / wrapper function in .RProfile or RProfile.Site?

是否有一种简单的方法来管理包的位置,例如在.RProfile或RProfile.Site中创建一些合理的设置/包装器函数?

4 个解决方案

#1


21  

Hadley's excellent package devtools provides a function dev_mode.
http://www.inside-r.org/packages/cran/devtools/docs/dev_mode

Hadley出色的包devtools提供了一个函数dev_mode。http://www.inside-r.org/packages/cran/devtools/docs/dev_mode

Here you can find an example usage: https://gist.github.com/1150934

这里您可以找到一个示例用法:https://gist.github.com/1150934。

Basically,

基本上,

dev_mode(TRUE, path = "anywhere-you-want-to-install")
install.packages("anything-that-you-want-to-install")

is a powerful way.

是一个功能强大的方法。

#2


25  

There are numerous options for that. The first thing I did was adapt my Rprofile.site to contain the following line, making my default library path a directory not included in my R installation.

有很多选择。我做的第一件事就是修改我的简历。站点包含以下行,使我的默认库路径成为R安装中不包含的目录。

 .libPaths(c("D:/R/Library",.libPaths()))

This makes D:/R/Library my default path without losing the other paths. You can add two paths to that one, say D:/R/Library/Work and D:/R/Library/Test. The one that's put in the first position is the default one used if you don't specify lib in install.packages().

这使得D:/R/Library成为我的默认路径,而不会丢失其他路径。您可以在其中添加两条路径,例如D:/R/Library/Work和D:/R/Library/Test。如果在install.packages()中没有指定lib,那么放在第一个位置的是默认的。

Then you can assign two variables in your .Rprofile.site. These ones are assigned in the base namespace, and hence always accessible and not removed by ls(). Something like

然后,您可以在. rprofile.site中分配两个变量。它们在基本名称空间中分配,因此总是可访问的,不会被ls()删除。类似的

 .libwork <- 'D:/R/Library/Work'
 .libtest <- 'D:/R/Library/Test'

which allows you to install packages like:

它允许您安装以下包:

 install.packages('aPackage',lib=.libwork)

There are other options too I guess, but this is how I would roll.

我想还有其他的选择,但这就是我的选择。

#3


4  

You are supposed to be able to specify several library paths/trees via a colon separated list of paths in the Environmental Variable R_LIBS. I couldn't get this to work reliably on R 2.13.1-patched - it only ever takes the first entry. I got R_LIBS and R_LIBS_USER to work reliably on my system - I normally only set the former.

您应该能够通过环境变量R_LIBS中的冒号分隔的路径列表来指定几个库路径/树。我无法让它在r2 2.13.1补丁上可靠地工作——它只需要第一个条目。我让R_LIBS和R_LIBS_USER在我的系统上可靠地工作——我通常只设置前者。

.libPaths() can add new paths to set of library trees searched. I'd just add the appropriate calls to .libPaths(new) in my .Rprofile to add the relevant trees for each session. Then you can choose where to install packages at install time - i.e. which tree to use.

libpaths()可以向搜索的库树集添加新路径。我只需在. rprofile中添加对. libpaths (new)的适当调用,以便为每个会话添加相关的树。然后,您可以选择在安装时安装包的位置——即使用哪个树。

#4


2  

To answer, I have to give a bit of context.

要回答这个问题,我必须提供一些背景知识。

For the purposes of reproduceability, I try to script things, including my entire R setup. I have a script "initializeR.r" that, among other things, installs packages, and I've arranged packages in bundles, such as those relating to cacheing, those relating to visualization, sampling, spatial stats, etc. - my own little task views, if you will.

为了实现可繁殖性,我尝试编写脚本,包括整个R设置。我有一个脚本“初始化器”。r“除了安装包之外,我还将包分成了包,比如与缓存相关的包,与可视化、采样、空间统计等相关的包——我自己的小任务视图。

For instance, here is a snippet:

例如,这里有一个片段:

# Profiling & testing
Packages$CodingTools = c("codetools","debug", "profr","proftools","RUnit")

I combine some of the bundles into a "Major" packages (or primary) list and others go into the "Secondary" list. I am sure to install everything on the primary list - these are needed to have a reasonable R environment, to use my own scripts, functions, and packages, etc. (Btw, some packages are assigned to multiple bundles, but only a few; I de-dupe before processing an aggregated list.)

我将一些包合并到“主要”包(或主包)列表中,而其他包则进入“次要”列表。我一定要把所有东西都安装到主列表中——这些都需要有一个合理的R环境,才能使用我自己的脚本、函数和包等等(顺便说一句,有些包被分配给多个包,但只有少数包;在处理聚合列表之前,我进行了de-dupe处理。

I then specify a platform specific default library, and install to there. However, this capability is extensible and this idea can be extended to include optional locations for each package bundle (or package): just map from bundle name, e.g. "CodingTools" to a unique directory (library path), say "D:/R/Library/CodingTools". This can be done in the initialization script, with matching lists & default options, or the locations could be stored elsewhere, such as a hash table, JSON, or a database.

然后我指定一个特定于平台的默认库,并将其安装到那里。但是,这个功能是可扩展的,这个想法可以扩展为包含每个包(或包)的可选位置:只需从包名映射,例如。"编码工具"到一个唯一的目录(库路径),说"D:/R/ library /CodingTools"。这可以在初始化脚本中完成,使用匹配列表和默认选项,或者可以将位置存储在其他地方,例如散列表、JSON或数据库。

As others have said, the default library paths need to be communicated to R. That can be done in .RProfile.site. In my case, I have another script that is used to initialize the R instance as I'd like it. I try to avoid external parameter files that are read by R (e.g. .Rprofile), and instead do all initializations via function calls in my own package (though the parameters are still external). This tends to make it easier for me to debug and reproduce my work. So, my library paths can be included in the same kind of JSON where my data file locations are specified.

正如其他人所说,需要将默认库路径传递给r,这可以在. rprofile .site中完成。在我的例子中,我有另一个脚本,用于初始化我想要的R实例。我尽量避免使用R读取的外部参数文件(例如. rprofile),而是在我自己的包中通过函数调用进行所有的初始化(尽管参数仍然是外部的)。这使得调试和复制我的工作变得更容易。因此,我的库路径可以包含在指定数据文件位置的同一种JSON中。

Personally, I want to get away from defining the bundles inside the script and instead use JSON, as I can more easily create different JSON files for different setup configurations. I already do this for most other purposes of reproducible work.

就我个人而言,我不想在脚本中定义捆绑包,而是使用JSON,因为我可以更容易地为不同的设置配置创建不同的JSON文件。我已经为可重复工作的大多数其他目的做了这一点。

#1


21  

Hadley's excellent package devtools provides a function dev_mode.
http://www.inside-r.org/packages/cran/devtools/docs/dev_mode

Hadley出色的包devtools提供了一个函数dev_mode。http://www.inside-r.org/packages/cran/devtools/docs/dev_mode

Here you can find an example usage: https://gist.github.com/1150934

这里您可以找到一个示例用法:https://gist.github.com/1150934。

Basically,

基本上,

dev_mode(TRUE, path = "anywhere-you-want-to-install")
install.packages("anything-that-you-want-to-install")

is a powerful way.

是一个功能强大的方法。

#2


25  

There are numerous options for that. The first thing I did was adapt my Rprofile.site to contain the following line, making my default library path a directory not included in my R installation.

有很多选择。我做的第一件事就是修改我的简历。站点包含以下行,使我的默认库路径成为R安装中不包含的目录。

 .libPaths(c("D:/R/Library",.libPaths()))

This makes D:/R/Library my default path without losing the other paths. You can add two paths to that one, say D:/R/Library/Work and D:/R/Library/Test. The one that's put in the first position is the default one used if you don't specify lib in install.packages().

这使得D:/R/Library成为我的默认路径,而不会丢失其他路径。您可以在其中添加两条路径,例如D:/R/Library/Work和D:/R/Library/Test。如果在install.packages()中没有指定lib,那么放在第一个位置的是默认的。

Then you can assign two variables in your .Rprofile.site. These ones are assigned in the base namespace, and hence always accessible and not removed by ls(). Something like

然后,您可以在. rprofile.site中分配两个变量。它们在基本名称空间中分配,因此总是可访问的,不会被ls()删除。类似的

 .libwork <- 'D:/R/Library/Work'
 .libtest <- 'D:/R/Library/Test'

which allows you to install packages like:

它允许您安装以下包:

 install.packages('aPackage',lib=.libwork)

There are other options too I guess, but this is how I would roll.

我想还有其他的选择,但这就是我的选择。

#3


4  

You are supposed to be able to specify several library paths/trees via a colon separated list of paths in the Environmental Variable R_LIBS. I couldn't get this to work reliably on R 2.13.1-patched - it only ever takes the first entry. I got R_LIBS and R_LIBS_USER to work reliably on my system - I normally only set the former.

您应该能够通过环境变量R_LIBS中的冒号分隔的路径列表来指定几个库路径/树。我无法让它在r2 2.13.1补丁上可靠地工作——它只需要第一个条目。我让R_LIBS和R_LIBS_USER在我的系统上可靠地工作——我通常只设置前者。

.libPaths() can add new paths to set of library trees searched. I'd just add the appropriate calls to .libPaths(new) in my .Rprofile to add the relevant trees for each session. Then you can choose where to install packages at install time - i.e. which tree to use.

libpaths()可以向搜索的库树集添加新路径。我只需在. rprofile中添加对. libpaths (new)的适当调用,以便为每个会话添加相关的树。然后,您可以选择在安装时安装包的位置——即使用哪个树。

#4


2  

To answer, I have to give a bit of context.

要回答这个问题,我必须提供一些背景知识。

For the purposes of reproduceability, I try to script things, including my entire R setup. I have a script "initializeR.r" that, among other things, installs packages, and I've arranged packages in bundles, such as those relating to cacheing, those relating to visualization, sampling, spatial stats, etc. - my own little task views, if you will.

为了实现可繁殖性,我尝试编写脚本,包括整个R设置。我有一个脚本“初始化器”。r“除了安装包之外,我还将包分成了包,比如与缓存相关的包,与可视化、采样、空间统计等相关的包——我自己的小任务视图。

For instance, here is a snippet:

例如,这里有一个片段:

# Profiling & testing
Packages$CodingTools = c("codetools","debug", "profr","proftools","RUnit")

I combine some of the bundles into a "Major" packages (or primary) list and others go into the "Secondary" list. I am sure to install everything on the primary list - these are needed to have a reasonable R environment, to use my own scripts, functions, and packages, etc. (Btw, some packages are assigned to multiple bundles, but only a few; I de-dupe before processing an aggregated list.)

我将一些包合并到“主要”包(或主包)列表中,而其他包则进入“次要”列表。我一定要把所有东西都安装到主列表中——这些都需要有一个合理的R环境,才能使用我自己的脚本、函数和包等等(顺便说一句,有些包被分配给多个包,但只有少数包;在处理聚合列表之前,我进行了de-dupe处理。

I then specify a platform specific default library, and install to there. However, this capability is extensible and this idea can be extended to include optional locations for each package bundle (or package): just map from bundle name, e.g. "CodingTools" to a unique directory (library path), say "D:/R/Library/CodingTools". This can be done in the initialization script, with matching lists & default options, or the locations could be stored elsewhere, such as a hash table, JSON, or a database.

然后我指定一个特定于平台的默认库,并将其安装到那里。但是,这个功能是可扩展的,这个想法可以扩展为包含每个包(或包)的可选位置:只需从包名映射,例如。"编码工具"到一个唯一的目录(库路径),说"D:/R/ library /CodingTools"。这可以在初始化脚本中完成,使用匹配列表和默认选项,或者可以将位置存储在其他地方,例如散列表、JSON或数据库。

As others have said, the default library paths need to be communicated to R. That can be done in .RProfile.site. In my case, I have another script that is used to initialize the R instance as I'd like it. I try to avoid external parameter files that are read by R (e.g. .Rprofile), and instead do all initializations via function calls in my own package (though the parameters are still external). This tends to make it easier for me to debug and reproduce my work. So, my library paths can be included in the same kind of JSON where my data file locations are specified.

正如其他人所说,需要将默认库路径传递给r,这可以在. rprofile .site中完成。在我的例子中,我有另一个脚本,用于初始化我想要的R实例。我尽量避免使用R读取的外部参数文件(例如. rprofile),而是在我自己的包中通过函数调用进行所有的初始化(尽管参数仍然是外部的)。这使得调试和复制我的工作变得更容易。因此,我的库路径可以包含在指定数据文件位置的同一种JSON中。

Personally, I want to get away from defining the bundles inside the script and instead use JSON, as I can more easily create different JSON files for different setup configurations. I already do this for most other purposes of reproducible work.

就我个人而言,我不想在脚本中定义捆绑包,而是使用JSON,因为我可以更容易地为不同的设置配置创建不同的JSON文件。我已经为可重复工作的大多数其他目的做了这一点。