如何获取R中的内置数据集列表?

时间:2020-12-26 18:09:30

Can someone please help how to get the list of built-in data sets and their dependency packages?

有人可以帮助如何获取内置数据集及其依赖包的列表?

2 个解决方案

#1


24  

There are several ways to find the included datasets in R:

有几种方法可以在R中找到包含的数据集:

1: Using data() will give you a list of the datasets of all loaded packages (and not only the ones from the datasets package); the datasets are ordered by package

1:使用data()将为您提供所有已加载包的数据集列表(而不仅仅是数据集包中的数据集);数据集按包排序

2: Using data(package = .packages(all.available = TRUE)) will give you a list of all datasets in the available packages on your computer (i.e. also the not-loaded ones)

2:使用数据(package = .packages(all.available = TRUE))将为您提供计算机上可用包中所有数据集的列表(即未加载的数据集)

3: Using data(package = "packagename") will give you the datasets of that specific package, so data(package = "plyr") will give the datasets in the plyr package

3:使用数据(package =“packagename”)将为您提供该特定包的数据集,因此data(package =“plyr”)将为plyr包中的数据集提供数据集


If you want to know in which package a dataset is located (e.g. the acme dataset), you can do:

如果您想知道数据集所在的包(例如acme数据集),您可以:

dat <- as.data.frame(data(package = .packages(all.available = TRUE))$results)
dat[dat$Item=="acme", c(1,3,4)]

which gives:

这使:

    Package Item                  Title
107    boot acme Monthly Excess Returns

#2


1  

I often need to also know which structure of datasets are available, so I created dataStr in my misc package.

我经常需要知道哪个数据集结构可用,所以我在我的misc包中创建了dataStr。

dataStr <- function(package="datasets", ...)
  {
  d <- data(package=package, envir=new.env(), ...)$results[,"Item"]
  d <- sapply(strsplit(d, split=" ", fixed=TRUE), "[", 1)
  d <- d[order(tolower(d))]
  for(x in d){ message(x, ":  ", class(get(x))); message(str(get(x)))}
  }
dataStr()

Please mind that the output in the console is quite long.

请注意,控制台中的输出很长。

This is the type of output:

这是输出的类型:

[...]

warpbreaks:  data.frame
'data.frame':   54 obs. of  3 variables:
 $ breaks : num  26 30 54 25 70 52 51 26 67 18 ...
 $ wool   : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
 $ tension: Factor w/ 3 levels "L","M","H": 1 1 1 1 1 1 1 1 1 2 ...

WorldPhones:  matrix
 num [1:7, 1:7] 45939 60423 64721 68484 71799 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:7] "1951" "1956" "1957" "1958" ...
  ..$ : chr [1:7] "N.Amer" "Europe" "Asia" "S.Amer" ...

WWWusage:  ts
 Time-Series [1:100] from 1 to 100: 88 84 85 85 84 85 83 85 88 89 ...

Edit: To get more informative output and use it for unloaded packages or all the packages on the search path, please use the revised online version with

编辑:要获得更多信息输出并将其用于卸载的包或搜索路径上的所有包,请使用修订后的在线版本

source("https://raw.githubusercontent.com/brry/berryFunctions/master/R/dataStr.R")

#1


24  

There are several ways to find the included datasets in R:

有几种方法可以在R中找到包含的数据集:

1: Using data() will give you a list of the datasets of all loaded packages (and not only the ones from the datasets package); the datasets are ordered by package

1:使用data()将为您提供所有已加载包的数据集列表(而不仅仅是数据集包中的数据集);数据集按包排序

2: Using data(package = .packages(all.available = TRUE)) will give you a list of all datasets in the available packages on your computer (i.e. also the not-loaded ones)

2:使用数据(package = .packages(all.available = TRUE))将为您提供计算机上可用包中所有数据集的列表(即未加载的数据集)

3: Using data(package = "packagename") will give you the datasets of that specific package, so data(package = "plyr") will give the datasets in the plyr package

3:使用数据(package =“packagename”)将为您提供该特定包的数据集,因此data(package =“plyr”)将为plyr包中的数据集提供数据集


If you want to know in which package a dataset is located (e.g. the acme dataset), you can do:

如果您想知道数据集所在的包(例如acme数据集),您可以:

dat <- as.data.frame(data(package = .packages(all.available = TRUE))$results)
dat[dat$Item=="acme", c(1,3,4)]

which gives:

这使:

    Package Item                  Title
107    boot acme Monthly Excess Returns

#2


1  

I often need to also know which structure of datasets are available, so I created dataStr in my misc package.

我经常需要知道哪个数据集结构可用,所以我在我的misc包中创建了dataStr。

dataStr <- function(package="datasets", ...)
  {
  d <- data(package=package, envir=new.env(), ...)$results[,"Item"]
  d <- sapply(strsplit(d, split=" ", fixed=TRUE), "[", 1)
  d <- d[order(tolower(d))]
  for(x in d){ message(x, ":  ", class(get(x))); message(str(get(x)))}
  }
dataStr()

Please mind that the output in the console is quite long.

请注意,控制台中的输出很长。

This is the type of output:

这是输出的类型:

[...]

warpbreaks:  data.frame
'data.frame':   54 obs. of  3 variables:
 $ breaks : num  26 30 54 25 70 52 51 26 67 18 ...
 $ wool   : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
 $ tension: Factor w/ 3 levels "L","M","H": 1 1 1 1 1 1 1 1 1 2 ...

WorldPhones:  matrix
 num [1:7, 1:7] 45939 60423 64721 68484 71799 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:7] "1951" "1956" "1957" "1958" ...
  ..$ : chr [1:7] "N.Amer" "Europe" "Asia" "S.Amer" ...

WWWusage:  ts
 Time-Series [1:100] from 1 to 100: 88 84 85 85 84 85 83 85 88 89 ...

Edit: To get more informative output and use it for unloaded packages or all the packages on the search path, please use the revised online version with

编辑:要获得更多信息输出并将其用于卸载的包或搜索路径上的所有包,请使用修订后的在线版本

source("https://raw.githubusercontent.com/brry/berryFunctions/master/R/dataStr.R")