一:读取图像
readJPEG(jpeg)
readJPEG()所属R语言包:jpeg
readJPEG函数的用法可以参考:http://www.biostatistic.net/thread-48901-1-1.html
例子:
> library(jpeg)
> img<-readJPEG("getdata-jeff.jpg",native=TRUE)
> head(img)
[1] -11494710 -11494710 -11494710 -11494710 -11494710 -11494710
二:字符串替换:
- > text
- [1] "Hello Adam!\nHello Ava!"
- > sub(pattern="Adam", replacement="world", text)
- [1] "Hello world!\nHello Ava!"
- > text
- [1] "Hello Adam!\nHello Ava!"
字符串连接,字符串查询,字符串拆分,字符串提取。
参考资料来自:http://developer.51cto.com/art/201305/393692.htm
三:数据合并:merge
> f<-"getdata-data-GDP.csv"
> dtGDP <- data.table(read.csv(f, skip = 4, nrows = 215))
> View(dtGDP)
> dtGDP <- dtGDP[X != ""]
> dtGDP
> View(dtGDP)
> dtGDP <- data.table(read.csv(f, skip = 4, nrows = 215))
> View(dtGDP)
> dtGDP <- dtGDP[X != ""]
> dtGDP <- dtGDP[, list(X, X.1, X.3, X.4)]
> View(dtGDP)
> setnames(dtGDP, c("X", "X.1", "X.3", "X.4"), c("CountryCode", "rankingGDP",
+ "Long.Name", "gdp"))
> f<-"getdata-data-EDSTATS_Country.csv"
> dtEd <- data.table(read.csv(f))
> View(dtEd)
> dt <- merge(dtGDP, dtEd, all = TRUE, by = c("CountryCode"))
> View(dt)
> View(dtEd)
> sum(!is.na(unique(dt$rankingGDP)))
[1] 189
> a<-dt[order(rankingGDP, decreasing = TRUE)]
> a[13]
四:Income.Group列中所有相同值对应X.1列中数的均值
dt[, mean(X.1, na.rm = TRUE), by = Income.Group]
结果:
Income.Group V1
1: High income: nonOECD 91.91304
2: Low income 133.72973
3: Lower middle income 107.70370
4: Upper middle income 92.13333
5: High income: OECD 32.96667
6: NA 131.00000
7: NaN
五:将一列分成五个分为数组,计算这五个分为数组的区间,计算某列的某些值在这个区间内数的个数
breaks <- quantile(dt$rankingGDP, probs = seq(0, 1, 0.2), na.rm = TRUE)
dt$quantileGDP <- cut(dt$rankingGDP, breaks = breaks)
dt[Income.Group == "Lower middle income", .N, by = c("Income.Group", "quantileGDP")]