In R, what would be the best way to separate the following data into a table with 2 columns?
在R中,将以下数据分成具有2列的表的最佳方法是什么?
March 09, 2018
0.084752
March 10, 2018
0.084622
March 11, 2018
0.084622
March 12, 2018
0.084437
March 13, 2018
0.084785
March 14, 2018
0.084901
2018年3月9日0.084752 2018年3月10日0.084622 2018年3月11日0.084622 2018年3月12日0.084437 2011年3月13日0.084785 2015年3月14日0.084901
I considered using a for loop but was advised against it. I do not know how to parse things very well, so if the best method involves this process please be as clear as possible.
我考虑过使用for循环,但建议不要使用它。我不知道如何解析事情,所以如果最好的方法涉及这个过程,请尽可能清楚。
The final table should look something like this:
决赛桌应该是这样的:
https://i.stack.imgur.com/u5hII.png
Thank you!
3 个解决方案
#1
1
Input:
input <- c("March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901")
Method:
library(dplyr)
library(lubridate)
df <- matrix(input, ncol = 2, byrow = TRUE) %>%
as_tibble() %>%
mutate(V1 = mdy(V1), V2 = as.numeric(V2))
Output:
df
# A tibble: 6 x 2
V1 V2
<date> <dbl>
1 2018-03-09 0.0848
2 2018-03-10 0.0846
3 2018-03-11 0.0846
4 2018-03-12 0.0844
5 2018-03-13 0.0848
6 2018-03-14 0.0849
Use names()
or rename()
to rename each columns.
使用names()或rename()重命名每列。
names(df) <- c("Date", "Value")
#2
1
data.table::fread
can read "...a string (containing at least one \n)...." 'f' in fread
stands for 'fast' so the code below should work on fairly large chunks as well.
data.table :: fread可以读取“...一个字符串(至少包含一个\ n)......”fread中的'f'代表'fast',所以下面的代码也应该适用于相当大的块。
require(data.table)
x = 'March 09, 2018
0.084752
March 10, 2018
0.084622
March 11, 2018
0.084622
March 12, 2018
0.084437
March 13, 2018
0.084785
March 14, 2018
0.084901'
o = fread(x, sep = '\n', header = FALSE)
o[, V1L := shift(V1, type = "lead")]
o[, keep := (1:.N)%% 2 != 0 ]
z = o[(keep)]
z[, keep := NULL]
z
#3
0
result = data.frame(matrix(input, ncol = 2, byrow = T), stringsAsFactors = FALSE)
result
# X1 X2
# 1 March 09, 2018 0.084752
# 2 March 10, 2018 0.084622
# 3 March 11, 2018 0.084622
# 4 March 12, 2018 0.084437
# 5 March 13, 2018 0.084785
# 6 March 14, 2018 0.084901
You should next adjust the names and classes, something like this:
接下来你应该调整名称和类,如下所示:
names(result) = c("date", "value")
result$value = as.numeric(result$value)
# etc.
Using Nik's nice input:
使用Nik的好输入:
input = c(
"March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901"
)
#1
1
Input:
input <- c("March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901")
Method:
library(dplyr)
library(lubridate)
df <- matrix(input, ncol = 2, byrow = TRUE) %>%
as_tibble() %>%
mutate(V1 = mdy(V1), V2 = as.numeric(V2))
Output:
df
# A tibble: 6 x 2
V1 V2
<date> <dbl>
1 2018-03-09 0.0848
2 2018-03-10 0.0846
3 2018-03-11 0.0846
4 2018-03-12 0.0844
5 2018-03-13 0.0848
6 2018-03-14 0.0849
Use names()
or rename()
to rename each columns.
使用names()或rename()重命名每列。
names(df) <- c("Date", "Value")
#2
1
data.table::fread
can read "...a string (containing at least one \n)...." 'f' in fread
stands for 'fast' so the code below should work on fairly large chunks as well.
data.table :: fread可以读取“...一个字符串(至少包含一个\ n)......”fread中的'f'代表'fast',所以下面的代码也应该适用于相当大的块。
require(data.table)
x = 'March 09, 2018
0.084752
March 10, 2018
0.084622
March 11, 2018
0.084622
March 12, 2018
0.084437
March 13, 2018
0.084785
March 14, 2018
0.084901'
o = fread(x, sep = '\n', header = FALSE)
o[, V1L := shift(V1, type = "lead")]
o[, keep := (1:.N)%% 2 != 0 ]
z = o[(keep)]
z[, keep := NULL]
z
#3
0
result = data.frame(matrix(input, ncol = 2, byrow = T), stringsAsFactors = FALSE)
result
# X1 X2
# 1 March 09, 2018 0.084752
# 2 March 10, 2018 0.084622
# 3 March 11, 2018 0.084622
# 4 March 12, 2018 0.084437
# 5 March 13, 2018 0.084785
# 6 March 14, 2018 0.084901
You should next adjust the names and classes, something like this:
接下来你应该调整名称和类,如下所示:
names(result) = c("date", "value")
result$value = as.numeric(result$value)
# etc.
Using Nik's nice input:
使用Nik的好输入:
input = c(
"March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901"
)