I am trying to use the following code:
我想使用以下代码:
x <- scan("myfile.txt", what="", sep="\n")
b <- grep('/^one/(.*?)/^four/', x, ignore.case = TRUE, perl = TRUE, value = TRUE,
fixed = FALSE, useBytes = FALSE, invert = FALSE)
to extract a porting of text from myfile.txt
从myfile.txt中提取文本的移植
zero
one
two
three
four
five
the output I'm expecting is
我期待的输出是
one
two
three
four
I want to include the "one" and "four" I don't want to ditch them :)
我想包括“一”和“四”我不想抛弃他们:)
But somehow the regex is not working, The console output is giving no error but no text either... ?
但不知何故,正则表达式不起作用,控制台输出没有错误,但没有文字......?
I am using print(b)
我正在使用print(b)
2 个解决方案
#1
2
I'm not quite clear on what you're looking for, but just for fun...
我不太清楚你在寻找什么,但只是为了好玩......
R> x
[1] "zero" "one" "two" "three" "four" "five"
R> grep("one|four", x) # get the position of "one" and "four"
[1] 2 5
Subset x
to only include the things between "one" and "four"
子集x只包含“一”和“四”之间的东西
R> x[do.call(seq, as.list(grep("one|four", x)))]
[1] "one" "two" "three" "four"
#2
1
gsub('one(.*)four','\\1',paste(x,collapse=''))
[1] "zerotwothreefive"
or to get space between words :
或者在单词之间留出空格:
gsub('one(.*)four','\\1',paste(dat,collapse=' '))
[1] "zero two three five"
Edit after Gsee comment:
Gsee评论后编辑:
gsub('.*(one.*four).*','\\1',paste(dat,collapse=' '))
[1] "one two three four"
But I think here no need to use regular expression :
但我认为这里不需要使用正则表达式:
dat[seq(which(dat == 'one'),which(dat == 'four'))]
[1] "one" "two" "three" "four"
of course you can use min if the previous index in which are not in the good order.
当然,如果之前的索引不是正常的顺序,你可以使用min。
#1
2
I'm not quite clear on what you're looking for, but just for fun...
我不太清楚你在寻找什么,但只是为了好玩......
R> x
[1] "zero" "one" "two" "three" "four" "five"
R> grep("one|four", x) # get the position of "one" and "four"
[1] 2 5
Subset x
to only include the things between "one" and "four"
子集x只包含“一”和“四”之间的东西
R> x[do.call(seq, as.list(grep("one|four", x)))]
[1] "one" "two" "three" "four"
#2
1
gsub('one(.*)four','\\1',paste(x,collapse=''))
[1] "zerotwothreefive"
or to get space between words :
或者在单词之间留出空格:
gsub('one(.*)four','\\1',paste(dat,collapse=' '))
[1] "zero two three five"
Edit after Gsee comment:
Gsee评论后编辑:
gsub('.*(one.*four).*','\\1',paste(dat,collapse=' '))
[1] "one two three four"
But I think here no need to use regular expression :
但我认为这里不需要使用正则表达式:
dat[seq(which(dat == 'one'),which(dat == 'four'))]
[1] "one" "two" "three" "four"
of course you can use min if the previous index in which are not in the good order.
当然,如果之前的索引不是正常的顺序,你可以使用min。