I have a csv file, and I want to extract the each column a as string so I can use it with getSymbols
function from quantmod
package.
我有一个csv文件,我想把每一列a作为字符串提取出来,这样我就可以用它从quantmod包中提取getSymbols函数了。
The csv file looks like this:
csv文件如下所示:
AEGR,Aegerion Pharmaceuticals Inc
AKS,AK Steel Holding Corp
ALXA,Alexza Pharmaceuticals Inc
CCL,Carnival Corporation
CECO,Career Education Corp
CDXS,Codexis Inc
And I use this code to read the file:
我用这个代码读取文件:
data<-read.csv(file='CAPM/allquotes.csv',header=F)
symbols=gettext(data[,1])
symbol.names=gettext(data[,2])
getSymbols(symbols)
I get this error:
我得到这个错误:
Error in download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open URL 'http://chart.yahoo.com/table.csv?s=ALXA&a=0&b=01&c=2007&d=5&e=16&f=2012&g=d&q=q&y=0&z=ALXA&x=.csv'
In addition: Warning message:
In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open: HTTP status was '404 Not Found'
When I enter the symbols one by one it works fine. I've also noticed that when I go to the end of the last line, the margins seem to corrupt. In the image you can see that values of 'symbols', the end of the line is a few more spaces to the right than it should be (you can see that because of the color of the initial parenthesis).
当我一个一个地输入符号时,它就可以工作了。我还注意到,当我走到最后一行的末尾时,页边距似乎被破坏了。在图像中,你可以看到“符号”的值,这条线的末尾比它应该在右边多了几个空格(由于初始括号的颜色,你可以看到)。
2 个解决方案
#1
4
Your csv has hidden characters in it -- namely a left-to-right mark. Since you are using RStudio, you can remove it with gsub
using "\016" as the value for the pattern
argument. Alternatively, instead of removing the hidden character that you don't want, you could only keep the characters that you know you DO want. For example, if your symbols will only have letters and/or numbers you could use something like gsub("[^A-Za-z0-9]", "", data[, 1])
你的csv有隐藏的字符,即从左到右的标记。因为您正在使用RStudio,所以可以使用gsub使用“\016”作为模式参数的值来删除它。或者,您可以只保留您知道自己想要的字符,而不是删除您不想要的隐藏字符。例如,如果您的符号只有字母和/或数字你可以使用类似gsub(“[^ A-Za-z0-9)”,“”,数据[1])
data <- read.csv(text="AEGR,Aegerion Pharmaceuticals Inc
AKS,AK Steel Holding Corp
ALXA,Alexza Pharmaceuticals Inc
CCL,Carnival Corporation
CECO,Career Education Corp
CDXS,Codexis Inc", header=FALSE)
#data[, 1] <- gsub("\016", "", data[, 1]) #this should work in RStudio
data[, 1] <- gsub("[^A-Za-z0-9]", "", data[, 1]) #but this should work anywhere
symbols=gettext(data[,1])
getSymbols(symbols, src='yahoo')
After you read.csv
, you can examine the data
object to see that something is amiss.
在你阅读。csv,你可以检查数据对象看看哪里出了问题。
s <- as.character(data[, 1])
str(s)
#chr [1:6] "AEGR" "AKS" "ALXA""| __truncated__ "CCL""| __truncated__ "CECO""| __truncated__ "CDXS""| __truncated__
str(s[3])
#chr "ALXA""| __truncated__
charToRaw(s[3])
#[1] 41 4c 58 41 e2 80 8e
# Compare what we have to what we think we have
charToRaw("ALXA")
#[1] 41 4c 58 41
#2
0
I'm using the Systematic Investor Toolbox, that uses the quantmod. Thanks to GSee, the solution came this way:
我用的是系统投资者工具箱,用的是quantmod。感谢GSee,解决方案是这样的:
source('SystematicInvestorToolbox.r')
load.packages('quantmod')
dates='2012::2012'
data<-read.csv(file='CAPM/allquotes.csv',header=F,stringsAsFactors=F)
data[, 1] <- gsub("[^A-Za-z0-9]", "", data[, 1])
symbols=gettext(data[,1])
symbol.names=gettext(data[,2])
ia=aa.test.create.ia.custom(symbols,symbol.names,dates)
plot.ia(ia,(1:1))
It's worth noting that the left-to-right marks only appear with 'symbols' not when I extract the characters for the names of the quotes in 'symbol.names'.
值得注意的是,从左到右的标记只出现在“符号”中,而不是当我提取“符号。名称”中引号的字符时。
Thanks for the help.
谢谢你的帮助。
#1
4
Your csv has hidden characters in it -- namely a left-to-right mark. Since you are using RStudio, you can remove it with gsub
using "\016" as the value for the pattern
argument. Alternatively, instead of removing the hidden character that you don't want, you could only keep the characters that you know you DO want. For example, if your symbols will only have letters and/or numbers you could use something like gsub("[^A-Za-z0-9]", "", data[, 1])
你的csv有隐藏的字符,即从左到右的标记。因为您正在使用RStudio,所以可以使用gsub使用“\016”作为模式参数的值来删除它。或者,您可以只保留您知道自己想要的字符,而不是删除您不想要的隐藏字符。例如,如果您的符号只有字母和/或数字你可以使用类似gsub(“[^ A-Za-z0-9)”,“”,数据[1])
data <- read.csv(text="AEGR,Aegerion Pharmaceuticals Inc
AKS,AK Steel Holding Corp
ALXA,Alexza Pharmaceuticals Inc
CCL,Carnival Corporation
CECO,Career Education Corp
CDXS,Codexis Inc", header=FALSE)
#data[, 1] <- gsub("\016", "", data[, 1]) #this should work in RStudio
data[, 1] <- gsub("[^A-Za-z0-9]", "", data[, 1]) #but this should work anywhere
symbols=gettext(data[,1])
getSymbols(symbols, src='yahoo')
After you read.csv
, you can examine the data
object to see that something is amiss.
在你阅读。csv,你可以检查数据对象看看哪里出了问题。
s <- as.character(data[, 1])
str(s)
#chr [1:6] "AEGR" "AKS" "ALXA""| __truncated__ "CCL""| __truncated__ "CECO""| __truncated__ "CDXS""| __truncated__
str(s[3])
#chr "ALXA""| __truncated__
charToRaw(s[3])
#[1] 41 4c 58 41 e2 80 8e
# Compare what we have to what we think we have
charToRaw("ALXA")
#[1] 41 4c 58 41
#2
0
I'm using the Systematic Investor Toolbox, that uses the quantmod. Thanks to GSee, the solution came this way:
我用的是系统投资者工具箱,用的是quantmod。感谢GSee,解决方案是这样的:
source('SystematicInvestorToolbox.r')
load.packages('quantmod')
dates='2012::2012'
data<-read.csv(file='CAPM/allquotes.csv',header=F,stringsAsFactors=F)
data[, 1] <- gsub("[^A-Za-z0-9]", "", data[, 1])
symbols=gettext(data[,1])
symbol.names=gettext(data[,2])
ia=aa.test.create.ia.custom(symbols,symbol.names,dates)
plot.ia(ia,(1:1))
It's worth noting that the left-to-right marks only appear with 'symbols' not when I extract the characters for the names of the quotes in 'symbol.names'.
值得注意的是,从左到右的标记只出现在“符号”中,而不是当我提取“符号。名称”中引号的字符时。
Thanks for the help.
谢谢你的帮助。