i would like to read in R an XML file with encoding=utf-8
(there are text in Hebrew).
我想在R中读取一个带有encoding = utf-8的XML文件(希伯来语中有文本)。
i know about Package XML, but i have't find in xmlToDataFrame
any encoding options.
我知道Package XML,但我没有在xmlToDataFrame中找到任何编码选项。
i've tried:
我试过了:
library(XML)
data <- xmlToDataFrame("G:/G_RBT/Alexey/DB/kupot.xml")
but i get problems with Hebrew, i cant read it. I also tried:
但是我遇到了希伯来语的问题,我无法读懂它。我也尝试过:
data <- xmlParse("G:/G_RBT/Alexey/DB/kupot.xml",encoding="UTF-8")
and still encoding doesn't help.
仍然编码没有帮助。
1 个解决方案
#1
1
Sometimes you need some manual elbow grease:
有时您需要一些手动弯头润滑脂:
library(XML)
library(httr)
# found this XML with hebrew
tmp <- GET("https://tiktickets.googlecode.com/svn-history/r102/trunk/war/ShowHalls.xml")
doc <- content(tmp, as="text", encoding="UTF-8")
doc <- substr(doc, 2, nchar(doc)) # skip encoding bits at the beginning
doc_x <- xmlParse(doc, encoding="UTF-8")
# do data frame conversion by hand
data.frame(name=xpathSApply(doc_x, "//ShowHall/name", xmlValue, encoding="UTF-8"),
address=xpathSApply(doc_x, "//ShowHall/address", xmlValue, encoding="UTF-8"),
phone1=xpathSApply(doc_x, "//ShowHall/phone1", xmlValue, encoding="UTF-8"),
longitude=xpathSApply(doc_x, "//ShowHall/longitude", xmlValue, encoding="UTF-8"),
latitude=xpathSApply(doc_x, "//ShowHall/latitude", xmlValue, encoding="UTF-8"))
#1
1
Sometimes you need some manual elbow grease:
有时您需要一些手动弯头润滑脂:
library(XML)
library(httr)
# found this XML with hebrew
tmp <- GET("https://tiktickets.googlecode.com/svn-history/r102/trunk/war/ShowHalls.xml")
doc <- content(tmp, as="text", encoding="UTF-8")
doc <- substr(doc, 2, nchar(doc)) # skip encoding bits at the beginning
doc_x <- xmlParse(doc, encoding="UTF-8")
# do data frame conversion by hand
data.frame(name=xpathSApply(doc_x, "//ShowHall/name", xmlValue, encoding="UTF-8"),
address=xpathSApply(doc_x, "//ShowHall/address", xmlValue, encoding="UTF-8"),
phone1=xpathSApply(doc_x, "//ShowHall/phone1", xmlValue, encoding="UTF-8"),
longitude=xpathSApply(doc_x, "//ShowHall/longitude", xmlValue, encoding="UTF-8"),
latitude=xpathSApply(doc_x, "//ShowHall/latitude", xmlValue, encoding="UTF-8"))