访问R中的Selenium api

时间:2021-07-17 22:23:39

I am interested in using selenium with R. I note that the various documentation is described here WebDriver (Selenium 2) API documentation. Has there been any work done on an implementation with R. How would I go about approaching this. In the documentation it notes about running a selenium server and one can query the api using Javascript. Any help would be much appreciated.

我有兴趣在R中使用selenium。我注意到WebDriver(Selenium 2)API文档中描述了各种文档。是否已就R的实施做了任何工作。我将如何处理这个问题。在文档中,它记录了运行selenium服务器,并且可以使用Javascript查询api。任何帮助将非常感激。

3 个解决方案

#1


1  

Selenium can be accessed using the JsonWireProtocol.

可以使用JsonWireProtocol访问Selenium。

Firstly start up a Selenium server from the command line via:

首先从命令行启动Selenium服务器:

java -jar selenium-server-standalone-2.25.0.jar

a new Firefox browser can be opened as follows:

可以按如下方式打开新的Firefox浏览器:

library(RCurl)
library(RJSONIO)
library(XML)

baseURL<-"http://localhost:4444/wd/hub/"
server<-list(desiredCapabilities=list(browserName='firefox',javascriptEnabled=TRUE))

getURL(paste0(baseURL,"session"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(server))

serverDetails<-fromJSON(rawToChar(getURLContent('http://localhost:4444/wd/hub/sessions',binary=TRUE)))
serverId<-serverDetails$value[[1]]$id

Navigate to google.

导航到谷歌。

getURL(paste0(baseURL,"session/",serverId,"/url"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(url="http://www.google.com")))

get the id of the search box

获取搜索框的ID

elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(using="xpath",value="//*[@id=\"gbqfq\"]")),binary=TRUE))
       )

elementId<-elementDetails$value

search for a subject matter

搜索主题

rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element/",elementId,"/value"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(value=list("\uE009","a","\uE009",'\b','Selenium api in R')))
       ,binary=TRUE))

return the search html

返回搜索html

googData<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/source"),
       customrequest="GET",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       binary=TRUE
       ))
       )

get the suggested links

获取建议的链接

gxml<-htmlParse(googData$value)
urls<-unname(xpathSApply(gxml,"//*[@class='l']/@href"))

close the session

关闭会议

getURL(paste0(baseURL,"session/",serverId),
       customrequest="DELETE",
       httpheader=c('Content-Type'='application/json;charset=UTF-8')
       )

#2


0  

I would like to scrap soccer matches tables of every single round but dont know how to do, I do appreciate if you're willing to shade me a light... Using R to connect Selenium-Server-Standalone

我想废弃每一轮的足球比赛桌,但不知道该怎么做,如果你愿意给我一个灯光,我很感激...用R连接Selenium-Server-Standalone

elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"),
   customrequest="POST",
   httpheader=c('Content-Type'='application/json;charset=UTF-8'),
   postfields=toJSON(list(using="xpath",value="//*[@id="Match_Table"]")),binary=TRUE))
   )

elementId<-elementDetails$value

#3


0  

The package relenium (Selenium for R) has been recently developed, importing selenium through the rJava package. It is mainly proposed for webscraping. Disclaimer: I'm one of the developers.

最近开发了包装硒(Selenium for R),通过rJava包装进口硒。它主要用于webscraping。免责声明:我是开发人员之一。

#1


1  

Selenium can be accessed using the JsonWireProtocol.

可以使用JsonWireProtocol访问Selenium。

Firstly start up a Selenium server from the command line via:

首先从命令行启动Selenium服务器:

java -jar selenium-server-standalone-2.25.0.jar

a new Firefox browser can be opened as follows:

可以按如下方式打开新的Firefox浏览器:

library(RCurl)
library(RJSONIO)
library(XML)

baseURL<-"http://localhost:4444/wd/hub/"
server<-list(desiredCapabilities=list(browserName='firefox',javascriptEnabled=TRUE))

getURL(paste0(baseURL,"session"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(server))

serverDetails<-fromJSON(rawToChar(getURLContent('http://localhost:4444/wd/hub/sessions',binary=TRUE)))
serverId<-serverDetails$value[[1]]$id

Navigate to google.

导航到谷歌。

getURL(paste0(baseURL,"session/",serverId,"/url"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(url="http://www.google.com")))

get the id of the search box

获取搜索框的ID

elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(using="xpath",value="//*[@id=\"gbqfq\"]")),binary=TRUE))
       )

elementId<-elementDetails$value

search for a subject matter

搜索主题

rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element/",elementId,"/value"),
       customrequest="POST",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       postfields=toJSON(list(value=list("\uE009","a","\uE009",'\b','Selenium api in R')))
       ,binary=TRUE))

return the search html

返回搜索html

googData<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/source"),
       customrequest="GET",
       httpheader=c('Content-Type'='application/json;charset=UTF-8'),
       binary=TRUE
       ))
       )

get the suggested links

获取建议的链接

gxml<-htmlParse(googData$value)
urls<-unname(xpathSApply(gxml,"//*[@class='l']/@href"))

close the session

关闭会议

getURL(paste0(baseURL,"session/",serverId),
       customrequest="DELETE",
       httpheader=c('Content-Type'='application/json;charset=UTF-8')
       )

#2


0  

I would like to scrap soccer matches tables of every single round but dont know how to do, I do appreciate if you're willing to shade me a light... Using R to connect Selenium-Server-Standalone

我想废弃每一轮的足球比赛桌,但不知道该怎么做,如果你愿意给我一个灯光,我很感激...用R连接Selenium-Server-Standalone

elementDetails<-fromJSON(rawToChar(getURLContent(paste0(baseURL,"session/",serverId,"/element"),
   customrequest="POST",
   httpheader=c('Content-Type'='application/json;charset=UTF-8'),
   postfields=toJSON(list(using="xpath",value="//*[@id="Match_Table"]")),binary=TRUE))
   )

elementId<-elementDetails$value

#3


0  

The package relenium (Selenium for R) has been recently developed, importing selenium through the rJava package. It is mainly proposed for webscraping. Disclaimer: I'm one of the developers.

最近开发了包装硒(Selenium for R),通过rJava包装进口硒。它主要用于webscraping。免责声明:我是开发人员之一。