无法弄清楚为什么我无法用Xpath检索一个简单的字符串

时间:2022-09-11 20:55:40

I can't figure out why I can't retrieve a simple string with XPath with this very simple snippet

我无法弄清楚为什么我不能用这个非常简单的代码片段检索带有XPath的简单字符串

var page = new WebPage();
page.open('http://free.fr', function (status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        function getElementByXpath(path) {
          return document.evaluate(path, document, null, XPathResult.STRING_TYPE, null).stringValue;
        }

        console.log( getElementByXpath("//title/text()") );
    }
    phantom.exit();
}

always return nothing.

永远不会回报。

What I missed to print the title value?

我错过了打印标题值?

1 个解决方案

#1


2  

PhantomJS has two contexts. Only the DOM context (page context) has access to the DOM, but it is sandboxed. You get access to the DOM context through page.evaluate. But remember that:

PhantomJS有两种情境。只有DOM上下文(页面上下文)可以访问DOM,但它是沙箱。您可以通过page.evaluate访问DOM上下文。但请记住:

Note: The arguments and the return value to the evaluate function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.

注意:evaluate函数的参数和返回值必须是一个简单的原始对象。经验法则:如果它可以通过JSON序列化,那就没关系了。

Closures, functions, DOM nodes, etc. will not work!

闭包,函数,DOM节点等不起作用!

This means that you cannot pass any DOM node that you find to the outer context. Although, there is a document object outside of the DOM context, but it doesn't do anything. It's only a relict of the way PhantomJS is written on top of QtWebkit.

这意味着您无法将找到的任何DOM节点传递给外部上下文。虽然,DOM上下文之外有一个文档对象,但它没有做任何事情。它只是PhantomJS在QtWebkit上编写的一种方式。

Here's an example fix:

这是一个示例修复:

var page = new WebPage();
page.onConsoleMessage = function(msg){
    console.log("remote: " + msg);
};
page.open('http://google.fr', function (status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        page.evaluate(function(){
            function getElementByXpath(path) {
              return document.evaluate(path, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
            }

            console.log( getElementByXpath("//head/title/text()").textContent );
        });
    }
    phantom.exit();
});

#1


2  

PhantomJS has two contexts. Only the DOM context (page context) has access to the DOM, but it is sandboxed. You get access to the DOM context through page.evaluate. But remember that:

PhantomJS有两种情境。只有DOM上下文(页面上下文)可以访问DOM,但它是沙箱。您可以通过page.evaluate访问DOM上下文。但请记住:

Note: The arguments and the return value to the evaluate function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.

注意:evaluate函数的参数和返回值必须是一个简单的原始对象。经验法则:如果它可以通过JSON序列化,那就没关系了。

Closures, functions, DOM nodes, etc. will not work!

闭包,函数,DOM节点等不起作用!

This means that you cannot pass any DOM node that you find to the outer context. Although, there is a document object outside of the DOM context, but it doesn't do anything. It's only a relict of the way PhantomJS is written on top of QtWebkit.

这意味着您无法将找到的任何DOM节点传递给外部上下文。虽然,DOM上下文之外有一个文档对象,但它没有做任何事情。它只是PhantomJS在QtWebkit上编写的一种方式。

Here's an example fix:

这是一个示例修复:

var page = new WebPage();
page.onConsoleMessage = function(msg){
    console.log("remote: " + msg);
};
page.open('http://google.fr', function (status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        page.evaluate(function(){
            function getElementByXpath(path) {
              return document.evaluate(path, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
            }

            console.log( getElementByXpath("//head/title/text()").textContent );
        });
    }
    phantom.exit();
});