木偶师:如何处理多个标签?

时间:2022-12-20 19:24:02

Scenario: Web form for developer app registration with two part workflow.

场景:Web表单用于开发人员应用程序注册,有两部分工作流。

Page 1: Fill out developer app details and click on button to create Application ID, which opens, in a new tab...

第1页:填写开发人员应用程序的详细信息,点击按钮创建应用程序ID,在一个新的选项卡中打开……

Page 2: The App ID page. I need to copy the App ID from this page, then close the tab and go back to Page 1 and fill in the App ID (saved from Page 2), then submit the form.

第2页:App ID页面。我需要从这个页面复制App ID,然后关闭选项卡,回到第1页,填写App ID(从第2页保存),然后提交表单。

I understand basic usage - how to open Page 1 and click the button which opens Page 2 - but how do I get a handle on Page 2 when it opens in a new tab?

我了解基本用法——如何打开第1页并单击打开第2页的按钮——但如何在第2页打开新选项卡时获得句柄?

Example:

例子:

const puppeteer = require('puppeteer');

(async() => {
    const browser = await puppeteer.launch({headless: false, executablePath: '/Applications/Google Chrome.app'});
    const page = await browser.newPage();

    // go to the new bot registration page
    await page.goto('https://register.example.com/new', {waitUntil: 'networkidle'});

    // fill in the form info
    const form = await page.$('new-app-form');

    await page.focus('#input-appName');
    await page.type('App name here');

    await page.focus('#input-appDescription');
    await page.type('short description of app here');

    await page.click('.get-appId'); //opens new tab with Page 2

    // handle Page 2
    // get appID from Page 2
    // close Page 2

    // go back to Page 1
    await page.focus('#input-appId');
    await page.type(appIdSavedFromPage2);

    // submit the form
    await form.evaluate(form => form.submit());

    browser.close();
})();

Update 2017-10-25

更新2017-10-25

Still looking for a good usage example.

仍然在寻找一个好的用法示例。

6 个解决方案

#1


4  

This will work for you in the latest alpha branch:

这将在最新的alpha分支中为您工作:

const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())));
await page.click('my-link');
// handle Page 2: you can access new page DOM through newPage object
const newPage = await newPagePromise;
await newPage.waitForSelector('#appid');
const appidHandle = await page.$('#appid');
const appID = await page.evaluate(element=> element.innerHTML, appidHandle );
newPage.close()
[...]
//back to page 1 interactions

Be sure to use the last puppeteer version (from Github master branch) by setting package.json dependency to

通过设置包,确保使用最后一个puppeteer版本(来自Github master branch)。json的依赖

"dependencies": {
    "puppeteer": "git://github.com/GoogleChrome/puppeteer"
},

Source: JoelEinbinder @ https://github.com/GoogleChrome/puppeteer/issues/386#issuecomment-343059315

来源:https://github.com/GoogleChrome/puppeteer/issues/386 JoelEinbinder @ # issuecomment - 343059315

#2


1  

A new patch has been committed two days ago and now you can use browser.pages() to access all Pages in current browser. Works fine, tried myself yesterday :)

两天前提交了一个新补丁,现在可以使用browsr . Pages()访问当前浏览器中的所有页面。很好,我昨天试过了

Edit:

编辑:

An example how to get a JSON value of a new page opened as 'target: _blank' link.

一个如何获取一个新页面的JSON值的示例,该页面被打开为“target: _blank”链接。

const page = await browser.newPage();
await page.goto(url, {waitUntil: 'load'});

// click on a 'target:_blank' link
await page.click(someATag);

// get all the currently open pages as an array
let pages = await browser.pages();

// get the last element of the array (third in my case) and do some 
// hucus-pocus to get it as JSON...
const aHandle = await pages[3].evaluateHandle(() => document.body);

const resultHandle = await pages[3].evaluateHandle(body => 
  body.innerHTML, aHandle);

// get the JSON value of the page.
let jsonValue = await resultHandle.jsonValue();

// ...do something with JSON

#3


0  

You can't currently - Follow https://github.com/GoogleChrome/puppeteer/issues/386 to know when the ability is added to puppeteer (hopefully soon)

目前,您无法通过https://github.com/GoogleChrome/puppeteer/issues/386来知道这个功能何时被添加到puppeteer(希望很快)

#4


0  

In theory, you could override the window.open function to always open "new tabs" on your current page and navigate via history.

理论上,你可以重写窗口。打开函数,始终打开当前页面上的“新选项卡”,并通过历史导航。

Your workflow would then be:

您的工作流程将是:

  1. Override the window.open function:

    覆盖窗口。开放的功能:

    await page.evaluateOnNewDocument(() => {
      window.open = (url) => {
        top.location = url
      }
    })
    
  2. Go to your first page and perform some actions:

    浏览你的第一页,做一些动作:

    await page.goto(PAGE1_URL)
    // ... do stuff on page 1
    
  3. Navigate to your second page by clicking the button and perform some actions there:

    通过单击按钮导航到第二个页面并在那里执行一些操作:

    await page.click('#button_that_opens_page_2')
    await page.waitForNavigation()
    // ... do stuff on page 2, extract any info required on page 1
    // e.g. const handle = await page.evaluate(() => { ... })
    
  4. Return to your first page:

    回到你的第一页:

    await page.goBack()
    // or: await page.goto(PAGE1_URL)
    // ... do stuff on page 1, injecting info saved from page 2
    

This approach, obviously, has its drawbacks, but I find it simplifies multi-tab navigation drastically, which is especially useful if you're running parallel jobs on multiple tabs already. Unfortunately, current API doesn't make it an easy task.

显然,这种方法有其缺点,但我发现它极大地简化了多选项卡导航,如果您已经在多个选项卡上运行并行作业,这一点尤其有用。不幸的是,当前的API并不能使它成为一项简单的任务。

#5


0  

You could remove the need to switch page in case it is caused by target="_blank" attribute - by setting target="_self"

如果页面是由target="_blank"属性引起的,您可以通过设置target="_self"来消除切换页面的需要

Example:

例子:

element = page.$(selector)

await page.evaluateHandle((el) => {
        el.target = '_self';
 }, element)

element.click()

#6


0  

If your click action is emitting a pageload, then any subsequent scripts being ran are effectively lost. To get around this you need to trigger the action (a click in this case) but not await for it. Instead, wait for the pageload:

如果单击操作正在发出一个pageload,那么正在运行的任何后续脚本实际上都会丢失。要绕过这个问题,您需要触发操作(在本例中是单击),但不要等待它。相反,等待pageload:

page.click('.get-appId');
await page.waitForNavigation();

This will allow your script to effectively wait for the next pageload event before proceeding with further actions.

这将允许您的脚本有效地等待下一个pageload事件,然后再进行进一步的操作。

#1


4  

This will work for you in the latest alpha branch:

这将在最新的alpha分支中为您工作:

const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())));
await page.click('my-link');
// handle Page 2: you can access new page DOM through newPage object
const newPage = await newPagePromise;
await newPage.waitForSelector('#appid');
const appidHandle = await page.$('#appid');
const appID = await page.evaluate(element=> element.innerHTML, appidHandle );
newPage.close()
[...]
//back to page 1 interactions

Be sure to use the last puppeteer version (from Github master branch) by setting package.json dependency to

通过设置包,确保使用最后一个puppeteer版本(来自Github master branch)。json的依赖

"dependencies": {
    "puppeteer": "git://github.com/GoogleChrome/puppeteer"
},

Source: JoelEinbinder @ https://github.com/GoogleChrome/puppeteer/issues/386#issuecomment-343059315

来源:https://github.com/GoogleChrome/puppeteer/issues/386 JoelEinbinder @ # issuecomment - 343059315

#2


1  

A new patch has been committed two days ago and now you can use browser.pages() to access all Pages in current browser. Works fine, tried myself yesterday :)

两天前提交了一个新补丁,现在可以使用browsr . Pages()访问当前浏览器中的所有页面。很好,我昨天试过了

Edit:

编辑:

An example how to get a JSON value of a new page opened as 'target: _blank' link.

一个如何获取一个新页面的JSON值的示例,该页面被打开为“target: _blank”链接。

const page = await browser.newPage();
await page.goto(url, {waitUntil: 'load'});

// click on a 'target:_blank' link
await page.click(someATag);

// get all the currently open pages as an array
let pages = await browser.pages();

// get the last element of the array (third in my case) and do some 
// hucus-pocus to get it as JSON...
const aHandle = await pages[3].evaluateHandle(() => document.body);

const resultHandle = await pages[3].evaluateHandle(body => 
  body.innerHTML, aHandle);

// get the JSON value of the page.
let jsonValue = await resultHandle.jsonValue();

// ...do something with JSON

#3


0  

You can't currently - Follow https://github.com/GoogleChrome/puppeteer/issues/386 to know when the ability is added to puppeteer (hopefully soon)

目前,您无法通过https://github.com/GoogleChrome/puppeteer/issues/386来知道这个功能何时被添加到puppeteer(希望很快)

#4


0  

In theory, you could override the window.open function to always open "new tabs" on your current page and navigate via history.

理论上,你可以重写窗口。打开函数,始终打开当前页面上的“新选项卡”,并通过历史导航。

Your workflow would then be:

您的工作流程将是:

  1. Override the window.open function:

    覆盖窗口。开放的功能:

    await page.evaluateOnNewDocument(() => {
      window.open = (url) => {
        top.location = url
      }
    })
    
  2. Go to your first page and perform some actions:

    浏览你的第一页,做一些动作:

    await page.goto(PAGE1_URL)
    // ... do stuff on page 1
    
  3. Navigate to your second page by clicking the button and perform some actions there:

    通过单击按钮导航到第二个页面并在那里执行一些操作:

    await page.click('#button_that_opens_page_2')
    await page.waitForNavigation()
    // ... do stuff on page 2, extract any info required on page 1
    // e.g. const handle = await page.evaluate(() => { ... })
    
  4. Return to your first page:

    回到你的第一页:

    await page.goBack()
    // or: await page.goto(PAGE1_URL)
    // ... do stuff on page 1, injecting info saved from page 2
    

This approach, obviously, has its drawbacks, but I find it simplifies multi-tab navigation drastically, which is especially useful if you're running parallel jobs on multiple tabs already. Unfortunately, current API doesn't make it an easy task.

显然,这种方法有其缺点,但我发现它极大地简化了多选项卡导航,如果您已经在多个选项卡上运行并行作业,这一点尤其有用。不幸的是,当前的API并不能使它成为一项简单的任务。

#5


0  

You could remove the need to switch page in case it is caused by target="_blank" attribute - by setting target="_self"

如果页面是由target="_blank"属性引起的,您可以通过设置target="_self"来消除切换页面的需要

Example:

例子:

element = page.$(selector)

await page.evaluateHandle((el) => {
        el.target = '_self';
 }, element)

element.click()

#6


0  

If your click action is emitting a pageload, then any subsequent scripts being ran are effectively lost. To get around this you need to trigger the action (a click in this case) but not await for it. Instead, wait for the pageload:

如果单击操作正在发出一个pageload,那么正在运行的任何后续脚本实际上都会丢失。要绕过这个问题,您需要触发操作(在本例中是单击),但不要等待它。相反,等待pageload:

page.click('.get-appId');
await page.waitForNavigation();

This will allow your script to effectively wait for the next pageload event before proceeding with further actions.

这将允许您的脚本有效地等待下一个pageload事件,然后再进行进一步的操作。