我能找到使用regex与Python和Selenium的元素吗?

时间:2022-03-12 19:24:44

I need to click a dropdown list and click a hidden element with in it. the html will be generated by javascript and I won't know the id or class name but I will know it will have a phrase in it. Can I find and element by regex and then click it with selenium?

我需要单击下拉列表并单击其中包含的隐藏元素。html将由javascript生成,我不知道id或类名,但我知道它将包含一个短语。我可以通过regex查找和元素,然后使用selenium单击它吗?

2 个解决方案

#1


8  

You cannot simply do regex-based search with built-in selenium webdriver locators, but you have multiple things that might help you:

您不能简单地使用内置的selenium webdriver定位器进行基于regex的搜索,但是您有很多东西可以帮助您:

  • contains() and starts-with() XPath functions:

    包含()和starts-with() XPath函数:

    //div[contains(., "Desired text")]
    //div[starts-with(., "Desired text")]
    
  • preceding, preceding-sibling, following and following-sibling axis that might help you if you know the relative position of an newly generated block of elements you need to locate
  • 如果您知道需要定位的新生成的元素块的相对位置,那么可以使用前面的、前面的、后面的和后面的同级轴

There are also CSS selectors for partial match on element attributes:

还有CSS选择器用于元素属性的部分匹配:

a[href*=desiredSubstring]  # contains
a[href^=desiredSubstring]  # starts-with
a[href$=desiredSubstring]  # ends-with

And you can always find more elements than needed and filter them out later in Python, example:

你总是可以找到比需要的更多的元素,并在以后的Python中过滤它们,例如:

import re

pattern = re.compile(r"^Some \w+ text.$")

elements = driver.find_elements_by_css_selector("div.some_class")
for element in elements:
    match = pattern.match(element.text)
    if match:
        print(element.text)

#2


0  

You can use import re to perform regex functions. The snippet below looks through a table and grabs the text between the <b></b> tags in the first cell if the row has 3 cells in it.

您可以使用import re来执行regex函数。下面的代码片段查看一个表,并获取第一个单元格中标记之间的文本,如果行中有3个单元格。

import re
from lxml import html, etree

tree = html.fromstring(browser.page_source)
party_table = tree.xpath("//table")
assert len(party_table) == 1

CURRENT_PARTIES = []
for row in party_table[0].xpath("tbody/tr"):
    cells = row.xpath("td")
    if len(cells) != 3:
        continue

    if cells[1].text == "represented by":
        match = re.search(r'<b>(.+?)</b>', etree.tostring(cells[0]), re.IGNORECASE)
        print "MATCH: ", match

#1


8  

You cannot simply do regex-based search with built-in selenium webdriver locators, but you have multiple things that might help you:

您不能简单地使用内置的selenium webdriver定位器进行基于regex的搜索,但是您有很多东西可以帮助您:

  • contains() and starts-with() XPath functions:

    包含()和starts-with() XPath函数:

    //div[contains(., "Desired text")]
    //div[starts-with(., "Desired text")]
    
  • preceding, preceding-sibling, following and following-sibling axis that might help you if you know the relative position of an newly generated block of elements you need to locate
  • 如果您知道需要定位的新生成的元素块的相对位置,那么可以使用前面的、前面的、后面的和后面的同级轴

There are also CSS selectors for partial match on element attributes:

还有CSS选择器用于元素属性的部分匹配:

a[href*=desiredSubstring]  # contains
a[href^=desiredSubstring]  # starts-with
a[href$=desiredSubstring]  # ends-with

And you can always find more elements than needed and filter them out later in Python, example:

你总是可以找到比需要的更多的元素,并在以后的Python中过滤它们,例如:

import re

pattern = re.compile(r"^Some \w+ text.$")

elements = driver.find_elements_by_css_selector("div.some_class")
for element in elements:
    match = pattern.match(element.text)
    if match:
        print(element.text)

#2


0  

You can use import re to perform regex functions. The snippet below looks through a table and grabs the text between the <b></b> tags in the first cell if the row has 3 cells in it.

您可以使用import re来执行regex函数。下面的代码片段查看一个表,并获取第一个单元格中标记之间的文本,如果行中有3个单元格。

import re
from lxml import html, etree

tree = html.fromstring(browser.page_source)
party_table = tree.xpath("//table")
assert len(party_table) == 1

CURRENT_PARTIES = []
for row in party_table[0].xpath("tbody/tr"):
    cells = row.xpath("td")
    if len(cells) != 3:
        continue

    if cells[1].text == "represented by":
        match = re.search(r'<b>(.+?)</b>', etree.tostring(cells[0]), re.IGNORECASE)
        print "MATCH: ", match