我能找到使用regex与Python和Selenium的元素吗?

I need to click a dropdown list and click a hidden element with in it. the html will be generated by javascript and I won't know the id or class name but I will know it will have a phrase in it. Can I find and element by regex and then click it with selenium?

我需要单击下拉列表并单击其中包含的隐藏元素。html将由javascript生成，我不知道id或类名，但我知道它将包含一个短语。我可以通过regex查找和元素，然后使用selenium单击它吗?

2 个解决方案

#1

You cannot simply do regex-based search with built-in selenium webdriver locators, but you have multiple things that might help you:

您不能简单地使用内置的selenium webdriver定位器进行基于regex的搜索，但是您有很多东西可以帮助您:

contains() and starts-with() XPath functions:

包含()和starts-with() XPath函数:
```
//div[contains(., "Desired text")]
//div[starts-with(., "Desired text")]
```
preceding, preceding-sibling, following and following-sibling axis that might help you if you know the relative position of an newly generated block of elements you need to locate
如果您知道需要定位的新生成的元素块的相对位置，那么可以使用前面的、前面的、后面的和后面的同级轴

There are also CSS selectors for partial match on element attributes:

还有CSS选择器用于元素属性的部分匹配:

a[href*=desiredSubstring]  # contains
a[href^=desiredSubstring]  # starts-with
a[href$=desiredSubstring]  # ends-with

And you can always find more elements than needed and filter them out later in Python, example:

你总是可以找到比需要的更多的元素，并在以后的Python中过滤它们，例如:

import re

pattern = re.compile(r"^Some \w+ text.$")

elements = driver.find_elements_by_css_selector("div.some_class")
for element in elements:
    match = pattern.match(element.text)
    if match:
        print(element.text)

#2

You can use import re to perform regex functions. The snippet below looks through a table and grabs the text between the <b></b> tags in the first cell if the row has 3 cells in it.

您可以使用import re来执行regex函数。下面的代码片段查看一个表，并获取第一个单元格中标记之间的文本，如果行中有3个单元格。

import re
from lxml import html, etree

tree = html.fromstring(browser.page_source)
party_table = tree.xpath("//table")
assert len(party_table) == 1

CURRENT_PARTIES = []
for row in party_table[0].xpath("tbody/tr"):
    cells = row.xpath("td")
    if len(cells) != 3:
        continue

    if cells[1].text == "represented by":
        match = re.search(r'<b>(.+?)</b>', etree.tostring(cells[0]), re.IGNORECASE)
        print "MATCH: ", match

#1