I just started using selenium yesterday to help scrape some data and I'm having a difficult time wrapping my head around the selector engine. I know lxml, BeautifulSoup, jQuery and Sizzle have similar engines. But what I'm trying to do is:
我昨天刚开始使用selenium来帮助收集一些数据,我很难在选择器引擎上下功夫。我知道lxml、BeautifulSoup、jQuery和Sizzle都有类似的引擎。但我想做的是:
- Wait 10 seconds for page to completely load
- 等待10秒,页面才能完全加载
- Make sure there are the presence of ten or more span.eN elements (two load on intitial page load and more after)
- 确保有10个或更多的跨度。eN元素(初始页面加载两个加载,之后加载更多)
- Then start processing the data with beautifulsoup
- 然后开始用beautifulsoup处理数据
I am struggling with the selenium conditions of either finding the nth element or locating the specific text that only exists in an nth element. I keep getting errors (timeout, NoSuchElement, etc)
我正在为寻找第n个元素或定位仅存在于第n个元素中的特定文本的硒条件而苦恼。我不断收到错误(超时,NoSuchElement,等等)
url = "http://someajaxiandomain.com/that-injects-html-after-pageload.aspx"
wd = webdriver.Chrome()
wd.implicitly_wait(10)
wd.get(url)
# what I've tried
# .find_element_by_xpath("//span[@class='eN'][10]"))
# .until(EC.text_to_be_present_in_element(By.CSS_SELECTOR, "css=span[class='eN']:contains('foo')"))
1 个解决方案
#1
3
You need to understand the concept of Explicit Waits and Expected Conditions to wait for.
您需要理解显式等待和预期条件的概念。
In your case, you can write a custom Expected Condition to wait for elements count found by a locator being equal to n
:
在您的情况下,您可以编写一个自定义期望条件,等待定位器找到的元素计数为n:
from selenium.webdriver.support import expected_conditions as EC
class wait_for_n_elements_to_be_present(object):
def __init__(self, locator, count):
self.locator = locator
self.count = count
def __call__(self, driver):
try:
elements = EC._find_elements(driver, self.locator)
return len(elements) >= self.count
except StaleElementReferenceException:
return False
Usage:
用法:
n = 10 # specify how many elements to wait for
wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))
Probably, you could have also just used a built-in Expected Condition such as presence_of_element_located
or visibility_of_element_located
and wait for a single span.eN
element to be present or visible, example:
可能,您还可以使用一个内置的预期条件,比如presence_of_element_locate或visibility_of_element_locate,然后等待一个span。要显示或可见的元素,例如:
wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))
#1
3
You need to understand the concept of Explicit Waits and Expected Conditions to wait for.
您需要理解显式等待和预期条件的概念。
In your case, you can write a custom Expected Condition to wait for elements count found by a locator being equal to n
:
在您的情况下,您可以编写一个自定义期望条件,等待定位器找到的元素计数为n:
from selenium.webdriver.support import expected_conditions as EC
class wait_for_n_elements_to_be_present(object):
def __init__(self, locator, count):
self.locator = locator
self.count = count
def __call__(self, driver):
try:
elements = EC._find_elements(driver, self.locator)
return len(elements) >= self.count
except StaleElementReferenceException:
return False
Usage:
用法:
n = 10 # specify how many elements to wait for
wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))
Probably, you could have also just used a built-in Expected Condition such as presence_of_element_located
or visibility_of_element_located
and wait for a single span.eN
element to be present or visible, example:
可能,您还可以使用一个内置的预期条件,比如presence_of_element_locate或visibility_of_element_locate,然后等待一个span。要显示或可见的元素,例如:
wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))