+2 votes
1 view
in Devops and Agile by (20k points)

I want to scrape all the data of a page implemented by an infinite scroll. The following python code works.

for i in range(100):

    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    time.sleep(5)

This means every time I scroll down to the bottom, I need to wait 5 seconds, which is generally enough for the page to finish loading the newly generated contents. But, this may not be time efficient. The page may finish loading the new contents within 5 seconds. How can I detect whether the page finished loading the new contents every time I scroll down? If I can detect this, I can scroll down again to see more contents once I know the page finished loading. This is more time efficient.

1 Answer

+1 vote
by (28.4k points)
edited by

By default via .get() method the webdriver will wait for a page to load. Use WebDriverWait function to wait for an element located in your page:

from selenium import webdriver

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.by import By

from selenium.common.exceptions import TimeoutException

browser = webdriver.Firefox()

browser.get("url")

delay = 3 # seconds

try:

    myElem = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'IdOfMyElement')))

    print "Page is ready!"

except TimeoutException:

    print "Loading took too much time!"

For more information please go through the following tutorial to get more info about selenium:

 

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...