0

I run into a stale element reference every time I run this double for loop and the page refreshes after I click the back element. I tried to call the driver again in the for loop and pull up the element from the DOM however this only calls the first element and does not iterate through the for loop.

Code here:

import time
from selenium import webdriver

driver =webdriver.Chrome()
time.sleep(3)
driver.get("https://www.elizabethnj.org/Directory.aspx")
driver.maximize_window()
while True:
    driver.get("https://www.elizabethnj.org/Directory.aspx")
    driver.maximize_window()
    divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
    for div in divs:
        # div = driver.find_element('xpath', 
        '//div[@class="topmenu"]/div')
        time.sleep(2)
        div.click()
        time.sleep(2)
        contacts = driver.find_elements('xpath', '//* 
        [@id="cityDirectoryDepartmentDetails"]/tbody/tr')
        for contact in contacts:
            print(contact.text)
        back = driver.find_element('xpath', '//* 
            [@id="CityDirectoryLeftMargin"]/div[3]/span')
        back.click()
4
  • no actual error message. no real indication which line fails. please improve your question. by the looks of it, try to increase sleeps, or use wait.until function to make sure back has finished loading. Commented Apr 30, 2024 at 10:09
  • don't click() in loop because find_elements gives references to object in memory and when you click() then it remove these objects to load new page, and later reference can't find object in memory (even if you load back the same page). First you have to get all urls as strings and later use for-loop which loads pages with .get(url) instead of click() Commented Apr 30, 2024 at 16:44
  • if you click() elements on page then browser may move objects in different place in memory - and references may not work. You may have to count all div - len(divs) - and later run for-loop like for index in range(len(divs)): which always runs divs = driver.find_elements(...) to get new references - and later it uses index divs[index] to work with next element. Commented Apr 30, 2024 at 16:47
  • there are other problems; (1) some div are empty and it needs to skip them (when they have emtoy .text), (2) some pages don't have table with contacts - so link to previous page is in div[2] instead of div[3]. You could get all div and use [-1] in Python, but you can also use driver.back() (to use back button in browser) or simply run get(url) to load main page again. Commented Apr 30, 2024 at 17:42

1 Answer 1

0

Selenium gives references to elements in browser's memory and when you click() to load new page then it removes elements from memory and later it has problem to find next element - even if you go back to page because elements can be in different places in memory.

Sometimes similar problem can be when click() changes something on page (ie. fold or unfold some list) because it can also move elements in memory.

There are two methods:

  • harder: count all elements divs at start and later in every loop get again all divs and use index to get next elements - div = divs[index].
divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
print('len(divs):', len(divs))

for index in range(len(divs)):

    divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
    div = divs[index]

    # ... code ...

    # go back to main page
    driver.get("https://www.elizabethnj.org/Directory.aspx")
  • simpler: in your code click() loads new pages so first you can get all href as strings and later you can use .get(url) to load pages with contacts. And it doesn't need to go back to main page - so it may work shorter.

BTW: there two other problems.

  • some div are empty - without text and without link to page - so you have to skip it (ie. you can check if it has empty text).
  • some pages don't have table with contacts and link to main page is [2] instead of [3]. It is simpler to use driver.back() to go back to previous page. Or you could simply run again driver.get(url) to get main page.

Working code for harder version:

import time
from selenium import webdriver

driver = webdriver.Chrome()
driver.maximize_window()

driver.get("https://www.elizabethnj.org/Directory.aspx")
time.sleep(3)

divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
print('len(divs):', len(divs))

#for index, div in enumerate(divs):
#    print(f'{index} >>>', div.text)
    
for index in range(len(divs)):

    divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
    div = divs[index]
    
    text = div.text.strip()
    
    if not text:
        print(f'{index} >>>  --- empty ---')
        continue
        
    print(f'{index} >>>', div.text)
    div.click()
    time.sleep(2)
    
    contacts = driver.find_elements('xpath', '//*[@id="cityDirectoryDepartmentDetails"]/tbody/tr')        
    for contact in contacts:
        print('  contact:', contact.text)

    print('<<< back')
    # ... some pages don't have contacts and link is in `[2]` instead of `[3]`        
    #back = driver.find_element('xpath', '//*[@id="CityDirectoryLeftMargin"]/div[3]/span')
    #back.click()
    # ... or ...    
    driver.back()
    # ... or ...
    #driver.get("https://www.elizabethnj.org/Directory.aspx")

    time.sleep(2)

input('Press ENTER to close')

Working code for simpler version

import time
from selenium import webdriver

driver = webdriver.Chrome()
driver.maximize_window()

driver.get("https://www.elizabethnj.org/Directory.aspx")
time.sleep(3)

divs = driver.find_elements('xpath', '//div[@class="topmenu"]/div')
print('len(divs):', len(divs))

# --- first get all HREF as strings ---

data = []

for index, div in enumerate(divs):
    text = div.text.strip()
    
    if not text:
        print(f'{index} >>>  --- empty ---')
        continue

    url = div.find_element('xpath', './/a').get_attribute('href')
    print(f'{index} >>>', text)
    
    data.append( (text, url) ) 

# --- next visit all pages (wiithout going back to main page) ---
    
for index, (text, url) in enumerate(data):

    print(f'{index} >>>', text)
    
    driver.get(url)
    time.sleep(2)
    
    contacts = driver.find_elements('xpath', '//*[@id="cityDirectoryDepartmentDetails"]/tbody/tr')        
    for contact in contacts:
        print('  contact:', contact.text)

    # it doesn't need to go back to main page
    
input('Press ENTER to close')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.