1

I'm trying to scrape this website:

https://www.novanthealth.org/home/patients--visitors/locations/clinics.aspx?behavioral-health=yes

I want to get the clinic names and addresses, and this is the python code I'm using

from selenium import webdriver
import pd 
import time 

#driver = webdriver.Chrome()
specialty   = ["behavioral-health","dermatology","colon","ear-nose-and-    throat","endocrine","express","family-practice","foot-and-ankle",
           "gastroenterology","heart-%26-vascular","hepatobiliary-and-pancreas","infectious-disease","inpatient","internal-medicine",
           "neurology","nutrition","ob%2Fgyn","occupational-medicine","oncology","orthopedics","osteoporosis","pain-management",
           "pediatrics","plastic-surgery","pulmonary","rehabilitation","rheumatology","sleep","spine","sports-medicine","surgical","urgent-care",
           "urology","weight-loss","wound-care","pharmacy"]
name = []
address = []

for q in specialty: 
    driver = webdriver.Chrome()
    driver.get("https://www.novanthealth.org/home/patients--   visitors/locations/clinics.aspx?"+q+"=yes")
    x = driver.find_element_by_class_name("loc-link-right")
    num_page = str(x.text).split(" ")
    x.click() 

    for i in num_page:
        btn = driver.find_element_by_xpath('//*[@id="searchResults"]/div[2]/div[2]/button['+i+']')
        btn.click() 
        time.sleep(8) #instaed of this use waituntil #     
        temp = driver.find_element_by_class_name("gray-background").text
        temp0 = temp.replace("Get directions Website View providers\n","")

        x_temp = temp0.split("\n\n\n")

        for j in range(0,len(x_temp)-1):
            temp1 = x_temp[j].split("Phone:")
            name.append(temp1[0].split("\n")[1])
            temp3 = temp1[1].split("Office hours:")
            temp4 = temp3[0].split("\n")
            temp5 = temp4[1:len(temp4)]
            address.append(" ".join(temp5))
   driver.close()   

This code works fine If I use it for only one specialty at a time, but when I pass the specialties in a loop as above, the code fails in the second iteration with the error:

Traceback (most recent call last):
 File "<stdin>", line 10, in <module>
File "C:\Anaconda2\lib\site- packages\selenium\webdriver\remote\webelement.py", line 77, in click self._execute(Command.CLICK_ELEMENT)
File C:\Anaconda2\lib\sitepackages\selenium\webdriver\remote\webelement.py", line 493, in _execute return self._parent.execute(command, params)
File "C:\Anaconda2\lib\site-packages\selenium\webdriver\remote\webdriver.py",     line 249, in execute self.error_handler.check_response(response)
 File "C:\Anaconda2\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 193, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: element not visible
(Session info: chrome=46.0.2490.80)
(Driver info: chromedriver=2.19.346078    (6f1f0cde889532d48ce8242342d0b84f94b114a1),platform=Windows NT 6.1 SP1 x86_64

I don't have much experience using python, any help will be appreciated

8
  • You have to make your web driver to wait for some secs until the corresponding elem gets appeared on the page. Have a look at webdriver_wait function.. Commented Apr 24, 2017 at 7:44
  • I was already going through the documentation on that, but was facing some issues implementing it, can you give a sample code for it ? Thanks ! Commented Apr 24, 2017 at 7:51
  • here it is stackoverflow.com/a/41832157/3297613 Commented Apr 24, 2017 at 7:53
  • @AvinashRaj I added wait = WebDriverWait(driver, 10) wait.until(EC.presence_of_element_located((By.ID, "searchResults"))), above btn = driver.find_element_by_xpath('//*[@id="searchResults"]/div[2]/div[2]/button['+i+']') This time it ran for 2 iterations but gave the same error in the third iteration Commented Apr 24, 2017 at 8:21
  • @Vaibhav: it is worth avoiding asking directly for "a sample code" here. That is usually understood to mean "will you do my work for me", even if that is not the actual intent. Commented Apr 24, 2017 at 8:21

2 Answers 2

1

The Error message had told you why it not work.

ElementNotVisibleException: Message: element not visible

The element is not visible if you do not scroll down to see it.

You have to scroll down the list according to the size of your browser,

OR

Just extract the data from the source page, which is easier.

Sign up to request clarification or add additional context in comments.

Comments

1

Usually I would do in Selenium Basic, an excel plugin. You can use the same logic in Python. This is tried in VBA and works fine for me.

Private assert As New assert
Private driver As New Selenium.ChromeDriver

Sub sel_novanHealth()
Set ObjWB = ThisWorkbook
Set ObjExl_Sheet1 = ObjWB.Worksheets("Sheet1")
Dim Name As Variant

   'Open the website
    driver.get "https://www.novanthealth.org/home/patients--visitors/locations.aspx"

    driver.Window.Maximize

    driver.Wait (1000)

    'Find out the total number of pages to be scraped
    lnth = driver.FindElementsByXPath("//button[@class='paginate_button']").Count
   'Running the Loop for the Pages
    For y = 2 To lnth
            'Running the Loop for the Elements
            For x = 1 To 10
                Name = driver.FindElementsByXPath("//div[@class='span12 loc-heading']")(x).Text
                ' Element 2
                 'Element 3
            Next x
                driver.FindElementsByXPath("//button[@class='paginate_button']")(y).Click
    Next y

        driver.Wait (1000)


End Sub

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.