0

Source Code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from bs4 import BeautifulSoup

path = "C:\\Python27\\chromedriver\\chromedriver"
driver = webdriver.Chrome(executable_path=path)
# Open Chrome
driver.get("http://www.thehindu.com/")
# 10 Second Delay
time.sleep(10)
elem = driver.find_element_by_id("searchString")
# Enter Keyword
elem.send_keys("unilever")
elem.send_keys(Keys.RETURN)
time.sleep(10)

#  Problem Here
page = driver.page_source
soup = BeautifulSoup(page, 'lxml')
print soup

Above it the code. I want to scrap data from "http://www.thehindu.com/", It searches for "unilever" word in search box and redirect to result page

Link for Search Page

Now I have a question for this, How can I get Source code of the searched Page. Basically I want news related to "Unilever".

6
  • try driver.page_source ? Commented Feb 3, 2016 at 6:22
  • If I Get source code manually and with your method @The6thSense, different results! Commented Feb 3, 2016 at 6:30
  • possible duplicate: stackoverflow.com/questions/22739514/… Commented Feb 3, 2016 at 7:07
  • @shivshankar they tend to be because selenium run javascripts and provides the output after making necessary changes in source code just see the resemblance between them . Commented Feb 3, 2016 at 7:12
  • @KitFung, this is different Commented Feb 3, 2016 at 8:25

1 Answer 1

0

You can get text inside <body>:

body = driver.find_element_by_tag_name("body")
bodyText = body.get_attribute("innerText")

Then you can find your keyword in bodyText.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.