So I am making a webpage 'crawler' that parses a webpage and then searches for a word or set of words within the webpage. Here arises my problem, the data that I am looking for is contained within the parsed webpage (I ran it using the specific word as a test) yet it says that the data that it is looking for has not been found.
from html.parser import HTMLParser
from urllib import *
class dataFinder(HTMLParser):
def open_webpage(self):
import urllib.request
request = urllib.request.Request('https://www.summet.com/dmsi/html/readingTheWeb.html')#Insert Webpage
response = urllib.request .urlopen(request)
web_page = response.read()
self.webpage_text = web_page.decode()
return self.webpage_text
def handle_data(self, data):
wordtofind = 'PaperBackSwap.com'
if data == wordtofind:
print('Match found:',data)
else:
print('No matches found')
p = dataFinder()
print(p.open_webpage())
p.handle_data(p.webpage_text)
I have run the program without the open webpage function using the feed method and it works and finds the data, however it now does not work.
Any help in solving this problem is appreciated