2

I have been trying to figure out how I can use asyncio and aiohttp inside of a Class. If I just try running the script without the Class (just use the functions as is), everything works fine. As soon as I bring all the functions into a Class and try using the Class in Main.py the script locks up without any errors. Not exactly sure where to go from here, I am guessing I have to set up my Class differently for it to work. If anyone has any knowledge as to why this does not work, it would be greatly appreciated it if you shared what I am doing wrong. Thank you for your time.

Fetch.py

import asyncio
from aiohttp import ClientSession

class Fetch:
 def __init__(self, proxy=None):
  self.proxy = proxy
  self.headers =  {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}

 def set_headers(self, headers):
  if not headers:
   headers = self.headers
  return headers

 def set_proxy(self, proxy):
  if proxy:
   p = proxy
  else:
   p = self.proxy
  return "http://{}".format(p)

 async def get_fetch(self, session, url, headers=None, proxy=None, params=None, timeout=9):
  array = []
  while True:
   try:
    async with session.get(url, headers=self.set_headers(headers), proxy=self.set_proxy(proxy), params=params, timeout=timeout) as r:
     print (r.status)
     if r.status == 200:
      obj = await r.read()
      array.append(obj)
      break
   except:
    pass
  return array

 async def get_bound(self, sem, session, url):
  async with sem:
   array = await self.get_fetch(session, url)
   return array

 async def get_run(self, urls, semaphores=400):
  tasks = []
  sem = asyncio.Semaphore(semaphores)

  async with ClientSession() as session:
   for url in urls:
    task = asyncio.ensure_future(self.get_bound(sem, session, url))
    tasks.append(task)

  responses = await asyncio.gather(*tasks)
  return responses

 def get(self, urls):
  loop = asyncio.get_event_loop()
  future = asyncio.ensure_future(self.get_run(urls))
  array = loop.run_until_complete(future)
  loop.close()
  return [ent for sublist in array for ent in sublist]

Main.py

from Browser import Fetch
from bs4 import BeautifulSoup

proxy = 'xxx.xxx.xxx.xxx:xxxxx'
fetch = Fetch(proxy)

if __name__ == '__main__':
 urls = ['http://ip4.me','http://ip4.me','http://ip4.me']
 array = fetch.get(urls)
 for obj in array:
  soup = BeautifulSoup(obj, 'html.parser')
  for ip in soup.select('tr +  tr td font'):
   print(ip.get_text())
2
  • You don't need to do anything special with your class for async programming. I suspect your problem is with the "except" catching all errors. With catch specific errors or at the very least log the errors you're getting. Commented Feb 17, 2018 at 21:05
  • You are right, I changed it to except Exception as e: and I am getting the error Session is closed. Why would it be closed when I use it in the Class, and not closed when it is not used inside of the Class? Commented Feb 17, 2018 at 21:08

1 Answer 1

4

Your indentation is wrong.

async with ClientSession() as session:
    for url in urls:
        task = asyncio.ensure_future(self.get_bound(sem, session, url))
        tasks.append(task)

responses = await asyncio.gather(*tasks)
return responses

Bring the last two lines back within the with block.

Your code looks similar to https://pawelmhm.github.io/asyncio/python/aiohttp/2016/04/22/asyncio-aiohttp.html. In this reference, the await responses and related statements are well within the with block, otherwise your code lets the ClientSession instance go out of scope (and the underlying session be closed) before the http calls come back.

On a side note, please consider a standard indentation style for your code. A single space makes it really hard to spot these easy mistakes.

Sign up to request clarification or add additional context in comments.

4 Comments

Wow, so much time wasted... thank you! And what do you mean by "standard indentation style"? Use a tab instead of single space? Or double space instead of single space?
@antfuentes87 Details about python indentation style are here: python.org/dev/peps/pep-0008/#indentation, but the gist is indeed that yes, you need significantly visible indentation spacing in order to easily see what's in a block and what's not
@antfuentes87 The standard Python indentation level is 4 spaces, as shown in Antoine's response (and in virtually all Python code one finds on the Internet). That way it is clearly visible what is indented under what.
@user4815162342 Will start using 4 spaces from now on :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.