3

When I run this it lists off the websites in the database one by one with the response code and it takes about 10 seconds to run through a very small list. It should be way faster and isn't running asynchronously but I'm not sure why.

import dblogin
import aiohttp
import asyncio
import async_timeout

dbconn = dblogin.connect()
dbcursor = dbconn.cursor(buffered=True)
dbcursor.execute("SELECT thistable FROM adatabase")
website_list = dbcursor.fetchall()

async def fetch(session, url):
    with async_timeout.timeout(30):
        async with session.get(url, ssl=False) as response:
            await response.read()
            return response.status, url

async def main():
    async with aiohttp.ClientSession() as session:
        for all_urls in website_list:
            url = all_urls[0]
            resp = await fetch(session, url)
            print(resp, url)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    loop.close()

dbcursor.close()
dbconn.close()

1 Answer 1

2

This article explains the details. What you need to do is pass each fetch call in a Future object, and then pass a list of those to either asyncio.wait or asyncio.gather depending on your needs.

Your code would look something like this:

async def fetch(session, url):
    with async_timeout.timeout(30):
        async with session.get(url, ssl=False) as response:
            await response.read()
            return response.status, url

async def main():
    tasks = []
    async with aiohttp.ClientSession() as session:
        for all_urls in website_list:
            url = all_urls[0]
            task = asyncio.create_task(fetch(session, url))
            tasks.append(task)

        responses = await asyncio.gather(*tasks)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    future = asyncio.create_task(main())
    loop.run_until_complete(future)

Also, are you sure that loop.close() call is needed? The docs mention that

The loop must not be running when this function is called. Any pending callbacks will be discarded.

This method clears all queues and shuts down the executor, but does not wait for the executor to finish.


As mentioned in the docs and in the link that @user4815162342 posted, it is better to use the create_task method instead of the ensure_future method when we know that the argument is a coroutine. Note that this was added in Python 3.7, so previous versions should continue using ensure_future instead.

Sign up to request clarification or add additional context in comments.

3 Comments

Please consider using create_task rather than ensure_future when you know the argument is a coroutine. ensure_future is a specialized function designed to be used by combinators such as gather to convert arbitrary awaitables into futures.
Thanks for the tip, I didn't know that. I'll edit my post.
Thanks! That article really helped and I was able to get it working for the most part. It just requires a bit of tweaking to get it to do what I want.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.