How to run DB requests asynchronously?

Question

When I run this it lists off the websites in the database one by one with the response code and it takes about 10 seconds to run through a very small list. It should be way faster and isn't running asynchronously but I'm not sure why.

import dblogin
import aiohttp
import asyncio
import async_timeout

dbconn = dblogin.connect()
dbcursor = dbconn.cursor(buffered=True)
dbcursor.execute("SELECT thistable FROM adatabase")
website_list = dbcursor.fetchall()

async def fetch(session, url):
    with async_timeout.timeout(30):
        async with session.get(url, ssl=False) as response:
            await response.read()
            return response.status, url

async def main():
    async with aiohttp.ClientSession() as session:
        for all_urls in website_list:
            url = all_urls[0]
            resp = await fetch(session, url)
            print(resp, url)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    loop.close()

dbcursor.close()
dbconn.close()

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

This article explains the details. What you need to do is pass each fetch call in a Future object, and then pass a list of those to either asyncio.wait or asyncio.gather depending on your needs.

Your code would look something like this:

async def fetch(session, url):
    with async_timeout.timeout(30):
        async with session.get(url, ssl=False) as response:
            await response.read()
            return response.status, url

async def main():
    tasks = []
    async with aiohttp.ClientSession() as session:
        for all_urls in website_list:
            url = all_urls[0]
            task = asyncio.create_task(fetch(session, url))
            tasks.append(task)

        responses = await asyncio.gather(*tasks)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    future = asyncio.create_task(main())
    loop.run_until_complete(future)

Also, are you sure that loop.close() call is needed? The docs mention that

The loop must not be running when this function is called. Any pending callbacks will be discarded.

This method clears all queues and shuts down the executor, but does not wait for the executor to finish.

As mentioned in the docs and in the link that @user4815162342 posted, it is better to use the create_task method instead of the ensure_future method when we know that the argument is a coroutine. Note that this was added in Python 3.7, so previous versions should continue using ensure_future instead.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered May 14, 2019 at 6:27

shriakhilc

3,0102 gold badges14 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user4815162342 Over a year ago

Please consider using create_task rather than ensure_future when you know the argument is a coroutine. ensure_future is a specialized function designed to be used by combinators such as gather to convert arbitrary awaitables into futures.

shriakhilc Over a year ago

Thanks for the tip, I didn't know that. I'll edit my post.

user3079103 Over a year ago

Thanks! That article really helped and I was able to get it working for the most part. It just requires a bit of tweaking to get it to do what I want.

Collectives™ on Stack Overflow

How to run DB requests asynchronously?

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related