3

I am learning python asyncio and testing a lot of code using them.
Below is a code where I try to subscribe multiple Websocket streaming using asyncio and aiohttp.

I do not understand why when coro(item1, item2): is executed as a task, it does not go into the async with ... block. (i.e "A" is printed but not "B").
Could anyone help me understand the reason for this?

(I have already got a working code but , I simply want to understand what the mechanism behind this is.)

Code

import aiohttp
import asyncio
import json

async def coro(
               item1,
               item2):
    print("A")

    async with aiohttp.ClientSession() as session:
        async with session.ws_connect(url='URL') as ws:
            print("B")

            await asyncio.gather(ws.send_json(item1),
                                 ws.send_json(item2))
            print("C")

            async for msg in ws:
                print(msg)

async def ws_connect(item1,
                     item2):
    task = asyncio.create_task(coro(item1, item2))
    return task

async def main():

    item1 = {
        "method": "subscribe",
        "params": {'channel': "..."}
    }
    item2 = {
        "method": "subscribe",
        "params": {'channel': "..."}
    }

    ws_task = await ws_connect(item1, item2)
    print("D")

asyncio.run(main())

Output

D
A

1 Answer 1

2

B is never printed because you never await the returned task, only the method which returned it.

The subtle mistake is in return task followed by await ws_connect(item1, item2).

TL;DR; return await task.

The key to understand the program's output is to know that the context switches in the asyncio event loop can only occur at few places, in particular at await expressions. At this point, the event loop might suspend the current coroutine and continue with another.

First, you create a ws_connect coroutine and immedietely await it, this forces the event loop to suspend main and actually run ws_connect because there is not anything else to run.

Since ws_connect contains none of those points which allow context switch, the coro() function never actually starts.

Only thing create_task does is binding the coroutine to the task object and adding it to the event loop's queue. But you never await it, you just return it as any ordinary return value. Okay, now the ws_connect() finishes and the event loop can choose to run any of the tasks, it chose to continue with main probably since it has been waiting on ws_connect().

Okay, main prints D and returns. Now what?

There is some extra await in asyncio.run which gives coro() a chance to start - hence the printed A (but only after D) yet nothing forces asyncio.run to wait on coro() so when the coro yields back to the context loop through async with, the run finishes and program exits which leaves coro() unfinished.

If you add an extra await asyncio.sleep(1) after print('D'), the loop will again suspend main for at least some time and continue with coro() and that would print B had the URL been correct.

Actually, the context switching is little bit more complicated because ordinary await on a coroutine usually does not switches unless the execution really needs to block on IO or something await asyncio.sleep(0) or yield* guarantees a true context switch without the extra blocking.

*yield from inside __await__ method.

The lesson here is simple - never return awaitables from async methods, it leads to exactly this kind of mistake. Always use return await by default, at worst you get runtime error in case the returned object is not actually awaitable(like return await some_string) and it can easily be spotted and fixed.

On the other hand, returning awaitables from ordinary functions is OK and makes it act like the function is asynchronous. Although one should be careful when mixing these two approaches. Personally, I prefer the first approach as it shifts the responsibility on the writer of the function, not the user which will be warned linters which usually do detect non-awaited corountine calls but not the returned awaitables. So another solution would to make ws_connect an ordinary function, then the await in await ws_connect would apply to the returned value(=the task), not the function itself.

Sign up to request clarification or add additional context in comments.

5 Comments

One of the things I was not aware was the fact that "...the coro yields back to the context loop through async with..." By "context loop" do you mean the event loop? Why does it give its control back to the loop?(Is it because the session is an awaitable object?)
Yes, sorry I meant the event loop. It is due to how the async context managers work, they too have a pair of __aenter__ and __aexit__ async methods which are awaited. See PEP 492 async with syntax. You can see there VAR = await aenter(mgr) which can make a context switch depending on how __aenter__ is implemented. Digging into aiohttp , it goes deep but there is a blocking await in aiohttp/client.py:754 making the actual request to the URL which is a blocking IO call because it surely uses the sockets underneath.
@koyamashinji You can check most of the event loop for yourself using a debugger like pdb but it's not a very readable code and some of async stuff is hidden behind the syntactic sugar similar to async with
The deeper I go into the code, the lower the level gets and gosh it's so tough to understand! Anyways thank you for the helpful answers&comments, Ill look into it more.
@koyamashinji Yea, it's not meant for us mere mortals ;) The point is that you never awaited the coroutine and it did not get enough chances to finish before the program exited. Although async approach is a cooperative multitasking - you control where the functions are interrupted, their scheduling is still nondeterministic and cannot be relied upon.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.