12

I am trying to learn async, and now I am trying to get whois information for a batch of domains. I found this lib aiowhois, but there are only a few strokes of information, not enough for such newbie as I am.

This code works without errors, but I don't know how to print data from parsed whois variable, which is coroutine object.

resolv = aiowhois.Whois(timeout=10)

async def coro(url, sem):
    parsed_whois = await resolv.query(url)

async def main():
    tasks = []
    sem = asyncio.Semaphore(4)

    for url in domains:
        task = asyncio.Task(coro(url, sem))
        tasks.append(task)
    await asyncio.gather(*tasks)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
5
  • 1
    I think you can avoid using tasks. Just apply gather to coro(url, sem) directly. You can rename the list of tasks to coros if you like Commented Nov 24, 2019 at 20:16
  • 1
    What do you use the semaphore for? Commented Nov 24, 2019 at 20:22
  • This code made from parts of other programs, i'm still not very clear about everything here =( Commented Nov 25, 2019 at 9:07
  • 2
    Not answering your question but just helping for the future: especially in gTLDs, whois is dying, the new protocol to use is RDAP. Since it is based on HTTPS, any HTTP async library will be able to handle it without problems. Except with very good reasons, new software should be built using RDAP today not whois anymore. Also in both cases the input should be a domain name, not an URL. Commented Nov 25, 2019 at 17:43
  • This information is very useful for me!! I didn't know that, thanks a lot! Commented Nov 26, 2019 at 15:06

2 Answers 2

9

You can avoid using tasks. Just apply gather to the coroutine directly. In case you are confused about the difference, this SO QA might help you (especially the second answer).

You can have each coroutine return its result, without resorting to global variables:

async def coro(url):
    return await resolv.query(url)

async def main():
    domains = ...
    ops = [coro(url) for url in domains]
    rets = await asyncio.gather(*ops)
    print(rets)

Please see the official docs to learn more about how to use gather or wait or even more options

Note: if you are using the latest python versions, you can also simplify the loop running with just

asyncio.run(main())

Note 2: I have removed the semaphore from my code, as it's unclear why you need it and where.

Sign up to request clarification or add additional context in comments.

5 Comments

I see what you mean, will try to modify my code. Thank you for your response
As SO etiquette, if my contribution helped you, please show your appreciation by upvoting it and/or accepting it as the best answer. Many thanks
I'm not enough 4 reputation to make votes at the moment (11 only need 15). But i will when i got them. Thanks
you never did and it's been almost a year :o
honey bees don't live a year :(
3
all_parsed_whois = []  # make a global

async def coro(url, sem):
    all_parsed_whois.append(await resolv.query(url))

If you want the data as soon as it is available you could task.add_done_callback()

python asyncio add_done_callback with async def

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.