0

I have a basic script to import UK postcodes into a database with SQL Alchemy. To try to improve efficiency I am attempting to do this with AsyncIO following the docs and a number of "guide" blog posts.

The below is working (no exception thrown, imports into database correctly) however it is seemingly synchronous - both the file and respective rows are in order when I would expect all three files and rows as and whenever. I can't see why. How can it be fixed so that the import of each row in a given CSV does not block the import of the next?

import csv
import os
import asyncio
from db.db import Session, engine, Base
from Models.PostCode import PostCode
from sqlalchemy.exc import IntegrityError

Base.metadata.create_all(engine)

session = Session()
csv_path = os.path.dirname(os.path.realpath(__file__)) + '/postcode_lists/'

def runImport(fname):
    with open(csv_path + fname + '_postcodes.csv', newline='') as csvfile:
        reader = csv.DictReader(csvfile)

        tasks = [asyncio.ensure_future(saveRow(row)) for row in reader]

        loop = asyncio.get_event_loop()

        responses = loop.run_until_complete(asyncio.gather(*tasks, return_exceptions=True))
        loop.close()

        return responses


async def saveRow(row):
    if ('In Use?' not in row) or (row['In Use?']=='Yes'):
        await persist(row)

def persist(row):
    EXISTS = len(session.query(PostCode).filter(PostCode.postcode == row['Postcode']).all())
    if EXISTS == 0:
        pc = PostCode(
            row['Postcode'],
            )

        session.add(pc)
        session.commit()
        print(pc)
        return pc


datasets = ['CM', 'CO']
for d in datasets:
    runImport(d)

print(Done)

Output example

<PostCode(postcode='CA7 5HU')>
<PostCode(postcode='CA7 5HW')>
<PostCode(postcode='CA7 5HX')>
<PostCode(postcode='CA7 5HY')>
<PostCode(postcode='CA7 5HZ')>
<PostCode(postcode='CA7 5JB')>

I am expecting a somewhat jumbled output instead of alpha-ordered as per the CSV.

5
  • 1
    I cannot find any async operation in your code. There is no difference between your code with a simple ordered task queue. Could you point out which part of your code is async? If it is, it should use keyword await. Commented Oct 5, 2018 at 22:41
  • @Sraw I missed that bit. I updated the snippet with an await however it's the same synchronous output on the rows. This is the first time I am using asyncio, using this to learn the basics. From examples online, I can't see what's fundamentally wrong here. The example I am following is at djangostars.com/blog/asynchronous-programming-in-python-asyncio Commented Oct 5, 2018 at 23:12
  • The problem is there is no real async operation in your case. You put an await in but that's meaningless as the awaited expression is still a blocking one. Or let's say your whole program is blocking instead of including some async operation. Commented Oct 5, 2018 at 23:16
  • 2
    If you want to take the advantage of asyncio, you should really contain async operation and use corresponding async lib. For example, use aiohttp to do async http requests. But in your case, file system operation and database accessing are all blocking operations. There isn't a real async operation which means await never cause switching. Commented Oct 5, 2018 at 23:17
  • Ahh ok. I assumed wrongly that whilst waiting for one db action, it could plod along with the next thus speeding it up a bit. Thanks for your help. If you could, add your comments as an answer and I'll accept it as this isn't something that is abundantly clear from tutorial sources. Commented Oct 5, 2018 at 23:27

1 Answer 1

2

Bascially your problem is that there is no real async operation in your code.

Async is actually working as an event loop which is callback based. Your async operation will cause switching which means current task is hung up and the event loop will switch to another task.

But as all your tasks are totally blocking, so none of your task will cause hanging up and switching. Which means your code snippet is exactly the same as an ordered task queue.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.