Map raw SQL to multiple related Django models

Question

Due to performance reasons I can't use the ORM query methods of Django and I have to use raw SQL for some complex questions. I want to find a way to map the results of a SQL query to several models.

I know I can use the following statement to map the query results to one model, but I can't figure how to use it to be able to map to related models (like I can do by using the select_related statement in Django).

model_instance = MyModel(**dict(zip(field_names, row_data)))

Is there a relatively easy way to be able to map fields of related tables that are also in the query result set?

S.Lott · Accepted Answer · 2009-04-17 11:54:36Z

1

First, can you prove the ORM is stopping your performance? Sometimes performance problems are simply poor database design, or improper indexes. Usually this comes from trying to force-fit Django's ORM onto a legacy database design. Stored procedures and triggers can have adverse impact on performance -- especially when working with Django where the trigger code is expected to be in the Python model code.

Sometimes poor performance is an application issue. This includes needless order-by operations being done in the database.

The most common performance problem is an application that "over-fetches" data. Casually using the .all() method and creating large in-memory collections. This will crush performance. The Django query sets have to be touched as little as possible so that the query set iterator is given to the template for display.

Once you choose to bypass the ORM, you have to fight out the Object-Relational Impedance Mismatch problem. Again. Specifically, relational "navigation" has no concept of "related": it has to be a first-class fetch of a relational set using foreign keys. To assemble a complex in-memory object model via SQL is simply hard. Circular references make this very hard; resolving FK's into collections is hard.

If you're going to use raw SQL, you have two choices.

Eschew "select related" -- it doesn't exist -- and it's painful to implement.
Invent your own ORM-like "select related" features. A common approach is to add stateful getters that (a) check a private cache to see if they've fetched the related object and if the object doesn't exist, (b) fetch the related object from the database and update the cache.

In the process of inventing your own stateful getters, you'll be reinventing Django's, and you'll probably discover that it isn't the ORM layer, but a database design or an application design issue.

answered Apr 17, 2009 at 11:54

S.Lott

393k83 gold badges521 silver badges791 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Michael Over a year ago

The performance problem is due to the way I had to go around some limitations in the ORM itself. The database design is good (no legacy database). Maybe I should ask if there is a simpler way to write the query with Django. In SQL the query is really simple. But that would be a topic on its own. ;-)

S.Lott Over a year ago

That's my point -- get the Django ORM query to actually work in the Django ORM and everything will be better. Whatever the "limitations" are, it may be a simple misunderstanding or an application design issue that can be fixed.

stealthwang Over a year ago

Answer doesn't resolve question, instead suggests that the issue is likely elsewhere. Voted down. Even a well designed DB schema may have performance issues when scaling on a specific DB platform due to the platforms shortcomings. Those shortcomings may be completely overcome with optimizations in raw queries, and implying that's an application or schema design issue is baseless. Sometimes the problem /is/ the database and that's the question that was asked.

Collectives™ on Stack Overflow

Map raw SQL to multiple related Django models

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related