2

Let's imagine a simple Food model with a name and an expiration date, my goal is to auto delete the object after the expiration date is reached.

I want to delete objects from the database (postgresql in my case) just after exp_date is reached, not filter by exp_date__gt=datetime.datetime.now() in my code then cron/celery once a while a script that filter by exp_date__lt=datetime.datetime.now() and then delete

Food(models.Model):
    name = models.CharField(max_length=200)
    exp_date = models.DateTimeField()

*I could do it with a vanilla view when the object is accessed via an endpoint or even with the DRF like so :

class GetFood(APIView):

    def check_date(self, food):
        """
       checking expiration date   
        """
       
       if food.exp_date <= datetime.datetime.now():
           food.delete()
           return False 

    def get(self, request, *args, **kwargs):

        id = self.kwargs["id"]

        if Food.objects.filter(pk=id).exists():

            food = Food.objects.get(pk=id)
            
            if self.check_date(food) == False:
               return Response({"error": "not found"}, status.HTTP_404_NOT_FOUND)
            else:
                name = food.name 
                return Response({"food":name}, status.HTTP_200_OK)
                
            
        else:
          return Response({"error":"not found"},status.HTTP_404_NOT_FOUND)        

but it would not delete the object if no one try to access it via an endpoint.

*I could also set cronjob with a script that query the database for every Food object which has an expiration date smaller than today and then delete themor even setup Celery. It would indeed just need to run once a day if I was using DateField but as I am using DateTimeField it would need to run every minute (every second for the need of ny project).

*I've also thought of a fancy workaround with a post_save signal with a while loop like :

@receiver(post_save, sender=Food)
def delete_after_exp_date(sender, instance, created, **kwargs):
    if created:
        while instance.exp_date > datetime.datetime.now():
            pass           
     
        else:
           instance.delete()

I don't know if it'd work but it seems very inefficient (if someone could please confirm)

Voila, thanks in advance if you know some ways or some tools to achieve what I want to do, thanks for reading !

11
  • 1
    Is there a hard need to delete the object? Could the consuming parts of your application not just filter by exp_date__gt=datetime.datetime.now()? Otherwise put it on something that receives a docs.djangoproject.com/en/dev/ref/signals/#pre-init signal of an object (say the User one) - these fire constantly. To keep them sensible you could limit any db operation to a particular time (say only on every even second in a minute) Commented Dec 18, 2020 at 15:33
  • If the "need of your project" is to run a delete every second, then run a delete every second. What is the issue? Does the delete take more than a second to run? Do you not know how to schedule things? Are you doubting whether your project really does need to do this? Commented Dec 18, 2020 at 16:21
  • "Every second" means it would need to run every second to delete these objects immediately after exp_date is reached with 100/100 success rate. Where is the issue ? It would run every second for nothing even if no exp_date is going to expire in the next second, that's the issue and this topic is a way to determine how to auto delete an object after a certain date. I'm not doubting of anything and could use cron or celery to run a script every second without problem if that's really your concern. Commented Dec 18, 2020 at 16:29
  • @Heroe__: automatically deleting will be more cumbersome. It means you somehow need something to "schedule" this, like a queueing mechanism. Even that will never be "exact", and even if you somehow would manage that, if you later alter the expiration time, then it will result in more trouble to "cancel" this and reschedule. The best way to implement this is to filter, you can do this more transparent with packages like django-softdelete and occasionally delete the objects effectively. Commented Dec 18, 2020 at 16:32
  • @WillemVanOnsem exact is of course a way to put it., more or less 1 second is fine in my case. I know about djabgo-sofdelete and don't really care about recovering data. Do you know a way to shedule a deletion after an object is created ? maybe I should just use post_save signal and add an individual task to celery queue everytime Commented Dec 18, 2020 at 16:46

2 Answers 2

5

I would advice not to delete the objects, or at least not effectively. Sceduling tasks is cumbersome. Even if you manage to schedule this, the time when you remove the items will always be slighlty off the time when you scheduled this from happening. It also means you will make an extra query per element, and not remove the items in bulk. Furthermore scheduling is inherently more complicated: it means you need something to persist the schedule. If later the expiration date of some food is changed, it will require extra logic to "cancel" the current schedule and create a new one. It also makes the system less "reliable": besides the webserver, the scheduler daemon has to run. It can happen that for some reason the daemon fails, and then you will no longer retrieve food that is not expired.

Therefore it might be better to combine filtering the records such that you only retrieve food that did not expire, and remove at some regular interval Food that has expired. You can easily filter the objects with:

from django.db.models.functions import Now

Food.objects.filter(exp_date__gt=Now())

to retrieve Food that is not expired. To make it more efficient, you can add a database index on the exp_date field:

Food(models.Model):
    name = models.CharField(max_length=200)
    exp_date = models.DateTimeField(db_index=True)

If you need to filter often, you can even work with a Manager [Django-doc]:

from django.db.models.functions import Now

class FoodManager(models.Manager):

    def get_queryset(*args, **kwargs):
        return super().get_queryset(*args, **kwargs).filter(
            exp_date__gt=Now()
        )

class Food(models.Model):
    name = models.CharField(max_length=200)
    exp_date = models.DateTimeField(db_index=True)
    
    objects = FoodManager()

Now if you work with Food.objects you automatically filter out all Food that is expired.

Besides that you can make a script that for example runs daily to remove the Food objects that have expired:

from django.db.models import Now

Food._base_manager.filter(exp_date__lte=Now()).delete()
Sign up to request clarification or add additional context in comments.

Comments

0

Update to the accepted answer. You may run into Super(): No Arguments if you define the method outside the class. I found this answer helpful.

As Per PEP 3135, which introduced "new super":

The new syntax:

super()

is equivalent to:

super(__class__, <firstarg>)

where class is the class that the method was defined in, and is the first parameter of the method (normally self for instance methods, and cls for class methods).

While super is not a reserved word, the parser recognizes the use of super in a method definition and only passes in the class cell when this is found. Thus, calling a global alias of super without arguments will not necessarily work.

As such, you will still need to include self:

class FoodManager(models.Manager):

    def get_queryset(self, *args, **kwargs):
        return super().get_queryset(*args, **kwargs).filter(
        exp_date__gt=Now()
    )

Just something to keep in mind.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.