Django with Celery - existing object not found

Django with Celery - existing object not found

I am having problem with executing celery task from another celery task.

Here is the problematic snippet (data object already exists in database, its attributes are just updated inside finalize_data function):

def finalize_data(data):
    data = update_statistics(data)
    data.save()
    from apps.datas.tasks import optimize_data
    optimize_data.delay(data.pk)

@shared_task
def optimize_data(data_pk):
    data = Data.objects.get(pk=data_pk)
    #Do something with data

Get call in optimize_data function fails with "Data matching query does not exist."

If I call the retrieve by pk function in finalize_data function it works fine. It also works fine if I delay the celery task call for some time.

This line:

optimize_data.apply_async((data.pk,), countdown=10)

instead of

optimize_data.delay(data.pk)

works fine. But I don't want to use hacks in my code. Is it possible that .save() call is asynchronously blocking access to that row/object?


I'm guessing your caller is inside a transaction that hasn't committed before celery starts to process the task. Hence celery can't find the record. That is why adding a countdown makes it work.

A 1 second countdown will probably work as well as the 10 second one in your example. I've used 1 second countdowns throughout code to deal with this issue.

Another solution is to stop using transactions.

First steps with Django — Celery 4.4.7 documentation, When you have a working example you can continue to the Next Steps guide. To use Celery with your Django project you must first define an instance of the Celery many instances but there's probably no reason for that when using Django. task option introduced in Celery 3.1 to easily refer to the current task instance. os. environ. setdefault ('DJANGO_SETTINGS_MODULE', 'myproject.settings') app = Celery ('myproject') # Using a string here means the worker doesn't have to serialize # the configuration object to child processes. # - namespace='CELERY' means all celery-related configuration keys # should have a `CELERY_` prefix. app. config_from_object ('django.conf:settings', namespace = 'CELERY') # Load task modules from all registered Django app configs.


I know that this is an old post but I stumbled on this problem today. Lee's answer pointed me to the correct direction but I think a better solution exists today.

Using the on_commit handler provided by Django this problem can be solved without a hackish way of countdowns in the code which might not be intuitive to the user about why it exsits.

I'm not sure if this existed when the question was posted but I'm just posting the answer so that people who come here in the future know about the alternative.

First Steps with Celery — Celery 4.4.7 documentation, This document describes the current stable version of Celery (4.4). It's deliberately kept simple, so as to not confuse you with advanced features. As this instance is used as the entry-point for everything you want to do in Celery, like There are several built-in result backends to choose from: SQLAlchemy/ Django ORM,� We are using DRF, the user makes an API request to create a Django model object. While handling the request we need to make an API call to a third party service, this could take 1 - 5 seconds. 3rd party API call completes, we do something with the response, object is created and new object response returned to the original user request.


You could use an on_commit hook to make sure the celery task isn't triggered until after the transaction commits?

DjangoDocs#performing-actions-after-commit

It's a feature that was added in Django 1.9.

from django.db import transaction

def do_something():
    pass  # send a mail, invalidate a cache, fire off a Celery task, etc.

transaction.on_commit(do_something)

You can also wrap your function in a lambda:

transaction.on_commit(lambda: some_celery_task.delay('arg1'))

The function you pass in will be called immediately after a hypothetical database write made where on_commit() is called would be successfully committed.

If you call on_commit() while there isn’t an active transaction, the callback will be executed immediately.

If that hypothetical database write is instead rolled back (typically when an unhandled exception is raised in an atomic() block), your function will be discarded and never called.

Multiple databases | Django documentation, Django uses the database with the alias of default when no other database has been selected. If no suggestion can be found, it tries the current instance. Do not pass Django model objects to Celery tasks. To avoid cases where the model object has already changed before it is passed to a Celery task, pass the object’s primary key to Celery. You would then, of course, have to use the primary key to get the object from the database before working on it.


django-admin and manage.py | Django documentation, If it's not on your path, you can find it in site-packages/django/bin within your Python installation. Generally, when working on a single Django project, it's easier to use manage.py Run django-admin version to display the current Django version. Omits the primary key in the serialized data of this object since it can be� if scorm.objects.filter(Header__id=qp.id).exists(): . Returns True if the QuerySet contains any results, and False if not. This tries to perform the query in the simplest and fastest way possible, but it does execute nearly the same query as a normal QuerySet query.


Writing views | Django documentation, Here's a view that returns the current date and time, as an HTML document: Return an instance of one of those subclasses instead of a normal HttpResponse in not found</h1>') else: return HttpResponse('<h1>Page was found</h1>'). get() raises a DoesNotExist exception if an object is not found for the given parameters. This exception is also an attribute of the model class. The DoesNotExist exception inherits from django.core.exceptions.ObjectDoesNotExist. You can catch the exception and assign None to go.


Using Celery: Python Task Management, How to use Celery in Python as a workflow orchestration tool, with examples and a string here means the worker will not have to # pickle the object when using In order to launch and test how the task is working, first we need to start the from one execution context and inject it into the current execution context as a� Browse other questions tagged python django celery django-celery django-rest-viewsets or ask your own question. The Overflow Blog Podcast 248: You can’t pay taxes if the website won’t load