Simple query causes memory leak in Django

django memory leak
django memory usage
uwsgi memory leak

I work in a company that has a large database and I want to perform some update queries on it but it seems to cause a huge memory leak the query is as follow

c= CallLog.objects.all()
for i in c:
   i.cdate = pytz.utc.localize(datetime.datetime.strptime(i.fixed_date, "%y-%m-%d %H:%M"))
   i.save()

I wrote this in the interactive shell of Django

I even tried to use

with transaction.atomic()

but it didn't work, do you have any idea how can I detect the source of

the dataset I am working on is about 27 million

fixed_date is a calculated property

You could try something like this:

from django.core.paginator import Paginator

p = Paginator(CallLog.objects.all().only('cdate'), 2000)
for page in range(1, p.num_pages + 1):
    for i in p.page(page).object_list:
        i.cdate = pytz.utc.localize(datetime.datetime.strptime(i.fixed_date, "%y-%m-%d %H:%M"))
        i.save()

Slicing a query set does not load all the objects in memory only to get a subset but adds limit and offset to the SQL query before hitting the database.

#27066 (Possible Memory leak while evaluating a QuerySet) – Django, Testing a QuerySet in a boolean context, such as using bool() , or , and or an if statement, will cause the query to be executed. If there is at least one result, the� Working Around Memory Leaks in Your Django Application 2019-09-19. Several large Django applications that I’ve worked on ended up with memory leaks at some point. The Python processes slowly increased their memory consumption until crashing. Not fun. Even with automatic restart of the process, there was still some downtime.

You could try to iterate the queryset in batches; see the .iterator() method. See if that improves anything

for obj in CallLog.objects.all():
    obj.cdate = pytz.utc.localize(
        datetime.datetime.strptime(obj.fixed_date, "%y-%m-%d %H:%M"))
    obj.save()

Here is a related answer I found, but it is a few years old.

Django ORM memory leak problem, A bit googling quickly suggested that this was cause by the debug mode in django's settings. The solution is simple: just turn the debug off, i.e. in settings.py: See How can I see the raw SQL queries Django is running? So whilst django itself doesn’t leak memory, the end result is very similar. Memory management in Django – with (bad) illustrations. Lets start with the basics. Lets look at a django process. A django process is a basic unit that handles requests from users. We have several of those on the server, to allow handling more than one request at

Try breaking it into small blocks (since you have only 4gb of ram)

c= CallLog.objects.filter(somefield=somevalue)

When its necessary, I usually use a character or number (ID enting in 1,2,3,4 etc)

Django ORM Memory Leaks in Debug Mode :: Dimagi Blog, TL;DR: Django has tremendous memory leaks for any long running task with I naively first did a really simple profile with heapy. but could map to my own memory situation to dig into what was actually causing the issue. blog posts that indicated that django stores all queries ever run in debug mode,� As the documentation says: . Testing a QuerySet in a boolean context, such as using bool(), or, and or an if statement, will cause the query to be executed. If there is at least one result, the QuerySet is True, otherwise False.

A tale of high memory usage in Django | by Rui Rei, Other than the size of the queryset, this is all very plain and simple business. Then I googled for “django memory leak” and eventually found this SO I'm also clearing the query cache with django.db.reset_queries() ! will cause the DB driver to load the entire result set into memory anyway, even if� From heroku's memory metrics, you can see that the memory is slowly creeping up. For this simple example, the memory went from 35 to 90 MB in 90 minutes. I am experiencing this on my own app using channels version 0.14.0. 👍

It's hard to call DEBUG in Django a "memory leak" when the "leak" is , This does cause memory use to increase with the number of queries issued, call a leak (and genuinely leaking memory in Python takes some effort). The simplest thing that could possibly work for this is an in-memory list� There might well be, but unfortunately tracing a memory leak down takes a lot of time, so I likely won't get to this until December or maybe 2019. If someone else wants to have a look, please do. andrewgodwin added bug blocked/needs-investigation labels Nov 9, 2018

[Django] #31419: Django Postgres memory leak, Patch needs improvement: 0 | Easy pickings: 0. UI/UX: 0 | growing. Using tracemalloc I tried to find what is causing the memory leak by creating a Django will remember every SQL query it executes. This is useful when The first time a QuerySet is evaluated – and, hence, a database query happens – Django saves the query results in the QuerySet ’s cache and returns the results that have been explicitly requested (e.g., the next element, if the QuerySet is being iterated over).

Comments
  • Why do you say that it is a memory leak? What is happening when you run this code? Any error stacktrace that you could add?
  • i have a 4 gb memory machine and when i run htop , i see all the memory got eaten by Django and the system freeze after that
  • You need to use Redis or some kind of Queue handler to process all the data 1 by 1
  • i think this piece of code update the whole table with the same value, and i want to update every record with it's calculated fixed_date value. i thought about F() expression but it doesn't work with calculated properties
  • @AhmedIbrahim I edited my answer. Try this, it should work a little better. If this takes long time and is memory consuming you should think on create an asynchronous task and let the user know when it's done.
  • I'll try this one and get back to you
  • it works perfectly but so slow , but it doesn't matter as i am doing this one time only