Why is this Python script with Matplotlib so slow?

matplotlib slow with large data
matplotlib optimization
matplotlib 3d slow
matplotlib gui
pyqtgraph vs matplotlib
python fast plotting
plt hist slow
speed up matplotlib animation

I'm trying so simulate coin tosses and profits and plot the graph in matplotlib:

from random import choice
import matplotlib.pyplot as plt
import time

start_time = time.time()
num_of_graphs = 2000
tries = 2000
coins = [150, -100]
last_loss = 0


for a in range(num_of_graphs):
    profit = 0
    line = []
    for i in range(tries):
        profit = profit + choice(coins)
        if (profit < 0 and last_loss < i):
            last_loss = i
        line.append(profit)
    plt.plot(line)
plt.show()

print("--- %s seconds ---" % (time.time() - start_time))
print("No losses after " + str(last_loss) + " iterations")

The end result is

--- 9.30498194695 seconds ---
No losses after 310 iterations

Why is it taking so long to run this script? If I change num_of_graphs to 10000, the scripts never finishes.

How would you optimize this?

Your measure of execution time is too rough. The following allows you to measure the time needed for the simulation, separate from the time needed for plotting:

It is using numpy.

import matplotlib.pyplot as plt
import numpy as np
import time


def run_sims(num_sims, num_flips):
    start = time.time()
    sims = [np.random.choice(coins, num_flips).cumsum() for _ in range(num_sims)]
    end = time.time()
    print(f"sim time = {end-start}")
    return sims


def plot_sims(sims):
    start = time.time()
    for line in sims:
        plt.plot(line)
    end = time.time()
    print(f"plotting time = {end-start}")
    plt.show()


if __name__ == '__main__':

    start_time = time.time()
    num_sims = 2000
    num_flips = 2000
    coins = np.array([150, -100])

    plot_sims(run_sims(num_sims, num_flips))
result:
sim time = 0.13962197303771973
plotting time = 6.621474981307983

As you can see, the sim time is greatly reduced (it was on the order of 7 seconds on my 2011 laptop); The plotting time is matplotlib dependent.

why is plotting with Matplotlib so slow?, Matplotlib 2 is extremely slow after `plt.show()` #8129. Open. tomchor opened this The behavior can be reproduced simple with this code. from matplotlib dev-python/matplotlib: Clean old versions up. Unverified. This user  The one on the left (SLOW!) uses the Qt4Agg backend, which renders the matplotlib plot on a QT4 canvas. This is slower than the one on the right, which uses the more traditional TkAgg backend to draw the plot on a Tk canvas with tkinter (FASTER!).

Matplotlib 2 is extremely slow after `plt.show()` · Issue #8129 , makes python repl in cmd slow #5159 (It seems that I never really learned how to use matplotlib in another environement than the ipython notebook. the best way, even if having the code directly in mpl would be also nice. We've all heard it before: Python is slow. When I teach courses on Python for scientific computing, I make this point very early in the course, and tell the students why: it boils down to Python being a dynamically typed, interpreted language, where values are stored not in dense buffers but in scattered objects. And then I talk about how to get around this by using NumPy, SciPy, and related tools for vectorization of operations and calling into compiled code, and go on from there.

In order to better optimize your code, I would always try to replace loops by vectorization using numpy or, depending on my specific needs, other libraries that use numpy under the hood.

In this case, you could calculate and plot your profits this way:

import matplotlib.pyplot as plt
import time
import numpy as np

start_time = time.time()
num_of_graphs = 2000
tries = 2000
coins = [150, -100]

# Create a 2-D array with random choices
# rows for tries, columns for individual runs (graphs).
coin_tosses = np.random.choice(coins, (tries, num_of_graphs))

# Caculate 2-D array of profits by summing 
# cumulatively over rows (trials).
profits = coin_tosses.cumsum(axis=0)

# Plot everything in one shot.
plt.plot(profits)
plt.show()

print("--- %s seconds ---" % (time.time() - start_time))

In my configuration, this code took aprox. 6.3 seconds (6.2 plotting) to run, while your code took almost 15 seconds.

plt.plot() makes python repl in cmd slow · Issue #5159 · matplotlib , By default, matplotlib defers drawing until the end of the script because drawing python shell, has figured out all of these tricks, and is matplotlib aware, so when making figures from scripts, interactive mode can be slow since it redraws the  matplotlib 1.5.3, Python 2.7.3, Linux (Ubuntu 14.04 LTS), installed using pip pyplot.subplots() is significantly slower than pyplot.subplot(). This becomes significant when plotting a large number of subplots.

Using matplotlib in a python shell, By default, matplotlib defers drawing until the end of the script and is matplotlib aware, so when you start ipython in the pylab mode. Note, in batch mode, i.e. when making figures from scripts, interactive mode can be slow  matplotlib is getting slower as the script progresses because it is redrawing all of the lines that you have previously plotted - even the ones that have scrolled off the screen. This is the answer from a previous post answered by Simon Gibbons. matplotlib isn't optimized for speed, rather its graphics.

Using matplotlib in a python shell, (Don't worry too much about the canvas, it is crucial as it is the object that For the pyplot style, the imports at the top of your scripts will typically be: Some people use matplotlib interactively from the python shell and have  I want to know why it will be very slow when I use matplotlib to draw lines? How to fix it? Belows are the demo code. It used plot() to draw a line between two randomly generated points. On my computer, 'END=100/200/500' results 'FPS=36.9/28.6/20'. I need to endless draw lines and it will get worse while time being. How to solve it? Thanks!

Usage Guide, Right now I'm trying matplotlib and I'm quite disappointed with the performance. change the performance at all) consider cleaning up your code, similar to this:. This may not apply to many of you, but I'm usually operating my computers under Linux, so by default I save my matplotlib plots as PNG and SVG. This works fine under Linux but is unbearably slow on my Windows 7 installations [MiKTeX under Python(x,y) or Anaconda], so I've taken to adding this code, and things work fine over there again:

Comments
  • Probably better answers, but first thing I would do since you know how big line is going to be would be to use numpy and pre-allocate your array. line = np.zeros((2000,)) outside of either loop, followed by line[i] = profit inside the second loop. Allocate once and then keep rewriting.