Python 3 Multiprocessing Pool

python multiprocessing pool queue
python multiprocessing pool vs process
python pool map multiple arguments
python parallel for loop multiprocessing
python pool apply_async multiple arguments
python multiprocessing class method
python multiprocessing process does not terminate
multiprocessing pool python args

I'm learning to use pool with multiprocessing. I did this script as an exercise.

Can anyone tell me why using a normal for loop took less time than using a pool?

P.S: My CPU has 2 cores.

Thank you very much.

from multiprocessing import Pool
from functools import reduce
import time

def one(n):
    a = n*n
    return a 

if __name__ == '__main__':
    l = list(range(1000))

    p = Pool()
    t = time.time()
    pol = p.map(one, l)
    result = reduce(lambda x,y: x+y, pol)
    print("Using Pool the result is: ", result, "Time: ", time.time() - t )
    p.close()
    p.join()

    def two(n):
        t = time.time()
        p_result = [] 

        for i in n:
            a = i*i 
            p_result.append(a)

        result = reduce(lambda x,y: x+y, p_result)
        print("Not using Pool the result is: ", result, "Time: ", time.time() - t)

    two(l)

Using Pool the result is: 332833500 Time: 0.14810872077941895

Not using Pool the result is: 332833500 Time: 0.0005018711090087891

I think there are several reasons at play here, but I would guess that it largely has to do with the overhead of running multiple processes, which mostly has to do with synchronization and communication, as well as the fact that your non-parallelized code is written a bit more efficiently.

As a basis, here is how your unmodified code runs on my computer:

('Using Pool the result is: ', 332833500, 'Time: ', 0.0009129047393798828)
('Not using Pool the result is: ', 332833500, 'Time: ', 0.000598907470703125)

First of all, I would like to try to level the playing field by making the code of the two() function nearly identical to the parallelized code. Here is the modified two() function:

def two(l):
    t = time.time()

    p_result = map(one, l)

    result = reduce(lambda x,y: x+y, p_result)
    print("Not using Pool the result is: ", result, "Time: ", time.time() - t)

Now, this does not actually make a whole lot of difference in this case, but it will be important in a second to see that both cases are doing the exact same thing. Here is a sample output with this change:

('Using Pool the result is: ', 332833500, 'Time: ', 0.0009338855743408203)
('Not using Pool the result is: ', 332833500, 'Time: ', 0.0006031990051269531)

What I would like to illustrate now is that since the one() function is so computationally cheap, the overhead of the inter-process communication is outweighing the benefit of running it in parallel. I will modify the one() function as follows to force it to do a bunch of extra computation. Note that because of the changes to the two() function, this change will affect both the parallel and the single-threaded code.

def one(n):
    for i in range(100000):
        a = n*n
    return a

The reason for the for loop is to give each process a reason for existence. As you have your original code, each process simply does several multiplications, and then has to send the list of results back to the parent process, and wait to be given a new chunk. It takes much longer to send and wait than it does to complete a single chunk. By adding these extra cycles, it forces each chunk to take longer, without changing the time needed for inter-process communication, and so we begin to see the parallelism pay off. Here are my results when I run the code with this change to the one() function:

('Using Pool the result is: ', 332833500, 'Time: ', 1.861448049545288)
('Not using Pool the result is: ', 332833500, 'Time: ', 3.444211959838867)

So there you have it. All you need is to give your child processes a bit more work, and they will be more worth your while.

17.2. multiprocessing — Process-based parallelism, from multiprocessing import Pool def f(x): return x*x if __name__ == '__main__': with Pool(5) as p: print(p.map(f, [1, 2, 3])). will print to standard output. [1, 4, 9]  ('Using Pool the result is: ', 332833500, 'Time: ', 1.861448049545288) ('Not using Pool the result is: ', 332833500, 'Time: ', 3.444211959838867) So there you have it. All you need is to give your child processes a bit more work, and they will be more worth your while.

When using Pool, python use a global interpreter lock to synchronize multiple thread among multiple processes. That is, when one thread is running, all the other threads are stopped/waiting. Therefore, what you will experience is a sequential execution, not a parallel one. In your example, even if you distribute among multiple threads in the pool, they run sequentially due to the global interpreter lock. Also this adds a lot of overhead on scheduling as well.

From python docs on global interpreter lock:

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

Therefore, what you achieve is not true parallelism. If you need to achieve real multiprocessing capabilities in python, you need to use Processes and this will cause you to use Queues to exchange data between processes.

An introduction to parallel programming using Python's , I think there are several reasons at play here, but I would guess that it largely has to do with the overhead of running multiple processes, which  Python multiprocessing Pool Python multiprocessing Pool can be used for parallel execution of a function across multiple input values, distributing the input data across processes (data parallelism). Below is a simple Python multiprocessing Pool example. Below image shows the output of the above program.

The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine.

Found in chapter Process-based "threading" interface in the Python 2.7.16 documentation

Python 3 Multiprocessing Pool, Python Multiprocessing: The Pool and Process class​​ The pool distributes the tasks to the available processors using a FIFO scheduling. It works like a map reduce architecture. It maps the input to the different processors and collects the output from all the processors. Importable Target Functions¶. One difference between the threading and multiprocessing examples is the extra protection for __main__ used in the multiprocessing examples. Due to the way the new processes are started, the child process needs to be able to import the script containing the target function.

Python Multiprocessing: Pool vs Process, from multiprocessing import Pool from os import getpid def double(i): '__main​__': with Pool() as pool: result = pool.map(double, [1, 2, 3, 4,  multiprocessing - Process-based parallelism - Python 3.8.0 documentation is a package that supports spawning processes using an API similar to the module. The package offers both local and…

Why your multiprocessing Pool is stuck (it's full of , In this video, we will be continuing our treatment of the multiprocessing module in Python Duration: 13:51 Posted: Oct 11, 2018 The multiprocessing.Pool modules tries to provide a similar interface. Pool.apply is like Python apply, except that the function call is performed in a separate process. Pool.apply blocks until the function is completed.

Multiprocessing in Python: Pool, Python multiprocessing Pool can be used for parallel execution of a function across multiple input values,  # For python 2/3 compatibility, define pool context manager # to support the 'with' statement in Python 2 if sys.version_info[0] == 2: from contextlib import contextmanager @contextmanager def multiprocessing_context(*args, **kwargs): pool = multiprocessing.Pool(*args, **kwargs) yield pool pool.terminate() else: multiprocessing_context

Comments
  • Thank you very much TallChuck, your answer was very enlighting.
  • the Global Interpreter Lock is only operative within a single process. If you run a ps while you have a Pool going, you will see that is is in fact using multiple processes
  • Yes, your argument is correct. But as per my personal experience, global interpreter lock affects when using threads instead of processes. What do you think?
  • well sure, but the code presented does not use threads, except insofar as the threading library is used to handle the processes within the multiprocessing library
  • I see. Thanks for the details.
  • It looks like you copied that from docs.python.org/2/library/multiprocessing.html and if you do so please state the source and make 100% clear the text is not yours. Also prefer to not write answers that are nothing more the a copy of text of an off-site resource. Use them as back-up, not as the core of your answer.