limit number of threads working in parallel

python threadpoolexecutor thread limit
multithreading maximum number of threads
python thread limit
c++ max number of threads
how many threads in a server
reasonable number of threads
number of threads on server
thousands of threads

am making a function to copy file from local machine to remote creating thread to do sftp in parallel

def copyToServer():
    //does  copy file given host name and credentials

for i in hostsList:
    hostname = i
    username = defaultLogin
    password = defaultPassword
    thread = threading.Thread(target=copyToServer, args=(hostname, username, password, destPath, localPath))
    threadsArray.append(thread)
    thread.start()

this creates thread and does start copying in parallel but i want to limit it to process like 50 threads at a time as total number of servers could be too many

You need to adjust your code to share and keep track of a common value.

This could be done with a Semaphore Object. The object holds an internal counter and every thread try to acquire it. If the counter is bigger than your defined maximum, the thread can't acquire one and will be blocked until one gets free.

A short example shows for a maximum of 5 threads in parallel, that one half of the threads are executed instantly and the others are blocked and wait:

import threading
import time

maxthreads = 5
sema = threading.Semaphore(value=maxthreads)
threads = list()

def task(i):
    sema.acquire()
    print "start %s" % (i,)
    time.sleep(2)
    sema.release()

for i in range(10):
    thread = threading.Thread(target=task,args=(str(i)))
    threads.append(thread)
    thread.start()

The output

start 0
start 1
start 2
start 3
start 4

and after some seconds the first threads are finished the next threads are executed

start 5
start 6
start 7
start 8
start 9

How do I limit number of threads in parallel calculations to avoid , 3) Also one could limit the memory usage per thread to avoid too large data fragments to be worked with in parallel. Also I found that in my precise task it is much� Limit The Number Of C# Tasks That Run In Parallel 8 minute read April 17, 2016 Why I needed to throttle the number of Tasks running simultaneously. In the past few months I have come across the scenario where I wanted to run a whole bunch of Tasks (potentially thousands), but didn’t necessarily want to run all (or even a lot) of them in parallel at the same time.

#!/usr/bin/python
# -*- coding: utf-8 -*-
import time
from threading import Lock, Thread, active_count
from random import uniform # get some random time

thread_list = []
names = ['Alfa', ' Bravo', ' Charlie', ' Delta', ' Echo', ' Foxtrot', ' Golf', ' Hotel', ' India', ' Juliett', ' Kilo', ' Lima']
#-------------------------------------------------------------------------

def testFunction(inputName):
    waitTime = uniform(0.987, 2.345) # Random time between 0.987 and 2.345 seconds
    time.sleep(waitTime)
    print ('Finished working on name: ' + inputName)
#-------------------------------------------------------------------------

n_threads = 4 # define max child threads. 
for list_names in names:

    print ( 'Launching thread with name: ' + list_names )
    t = Thread(target=testFunction, args=(list_names,))
    thread_list.append(t)
    t.start()

    while active_count() > n_threads: # max thread count (includes parent thread)
        print ( '\n == Current active threads ==: ' + str(active_count()-1) )
        time.sleep(1) # block until active threads are less than 4

for ex in thread_list: # wait for all threads to finish
    ex.join()
#-------------------------------------------------------------------------
print ( '\n At this point we continue on main thread \n' )

This should give you something like this

# time ./threads.py
Launching thread with name: Alfa
Launching thread with name:  Bravo
Launching thread with name:  Charlie
Launching thread with name:  Delta

== Current active threads ==: 4

== Current active threads ==: 4
Finished working on name:  Bravo
Finished working on name:  Delta
Finished working on name: Alfa
Finished working on name:  Charlie
Launching thread with name:  Echo
Launching thread with name:  Foxtrot
Launching thread with name:  Golf
Launching thread with name:  Hotel

== Current active threads ==: 4

== Current active threads ==: 4
Finished working on name:  Hotel
Finished working on name:  Foxtrot
Launching thread with name:  India
Launching thread with name:  Juliett

== Current active threads ==: 4
Finished working on name:  Echo
Finished working on name:  Golf
Launching thread with name:  Kilo
Launching thread with name:  Lima

== Current active threads ==: 4
Finished working on name:  India
Finished working on name:  Juliett
Finished working on name:  Lima
Finished working on name:  Kilo

At this point we continue on main thread


real    0m6.945s
user    0m0.034s
sys     0m0.009s

ParallelOptions.MaxDegreeOfParallelism Property (System , Gets or sets the maximum number of concurrent tasks enabled by this ParallelOptions instance. public: property int MaxDegreeOfParallelism { int get(); void set(int� 2.6.1 Determining the Number of Threads for a parallel Region. When execution encounters a parallel directive, the value of the if clause or num_threads clause (if any) on the directive, the current parallel context, and the values of the nthreads-var, dyn-var, thread-limit-var, and max-active-levels-var ICVs are used to determine the number of threads to use in the region.

To those looking to 'quickfix' solution for limiting number of threads in 'threading' module in python3 - the basic logic is to wrap the main function into a wrapper and call on a wrapper containing the stop/go logic.

This below reuses solution proposed by Andpei, however the verbatim code from his post did not work, my modification which worked for me is below.

Python3:

import threading
import time

maxthreads = 3
smphr = threading.Semaphore(value=maxthreads)
threads = list()

SomeInputCollection=("SomeInput1","SomeInput2","SomeInput3","SomeInput4","SomeInput5","SomeInput6")

def yourmainfunction(SomeInput):
    #main function
    print ("Your input was: "+ SomeInput)

def task(SomeInput):
    #yourmainfunction wrapped in a task
    print(threading.currentThread().getName(), 'Starting')
    smphr.acquire()
    yourmainfunction(SomeInput)
    time.sleep(2)
    print(threading.currentThread().getName(), 'Exiting')
    smphr.release()


def main():
    threads = [threading.Thread(name="worker/task", target=task, args=(SomeInput,)) for SomeInput in SomeInputCollection]
    for thread in threads:
        thread.start()
    for thread in threads:
        thread.join()
if __name__== "__main__":
  main()

Output:

worker/task Starting
Your input was: SomeInput1
worker/task Starting
Your input was: SomeInput2
worker/task Starting
Your input was: SomeInput3
worker/task Starting
worker/task Starting
worker/task Starting
worker/task Exiting
Your input was: SomeInput4
worker/task Exiting
worker/task Exiting
Your input was: SomeInput6
Your input was: SomeInput5
worker/task Exiting
worker/task Exiting
worker/task Exiting

How to limit number of tests running in parallel when using async , This is working as designed. The thread pool limits how many threads are executing simultaneously. It is not a limitation of how many can be� Parallel.ForEach partitions (chunks) of the collection it's working on between a number of threads, but that number is calculated based on an algorithm that takes into account and appears to continually monitor the work done by the threads it's allocating to the ForEach.

What are the factors that limit the number of threads in parallel , Like for example, on 12-core CPU how many threads execute in parallel? A software thread is entirely managed by the operating system, which decides when� But it can be a bit of a pain if we are still generating more work to do while we've started processing work as the reader threads could exit too early. Technique 2 - SemaphoreSlim Another approach (inspired by this StackOverflow answer ) to use a SemaphoreSlim with an initialCount equal to the maximum number of threads.

Set or get number of threads that data.table should use, Set and get number of threads to be used in data.table functions that are to limit data.table to one thread for pre-existing explictly parallel user code; e.g. Default 0 means use all CPU available and leave the operating system to multi task. CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100

Custom Thread Pools In Java 8 Parallel Streams, Brief intro to custom thread pools and their use in Java 8 parallel is simply choosing the number based on how many cores your CPU has. To determine the minimum thread pool size, call the GetMinThreads method. If the common language runtime is hosted, for example by Internet Information Services (IIS) or SQL Server, the host can limit or prevent changes to the thread pool size. Use caution when changing the maximum number of threads in the thread pool.

Comments
  • This should be the accepted answer. Works great. BTW do you have any method on how to add priority to each task using this solution? Like I have some tasks that should be able to acquire with priority than others.