How do I select which GPU to run a job on?

tensorflow-gpu
tensorflow gpu example
python cuda visible devices
nvidia-smi
python torch select gpu
tensorflow not using gpu
cuda_visible_devices disable gpu
nvidia visible devices docker

In a multi-GPU computer, how do I designate which GPU a CUDA job should run on?

As an example, when installing CUDA, I opted to install the NVIDIA_CUDA-<#.#>_Samples then ran several instances of the nbody simulation, but they all ran on one GPU 0; GPU 1 was completely idle (monitored using watch -n 1 nvidia-dmi). Checking CUDA_VISIBLE_DEVICES using

echo $CUDA_VISIBLE_DEVICES

I found this was not set. I tried setting it using

CUDA_VISIBLE_DEVICES=1

then running nbody again but it also went to GPU 0.

I looked at the related question, how to choose designated GPU to run CUDA program?, but deviceQuery command is not in the CUDA 8.0 bin directory. In addition to $CUDA_VISIBLE_DEVICES$, I saw other posts refer to the environment variable $CUDA_DEVICES but these were not set and I did not find information on how to use it.

While not directly related to my question, using nbody -device=1 I was able to get the application to run on GPU 1 but using nbody -numdevices=2 did not run on both GPU 0 and 1.

I am testing this on a system running using the bash shell, on CentOS 6.8, with CUDA 8.0, 2 GTX 1080 GPUs, and NVIDIA driver 367.44.

I know when writing using CUDA you can manage and control which CUDA resources to use but how would I manage this from the command line when running a compiled CUDA executable?

The problem was caused by not setting the CUDA_VISIBLE_DEVICES variable within the shell correctly.

To specify CUDA device 1 for example, you would set the CUDA_VISIBLE_DEVICES using

export CUDA_VISIBLE_DEVICES=1

or

CUDA_VISIBLE_DEVICES=1 ./cuda_executable

The former sets the variable for the life of the current shell, the latter only for the lifespan of that particular executable invocation.

If you want to specify more than one device, use

export CUDA_VISIBLE_DEVICES=0,1

or

CUDA_VISIBLE_DEVICES=0,1 ./cuda_executable

How do I select which GPU to run a job on?, However, I do not mean to specify the GPU card for training the algorithm :) CARLA utilizes the default GPU device 0 to render frames. With A3C,  How do I select which GPU to run a job on? (1) The problem was caused by not setting the CUDA_VISIBLE_DEVICES variable within the shell correctly. To specify CUDA device 1 for example, you would set the CUDA_VISIBLE_DEVICES using . export CUDA_VISIBLE_DEVICES = 1. or

Set the following two environment variables:

NVIDIA_VISIBLE_DEVICES=$gpu_id
CUDA_VISIBLE_DEVICES=0

where gpu_id is the ID of your selected GPU, as seen in the host system's nvidia-smi (a 0-based integer) that will be made available to the guest system (e.g. to the Docker container environment).

You can verify that a different card is selected for each value of gpu_id by inspecting Bus-Id parameter in nvidia-smi run in a terminal in the guest system).

More info

This method based on NVIDIA_VISIBLE_DEVICES exposes only a single card to the system (with local ID zero), hence we also hard-code the other variable, CUDA_VISIBLE_DEVICES to 0 (mainly to prevent it from defaulting to an empty string that would indicate no GPU).

Note that the environmental variable should be set before the guest system is started (so no chances of doing it in your Jupyter Notebook's terminal), for instance using docker run -e NVIDIA_VISIBLE_DEVICES=0 or env in Kubernetes or Openshift.

If you want GPU load-balancing, make gpu_id random at each guest system start.

If setting this with python, make sure you are using strings for all environment variables, including numerical ones.

You can verify that a different card is selected for each value of gpu_id by inspecting nvidia-smi's Bus-Id parameter (in a terminal run in the guest system).

The accepted solution based on CUDA_VISIBLE_DEVICES alone does not hide other cards (different from the pinned one), and thus causes access errors if you try to use them in your GPU-enabled python packages. With this solution, other cards are not visible to the guest system, but other users still can access them and share their computing power on an equal basis, just like with CPU's (verified).

This is also preferable to solutions using Kubernetes / Openshift controlers (resources.limits.nvidia.com/gpu), that would impose a lock on the allocated card, removing it from the pool of available resources (so the number of containers with GPU access could not exceed the number of physical cards).

This has been tested under CUDA 8.0, 9.0 and 10.1 in docker containers running Ubuntu 18.04 orchestrated by Openshift 3.11.

How to set a specific GPU device to use when running multiple , Right now I am running a long relion job taking three GPUs, but I would like to be able to queue up cryosparc jobs on the remaining GPU. Currently there is no way​  Open the NVIDIA control panel and go to 3D Settings>Manage 3D Settings. Select the Program Settings tab and click the ‘Add’ button. Browse for the Windows UWP app you want to run with the dedicated GPU, and add it. From the ‘Select the preferred graphics processor for this program’ dropdown, select your GPU.

In case of someone else is doing it in Python and it is not working, try to set it before do the imports of pycuda and tensorflow.

I.e.:

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
...
import pycuda.autoinit
import tensorflow as tf
...

As saw here.

Specify GPU to use at job run time?, On a system with devices CPU:0 and GPU:0 , the GPU:0 device will be selected to run tf.matmul unless you explicitly request running it on  where gpu_id is the ID of your selected GPU, as seen in the host system's nvidia-smi (a 0-based integer) that will be made available to the guest system (e.g. to the Docker container environment). You can verify that a different card is selected for each value of gpu_id by inspecting Bus-Id parameter in nvidia-smi run in a terminal in the guest

You can also set the GPU in the command line so that you don't need to hard-code the device into your script (which may fail on systems without multiple GPUs). Say you want to run your script on GPU number 5, you can type the following on the command line and it will run your script just this once on GPU#5:

CUDA_VISIBLE_DEVICES=5, python test_script.py

Use a GPU, All GPUs in Leonhard are configured in Exclusive Process mode. To run multi-​node job, you will need to request span[ptile=XX] with -R "select[gpu_model0​==GeForceGTX1080Ti]" span[ptile=20]" . How to select which GPU your apps are using. Even though this feature is designed for computers that have both integrated and dedicated GPU it is also available for those that only have an integrated one. Certain users have reported that it does make slight difference, so if you have a PC only with an integrated GPU test it out and let us know. 1.

Getting started with GPUs, To run programs that use GPUs on a cluster, readers should consult the documentation for the clusters they use to find out how to request GPU resources for the jobs they submit. The job scripts We start in main.cu by selecting a GPU device:  GPU means Graphics Processing Unit, and its main purpose is to perform graphics processing operations or do floating point calculations. In simple terms, it is a specialized circuit whose main job is to generate images for the device to display. Every smartphone has a GPU in some form to generate pictures.

The Lattice Boltzmann Method: Principles and Practice, Open the NVIDIA Control Panel. Click the Desktop menu from the menu bar. Select Add “run with graphics processor” to Context Menu. To use, right click on the  4. GPU MEMORY REQUIREMENTS ANSYS Fluent is a memory-intensive application and it is very important to understand the general memory requirements for a particular job. For this reason it is recommended to use a high memory GPU such as the NVIDIA TeslaTM K40 or NVIDIA Quadro® K6000 which have 12 GB of Memory.

Enabling Applications to run with the NVIDIA Graphics Processor, I am Ubuntu 14.04 and have NVIDIA graphics card 820m running with driver v331. On windows we can select which GPU to use to run a  How the GPU became the heart of AI and machine learning. The GPU has evolved from just a graphics chip into a core components of deep learning and machine learning, says Paperspace CEO Dillion Erb.

Comments
  • The nbody application has a command line option to select the GPU to run on - you might want to study that code. For the more general case, CUDA_VISIBLE_DEVICES should work. If it does not, you're probably not using it correctly, and you should probably give a complete example of what you have tried. You should also indicate what OS you are working on and for linux, what shell (e.g. bash, csh, etc.). deviceQuery isn't necessary to any of this, it's just an example app to demonstrate the behavior of CUDA_VISIBLE_DEVICES. The proper environment variable name doesn't have a $ in it.
  • You'll need to learn more about the bash shell you are using. This: CUDA_VISIBLE_DEVICES=1 doesn't permanently set the environment variable (in fact, if that's all you put on that command line, it really does nothing useful.). This: export CUDA_VISIBLE_DEVICES=1 will permanently set it for the remainder of that session. You may want to study how environment variables work in bash, and how various commands affect them, and for how long.
  • deviceQuery is provided with CUDA 8, but you have to build it. If you read the CUDA 8 installation guide for linux, it will explain how to build deviceQuery
  • In /usr/local/cuda/bin, there is a cuda-install-samples-<version>.sh script, which you can use, if the samples were not installed. Then, in the 1_Utilities, folder, in the NVIDIA_Samples installation directory, you will find the deviceQuery. Just calling make in that folder will compile it for you. If I remember correctly, it will copy the binary in the same folder.
  • Should it be watch -n 1 nvidia-smi...
  • So what will happen if CUDA_VISIBLE_DEVICE=0?
  • @KurianBenoy setting CUDA_VISIBLE_DEVICE=0 will select GPU 0 to perform any CUDA tasks. I think this is the default behavior, as all my GPU tasks were going to GPU 0 before I set the variable, so it may not be necessary to actually set that, depending on your use case.
  • @StevenC.Howell I was thinking CUDA_VISIBLE_DEVICE=0 means a CPU system. Thanks for clarrifying
  • @KurianBenoy CUDA_VISIBLE_DEVICES="" means CPU