Ask Sawal

Discussion Forum
Notification Icon1
Write Answer Icon
Add Question Icon

How to check cuda visible devices?

3 Answer(s) Available
Answer # 1 #

where gpu_id is the ID of your selected GPU, as seen in the host system's nvidia-smi (a 0-based integer) that will be made available to the guest system (e.g. to the Docker container environment).

You can verify that a different card is selected for each value of gpu_id by inspecting Bus-Id parameter in nvidia-smi run in a terminal in the guest system).

This method based on NVIDIA_VISIBLE_DEVICES exposes only a single card to the system (with local ID zero), hence we also hard-code the other variable, CUDA_VISIBLE_DEVICES to 0 (mainly to prevent it from defaulting to an empty string that would indicate no GPU).

Note that the environmental variable should be set before the guest system is started (so no chances of doing it in your Jupyter Notebook's terminal), for instance using docker run -e NVIDIA_VISIBLE_DEVICES=0 or env in Kubernetes or Openshift.

If you want GPU load-balancing, make gpu_id random at each guest system start.

If setting this with python, make sure you are using strings for all environment variables, including numerical ones.

[3]
Edit
Query
Report
Gwilym Kagey
Station Agent
Answer # 2 #

The GPUs installed on my server are as follows

Until version 1.11, I used the following code to switch GPUs.

Output:

Since version 1.12, the output has changed to the following.

This may be due to a change in the timing of reading environment variables.

GPU allocation to processes by the environment variable CUDA_VISIBELE_DEVICES does not seem to be working.

I tried the following code for loading the torch module, but did not get the expected behavior as in version 1.11.

and del torch

and importlib.reload()

Even with version 1.12, I can switch GPUs by using the following code.

Is there a way to change environment variables on the code? I need your help.

[2]
Edit
Query
Report
Gaye Long
Actuary
Answer # 3 #
  • Verify driver version by looking at: /proc/driver/nvidia/version : .
  • Verify the CUDA Toolkit version.
  • Verify running CUDA GPU jobs by compiling the samples and executing the deviceQuery or bandwidthTest programs.
[1]
Edit
Query
Report
Anjelika Leachman
Promotional Model