Fix `Could not load library libcuda.so` in xformers (2026 Debug Guide)
Quick answer. The xformers error Could not load library libcuda.so means the dynamic loader cannot find the NVIDIA driver library. In 90% of cases it is a missing symlink (the driver ships libcuda.so.1 but not libcuda.so), an LD_LIBRARY_PATH that omits the CUDA lib directory, or an xformers wheel built against a different CUDA than your driver exposes. Start with find / -name 'libcuda.so*' 2>/dev/null and ldconfig -p | grep libcuda before touching anything else.
What the error actually means
xformers links against the NVIDIA CUDA driver library, libcuda.so. When Python imports xformers, the dynamic linker walks /etc/ld.so.cache plus every directory in LD_LIBRARY_PATH looking for that exact filename. If it cannot find it, you get:
RuntimeError: Could not load library libcuda.so. Error: libcuda.so: cannot open shared object file: No such file or directoryNote the distinction between libcuda.so (the NVIDIA driver) and libcudart.so (the CUDA runtime, shipped with the toolkit and with PyTorch wheels). The driver lives outside CUDA Toolkit and ships with your GPU driver package. PyTorch can often run without a system libcuda.so because it bundles its own runtime, but xformers operators that compile or call into driver-level APIs (graph capture, Flash Attention 3, paged attention) need the real driver.
Step 1: confirm what is actually on disk
Run these two commands first. They tell you which fix below applies.
find / -name 'libcuda.so*' 2>/dev/null
ldconfig -p | grep libcudaTypical good output on a working Ubuntu host with the NVIDIA driver installed:
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so.1
/usr/lib/x86_64-linux-gnu/libcuda.so.570.86.10If you see only libcuda.so.1 and the unversioned libcuda.so is missing — jump to fix 1. If you see nothing at all — jump to fix 3 (driver not installed). If you see the files but ldconfig -p is empty for libcuda — jump to fix 2.
Fix 1: the missing libcuda.so symlink (most common)
On many systems — especially Docker base images, WSL2, and minimal cloud GPU instances — only the versioned libcuda.so.1 ships, and the unversioned link xformers wants is missing. Create it manually:
sudo ln -sf /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so
sudo ldconfigVerify:
ldconfig -p | grep libcuda.so
# expect: libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.soNVIDIA documents this gap in its CUDA on WSL guide and in the long-running developer forum thread — the driver intentionally omits the unversioned symlink to keep the build-time stub library from being picked up at runtime, but the side effect is that any consumer asking for the plain name fails to load.
Fix 2: LD_LIBRARY_PATH or ldconfig not pointing at CUDA
If the file exists but ldconfig -p does not list it, your system library cache is stale or the CUDA lib directory was never registered. Two paths to fix it.
Per-shell (quick check):
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
python -c 'import xformers; print(xformers.__version__)'If that works, make it permanent by registering with ldconfig instead of relying on the env var:
echo '/usr/local/cuda/lib64' | sudo tee /etc/ld.so.conf.d/cuda.conf
sudo ldconfig
ldconfig -p | grep libcudartOn Arch the lib lives at /opt/cuda/targets/x86_64-linux/lib; on Ubuntu deb-installed CUDA it is /usr/local/cuda/lib64; on WSL2 the driver libs are at /usr/lib/wsl/lib.
Fix 3: NVIDIA driver not actually installed
If find / -name 'libcuda.so*' returns nothing, you do not have a driver — only the CUDA Toolkit. The toolkit ships a build-time stub at /usr/local/cuda/lib64/stubs/libcuda.so, and pointing your runtime at the stub is a common foot-gun that gives confusing errors later. Install the real driver:
# Ubuntu / Debian
sudo apt update
sudo apt install -y nvidia-driver-570 # or whichever matches your card
sudo reboot
nvidia-smi # confirm the driver loadednvidia-smi must print a driver version and GPU list for xformers to work. If it does not, no userspace fix will rescue you — the kernel module did not load.
Fix 4: WSL2 special case
WSL2 has its own rules. Per NVIDIA’s 2026 CUDA on WSL guide:
- Do not install a Linux NVIDIA driver inside WSL. The Windows host driver is exposed through
/usr/lib/wsl/lib/libcuda.so.1. - Use the
wsl-ubuntuapt repository (developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/), not the regular Ubuntu one. - Install
cuda-toolkit-13-x, not thecudameta-package — the meta-package pulls in a Linux driver that overwrites the host stub and breaks/dev/dxg.
If your WSL2 setup is missing the unversioned symlink (it usually is), use the Fix 1 command but adjust the path:
sudo ln -sf /usr/lib/wsl/lib/libcuda.so.1 /usr/lib/wsl/lib/libcuda.so
sudo ldconfigFix 5: Docker container missing --gpus all or wrong base image
The NVIDIA Container Toolkit mounts the host’s libcuda.so into the container at runtime. Without it the container has no driver library at all.
Run with explicit GPU access:
docker run --rm --gpus all nvidia/cuda:13.0.0-base-ubuntu22.04 nvidia-smiIf nvidia-smi fails inside the container, fix the host first — install nvidia-container-toolkit and restart the Docker daemon:
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart dockerBuild on top of nvidia/cuda:*-runtime-* or nvidia/cuda:*-devel-* images, not bare ubuntu. For xformers specifically, the NGC nvcr.io/nvidia/pytorch images are the safest choice — PyTorch, CUDA, and the matching driver shims are pre-validated together.
Fix 6: xformers wheel built against the wrong CUDA
If the symlink is correct and the driver loads but xformers still complains (or you see follow-on warnings like xFormers can’t load C++/CUDA extensions), your wheel was built for a different CUDA than your driver supports. Each xformers release is pinned to a specific PyTorch build, and PyTorch is pinned to a specific CUDA — mixing indexes produces silently-broken installs.
Detect the mismatch:
python -c "import torch; print(torch.__version__, torch.version.cuda)"
nvidia-smi | grep 'CUDA Version'
python -c "import xformers; print(xformers.__version__)"The nvidia-smi CUDA version is the maximum your driver supports; the torch.version.cuda must be less than or equal to it. If they disagree, reinstall both from the same matched PyTorch index:
# Pick the URL that matches your driver. For CUDA 12.8:
pip install --upgrade --force-reinstall \
torch xformers \
--index-url https://download.pytorch.org/whl/cu128
# For CUDA 13.0:
pip install --upgrade --force-reinstall \
torch xformers \
--index-url https://download.pytorch.org/whl/cu130Pulling xformers from PyPI while torch is on the PyTorch index is the single most common cause of the silently-mismatched-wheel variant. Route both packages through the same --index-url.
Fix 7: conda env not active (or the wrong one)
If you installed xformers inside a conda env but Python is resolving to the system interpreter, you will load the system’s (empty) library path instead of the env’s. Verify:
which python
python -c 'import sys; print(sys.prefix)'
conda info --envsThe interpreter under which python must live inside the env you installed xformers into. If not, conda activate <env> and retry.
Nuclear option: full xformers reinstall
When you have tried the above and want a clean slate:
pip uninstall -y xformers torch
pip cache purge
# Confirm your driver’s CUDA capability
nvidia-smi | grep 'CUDA Version'
# Install matched torch + xformers from the same index
pip install torch xformers --index-url https://download.pytorch.org/whl/cu128xformers 0.0.35 (Feb 2026) requires PyTorch 2.10.0 and migrated to the PyTorch stable API/ABI, so any later PyTorch 2.10+ point release will keep working without rebuilding.
Verify the fix worked
Run all three:
python -c "import xformers; print(xformers.__version__)"
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
python -c "import xformers.ops as xops, torch; q=torch.randn(1,8,64,64,device='cuda',dtype=torch.float16); print(xops.memory_efficient_attention(q,q,q).shape)"The third command actually exercises an xformers CUDA op — if it prints a shape, the install is real. If it raises, the error message will now point at the specific missing op rather than the loader.
Related errors with one-line fixes
libnvrtc.so: cannot open shared object file— install the CUDA Toolkit (sudo apt install cuda-nvrtc-13-0); NVRTC ships with the toolkit, not the driver.libcudnn.so.9: cannot open shared object file—pip install nvidia-cudnn-cu13or install cuDNN at the system level; xformers attention kernels need it.libcuda.so.1: version 'CUDA_X' not found— driver too old for the toolkit; upgrade the driver, not the toolkit.xFormers can’t load C++/CUDA extensions— almost always Fix 6 (wheel mismatch).
FAQ
Does this affect AMD GPUs?
No. libcuda.so is NVIDIA-specific. xformers ships experimental ROCm wheels (--index-url https://download.pytorch.org/whl/rocm7.1) that link against ROCm libraries instead.
Why does PyTorch work but xformers fails?
PyTorch bundles its own copy of the CUDA runtime (libcudart.so) inside the wheel, so it can run without anything CUDA-related installed system-wide as long as the driver loads. xformers operators that call into driver-level APIs — Flash Attention 3, paged attention, graph capture — need the real libcuda.so, which only the driver provides.
Can I run xformers without CUDA?
Yes, but with severe limits. CPU-only xformers exposes the building-block APIs but skips every fused attention kernel — you lose the entire performance reason for using it. For inference of a few tokens it works; for any real workload it does not.
Will pointing at the stub libcuda.so in /usr/local/cuda/lib64/stubs/ work?
No. The stub is link-time only and intentionally returns errors at runtime. NVIDIA omits the unversioned symlink in /usr/lib precisely to keep the stub from being picked up; if you symlink the stub into the runtime lib path you will get worse errors a few seconds later.
Is xformers still maintained in 2026?
Yes. The latest release on PyPI is 0.0.35 (February 2026), targeting PyTorch 2.10 via the stable ABI. The project is active on GitHub with regular releases against new CUDA versions.
Do I need to set LD_LIBRARY_PATH at all if ldconfig is configured?
No, and you should not. ldconfig-managed paths in /etc/ld.so.conf.d/*.conf are the standard mechanism — LD_LIBRARY_PATH is a debugging escape hatch that papers over the real fix and breaks subshells.
Why does the error not show up in a clean Colab notebook?
Colab pre-installs the NVIDIA driver and ships matched torch/xformers wheels — the libcuda symlink is already in place. The error shows up when you upgrade torch or xformers manually from a different index than the one Colab pinned.
References
- facebookresearch/xformers — canonical repo and installation docs
- CUDA on WSL User Guide — the WSL2-specific rules
- NVIDIA Container Toolkit install guide
- NVIDIA forum: libcuda.so should be a symlink
- Felix Sanz — PyTorch/CUDA/xFormers version compatibility