Actions
Bug #23458
closedCompute node nvidia pins are out of date, don't work
Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Deployment
Target version:
Story points:
-
Release:
Release relationship:
Auto
Description
packer-build-compute-image: #350
The following packages have unmet dependencies:
nvidia-driver-cuda : Depends: nvidia-persistenced (= 580.126.16-1) but it is not going to be installed
nvidia-xconfig : Depends: libnvidia-cfg1 (= 590.48.01-1) but 580.126.16-1 is to be installed
Recommends: libglx-nvidia0 (= 590.48.01-1) but 580.126.16-1 is to be installed
Updated by Brett Smith 28 days ago
With latest versions, this happens:
× nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver or toolkit install / upgrade events
Loaded: loaded (/etc/systemd/system/nvidia-cdi-refresh.path; enabled; preset: enabled)
Active: failed (Result: unit-start-limit-hit) since Tue 2026-02-24 14:20:50 UTC; 3min 34s ago
Duration: 17.467s
Triggers: ● nvidia-cdi-refresh.service
Feb 24 14:20:32 ip-10-253-254-55 systemd[1]: Started nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver or toolki>
Feb 24 14:20:50 ip-10-253-254-55 systemd[1]: nvidia-cdi-refresh.path: Failed with result 'unit-start-limit-hit'.
Updated by Brett Smith 28 days ago
23458-compute-node-fixes @ d86eefcbbd150bc998886c6426cb9aca7dc918ef - packer-build-compute-image: #352
The "main" bugfix is c5d309263246b7dc5bc245544a737ea2aa81b43e but I took the opportunity to address related issues I encountered on the way, including some preparation for newer versions of NVIDIA tools.
- All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
- Yes
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- N/A
- Code is tested and passing, both automated and manual, what manual testing was done is described.
- tordo-xvhdp-hq055qr4zo0zlvp shows a working container
- CUDA containers still don't work because of #23459 but that's almost certainly a separate issue.
- Tested code incorporates recent main branch changes.
- Yes
- New or changed UI/UX and has gotten feedback from stakeholders.
- N/A
- Documentation has been updated.
- N/A
- Behaves appropriately at the intended scale (describe intended scale).
- No change
- Considered backwards and forwards compatibility issues between client and server.
- No change
- Follows our coding standards and GUI style guidelines.
- N/A
Updated by Brett Smith 28 days ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|45ad8baf9e7c7c6059b55bb7b17502e535a0a101.
Actions