Project

General

Profile

Actions

Bug #23458

closed

Compute node nvidia pins are out of date, don't work

Added by Brett Smith about 1 month ago. Updated 28 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Deployment
Target version:
Story points:
-
Release relationship:
Auto

Description

packer-build-compute-image: #350

The following packages have unmet dependencies:
 nvidia-driver-cuda : Depends: nvidia-persistenced (= 580.126.16-1) but it is not going to be installed
 nvidia-xconfig : Depends: libnvidia-cfg1 (= 590.48.01-1) but 580.126.16-1 is to be installed
                  Recommends: libglx-nvidia0 (= 590.48.01-1) but 580.126.16-1 is to be installed

Subtasks 1 (0 open1 closed)

Task #23460: Review 23458-compute-node-fixesResolvedStephen Smith02/24/2026Actions
Actions #1

Updated by Brett Smith 28 days ago

With latest versions, this happens:

× nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver or toolkit install / upgrade events
     Loaded: loaded (/etc/systemd/system/nvidia-cdi-refresh.path; enabled; preset: enabled)
     Active: failed (Result: unit-start-limit-hit) since Tue 2026-02-24 14:20:50 UTC; 3min 34s ago
   Duration: 17.467s
   Triggers: ● nvidia-cdi-refresh.service

Feb 24 14:20:32 ip-10-253-254-55 systemd[1]: Started nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver or toolki>
Feb 24 14:20:50 ip-10-253-254-55 systemd[1]: nvidia-cdi-refresh.path: Failed with result 'unit-start-limit-hit'.
Actions #2

Updated by Brett Smith 28 days ago

23458-compute-node-fixes @ d86eefcbbd150bc998886c6426cb9aca7dc918ef - packer-build-compute-image: #352

The "main" bugfix is c5d309263246b7dc5bc245544a737ea2aa81b43e but I took the opportunity to address related issues I encountered on the way, including some preparation for newer versions of NVIDIA tools.

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
    • Yes
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • N/A
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
  • Tested code incorporates recent main branch changes.
    • Yes
  • New or changed UI/UX and has gotten feedback from stakeholders.
    • N/A
  • Documentation has been updated.
    • N/A
  • Behaves appropriately at the intended scale (describe intended scale).
    • No change
  • Considered backwards and forwards compatibility issues between client and server.
    • No change
  • Follows our coding standards and GUI style guidelines.
    • N/A
Actions #3

Updated by Brett Smith 28 days ago

  • Subtask #23460 added
Actions #4

Updated by Stephen Smith 28 days ago

Lgtm!

Actions #5

Updated by Brett Smith 28 days ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF