Skip to content

GPU hang / NV_ERR_INVALID_DEVICE / nvidia-modeset timeout on Ubuntu 24.04 HWE (Lenovo Legion Pro 5 RTX 4070 Laptop) #1092

@petritbahtiri123

Description

@petritbahtiri123

NVIDIA Open GPU Kernel Modules Version

570.211.01 (open kernel module, Ubuntu package nvidia-driver-570-open)

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Ubuntu 24.04 LTS (HWE)

Kernel Release

6.17.0-19-generic

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 4070 Laptop GPU (UUID: GPU-c79a90b6-3bf5-cbd9-8b1f-6f0d233827b9)

Describe the bug

Problem

System experienced GPU hangs and full desktop freezes requiring power cycle.

Observed errors:

  • NV_ERR_INVALID_DEVICE
  • nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress
  • nvAssert failures

These occurred during normal desktop usage (Chrome, VS Code, Slack)

To Reproduce

Steps to Reproduce

  1. Install NVIDIA open driver (580-open) on Ubuntu 24.04 HWE.
  2. Reboot into system (PRIME on-demand, X11 session).
  3. Connect external monitor (Dell U3421WE).
  4. Launch typical desktop workload:
    • Google Chrome (multiple tabs)
    • VS Code
    • Slack
  5. Use system normally for development workload.

Additional Trigger Condition

  1. Upgrade/downgrade NVIDIA driver (580-open → 570-open) without fully cleaning previous packages.
  2. Reboot system.

Result

  • GPU enters invalid state
  • Desktop freeze occurs
  • System requires power button reboot

Errors observed:

  • NV_ERR_INVALID_DEVICE
  • nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress
  • nvAssert failures

Notes

Issue appears related to inconsistent driver state during upgrade/downgrade of open kernel modules combined with hybrid GPU (PRIME) and external display usage.

Bug Incidence

Sometimes

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

fix-nvidia-laptop-stability.sh

check-nvidia-stability.sh

More Info

Expected Behavior

System should remain stable during normal desktop usage (browser, VS Code, Slack, external monitor) without GPU hangs or requiring power cycle.

Driver upgrades/downgrades should not leave the system in a state where GPU enters invalid device state.


Observed Behavior

After driver transition (580-open → 570-open), system entered unstable state:

  • GPU hang
  • nvidia-modeset timeouts
  • NV_ERR_INVALID_DEVICE
  • full desktop freeze

Key Observation

Issue appears strongly related to inconsistent driver state during upgrade/downgrade of open kernel modules.

After fully cleaning and aligning to a single driver branch (570-open), system became stable.


Additional Insight

  • Hybrid GPU (PRIME on-demand) + external monitor likely increases sensitivity to driver inconsistency
  • Electron-based applications (Chrome, VS Code, Slack) may increase GPU workload but are not root cause

Suggestion

It may be useful to validate:

  • driver upgrade/downgrade path integrity for open kernel modules
  • detection of mixed or partially upgraded driver states
  • recovery behavior when GPU enters invalid state (currently requires hard reboot)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions