Random crashing, guaranteed crash with Remnant, DMP points to Nvidia GPU


Ksenobyte

Member
Local time
1:21 PM
Posts
4
OS
Windows 11
Hi!

My Desktop PC (HP Omen 30L) keeps crashing randomly. This started about a month ago.
It's pretty much guaranteed to crash while playing Remnant 2; Sometimes it crashes in few minutes, sometimes I can play for hours before crash. It also crashes sometimes while doing absolutely nothing. Yesterday it crashed while composing with Bitwig 5.

Minidump seems to point at gpu. The temps don't spike, and the crash could come at high or low usage.
* I have ran DDU to uninstall GPU, I have reseated it.
* ran memtest86+ for hours and no errors.
* swapped DIMMs around slots.
* done UEFI tests, and no errors.
* ran cinebench and furmark and no crashes.
* unchecked HAGS
* checked integrity on Steam on Remnant 2, although the PC crashes while doing other things also
* tried the clean boot, still crashing
* other things that I forgot already, mostly from similar error post suggested procedures

No Blue screen, only Black screen
Winwdows 11 v 22H2 build 22621.2215 (updated recently, didn't help with crashing)


What next..?
Thanks for any help!

EDIT 1: The PC is less than year old, bought 10 months ago, still under warranty
EDIT 2: No user set OC, HP Omen mandatory OC, XMP changed to default - still crashes
 
Windows Build/Version
Winwdows 11 v 22H2 build 22621.2215

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Omen 30L
    CPU
    AMD Ryzen 7 5800X 8-Core Processor 3.80 GHz
    Motherboard
    HP 8876
    Memory
    DDR 4 48GB (2x16, 2x8)
    Graphics Card(s)
    Geforce RTX 3080 Ti, HP GA 102 A1
    Monitor(s) Displays
    Samsung Odyssey G9
    Screen Resolution
    5120x1440
    Hard Drives
    m.2, ssd, Hdd
    Cooling
    HP Liquid Cooler AIO (?)
When were the RAM modules installed?

What source was used to check for compatibility?
 

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
I installed the extra RAM modules 1-2 months after purchasing the PC. No problems then.
I believe the source was... gasp... couple of reddits considering HP Omen RAM upgrades.
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Omen 30L
    CPU
    AMD Ryzen 7 5800X 8-Core Processor 3.80 GHz
    Motherboard
    HP 8876
    Memory
    DDR 4 48GB (2x16, 2x8)
    Graphics Card(s)
    Geforce RTX 3080 Ti, HP GA 102 A1
    Monitor(s) Displays
    Samsung Odyssey G9
    Screen Resolution
    5120x1440
    Hard Drives
    m.2, ssd, Hdd
    Cooling
    HP Liquid Cooler AIO (?)
These were some findings:

a) The RAM modules are mismatched and displayed different speeds.

b) There were misbehaving Nvidia GPU drivers

c) There were BSOD commonly seen with malfunctioning or incompatible hardware.

d) Possible drive problems



Please perform the following steps:



1) Uninstall the Nvidia GPU drivers using DDU > reinstall using the HP website






2) Remove the two newest RAM modules



3) Turn off Windows fast startup




3) Make a new restore point:




4) Run HD Tune (free or trial version) (all drives)
Post images or share links for results on these tabs:
a) Health
b) Benchmark
c) Full error scan



5) Run Sea Tools for Windows Long Generic test:
Post images or share links into this thread using one drive, drop box, or google drive



6) The computer can be placed into safe mode to possibly reduce the frequency of BSOD and complete all of the drive tests.
If the BSOD are too frequent alternative tests can be used.
 

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
Just to be clear, is that the order of operations? Remove GPU drivers, and after create restore point?
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Omen 30L
    CPU
    AMD Ryzen 7 5800X 8-Core Processor 3.80 GHz
    Motherboard
    HP 8876
    Memory
    DDR 4 48GB (2x16, 2x8)
    Graphics Card(s)
    Geforce RTX 3080 Ti, HP GA 102 A1
    Monitor(s) Displays
    Samsung Odyssey G9
    Screen Resolution
    5120x1440
    Hard Drives
    m.2, ssd, Hdd
    Cooling
    HP Liquid Cooler AIO (?)
The above steps can be performed in any order except the restore point:
perform after steps 1 and 3.

Do you have an extended warranty?
If not, please check the status of the HP warranty.


Please uninstall AVG during the troubleshooting using the applicable uninstall tool:


If the computer is under warranty then seek a Return Merchandise Authorization (RMA).

These were the log bugchecks during the past 9 months:

116
117
193
113
141
124
14F
15F
19C
1E
21
3B
50
7E
10D
 
Last edited:

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
The most recent BSOD were related to Nvidia GPU drivers / hardware.

Removing the mismatched RAM will remove a possible confounding factor.

And the same for testing the drives.


The drive tests can run overnight while sleeping and while the computer is in safe mode.

For new BSOD post a new V2 share link into the newest post.


Code:
      Drive: C:
 Free Space: 110.0 GB
Total Space: 976.0 GB
File System: NTFS
      Model: WDC WD BLACK SDBPNTY-1T00-1106

      Drive: E:
 Free Space: 30.6 GB
Total Space: 953.9 GB
File System: NTFS
      Model: Samsung SSD 860 QVO 1TB

      Drive: F:
 Free Space: 440.0 GB
Total Space: 3815.4 GB
File System: NTFS
      Model: ST4000DM004-2CV104


Code:
Name    NVIDIA GeForce RTX 3080 Ti
PNP Device ID    PCI\VEN_10DE&DEV_2208&SUBSYS_88B8103C&REV_A1\4&10895AB3&0&0019
Adapter Type    NVIDIA GeForce RTX 3080 Ti, NVIDIA compatible
Adapter Description    NVIDIA GeForce RTX 3080 Ti
Adapter RAM    (1,048,576) bytes
Installed Drivers    C:\WINDOWS\System32\DriverStore\FileRepository\nvhdcig.inf_amd64_81158b77e02d898c\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvhdcig.inf_amd64_81158b77e02d898c\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvhdcig.inf_amd64_81158b77e02d898c\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvhdcig.inf_amd64_81158b77e02d898c\nvldumdx.dll
Driver Version    31.0.15.3713
INF File    oem19.inf (Section126 section)
Driver    C:\WINDOWS\SYSTEM32\DRIVERSTORE\FILEREPOSITORY\NVHDCIG.INF_AMD64_81158B77E02D898C\NVLDDMKM.SYS (31.0.15.3713, 56.42 MB (59,157,640 bytes), 9/3/2023 2:18 PM)
 

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
I rather suspect this is either a flaky graphics card or a badly seated graphics card. As @zbook says, many of those bugcheck codes are graphics card related, but the two dumps you uploaded are the most interesting. They are both 0x113 bugchecks - and that's rare...
Code:
VIDEO_DXGKRNL_FATAL_ERROR (113)
The dxgkrnl has detected that a violation has occurred. This resulted
in a condition that dxgkrnl can no longer progress.  By crashing, dxgkrnl
is attempting to get enough information into the minidump such that somebody
can pinpoint the crash cause. Any other values after parameter 1 must be
individually examined according to the subtype.
Arguments:
Arg1: 0000000000000019, The subtype of the bugcheck:
Arg2: 0000000000000001
Arg3: 00000000000010de
Arg4: 0000000000002208
The argument values here aren't documented anywhere, but arguments 3 and 4 are the VEN and DEV identifiers for the failing device - they point at the RTX 3080Ti graphics card. We could have suspected that of course, given the bugcheck code.

If we examine the call parameters on the calls stack we can see exactly what went wrong...
Code:
15: kd> !spack -p
No export spack found
15: kd> !stack -p
Call Stack : 12 frames
## Stack-Pointer    Return-Address   Call-Site
00 ffffa98d3ef1f1e8 fffff80684175685 nt!KeBugCheckEx+0
    Parameter[0] = 0000000000000113
    Parameter[1] = 0000000000000019
    Parameter[2] = 0000000000000001
    Parameter[3] = 00000000000010de
01 ffffa98d3ef1f1f0 fffff8068407a41c watchdog!WdLogSingleEntry5+3b45 (perf)
    Parameter[0] = 0000000000000000
    Parameter[1] = 0000000000000013
    Parameter[2] = (unknown)
    Parameter[3] = (unknown)
02 ffffa98d3ef1f2a0 fffff80683f03f84 dxgkrnl!DpiFdoHandleSurpriseRemoval+15c
    Parameter[0] = ffff868c3e197030
    Parameter[1] = ffff868c51a3aaa0
    Parameter[2] = (unknown)
    Parameter[3] = (unknown)
03 ffffa98d3ef1f2f0 fffff80683e80bc9 dxgkrnl!DpiFdoDispatchPnp+d4
    Parameter[0] = ffff868c3e197030
    Parameter[1] = ffff868c51a3aaa0
    Parameter[2] = (unknown)
    Parameter[3] = (unknown)
04 ffffa98d3ef1f390 fffff806978d4acc dxgkrnl!DpiDispatchPnp+e9
    Parameter[0] = ffff868c3e197030
    Parameter[1] = ffff868c51a3aaa0
    Parameter[2] = (unknown)
    Parameter[3] = (unknown)
05 ffffa98d3ef1f4b0 ffff868c3e402000 nvlddmkm+1094acc (leaf)
    Parameter[0] = (unknown)
    Parameter[1] = (unknown)
    Parameter[2] = (unknown)
    Parameter[3] = (unknown)
Notice the dxgkrnl!DpiFdoHandleSurpriseRemoval function call, there's a clue there in that a graphics card should never be surprise removed. Parameter 0 here is the device object, and parameter 1 is the IRP address.

Displaying the device object shows that the graphics driver nvlddmklm.sys was accessing the device - as we might expect, but this all confirms that this is a graphics problem (this also shows the driver object address which we will need later)...
Code:
15: kd> !devobj ffff868c3e197030
Device object (ffff868c3e197030) is for:
 InfoMask field not found for _OBJECT_HEADER at ffff868c3e197000
 \Driver\nvlddmkm DriverObject ffff868c3e0e4530
Current Irp 00000000 RefCount 0 Type 00000023 Flags 00002004
SecurityDescriptor ffff988cec9fbae0 DevExt ffff868c3e197180 DevObjExt ffff868c3e1987f8
ExtensionFlags (0000000000)
Characteristics (0x00000100)  FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) ffff868c35518d30 Name paged out
Device queue is not busy.

Displaying the IRP address we clearly see that 'surprise removal' of the card was the problem here...
Code:
15: kd> !irp ffff868c51a3aaa0
Irp is active with 3 stacks 3 is current (= 0xffff868c51a3ac00)
 No Mdl: No System Buffer: Thread ffff868c4f321040:  Irp stack trace.
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000

            Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000

            Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_PNP(1b), IRP_MN_SURPRISE_REMOVAL(17)]
            0  0 ffff868c3e197030 00000000 00000000-00000000
           \Driver\nvlddmkm
            Args: 00000000 00000000 00000000 00000000
Notice the IRP_MJ_PNP(1b) prefix to the IRP_MN_SURPRISE_REMOVAL text? That's the graphics driver routine that will be called if the card is surprise removed. If we use the driver object address (from the device object output above) we can see what address that surprise removal routine is at...
Code:
15: kd> !drvobj ffff868c3e0e4530 7
fffff8067423ff78: Unable to get value of ObpRootDirectoryObject
fffff8067423ff78: Unable to get value of ObpRootDirectoryObject
Driver object (ffff868c3e0e4530) is for:
 \Driver\nvlddmkm

Driver Extension List: (id , addr)

Couldn't read extension at 0xffff868c3e1e98f0

Device Object list:
ffff868c4901f9f0  ffff868c4b15dc90  ffff868c4b232d10: Could not read device object


DriverEntry:   fffff8069a190b60    nvlddmkm
DriverStartIo: 00000000
DriverUnload:  fffff806978d6110    nvlddmkm
AddDevice:     fffff806978d3d10    nvlddmkm

Dispatch routines:
[00] IRP_MJ_CREATE                      fffff806978d41b0    nvlddmkm+0x10941b0
[01] IRP_MJ_CREATE_NAMED_PIPE           fffff806978d41b0    nvlddmkm+0x10941b0
[02] IRP_MJ_CLOSE                       fffff806978d41b0    nvlddmkm+0x10941b0
[03] IRP_MJ_READ                        fffff806978d41b0    nvlddmkm+0x10941b0
[04] IRP_MJ_WRITE                       fffff806978d41b0    nvlddmkm+0x10941b0
[05] IRP_MJ_QUERY_INFORMATION           fffff806978d41b0    nvlddmkm+0x10941b0
[06] IRP_MJ_SET_INFORMATION             fffff806978d41b0    nvlddmkm+0x10941b0
[07] IRP_MJ_QUERY_EA                    fffff806978d41b0    nvlddmkm+0x10941b0
[08] IRP_MJ_SET_EA                      fffff806978d41b0    nvlddmkm+0x10941b0
[09] IRP_MJ_FLUSH_BUFFERS               fffff806978d41b0    nvlddmkm+0x10941b0
[0a] IRP_MJ_QUERY_VOLUME_INFORMATION    fffff806978d41b0    nvlddmkm+0x10941b0
[0b] IRP_MJ_SET_VOLUME_INFORMATION      fffff806978d41b0    nvlddmkm+0x10941b0
[0c] IRP_MJ_DIRECTORY_CONTROL           fffff806978d41b0    nvlddmkm+0x10941b0
[0d] IRP_MJ_FILE_SYSTEM_CONTROL         fffff806978d41b0    nvlddmkm+0x10941b0
[0e] IRP_MJ_DEVICE_CONTROL              fffff806978d41b0    nvlddmkm+0x10941b0
[0f] IRP_MJ_INTERNAL_DEVICE_CONTROL     fffff806978d41b0    nvlddmkm+0x10941b0
[10] IRP_MJ_SHUTDOWN                    fffff806978d41b0    nvlddmkm+0x10941b0
[11] IRP_MJ_LOCK_CONTROL                fffff806978d41b0    nvlddmkm+0x10941b0
[12] IRP_MJ_CLEANUP                     00000000
[13] IRP_MJ_CREATE_MAILSLOT             00000000
[14] IRP_MJ_QUERY_SECURITY              00000000
[15] IRP_MJ_SET_SECURITY                00000000
[16] IRP_MJ_POWER                       00000000
[17] IRP_MJ_SYSTEM_CONTROL              00000000
[18] IRP_MJ_DEVICE_CHANGE               00000000
[19] IRP_MJ_QUERY_QUOTA                 00000000
[1a] IRP_MJ_SET_QUOTA                   00000000
[1b] IRP_MJ_PNP                         00000000
The IRP_MJ_PNP function is right at the end of that list of functions, and you'll note there is no address or driver listed. That's because the graphics driver doesn't have a surprise removal routine - because the graphics card should never be surprise removed! This is why you got the 0x113 bugcheck. When the kernel detected the surprise removal of the graphics card and looked up the driver object to find the address of a driver routine to handle that, it found only zeros. That caused the kernel to BSOD with a 0x113 bugcheck code.

The question of course is , why did the kernel detect a surprise removal of the graphics card? It may simply need re-seating, so I'd suggest you remove and reinsert it fully. If you have another suitable PCIe slot then try it in the other slot. Also make sure any extra power cables are properly connected at both ends.

If re-seating doesn't work then I'd strongly suspect that the graphics card itself is flaky. In which case I suggest you print this, take it back to your builder, and ask for a replacement graphics card. If that BSODs too then it may well be the motherboard slots(s).
 

My Computer

System One

  • OS
    Windows
Tried Remnant 2 again after DDU, and sure enough it crashed (took a while though), but pc didn't crash as well. It gave Unreal engine error something like dxgi_error_device_removed with reason: dxgi_error_device_hung. Does this point in the same direction as before?
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Omen 30L
    CPU
    AMD Ryzen 7 5800X 8-Core Processor 3.80 GHz
    Motherboard
    HP 8876
    Memory
    DDR 4 48GB (2x16, 2x8)
    Graphics Card(s)
    Geforce RTX 3080 Ti, HP GA 102 A1
    Monitor(s) Displays
    Samsung Odyssey G9
    Screen Resolution
    5120x1440
    Hard Drives
    m.2, ssd, Hdd
    Cooling
    HP Liquid Cooler AIO (?)
For any BSOD post a new V2 share link into the newest post.

Recheck the HP warranty status.

If the warranty is active then contact HP.

If the warranty is not active the see if you can borrow a GPU card for swap testing (Nvidia or AMD).


After posting images of the drive tests plan to run additional component hardware tests.
 

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
Tried Remnant 2 again after DDU, and sure enough it crashed (took a while though), but pc didn't crash as well. It gave Unreal engine error something like dxgi_error_device_removed with reason: dxgi_error_device_hung. Does this point in the same direction as before?
Yes it does, that "device_removed" is the same as the "surprise remove" in the dumps.

Re-seat the card, try it in another slot, or see whether the builder will let you try another card.
 

My Computer

System One

  • OS
    Windows
The best tested drivers are typically displayed on the computer or motherboard manufacturer websites.

DXGI_ERROR_DEVICE_HUNG0x887A0006The application's device failed due to badly formed commands sent by the application. This is an design-time issue that should be investigated and fixed.
DXGI_ERROR_DEVICE_REMOVED0x887A0005The video card has been physically removed from the system, or a driver upgrade for the video card has occurred. The application should destroy and recreate the device. For help debugging the problem, call ID3D10Device::GetDeviceRemovedReason.


Plan HP RMA or swap testing.
 

My Computer

System One

  • OS
    Windows 10
    Computer type
    Laptop
    Manufacturer/Model
    HP
    CPU
    Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
    Motherboard
    Product : 190A Version : KBC Version 94.56
    Memory
    16 GB Total: Manufacturer : Samsung MemoryType : DDR3 FormFactor : SODIMM Capacity : 8GB Speed : 1600
    Graphics Card(s)
    NVIDIA Quadro K3100M; Intel(R) HD Graphics 4600
    Sound Card
    IDT High Definition Audio CODEC; PNP Device ID HDAUDIO\FUNC_01&VEN_111D&DEV_76E0
    Hard Drives
    Model Hitachi HTS727575A9E364
    Antivirus
    Microsoft Defender
    Other Info
    Mobile Workstation
Is the PC in a stable location? There's no chance of it wobbling, getting knocked or bumped is there?
 

My Computer

System One

  • OS
    Windows

Latest Support Threads

Back
Top Bottom