DPC latency: nvlddmkm.sys

Hellbovine

Active Member
I don't know much about IRQ and some of the related topics, and I'm hoping someone can please point me in the right direction, or maybe help me solve this issue since we have a lot of really advanced techies here.

Running LatencyMon reveals that nvlddmkm.sys (Nvidia Graphics driver kernel) keeps having DPC spikes up to 800 fairly frequently. This within itself is an extremely common issue that can be found all over Google, but I haven't yet come across any solutions, except for 1, but it's not really a sensible one.

According to one of the replies to a LinusTechTips forum (https://linustechtips.com/topic/1222687-dpc-latency-issues-with-nvlddmkm-and-dxgkrnlsys/) this is happening because the graphics card is sharing an IRQ with a problematic device. Indeed, it turns out that my card according to msinfo32.exe is in fact sharing IRQ 16 with one of my motherboard's USB host controllers.

The "solution" given on that forum is to do the following:
1) Uninstall all PCI/PCIe cards using Device Manager. Select to remove drivers if asked. Remember to show "hidden devices" and clear those too. (Use DDU to clear the GFX drivers properly too)

2) Uninstall whatever device is sharing an IRQ with the GFX (but obviously not if it's something vital)

3) Shut down the computer

4) Remove all power from your motherboard, and the battery - use the power switch to drain any caps

5) Physically remove the GFX card and other PCI/PCIe devices from the machine

6) Boot to BIOS and load defaults

7) Boot to Windows

8) Shut down again and power off normally to re-install the GFX card

9) Boot to Windows and install the basic GFX drivers, without GFExperience. Just GFX, PhysX, HDAudio, and then test it out

10) If that is all good, add the remaining devices back, if any

Note: if you have no onboard GFX/Video, then you have to have it installed for step 7 onwards... In this case, make a note of the key-presses needed to load BIOS defaults, and do that step blind (F10 and enter to save, when it starts rebooting again, hit the power button to turn off before it loads Windows). However you do it, you need at least one power-on without the GFX card before re-adding it

There just has to be a better way than this? I already have an idea in mind that I'm going to test tomorrow. I will go into my bios and just randomly disable USB ports until I find the specific ones that are tied to this host controller (usually motherboards split the USB slots into 2 or 3 sets, meaning you'll have 2 or 3 IRQ assignments in total just for USB host controllers). This seems much more simple than doing all the steps listed in the solution. Though I don't know if it will solve the problem.
 
Last edited:

Necrosaro

Member
Or you can also use Nvidia slimmer type programs and most of them have the ability not only to remove a lot of the garbage but add msi as well
 

Hellbovine

Active Member
Thank you for the replies, was a really busy couple of days so I haven't been able to follow-up on anything yet, but I'll check this out when I can get back on my PC.

I did see references to the MSI utility a few times on different forums, but honestly I just assumed they were talking about MSI the motherboard manufacturer, lol. I thought they were talking about like the MSI Afterburner program or whatever it's called :p
 

Necrosaro

Member
Thank you for the replies, was a really busy couple of days so I haven't been able to follow-up on anything yet, but I'll check this out when I can get back on my PC.

I did see references to the MSI utility a few times on different forums, but honestly I just assumed they were talking about MSI the motherboard manufacturer, lol. I thought they were talking about like the MSI Afterburner program or whatever it's called :p
I think we have all been there with MSI thinking one thing when it's the other.
 

garlin

Moderator
Staff member
That won't make a difference. The driver has to support switching IRQ's and some devices don't. Second, most drivers default to assignments based on PCI lanes (bus controller). When you have a laptop or small form-factor (SFF) PC, it's set by hardware design.

You need a special program like MSI, or hacking the registry. Unlike the DOS days, modern installers don't let you pick IRQ.
 

garlin

Moderator
Staff member
Maybe. But mbk1969 says in his thread:
You see, for MSI-mode must participate: chipset, device and device drivers.

It would be easier to remap other devices which are more flexible. Any real-time chipset (graphics, audio, network) wants to be tied to their parent bus controller for lowest latency. Other I/O devices (SATA, USB) are less picky because disk I/O is slower than real-time work.

I think some PC configs are always doomed if you depend on the onboard chipsets, and the real answer is spending money on add-on cards.
Don't own a "performance rig", but you see a ton of complaints where no one can figure out a DPC solution. The motherboard vendors are optimizing for gamers, and not for the audiophile markets.
 

aviv00

Member
mb changing to another pcie slot if its low-end gpu
or flashing old bios ?
disabling the usb hub "fix" the problem ?

it mostly should be fixed by the OEM and bios update
 

Hellbovine

Active Member
Just posting an update for anyone following the thread:

I spent a while troubleshooting. First I disabled all 12 usb slots via my bios, with no change in DPC. Then I also went into device manager and disabled the usb host that was sharing an IRQ with my GPU, and again no change. After spending a bunch of time figuring out which physical usb slots correspond to the bios settings and also the different OS usb hosts, it ended up just verifying that the problem isn't IRQ related. But at least I learned some new stuff and eliminated a suspect from the list.

I haven't tried the MSI vs IRQ thing yet, but with these initial troubleshooting results I doubt it will make a difference. I'll still try anyway, along with other things the next time I get a day to go at it. I'm going to focus on Nvidia next, trying different drivers, specifically the non-DCH ones. In LatencyMon the DirectX kernel also appears as one of the "issues", but I haven't investigated that yet since my theory is it's only spiking because of the Nvidia file. Also, DX is only spiking to like 80, which is well within an excellent DPC range--it only stands out because the rest of my system's DPC is so low in comparison.
 

Necrosaro

Member
Just posting an update for anyone following the thread:

I spent a while troubleshooting. First I disabled all 12 usb slots via my bios, with no change in DPC. Then I also went into device manager and disabled the usb host that was sharing an IRQ with my GPU, and again no change. After spending a bunch of time figuring out which physical usb slots correspond to the bios settings and also the different OS usb hosts, it ended up just verifying that the problem isn't IRQ related. But at least I learned some new stuff and eliminated a suspect from the list.

I haven't tried the MSI vs IRQ thing yet, but with these initial troubleshooting results I doubt it will make a difference. I'll still try anyway, along with other things the next time I get a day to go at it. I'm going to focus on Nvidia next, trying different drivers, specifically the non-DCH ones. In LatencyMon the DirectX kernel also appears as one of the "issues", but I haven't investigated that yet since my theory is it's only spiking because of the Nvidia file. Also, DX is only spiking to like 80, which is well within an excellent DPC range--it only stands out because the rest of my system's DPC is so low in comparison.
Thanks for the update
 

Hellbovine

Active Member
I'm hoping someone could help me out please:

I want to try and update to the latest chipset for my discontinued board to see if that changes anything. But here's the problem... Intel a few years ago did a site-wide wipe of all their legacy/discontinued drivers. I tried using Wayback machine to lookup the old links, but without success. I may be able to use chipsets from the Microsoft Catalog, but that catalog confuses me when it comes to drivers, because no matter what I search for it comes up with way too many results, and then I don't know how to figure out which ones actually apply to my system or not, so I literally have to just download a dozen or two of the latest ones that seem appropriate and try them all, which usually doesn't end up successful for me.

I've never had to deal with this legacy issue before because I always kept current on my hardware for gaming and so I've always been able to find official drivers from the usual places without problem. And in a worst case scenario I had all the drivers saved to a USB for *if* they did get discontinued... But this PC I'm working on is a frankenstein build, it was a combination of parts from two computers, one of which didn't belong to me, and so I don't have backups of some of the drivers.

The motherboard is an Intel Corportation DZ77SL-50K, and in the Device Manager the chipset shows up with the following in the USB controllers:
Intel(R) 7 Series/C216 Chipset Family USB Enhanced Host Controller - 1E2D
Intel(R) 7 Series/C216 Chipset Family USB Enhanced Host Controller - 1E26
Intel(R) USB 3.0 eXtensible Host Controller - 1.0 (Microsoft)

I tried narrowing things down by just searching for "1E2D" and "1E26" in both the catalog and Google, but it's not coming up with anything usable so far. Just a lot of outdated threads from other people experiencing issues with the same board.
 

Hellbovine

Active Member
Well, I'm stumped. I tried everything. I opened literally every single Google search result for "nvlddmkm.sys" in new tabs and read through all of them, trying every proposed solution.

Messed with bios, disabled stuff in device manager, changed drivers multiple times, switched to MSI mode, messed with all four HPET related settings, and a bunch of other things I'm forgetting now since my brain is fried from spending all day on this.

It's such a common issue it seems, I don't understand how it can go on for years without being addressed by now. I feel like this bug is probably the main culprit behind many of the performance issues causing people to hang onto certain versions of W10 because as a workaround some specific combinations of driver versions along with a certain version of W10 makes the issue go away. So the real question then is, where is the issue--in Nvidia's drivers, or in W10. I had no such problem on my XP computer, so I'm leaning towards something in the OS causing a problem.

This thread on Reddit (https://www.reddit.com/r/nvidia/comments/dj6iil/a_comment_on_nvidia_drivers_on_windows_10_with/) was probably the most active of all the places I found regarding this issue.

I renamed this thread to better represent the newer findings. If anyone has any ideas I'm all ears. I did all the basic stuff already, like disabling defender, sysmain, indexer, etcetera. The thing that sticks out the most here, is even in the Reddit thread that OP stated the same thing I did--using the basic Microsoft display driver with the default W10 install doesn't cause any problems. It's only after the Nvidia driver gets installed that it goes haywire (not using geforce experience either). I also specifically used the non-DCH driver and that made no difference. I'm also totally offline, so it's not like the Microsoft Store is messing with my control panel, nor is Windows Update doing anything.

I've tried 6 different driver versions now and they're all the same. I could keep just going further and further back until some old driver eliminates the issue maybe, but that still doesn't solve the problem. There has to be a registry setting in the OS or Nvidia causing the conflict. For example, something along the lines of a power saving feature that got added at some point, or whatever, some new feature that hasn't matured yet or doesn't work right on certain hardware or bios/OS configurations... I'm sure there's a way to truly fix the problem, rather than just workaround it. The problem with going really far back in drivers too is you then lose all the bugfixes and such that each of those drivers resolved, and you just end up trading one problem for another, so I'd rather spend time figuring out what the root issue is instead.

One thing that bothers me a lot is how the OS StartMenuExperience seems to tie into the Nvidia drivers. To see what I mean for yourself if you go into the Nvidia Control Panel and at the top under "Desktop" select the "Display GPU Activity Icon in Notification Area" and then left-click on the new icon that gets added to your tray you can see the following running on the GPU:

searchapp.exe
textinputhost.exe

My theory is that those are the real culprits. Which makes a lot of sense, because frankly the new startmenu is kind of a disaster. I've seen too many bugs and quirks in it already before this issue even arose, so it's clearly not stable in the OS.
 
Last edited:

Clanger

Well-Known Member
sniff the output of the nvidia installer with regfromapp to see what it does with the registry.
make a backup of the current power scheme, install driver then compare default and "after" powerplans with PowerSettingsExplorer or quickcpu's very good power settings tool. if you see new entries/values can you alter them and see what happens. you can do that with PowerSettingsExplorer too.

edits
dont matter about the documentation, as you as you see the entry and its values you can fiddled around with them. prolly better doing it through the api with quickcpu or PowerSettingsExplorer.

might be better to duplicate your required plan, set that copy to Active, run the installer then the tools cos if you fkup you can easily go back to your default plan. quickcpu much better as it has more options to play around with, delete duplicate export backup rename change descriptions.
suggest playing around with new entries/values first, up down, see what happens.
MT_ used to do stuff with the registry but now he uses the api and custom powerplans.

have a look at bitsums own power plan, just need Park Control to add it, see if that sets nvidia stuff with the driver already installed.
 
Last edited:

Hellbovine

Active Member
Yeah good idea. I was already in a place I could do this right now, so I just checked the Windows power plan registry tree, and there is one new key that's added, for adaptive display, but it is empty, presumably because I don't have an Nvidia gsync monitor which probably uses that key. What I mean by empty is that it doesn't add any AC or DC values or anything, it's just a placeholder.

I haven't looked into the Nvidia keys yet, until just now (the ones in HKLM\Software\NVIDIA Corporation\) to see if anything can be tweaked there. It's kind of barren though, so I'm not sure if anything will come out of that section unless I can find documentation on that stuff.
 

Hellbovine

Active Member
I submitted a ticket to Nvidia to see what they say. It'll probably take a few days before I can get it elevated to a senior tech though, since I have to jump through all the typical hoops of people trying to tell you to run sfc and all that garbage.

I read through such a colossal number of links today, and looking at how different everyone's hardware is, yet we all have the same issue, and the only two common factors are Nvidia+W10. I have a feeling that it's way more widespread of an issue than it might seem, because the majority of computer users don't use LatencyMon and so it's just flying under the radar. I'm curious how many of our forum members here have Nvidia cards and if they check also have the same issue, without realizing it. I'll keep troubleshooting it tommorow, spent a solid 10 hours on it today so I'm beat.
 
Top