Tag Archives: Driver/library version mismatch

[How to Solve] Driver/library version mismatch

After the server updates the NVIDIA driver version, it often appears

Failed to initialize NVML: Driver/library version mismatch

The reason for this problem is that the nvidiadriver version of kernel mod has not been updated

1. Generally, restarting the machine can solve the problem

2. If it can’t be restarted for some reasons, there is also a way to reload kernel mod

In short, there are only two steps

unload nvidiakernel mod

reload nvidia kernel mod

It’s all about execution

sudo rmmod nvidia

sudo nvidia-smi

NVIDIA SMI found that no kernel mod would load it automatically

But things are far from that simple, generally will encounter uninstall failure

$ sudo rmmod nvidia
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset nvidia_uvm

At this time, we need to unload the whole driver bit by bit. First, we need to know the dependency of kernel mod. First, we know from the error message that NVIDIA_ modeset nvidia_ UVM these two mods depend on NVIDIA, so you need to uninstall them first

$lsmod | grep nvidia
nvidia_uvm            647168  0
nvidia_drm             53248  0
nvidia_modeset        790528  1 nvidia_drm
nvidia              12144640  152 nvidia_modeset,nvidia_uvm            12144640  152 nvidia_modeset,nvidia_uvm

As you can see, NVIDIA has 152 words. We can unload NVIDIA first_ UVM and NVIDIA_ modeset

Let’s see which processes use NVIDIA first*

sudo lsof -n -w /dev/nvidia*

I have an understanding of these processes. If the uninstall fails later, remember to close the related processes

Uninstall NVIDIA_ uvm , nvidia_ modeset

sudo rmmod nvidia_uvm
sudo rmmod nvidia_modeset

Then in losf, if NVIDIA’s use by has not dropped to 0, kill the related process. Then perform the relevant unload operation

Finally

sudo rmmod nvidia
nvidia-smi