gpu problem after random time in xfce/x11/buster/nvidia

Everything about X, Gnome, KDE, ... and everything running on it

gpu problem after random time in xfce/x11/buster/nvidia

Postby delare » 2019-10-18 15:43

Hi,

my system:
Debian Buster 10
Kernel 4.19.0-6
XFCE and X11
all packages from buster sources, nothing from backports
AMD Ryzen 9 3900X
Nvidia Geforce RTX 2060 with driver 418.74 (as is it in the buster sources)
Mainboard with X570 chipset
dual monitor setup
their is no second gpu, this is a desktop system (no optimus system)

my problem is that after random time, sometimes after days or once a week, the graphical output stutters for multiple seconds, the mouse stutters too. It happens on open/start or closing windows/programs. it getting smooth again later ( ~ 30 - 60 seconds) but only so far i dont start the wrong application. With the wrong application, it starts stuttering again until I kill the process.

Just as i write this, i am in this situation and i dont see any warnings/errors in the log files. xorg.0.log only outputs this
Code: Select all
[ 10780.005] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x00001360, 0x00001b78)
[ 10781.004] (EE) NVIDIA(GPU-0): WAIT (0, 8, 0x8000, 0x00001b78, 0x00001b78)
[ 10781.004] (II) event3  - PixArt Microsoft USB Optical Mouse: SYN_DROPPED event - some input events have been lost.
[ 10784.005] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x00001b78, 0x00001dac)
[ 10785.004] (EE) NVIDIA(GPU-0): WAIT (0, 8, 0x8000, 0x00001dac, 0x00001dac)
[ 11099.454] nvLock: client timed out, taking the lock
[ 11304.042] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x00003884, 0x0000388c)
[ 11305.032] (EE) NVIDIA(GPU-0): WAIT (0, 8, 0x8000, 0x0000388c, 0x0000388c)


Windows and applications that are already running are not affected, this means they dont produce this effect if they were already running. but stuttering too as the whole graphic stutters until i kill the process with the problem. If i try to restart the display manager in terminal, it hangs on
Code: Select all
Oct 17 21:17:23 europa kernel: [17002.742251] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct 17 21:17:31 europa kernel: [17010.741209] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct 17 21:17:39 europa kernel: [17018.743601] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct 17 21:18:08 europa kernel: [17046.820025] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct 17 21:18:20 europa kernel: [17058.820933] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.


I tried the nvidia-drm.modeset=1 in grub but it was not the solution.

A other point I just found is, Visual Studio Code does not start in this moment, but it starts if i disable the gpu-process for this application. All other programms starts so far they dont use the gpu. If i have a video running, the audio outputs runs normal without any stuttering.

if i start firefox, it starts stuttering but runs after couple of seconds. My Hardware acceleration settings are like use it if avaiable. It looks he try to get gpu acceleration but fails.

any ideas? gpu driver partial crashed? new processes cant get hardware device context? gpu blocks new processes?

the once what i can do is to restart system.
Last edited by delare on 2019-10-22 17:59, edited 1 time in total.
delare
 
Posts: 14
Joined: 2016-05-16 08:33

Re: gpu problem after random time in xfce/x11/buster/nvidia

Postby stevepusser » 2019-10-18 17:27

Perhaps the newer hardware support repo I've set up might resolve these problems. viewtopic.php?f=10&t=143971

It currently has a backported 5.3.2 kernel, Nvidia driver 435.21, and Mesa 19.1.6. You could also try a kernel from liquorix.net, but it's still in the 5.2 series, though I expect they'll jump to the 5.3 series soon.
The MX Linux repositories: Backports galore! If we don't have something, just ask and we'll try--we like challenges. New packages: Clipgrab 3.8.6, Hedgewars 1.0.0, PulseEffects 4.6.8, Telegram-desktop 1.8.15, Pale Moon 28.7.2, KeepassXC 2.5.1
User avatar
stevepusser
 
Posts: 11258
Joined: 2009-10-06 05:53

Re: gpu problem after random time in xfce/x11/buster/nvidia

Postby delare » 2019-10-22 18:09

I hoped there is a solution without backports or experimental packages.

I will try next weekend on a seperate parition

As additional, nvidia-smi fails on getting hardware some sensor data but can listen the running processes

Code: Select all
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.74       Driver Version: 418.74       CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    On   | 00000000:08:00.0  On |                  N/A |
|ERR!   51C    P5   ERR! / 170W |    199MiB /  5901MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1086      G   /usr/lib/xorg/Xorg                           156MiB |
+-----------------------------------------------------------------------------+
delare
 
Posts: 14
Joined: 2016-05-16 08:33

Re: gpu problem after random time in xfce/x11/buster/nvidia

Postby delare » 2019-11-16 09:23

A small update on this. I think the nvidia driver (418.74) and/or x11 has a problem with the nvidia power managment.

Since the last post i set always set PowerMizer to 'Prefer Maximum Performance' and i did not got the problem. Today i forgot it and 2 hours later it crashed again.

Now i throwed
Code: Select all
nvidia-settings -a "[gpu:0]/GpuPowerMizerMode=1"
into the xfce auto run and hope this will help

independent of 'auto' or 'adaptive', after some time the driver gets in struggle
delare
 
Posts: 14
Joined: 2016-05-16 08:33


Return to Desktop & Multimedia

Who is online

Users browsing this forum: No registered users and 5 guests

fashionable