[Linux-aus] strange ATI video issue
Russell Coker
russell at coker.com.au
Fri Feb 18 16:42:16 AEDT 2022
I have a system running typical KDE desktop stuff with Chrome and Thunderbird.
After running for a while it gets the following in /var/log/Xorg.0.log:
[928485.680] (WW) RADEON(0): flip queue failed: Device or resource busy
[928485.680] (WW) RADEON(0): Page flip failed: Device or resource busy
[928485.680] (EE) RADEON(0): present flip failed
[928485.890] (WW) RADEON(0): flip queue failed: Device or resource busy
[928485.890] (WW) RADEON(0): Page flip failed: Device or resource busy
[928485.890] (EE) RADEON(0): present flip failed
Then after that it gives messages like the following:
[928541.507] (WW) RADEON(0): flip queue failed: Cannot allocate memory
[928541.507] (WW) RADEON(0): Page flip failed: Cannot allocate memory
[928541.507] (EE) RADEON(0): present flip failed
[928542.008] (WW) RADEON(0): flip queue failed: Cannot allocate memory
[928542.008] (WW) RADEON(0): Page flip failed: Cannot allocate memory
[928542.008] (EE) RADEON(0): present flip failed
At that time normal X operations start failing, if the system is in use the
window manager blocks, if the system isn't in use then it becomes impossible
to unlock the screen (it doesn't give a password prompt and doesn't respond to
"loginctl unlock-session" commands).
This seems to be correlated with BOINC accessing the GPU when the system is
idle (I haven't yet done sufficient testing to prove this), but other systems
don't have such problems. Chrome also seems involved in the problems, but I
don't know if it's causing it or just making it noticable.
The system with the problem is a Dell PowerEdge T320 with 96G of RAM and the
following video card according to lspci running at 2560x1440 resolution:
0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Bonaire XTX [Radeon R7 260X/360]
A system without that problem is a HP Proliant ML110 Gen 9 with 64G of RAM and
the following video card running at 3840x2160 resolution:
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (rev e5)
The Dell has this in the Radeon section of lspci -vv output:
Capabilities: [200 v1] Physical Resizable BAR
BAR 0: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB
The HP has this in the lspci -vv output:
Capabilities: [200 v1] Physical Resizable BAR
BAR 0: current size: 4GB, supported: 256MB 512MB 1GB 2GB 4GB
Could a mere 256M of buffer memory be contributing to video card problems or
be a symptom of some deeper problem?
Here's an article on the resizable BAR:
https://www.tomshardware.com/news/geforce-driver-465-89-resizable-bar-support
Both those systems are running Debian/Bullseye (Stable). The problems on the
T320 started occurring about 4-8 weeks ago with no deliberate changes that
seem relevant. Of course there were new versions of Chrome and Chromium
(which certainly had other changes than just bug fixes) and bug fixes to
various Debian packages which shouldn't break things (but computers are
complex).
My current test is to deny BOINC access to X and see if that makes things more
reliable. While running BOINC with only the CPU would be OK, I'd really like
to get the GPU working as without BOINC it's entirely idle 16 hours a day.
Any ideas?
--
My Main Blog http://etbe.coker.com.au/
My Documents Blog http://doc.coker.com.au/
More information about the linux-aus
mailing list