[Linux-aus] strange ATI video issue

Russell Coker russell at coker.com.au
Fri Feb 18 16:42:16 AEDT 2022


I have a system running typical KDE desktop stuff with Chrome and Thunderbird.  
After running for a while it gets the following in /var/log/Xorg.0.log:

[928485.680] (WW) RADEON(0): flip queue failed: Device or resource busy
[928485.680] (WW) RADEON(0): Page flip failed: Device or resource busy
[928485.680] (EE) RADEON(0): present flip failed
[928485.890] (WW) RADEON(0): flip queue failed: Device or resource busy
[928485.890] (WW) RADEON(0): Page flip failed: Device or resource busy
[928485.890] (EE) RADEON(0): present flip failed

Then after that it gives messages like the following:

[928541.507] (WW) RADEON(0): flip queue failed: Cannot allocate memory
[928541.507] (WW) RADEON(0): Page flip failed: Cannot allocate memory
[928541.507] (EE) RADEON(0): present flip failed
[928542.008] (WW) RADEON(0): flip queue failed: Cannot allocate memory
[928542.008] (WW) RADEON(0): Page flip failed: Cannot allocate memory
[928542.008] (EE) RADEON(0): present flip failed

At that time normal X operations start failing, if the system is in use the 
window manager blocks, if the system isn't in use then it becomes impossible 
to unlock the screen (it doesn't give a password prompt and doesn't respond to 
"loginctl unlock-session" commands).

This seems to be correlated with BOINC accessing the GPU when the system is 
idle (I haven't yet done sufficient testing to prove this), but other systems 
don't have such problems.  Chrome also seems involved in the problems, but I 
don't know if it's causing it or just making it noticable.

The system with the problem is a Dell PowerEdge T320 with 96G of RAM and the 
following video card according to lspci running at 2560x1440 resolution:
0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Bonaire XTX [Radeon R7 260X/360]

A system without that problem is a HP Proliant ML110 Gen 9 with 64G of RAM and 
the following video card running at 3840x2160 resolution:
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (rev e5)

The Dell has this in the Radeon section of lspci -vv output:
        Capabilities: [200 v1] Physical Resizable BAR
                BAR 0: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB

The HP has this in the lspci -vv output:
        Capabilities: [200 v1] Physical Resizable BAR
                BAR 0: current size: 4GB, supported: 256MB 512MB 1GB 2GB 4GB

Could a mere 256M of buffer memory be contributing to video card problems or 
be a symptom of some deeper problem?

Here's an article on the resizable BAR:
https://www.tomshardware.com/news/geforce-driver-465-89-resizable-bar-support

Both those systems are running Debian/Bullseye (Stable).  The problems on the 
T320 started occurring about 4-8 weeks ago with no deliberate changes that 
seem relevant.  Of course there were new versions of Chrome and Chromium 
(which certainly had other changes than just bug fixes) and bug fixes to 
various Debian packages which shouldn't break things (but computers are 
complex).

My current test is to deny BOINC access to X and see if that makes things more 
reliable.  While running BOINC with only the CPU would be OK, I'd really like 
to get the GPU working as without BOINC it's entirely idle 16 hours a day.

Any ideas?

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/



More information about the linux-aus mailing list