Resolved
It was a hardware issue. My replacement is running perfectly.
EDIT 3 : Ok…think it’s resolved with combination of a MESA PPA , and Pipewire Upstream PPA (the latter which I think was the one that truly fixed it).
EDIT 2 : Now the issue is that the displays don’t wake if I leave them alone (not asleep). Also, using Ubuntu’s 5.13 as opposed to Pop!_OS’s 5.15 seems to have fixed it, but I’ll keep an eye on it, and update this post.
There’s a weird issue with my 6900XT where it doesn’t display anything after being asleep for a few hours.
I took a nap earlier, then had dinner (so about 6 hours asleep).
If I simply press the reset button, the display doesn’t come back. Only thing that shows up is the the boot screen with the Gigabyte Aorus logo then nothing. I have to turn it off then turn it back on.
Here’s my hardware list:
PCPartPicker Part List
My OS:
Operating System: Kubuntu 21.10
KDE Plasma Version: 5.23.5
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2
Kernel Version: 5.15.11-76051511-generic (64-bit)
Graphics Platform: X11
Processors: 24 × AMD Ryzen 9 5900X 12-Core Processor
Memory: 62.7 GiB of RAM
Graphics Processor: AMD Radeon RX 6900 XT
Has this always happened or just started recently?
Ulfnic
January 17, 2022, 9:13pm
3
Just a side note… things like sleep
and hibernate
have always been a weird place in Linux. I’ve personally given up on both.
I just built this PC last week.
Although, it seems to have been fixed when i switched to Ubuntu’s 5.13 as opposed to Pop!_OS’s 5.15, but I’ll keep an eye on it.
Ethanol
January 18, 2022, 12:24am
5
Suspend has always been OK for me. Sometimes the kernel updates mess with that though but usually its a quick fix.
It also depends what it actually means as every desktop has another name for those functions and then your language. I call it standby (hibernate) and it always worked.
“Sleep” usually refers to saving to RAM, nowadays. Don’t think Hibernate is an option unless user enables it.
Also, it seems to have been fixed when I switched to Ubuntu’s 5.13 kernel from Pop!_OS’s 5.15. Yesterday, after I switched back to Pop!_OS’s 5.15, it didn’t happen again, but it kept on waking up. I found it up 2x: (1) after I got home from work (so around 11 hours later), and (2) after I took a nap (5 hours later).
That is what I mean, in German it has the other meaning. So I actually always prefer the laptop to suspend to RAM and that works, suspend to disk is the tricky thing and it usually takes more time then rebooting the machine and sometimes you have weird glitches. And true, Ubuntu disabled Hibernate as an option or better said, you need real swap to use that.
Sorry. Didn’t realize your reply was meant for someone else.
Totally missed that.
Any idea what this error is:
(alsa_output.pci-0000_0f_00.4.iec958-stereo-41) XRun! rate:256/48000 count:1 time:30425802 delay:8874055 max:8874055
happens either just before the desktop reboots on its own, or the display crashes.
Here’s the output when display crashes:
Jan 22 06:17:30 Y4M1-II kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
Jan 22 06:17:30 Y4M1-II kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
Jan 22 06:17:30 Y4M1-II kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
Jan 22 06:17:30 Y4M1-II kernel: [drm] PSP is resuming...
Jan 22 06:17:30 Y4M1-II kernel: [drm] VRAM is lost due to GPU reset!
Jan 22 06:17:30 Y4M1-II kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000753000).
Jan 22 06:17:30 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset succeeded, trying to resume
Jan 22 06:17:26 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 06:17:19 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU smu mode1 reset
Jan 22 06:17:19 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset
Jan 22 06:17:19 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: MODE1 reset
Jan 22 06:17:19 Y4M1-II kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
Jan 22 06:17:19 Y4M1-II kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate ras ta
Jan 22 06:17:19 Y4M1-II kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x0)
Jan 22 06:17:16 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 06:17:16 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 06:17:15 Y4M1-II kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:950
Jan 22 06:17:15 Y4M1-II kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:950
Jan 22 06:17:15 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable gfxoff!
Jan 22 06:17:15 Y4M1-II kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:80:crtc-1] flip_done timed out
Jan 22 06:17:15 Y4M1-II kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:77:crtc-0] flip_done timed out
Jan 22 06:17:10 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Bailing on TDR for s_job:18e3f, as another already in progress
Jan 22 06:17:10 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 06:17:10 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1688 thread Xorg:cs0 pid 1731
Jan 22 06:17:10 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=112513, emitted seq=112515
Jan 22 06:17:10 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 06:17:10 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Jan 22 06:17:10 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=7058, emitted seq=7059
Jan 22 06:17:10 Y4M1-II kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
Jan 22 06:17:05 Y4M1-II kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
I reset Pipewire, in case it was the cause.
Woke up just now to find my system not responding.
Jan 22 08:07:58 Y4M1-II kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 22 08:07:58 Y4M1-II kernel: Tainted: G OE 5.15.11-76051511-generic #202112220937~1640185481~21.10~b3a2c21
Jan 22 08:07:58 Y4M1-II kernel: INFO: task Xorg:1692 blocked for more than 120 seconds.
Jan 22 08:07:58 Y4M1-II kernel: </TASK>
Jan 22 08:07:58 Y4M1-II kernel: ret_from_fork+0x22/0x30
Jan 22 08:07:58 Y4M1-II kernel: ? set_kthread_struct+0x50/0x50
Jan 22 08:07:58 Y4M1-II kernel: ? process_one_work+0x3d0/0x3d0
Jan 22 08:07:58 Y4M1-II kernel: kthread+0x11e/0x140
Jan 22 08:07:58 Y4M1-II kernel: worker_thread+0x53/0x420
Jan 22 08:07:58 Y4M1-II kernel: process_one_work+0x22b/0x3d0
Jan 22 08:07:58 Y4M1-II kernel: drm_sched_job_timedout+0x6f/0x110 [gpu_sched]
Jan 22 08:07:58 Y4M1-II kernel: amdgpu_job_timedout+0x14f/0x170 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: amdgpu_device_gpu_recover.cold+0x6ec/0x8f8 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: ? drm_fb_helper_set_suspend_unlocked+0x33/0xa0 [drm_kms_helper]
Jan 22 08:07:58 Y4M1-II kernel: amdgpu_device_pre_asic_reset+0xdd/0x480 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: amdgpu_device_ip_suspend+0x21/0x70 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: amdgpu_device_ip_suspend_phase1+0xa3/0x180 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: ? amdgpu_device_set_cg_state+0x12f/0x280 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: ? nv_common_set_clockgating_state+0x9f/0xb0 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: dm_suspend+0xaa/0x270 [amdgpu]
Jan 22 08:07:58 Y4M1-II kernel: mutex_lock+0x34/0x40
Jan 22 08:07:58 Y4M1-II kernel: __mutex_lock_slowpath+0x13/0x20
Jan 22 08:07:58 Y4M1-II kernel: __mutex_lock.constprop.0+0x263/0x490
Jan 22 08:07:58 Y4M1-II kernel: schedule_preempt_disabled+0xe/0x10
Jan 22 08:07:58 Y4M1-II kernel: schedule+0x4e/0xb0
Jan 22 08:07:58 Y4M1-II kernel: __schedule+0x23d/0x590
Jan 22 08:07:58 Y4M1-II kernel: <TASK>
Jan 22 08:07:58 Y4M1-II kernel: Call Trace:
Jan 22 08:07:58 Y4M1-II kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
Jan 22 08:07:58 Y4M1-II kernel: task:kworker/12:1 state:D stack: 0 pid: 246 ppid: 2 flags:0x00004000
Jan 22 08:07:58 Y4M1-II kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 22 08:07:58 Y4M1-II kernel: Tainted: G OE 5.15.11-76051511-generic #202112220937~1640185481~21.10~b3a2c21
Jan 22 08:07:58 Y4M1-II kernel: INFO: task kworker/12:1:246 blocked for more than 120 seconds.
Jan 22 08:05:24 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Bailing on TDR for s_job:1123, as another already in progress
Jan 22 08:05:24 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Bailing on TDR for s_job:43c, as another already in progress
Jan 22 08:05:24 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 08:05:24 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 08:05:24 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=4303, emitted seq=4305
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma3 timeout, signaled seq=1084, emitted seq=1086
Jan 22 08:05:24 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma2 timeout, signaled seq=4379, emitted seq=4381
Jan 22 08:05:20 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:20 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:19 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:19 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:19 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:19 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:19 Y4M1-II kernel: amdgpu_cs_ioctl: 59 callbacks suppressed
Jan 22 08:05:14 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset end with ret = -62
Jan 22 08:05:14 Y4M1-II kernel: snd_hda_intel 0000:0c:00.1: CORB reset timeout#2, CORBRP = 65535
Jan 22 08:05:14 Y4M1-II kernel: snd_hda_intel 0000:0c:00.1: refused to change power state from D3hot to D0
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
...
Jan 22 08:05:14 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset(2) failed
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm] Skip scheduling IBs!
Jan 22 08:05:14 Y4M1-II kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
Jan 22 08:05:14 Y4M1-II kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
Jan 22 08:05:14 Y4M1-II kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
Jan 22 08:05:14 Y4M1-II kernel: [drm] PSP is resuming...
Jan 22 08:05:14 Y4M1-II kernel: [drm] VRAM is lost due to GPU reset!
Jan 22 08:05:14 Y4M1-II kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000753000).
Jan 22 08:05:14 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset succeeded, trying to resume
Jan 22 08:05:03 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:0c:00.0
Jan 22 08:05:03 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset failed
Jan 22 08:05:03 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command!
Jan 22 08:04:58 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU smu mode1 reset
Jan 22 08:04:58 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset
Jan 22 08:04:58 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: MODE1 reset
Jan 22 08:04:58 Y4M1-II kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
Jan 22 08:04:58 Y4M1-II kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate ras ta
Jan 22 08:04:58 Y4M1-II kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x0)
Jan 22 08:04:56 Y4M1-II kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
Jan 22 08:04:56 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Fail to disable dpm features!
Jan 22 08:04:56 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable smu features.
Jan 22 08:04:51 Y4M1-II kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Jan 22 08:04:51 Y4M1-II kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jan 22 08:04:50 Y4M1-II kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Jan 22 08:04:50 Y4M1-II kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jan 22 08:04:50 Y4M1-II kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 22 08:04:50 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1692 thread Xorg:cs0 pid 1745
Jan 22 08:04:50 Y4M1-II kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=570767, emitted seq=570769
When I ssh
ed to this PC, I couldn’t reboot through it. Have to manually shut it off.
And I set Power Supply Idle Control to “Typical Current Idle”.
Ethanol
January 22, 2022, 7:45pm
12
I wonder if adding the amdgpu.dc=0
kernel parameter might help. Thats usually a solution for older cards though…
That made my system not display anything. I fixed it by SSH’ing to it.
Found a Bugzilla report that resembles my issue.
Ethanol
January 23, 2022, 5:44am
14
Oh, sorry about that. I guess that’s strictly for older cards.
I read through that Bugzilla and interestingly it sounds like some people are having the same problem I am having also. I was pretty sure my issue was RAM related and I just ran a memtest86 test on my system to rule that out.
My issue is random desktop lockups, sometimes hours, sometimes seconds after boot. I’m trying amdgpu.dpm=0
as a boot parameter to see if that affects things. A couple people in the Bugzilla comments noted that helped them.
Ok…think it’s resolved. A combination of a MESA PPA , and Pipewire Upstream PPA , the latter which seems to have truly fixed it.
1 Like
Never mind. Still happens occasionally
Created bug reports:
I think the replacement 6900XT resolved it. Guess it was defective, thank God.