Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp320675pxb; Wed, 3 Feb 2021 06:28:56 -0800 (PST) X-Google-Smtp-Source: ABdhPJwas+ALvJJwDG/9Gut/Xz062qaQrN/N/6Lz6fIi3Ugupvf5Fvm7r//cCKwr14EJ7HpfH3yW X-Received: by 2002:aa7:d808:: with SMTP id v8mr3226141edq.380.1612362535751; Wed, 03 Feb 2021 06:28:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612362535; cv=none; d=google.com; s=arc-20160816; b=e6FBvkzFHV8aM3OU1/wouYggE78PMp7F+OismwH71sMlvPuZeFe75MYLEjj8hRDWnr CP9vghfAljrtSyxMfe/9SSmzfiLVpbmjHyh8TuW1h/tTGTQNlHfQnmjllpT9dzBbQfIT tHhRqREtzSKVqVV5nFv4HcBZaZW5aMNsYlhHXUXWCWfavA3VCSUnpTDPJhJ8Uh8r2oxm /H7364qsmt1vuvTZEorcGj3QRRAO66QuYuYTBUxOYsyjIg2+moIDcQJ1NpRJBvordTXi SLv8mCPPLyzYjjCwf/W2/Z5jjuHkdq7+Cs//2MNYHst2IEJT/3LnXZIVZ+RLhFmUpRke NN/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature; bh=5dWoeYhsK7onz3kVESI3mbfiNswWbGGVQnOqq3keG0k=; b=bYZXK0YgRCUyPpOmg8m6Wqr1/qGVApx53OVbPFJHbhN7VEEiyG+L2Nxa3xAEPF9ZXL LCvIAE0JukkApn1bshQGvhPYomIM72FzP7R5k3sxQL5T3JCVhH9T2CEZrufUikySS8lV dJ9eas1J2ELt9BPAl2exXR5TU1mCFJflCv2d6YzyTmOY36mSre7p4+G2Oj26pavMhrVw wwBezwDV/f3L9hmYMXc8Ox1QIVlCVH4Q10Cp0REeQyEzJMQM8hjXOePp3kNmHzU7yR4L mSLM6xUv70KAJseycjgU3rCDY/vZXNqBxR9KGHEGCxE0TOYPF1Nv5xjGpFiUiuwl5xYj oYOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=ZDgR0P0z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d3si1299150edy.16.2021.02.03.06.28.22; Wed, 03 Feb 2021 06:28:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=ZDgR0P0z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232661AbhBCO1w (ORCPT + 99 others); Wed, 3 Feb 2021 09:27:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232306AbhBCO1r (ORCPT ); Wed, 3 Feb 2021 09:27:47 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC272C0613ED for ; Wed, 3 Feb 2021 06:27:06 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id u14so2674507wri.3 for ; Wed, 03 Feb 2021 06:27:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=5dWoeYhsK7onz3kVESI3mbfiNswWbGGVQnOqq3keG0k=; b=ZDgR0P0zOK0VCZKMQZpzy01bHRWi9mCdo/YigSxXt1SMX0FOXVJBeam0MEK+J+zJSS d8QlbuH3FtFwIeOa9UFIA14V4mbLl6XasTFsKgBjgiZ6dypseAtUwJvalSeZ5z8uHiNa mOAacXvKRSkeHUrLaeZm7zE/9S3gjUINr/ZQw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=5dWoeYhsK7onz3kVESI3mbfiNswWbGGVQnOqq3keG0k=; b=LIs9qTsn+S03+4B0hI4zd03YeTag830VRkTUnht4kGC5jcvtmfjFLJvtIIxWhwJ6YI aUK7t5vSlZaq76xvdV9rXYJbhz/wj9LJs9L0wGX5nL8NIecKLqG6K3OM95RA8MkvBEpj f5QobUIDz9NEK7Pe8sW3Xw1NSl8KBa61XpTglqUduAtbR4Q1QFB4N6bCKmwgJdmd9wCM NslIaP+VfoISL4NEM7G8rgPpWs4puYRhWvS08j8DLHLvWMOD2Iw4mPKW3h+lKmmFjFj7 HxDYe1kbkhYwBI0FIUEeIZMXA4637aOi3Y21I04fTON+iKIV81lY5dLEXVofMZ04eoOg 5RNA== X-Gm-Message-State: AOAM531K1hKWgCU7eLgJFOd8WOwQkA5BSpbzrNmNSkrFITz77FJ0Gg44 AxfGnuq7An4r4M5RlXXpVd7WFw== X-Received: by 2002:adf:ffce:: with SMTP id x14mr3960981wrs.390.1612362425598; Wed, 03 Feb 2021 06:27:05 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id o17sm3915538wrm.52.2021.02.03.06.27.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 06:27:04 -0800 (PST) Date: Wed, 3 Feb 2021 15:27:02 +0100 From: Daniel Vetter To: Alex Deucher Cc: Christian =?iso-8859-1?Q?K=F6nig?= , Alex Deucher , Daniel Gomez , amd-gfx list , dri-devel , Linux Kernel Mailing List Subject: Re: [amdgpu] deadlock Message-ID: Mail-Followup-To: Alex Deucher , Christian =?iso-8859-1?Q?K=F6nig?= , Alex Deucher , Daniel Gomez , amd-gfx list , dri-devel , Linux Kernel Mailing List References: <58e41b62-b8e0-b036-c87d-a84d53f5a26e@amd.com> <538682ee-3e12-a345-2205-2c0f16b496ff@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Operating-System: Linux phenom 5.7.0-1-amd64 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 03, 2021 at 08:56:17AM -0500, Alex Deucher wrote: > On Wed, Feb 3, 2021 at 7:30 AM Christian König wrote: > > > > Am 03.02.21 um 13:24 schrieb Daniel Vetter: > > > On Wed, Feb 03, 2021 at 01:21:20PM +0100, Christian König wrote: > > >> Am 03.02.21 um 12:45 schrieb Daniel Gomez: > > >>> On Wed, 3 Feb 2021 at 10:47, Daniel Gomez wrote: > > >>>> On Wed, 3 Feb 2021 at 10:17, Daniel Vetter wrote: > > >>>>> On Wed, Feb 3, 2021 at 9:51 AM Christian König wrote: > > >>>>>> Am 03.02.21 um 09:48 schrieb Daniel Vetter: > > >>>>>>> On Wed, Feb 3, 2021 at 9:36 AM Christian König wrote: > > >>>>>>>> Hi Daniel, > > >>>>>>>> > > >>>>>>>> this is not a deadlock, but rather a hardware lockup. > > >>>>>>> Are you sure? Ime getting stuck in dma_fence_wait has generally good > > >>>>>>> chance of being a dma_fence deadlock. GPU hang should never result in > > >>>>>>> a forever stuck dma_fence. > > >>>>>> Yes, I'm pretty sure. Otherwise the hardware clocks wouldn't go up like > > >>>>>> this. > > >>>>> Maybe clarifying, could be both. TDR should notice and get us out of > > >>>>> this, but if there's a dma_fence deadlock and we can't re-emit or > > >>>>> force complete the pending things, then we're stuck for good. > > >>>>> -Daniel > > >>>>> > > >>>>>> Question is rather why we end up in the userptr handling for GFX? Our > > >>>>>> ROCm OpenCL stack shouldn't use this. > > >>>>>> > > >>>>>>> Daniel, can you pls re-hang your machine and then dump backtraces of > > >>>>>>> all tasks into dmesg with sysrq-t, and then attach that? Without all > > >>>>>>> the backtraces it's tricky to construct the full dependency chain of > > >>>>>>> what's going on. Also is this plain -rc6, not some more patches on > > >>>>>>> top? > > >>>>>> Yeah, that's still a good idea to have. > > >>>> Here the full backtrace dmesg logs after the hang: > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fraw%2Fkzivm2L3&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885971019%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=a3934SOOSFtRU3RraUe%2BWDgAEDefENxQZcd0prmSZXs%3D&reserved=0 > > >>>> > > >>>> This is another dmesg log with the backtraces after SIGKILL the matrix process: > > >>>> (I didn't have the sysrq enable at the time): > > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fraw%2FpRBwGcj1&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885981018%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nPom9VwIrEZF02hSEnC5Ef8lHdQURMELCapIhwKk2JE%3D&reserved=0 > > >>> I've now removed all our v4l2 patches and did the same test with the 'plain' > > >>> mainline version (-rc6). > > >>> > > >>> Reference: 3aaf0a27ffc29b19a62314edd684b9bc6346f9a8 > > >>> > > >>> Same error, same behaviour. Full dmesg log attached: > > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fraw%2FKgaEf7Y1&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885981018%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=WQw6g9oA38aT1VuuZ8%2F1Y43pG%2BPlV%2F9%2FRHjKdGvZLK4%3D&reserved=0 > > >>> Note: > > >>> dmesg with sysrq-t before running the test starts in [ 122.016502] > > >>> sysrq: Show State > > >>> dmesg with sysrq-t after the test starts in: [ 495.587671] sysrq: Show State > > >> There is nothing amdgpu related in there except for waiting for the > > >> hardware. > > > Yeah, but there's also no other driver that could cause a stuck dma_fence, > > > so why is reset not cleaning up the mess here? Irrespective of why the gpu > > > is stuck, the kernel should at least complete all the dma_fences even if > > > the gpu for some reason is terminally ill ... > > > > That's a good question as well. I'm digging into this. > > > > My best theory is that the amdgpu packages disabled GPU reset for some > > reason. > > The timeout for compute queues is infinite because of long running > compute kernels. You can override with the amdgpu.lockup_timeout > parameter. Uh, that doesn't work. If you want infinite compute queues you need the amdkfd model with preempt-ctx dma_fence. If you allow normal cs ioctl to run forever, you just hang the kernel whenever userspace feels like. Not just the gpu, the kernel (anything that allocates memory, irrespective of process can hang). That's no good. -Daniel > > Alex > > > > > But the much more interesting question is why we end up in this call > > path. I've pinged internally, but east coast is not awake yet :) > > > > Christian. > > > > > -Daniel > > > > > >> This is a pretty standard hardware lockup, but I'm still waiting for an > > >> explanation why we end up in this call path in the first place. > > >> > > >> Christian. > > >> > > >>> > > >>>>>> Christian. > > >>>>>> > > >>>>>>> -Daniel > > >>>>>>> > > >>>>>>>> Which OpenCl stack are you using? > > >>>>>>>> > > >>>>>>>> Regards, > > >>>>>>>> Christian. > > >>>>>>>> > > >>>>>>>> Am 03.02.21 um 09:33 schrieb Daniel Gomez: > > >>>>>>>>> Hi all, > > >>>>>>>>> > > >>>>>>>>> I have a deadlock with the amdgpu mainline driver when running in parallel two > > >>>>>>>>> OpenCL applications. So far, we've been able to replicate it easily by executing > > >>>>>>>>> clinfo and MatrixMultiplication (from AMD opencl-samples). It's quite old the > > >>>>>>>>> opencl-samples so, if you have any other suggestion for testing I'd be very > > >>>>>>>>> happy to test it as well. > > >>>>>>>>> > > >>>>>>>>> How to replicate the issue: > > >>>>>>>>> > > >>>>>>>>> # while true; do /usr/bin/MatrixMultiplication --device gpu \ > > >>>>>>>>> --deviceId 0 -x 1000 -y 1000 -z 1000 -q -t -i 50; done > > >>>>>>>>> # while true; do clinfo; done > > >>>>>>>>> > > >>>>>>>>> Output: > > >>>>>>>>> > > >>>>>>>>> After a minute or less (sometimes could be more) I can see that > > >>>>>>>>> MatrixMultiplication and clinfo hang. In addition, with radeontop you can see > > >>>>>>>>> how the Graphics pipe goes from ~50% to 100%. Also the shader clocks > > >>>>>>>>> goes up from ~35% to ~96%. > > >>>>>>>>> > > >>>>>>>>> clinfo keeps printing: > > >>>>>>>>> ioctl(7, DRM_IOCTL_SYNCOBJ_WAIT, 0x7ffe46e5f950) = -1 ETIME (Timer expired) > > >>>>>>>>> > > >>>>>>>>> And MatrixMultiplication prints the following (strace) if you try to > > >>>>>>>>> kill the process: > > >>>>>>>>> > > >>>>>>>>> sched_yield() = 0 > > >>>>>>>>> futex(0x557e945343b8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, > > >>>>>>>>> NULL, FUTEX_BITSET_MATCH_ANYstrace: Process 651 detached > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> After this, the gpu is not functional at all and you'd need a power cycle reset > > >>>>>>>>> to restore the system. > > >>>>>>>>> > > >>>>>>>>> Hardware info: > > >>>>>>>>> CPU: AMD Ryzen Embedded V1605B with Radeon Vega Gfx (8) @ 2.000GHz > > >>>>>>>>> GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series > > >>>>>>>>> > > >>>>>>>>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > > >>>>>>>>> [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] > > >>>>>>>>> (rev 83) > > >>>>>>>>> DeviceName: Broadcom 5762 > > >>>>>>>>> Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge > > >>>>>>>>> [Radeon Vega Series / Radeon Vega Mobile Series] > > >>>>>>>>> Kernel driver in use: amdgpu > > >>>>>>>>> Kernel modules: amdgpu > > >>>>>>>>> > > >>>>>>>>> Linux kernel info: > > >>>>>>>>> > > >>>>>>>>> root@qt5222:~# uname -a > > >>>>>>>>> Linux qt5222 5.11.0-rc6-qtec-standard #2 SMP Tue Feb 2 09:41:46 UTC > > >>>>>>>>> 2021 x86_64 x86_64 x86_64 GNU/Linux > > >>>>>>>>> > > >>>>>>>>> By enabling the kernel locks stats I could see the MatrixMultiplication is > > >>>>>>>>> hanged in the amdgpu_mn_invalidate_gfx function: > > >>>>>>>>> > > >>>>>>>>> [ 738.359202] 1 lock held by MatrixMultiplic/653: > > >>>>>>>>> [ 738.359206] #0: ffff88810e364fe0 > > >>>>>>>>> (&adev->notifier_lock){+.+.}-{3:3}, at: > > >>>>>>>>> amdgpu_mn_invalidate_gfx+0x34/0xa0 [amdgpu] > > >>>>>>>>> > > >>>>>>>>> I can see in the the amdgpu_mn_invalidate_gfx function: the > > >>>>>>>>> dma_resv_wait_timeout_rcu uses wait_all (fences) and MAX_SCHEDULE_TIMEOUT so, I > > >>>>>>>>> guess the code gets stuck there waiting forever. According to the > > >>>>>>>>> documentation: "When somebody tries to invalidate the page tables we block the > > >>>>>>>>> update until all operations on the pages in question are completed, then those > > >>>>>>>>> pages are marked as accessed and also dirty if it wasn’t a read only access." > > >>>>>>>>> Looks like the fences are deadlocked and therefore, it never returns. Could it > > >>>>>>>>> be possible? any hint to where can I look to fix this? > > >>>>>>>>> > > >>>>>>>>> Thank you in advance. > > >>>>>>>>> > > >>>>>>>>> Here the full dmesg output: > > >>>>>>>>> > > >>>>>>>>> [ 738.337726] INFO: task MatrixMultiplic:653 blocked for more than 122 seconds. > > >>>>>>>>> [ 738.344937] Not tainted 5.11.0-rc6-qtec-standard #2 > > >>>>>>>>> [ 738.350384] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > >>>>>>>>> disables this message. > > >>>>>>>>> [ 738.358240] task:MatrixMultiplic state:D stack: 0 pid: 653 > > >>>>>>>>> ppid: 1 flags:0x00004000 > > >>>>>>>>> [ 738.358254] Call Trace: > > >>>>>>>>> [ 738.358261] ? dma_fence_default_wait+0x1eb/0x230 > > >>>>>>>>> [ 738.358276] __schedule+0x370/0x960 > > >>>>>>>>> [ 738.358291] ? dma_fence_default_wait+0x117/0x230 > > >>>>>>>>> [ 738.358297] ? dma_fence_default_wait+0x1eb/0x230 > > >>>>>>>>> [ 738.358305] schedule+0x51/0xc0 > > >>>>>>>>> [ 738.358312] schedule_timeout+0x275/0x380 > > >>>>>>>>> [ 738.358324] ? dma_fence_default_wait+0x1eb/0x230 > > >>>>>>>>> [ 738.358332] ? mark_held_locks+0x4f/0x70 > > >>>>>>>>> [ 738.358341] ? dma_fence_default_wait+0x117/0x230 > > >>>>>>>>> [ 738.358347] ? lockdep_hardirqs_on_prepare+0xd4/0x180 > > >>>>>>>>> [ 738.358353] ? _raw_spin_unlock_irqrestore+0x39/0x40 > > >>>>>>>>> [ 738.358362] ? dma_fence_default_wait+0x117/0x230 > > >>>>>>>>> [ 738.358370] ? dma_fence_default_wait+0x1eb/0x230 > > >>>>>>>>> [ 738.358375] dma_fence_default_wait+0x214/0x230 > > >>>>>>>>> [ 738.358384] ? dma_fence_release+0x1a0/0x1a0 > > >>>>>>>>> [ 738.358396] dma_fence_wait_timeout+0x105/0x200 > > >>>>>>>>> [ 738.358405] dma_resv_wait_timeout_rcu+0x1aa/0x5e0 > > >>>>>>>>> [ 738.358421] amdgpu_mn_invalidate_gfx+0x55/0xa0 [amdgpu] > > >>>>>>>>> [ 738.358688] __mmu_notifier_release+0x1bb/0x210 > > >>>>>>>>> [ 738.358710] exit_mmap+0x2f/0x1e0 > > >>>>>>>>> [ 738.358723] ? find_held_lock+0x34/0xa0 > > >>>>>>>>> [ 738.358746] mmput+0x39/0xe0 > > >>>>>>>>> [ 738.358756] do_exit+0x5c3/0xc00 > > >>>>>>>>> [ 738.358763] ? find_held_lock+0x34/0xa0 > > >>>>>>>>> [ 738.358780] do_group_exit+0x47/0xb0 > > >>>>>>>>> [ 738.358791] get_signal+0x15b/0xc50 > > >>>>>>>>> [ 738.358807] arch_do_signal_or_restart+0xaf/0x710 > > >>>>>>>>> [ 738.358816] ? lockdep_hardirqs_on_prepare+0xd4/0x180 > > >>>>>>>>> [ 738.358822] ? _raw_spin_unlock_irqrestore+0x39/0x40 > > >>>>>>>>> [ 738.358831] ? ktime_get_mono_fast_ns+0x50/0xa0 > > >>>>>>>>> [ 738.358844] ? amdgpu_drm_ioctl+0x6b/0x80 [amdgpu] > > >>>>>>>>> [ 738.359044] exit_to_user_mode_prepare+0xf2/0x1b0 > > >>>>>>>>> [ 738.359054] syscall_exit_to_user_mode+0x19/0x60 > > >>>>>>>>> [ 738.359062] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > >>>>>>>>> [ 738.359069] RIP: 0033:0x7f6b89a51887 > > >>>>>>>>> [ 738.359076] RSP: 002b:00007f6b82b54b18 EFLAGS: 00000246 ORIG_RAX: > > >>>>>>>>> 0000000000000010 > > >>>>>>>>> [ 738.359086] RAX: fffffffffffffe00 RBX: 00007f6b82b54b50 RCX: 00007f6b89a51887 > > >>>>>>>>> [ 738.359091] RDX: 00007f6b82b54b50 RSI: 00000000c02064c3 RDI: 0000000000000007 > > >>>>>>>>> [ 738.359096] RBP: 00000000c02064c3 R08: 0000000000000003 R09: 00007f6b82b54bbc > > >>>>>>>>> [ 738.359101] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000165a0bc00 > > >>>>>>>>> [ 738.359106] R13: 0000000000000007 R14: 0000000000000001 R15: 0000000000000000 > > >>>>>>>>> [ 738.359129] > > >>>>>>>>> Showing all locks held in the system: > > >>>>>>>>> [ 738.359141] 1 lock held by khungtaskd/54: > > >>>>>>>>> [ 738.359148] #0: ffffffff829f6840 (rcu_read_lock){....}-{1:2}, at: > > >>>>>>>>> debug_show_all_locks+0x15/0x183 > > >>>>>>>>> [ 738.359187] 1 lock held by systemd-journal/174: > > >>>>>>>>> [ 738.359202] 1 lock held by MatrixMultiplic/653: > > >>>>>>>>> [ 738.359206] #0: ffff88810e364fe0 > > >>>>>>>>> (&adev->notifier_lock){+.+.}-{3:3}, at: > > >>>>>>>>> amdgpu_mn_invalidate_gfx+0x34/0xa0 [amdgpu] > > >>>>>>>>> > > >>>>>>>>> Daniel > > >>>>>>>> _______________________________________________ > > >>>>>>>> dri-devel mailing list > > >>>>>>>> dri-devel@lists.freedesktop.org > > >>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885981018%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=OkFv8jiehNoa46Q%2B5yOXUg29cRbzl8voV2GqC8j1V9Q%3D&reserved=0 > > >>>>> -- > > >>>>> Daniel Vetter > > >>>>> Software Engineer, Intel Corporation > > >>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885981018%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=m0e9DrqnuYQoJYwwZAyonKlSfkp9hFTRNoT53OY3IbU%3D&reserved=0 > > >>> _______________________________________________ > > >>> amd-gfx mailing list > > >>> amd-gfx@lists.freedesktop.org > > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C04065956e74d4ea73b2408d8c83eb15a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479518885981018%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BuUCnnGsKhSQc0ldgBPVBIQxYUnvIPwqqLMe81ynrgY%3D&reserved=0 > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch