Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A69BEC636CC for ; Mon, 13 Feb 2023 13:49:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230004AbjBMNth (ORCPT ); Mon, 13 Feb 2023 08:49:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229489AbjBMNte (ORCPT ); Mon, 13 Feb 2023 08:49:34 -0500 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 894BE1B559 for ; Mon, 13 Feb 2023 05:49:25 -0800 (PST) Received: from fsav115.sakura.ne.jp (fsav115.sakura.ne.jp [27.133.134.242]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 31DDnMrp098357; Mon, 13 Feb 2023 22:49:22 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav115.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav115.sakura.ne.jp); Mon, 13 Feb 2023 22:49:22 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav115.sakura.ne.jp) Received: from [192.168.1.6] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 31DDnM90098353 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Mon, 13 Feb 2023 22:49:22 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: <393a440f-5f82-432c-bc24-e8de33e29d75@I-love.SAKURA.ne.jp> Date: Mon, 13 Feb 2023 22:49:21 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH v3] locking/lockdep: add debug_show_all_lock_holders() Content-Language: en-US To: Peter Zijlstra Cc: Ingo Molnar , Ingo Molnar , Waiman Long , Will Deacon , Boqun Feng , Andrew Morton , LKML , Linus Torvalds References: <274adab4-9922-1586-7593-08d9db5479a1@I-love.SAKURA.ne.jp> From: Tetsuo Handa In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/02/13 21:49, Peter Zijlstra wrote: >>> And sched_show_task() being an utter piece of crap that will basically >>> print garbage for anything that's running (it doesn't have much >>> options). >>> >>> Should we try and do better? dump_cpu_task() prefers >>> trigger_single_cpu_backtrace(), which sends an interrupt in order to get >>> active registers for the CPU. >> >> What is the intent of using trigger_single_cpu_backtrace() here? >> check_hung_uninterruptible_tasks() is calling trigger_all_cpu_backtrace() >> if sysctl_hung_task_all_cpu_backtrace is set. > > Then have that also print the held locks for those tasks. And skip over > them again later. > >> Locks held and kernel backtrace are helpful for describing deadlock >> situation, but registers values are not. > > Register state is required to start the unwind. You can't unwind a > running task out of thin-air. Excuse me. There are two types of TASK_RUNNING tasks, one is that a thread is actually running on some CPU, and the other is that a thread is waiting for CPU to become available for that thread, aren't there? lockdep_print_held_locks() does not show locks held even if a thread is waiting for CPU to become available for that thread, does it? But sched_show_task() can show backtrace even if a thread is waiting for CPU to become available for that thread, can't it? Therefore, calling sched_show_task() helps understanding what that thread is doing when lockdep_print_held_locks() did not show locks held. > >> What is important is that tasks which are not on CPUs are reported, >> for when a task is reported as hung, that task must be sleeping. >> Therefore, I think sched_show_task() is fine. > > The backtraces generated by sched_show_task() for a running task are > absolutely worthless, might as well not print them. "a thread actually running on some CPU" or "a thread waiting for CPU to become available for that thread", which does this "running task" mean? > > And if I read your Changelog right, you explicitly wanted useful > backtraces for the running tasks -- such that you could see what they > were doing while holding the lock the other tasks were blocked on. Yes, we can get useful backtraces for threads that are waiting for CPU to become available for that thread. That's why sched_show_task() is chosen. > > The only way to do that is to send an interrupt, the interrupt will have > the register state for the interrupted task -- including the stack > pointer. By virtue of running the interrupt handler we know the stack > won't shrink, so we can then safely traverse the stack starting from the > given stack pointer. But trigger_single_cpu_backtrace() is for a thread actually running on some CPU, isn't it? While it would be helpful to get backtrace of a thread that is actually running on some CPU, it would be helpless not getting backtrace of a thread that is waiting for CPU to become available for that thread. We can later get backtrace of threads actually running on some CPU using trigger_all_cpu_backtrace() via sysctl_hung_task_all_cpu_backtrace setting, though I seldom find useful backtraces via trigger_all_cpu_backtrace(); it is likely that khungtaskd thread and some random workqueue thread (which are irrelevant to hung task) are reported via trigger_all_cpu_backtrace()...