Received: by 2002:ab2:1149:0:b0:1f3:1f8c:d0c6 with SMTP id z9csp3005646lqz; Wed, 3 Apr 2024 15:23:51 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUWkjgI2ruMmcqC4QTm5Xq6wN8v5EtskTkILBrg1GOF+TPYFMeAvKbqLFo5VoYJeIuSTJVcKK7NG7jIvZbJcbF7lhidleAENMAW/f1Pcg== X-Google-Smtp-Source: AGHT+IHyoTbvFClfqhZr9+5eMxxMxFNWvaO5zAIKPxLxbSdcgdpkG/+iqnwx3IneVJSv2w0FXrQ6 X-Received: by 2002:ac8:5956:0:b0:432:b389:7c42 with SMTP id 22-20020ac85956000000b00432b3897c42mr706299qtz.16.1712183031019; Wed, 03 Apr 2024 15:23:51 -0700 (PDT) Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id r6-20020a05622a034600b00432f6b2a2dbsi1208080qtw.711.2024.04.03.15.23.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Apr 2024 15:23:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-130676-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-130676-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-130676-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B09831C26707 for ; Wed, 3 Apr 2024 22:22:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5C59A156985; Wed, 3 Apr 2024 22:22:14 +0000 (UTC) Received: from fgw23-7.mail.saunalahti.fi (fgw23-7.mail.saunalahti.fi [62.142.5.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86D0D138494 for ; Wed, 3 Apr 2024 22:22:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=62.142.5.84 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712182933; cv=none; b=hgjQCT6zBRrSlA/gjHnEr9hCvwant+YO1t/p5ZVGocY5KvU9dKE4FPT76wj/mV1VbAGulHUHiHwRVyoAjeQ+xCX5SG0Fzk88+7C7FuWcEABzKRiqJl07/0p0ddIgsyPlEDVO4YMotTLsQ4IB8xasmCsHhPGOhYnamKqVbB5vh6k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712182933; c=relaxed/simple; bh=i2C/ek7GDUup7e0jSuXh074mLdQShr5EYQzutpDBKG8=; h=From:Date:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gqPIzOQbPDR6fjM5cp2AIpN77Y61zB0q5T7Qn6VaxJpmvrJNZAKj1jOpT9pqmLtl5EirftedrtzfvyRy32IxbE23qPfE0DzCT+8m1s+wC7tlLapo2ngmCLl4RqPQrX+UHqIsx2Xns3nxRlKrl/qnP6eCPpthha2+IUTRnpPXS8k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com; spf=fail smtp.mailfrom=gmail.com; arc=none smtp.client-ip=62.142.5.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=gmail.com Received: from localhost (88-113-26-217.elisa-laajakaista.fi [88.113.26.217]) by fgw23.mail.saunalahti.fi (Halon) with ESMTP id 981322a9-f208-11ee-b972-005056bdfda7; Thu, 04 Apr 2024 01:22:03 +0300 (EEST) From: Andy Shevchenko Date: Thu, 4 Apr 2024 01:22:01 +0300 To: liu.yec@h3c.com Cc: daniel.thompson@linaro.org, dianders@chromium.org, gregkh@linuxfoundation.org, jason.wessel@windriver.com, jirislaby@kernel.org, kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org Subject: Re: [PATCH V8] kdb: Fix the deadlock issue in KDB debugging. Message-ID: References: <20240402125802.GC25200@aspen.lan> <20240403061109.3142580-1-liu.yec@h3c.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240403061109.3142580-1-liu.yec@h3c.com> Wed, Apr 03, 2024 at 02:11:09PM +0800, liu.yec@h3c.com kirjoitti: > From: LiuYe > > Currently, if CONFIG_KDB_KEYBOARD is enabled, then kgdboc will > attempt to use schedule_work() to provoke a keyboard reset when > transitioning out of the debugger and back to normal operation. > This can cause deadlock because schedule_work() is not NMI-safe. > > The stack trace below shows an example of the problem. In this > case the master cpu is not running from NMI but it has parked > the slave CPUs using an NMI and the parked CPUs is holding > spinlocks needed by schedule_work(). > example: > BUG: spinlock lockup suspected on CPU#0, namex/10450 > lock: 0xffff881ffe823980, .magic: dead4ead, .owner: namexx/21888, .owner_cpu: 1 > ffff881741d00000 ffff881741c01000 0000000000000000 0000000000000000 > ffff881740f58e78 ffff881741cffdd0 ffffffff8147a7fc ffff881740f58f20 > Call Trace: > [] ? __schedule+0x16d/0xac0 > [] ? schedule+0x3c/0x90 > [] ? schedule_hrtimeout_range_clock+0x10a/0x120 > [] ? mutex_unlock+0xe/0x10 > [] ? ep_scan_ready_list+0x1db/0x1e0 > [] ? schedule_hrtimeout_range+0x13/0x20 > [] ? ep_poll+0x27a/0x3b0 > [] ? wake_up_q+0x70/0x70 > [] ? SyS_epoll_wait+0xb8/0xd0 > [] ? entry_SYSCALL_64_fastpath+0x12/0x75 > CPU: 0 PID: 10450 Comm: namex Tainted: G O 4.4.65 #1 > Hardware name: Insyde Purley/Type2 - Board Product Name1, BIOS 05.21.51.0036 07/19/2019 > 0000000000000000 ffff881ffe813c10 ffffffff8124e883 ffff881741c01000 > ffff881ffe823980 ffff881ffe813c38 ffffffff810a7f7f ffff881ffe823980 > 000000007d2b7cd0 0000000000000001 ffff881ffe813c68 ffffffff810a80e0 > Call Trace: > <#DB> [] dump_stack+0x85/0xc2 > [] spin_dump+0x7f/0x100 > [] do_raw_spin_lock+0xa0/0x150 > [] _raw_spin_lock+0x15/0x20 > [] try_to_wake_up+0x176/0x3d0 > [] wake_up_process+0x15/0x20 > [] insert_work+0x81/0xc0 > [] __queue_work+0x135/0x390 > [] queue_work_on+0x46/0x90 > [] kgdboc_post_exp_handler+0x48/0x70 > [] kgdb_cpu_enter+0x598/0x610 > [] kgdb_handle_exception+0xf2/0x1f0 > [] __kgdb_notify+0x71/0xd0 > [] kgdb_notify+0x35/0x70 > [] notifier_call_chain+0x4a/0x70 > [] notify_die+0x3d/0x50 > [] do_int3+0x89/0x120 > [] int3+0x44/0x80 Ouch. Please, read this https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages and modify the commit message accordingly. > We fix the problem by using irq_work to call schedule_work() > instead of calling it directly. This is because we cannot > resynchronize the keyboard state from the hardirq context > provided by irq_work. This must be done from the task context > in order to call the input subsystem. > > Therefore, we have to defer the work twice. First, safely > switch from the debug trap context (similar to NMI) to the > hardirq, and then switch from the hardirq to the system work queue. > Signed-off-by: LiuYe > Co-authored-by: Daniel Thompson Correct tag is Co-developed-by, btw it's written in the same document the link to which I provided a few lines above. .. > --- a/drivers/tty/serial/kgdboc.c > +++ b/drivers/tty/serial/kgdboc.c > @@ -22,6 +22,7 @@ > #include > #include > #include > +#include Please, keep it ordered (with visible context this should go at least before module.h). -- With Best Regards, Andy Shevchenko