Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp4747837ybl; Mon, 26 Aug 2019 15:27:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqxLAeb20tE+HNrxwyVANjoi6EPBK6Svd1vsXLlQrZPKL9JuyfI3qS3e4qB7EEQUx+mA/YkE X-Received: by 2002:a63:6eca:: with SMTP id j193mr18117533pgc.74.1566858444374; Mon, 26 Aug 2019 15:27:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566858444; cv=none; d=google.com; s=arc-20160816; b=BkTPunDg0PvC+6loSnZDkM0p9CALLDqxU5VJy+K869VlpxxrUgKimftqdr+pjpW2i2 p6NDjF7EMnQ0kUwxJ6Gyz+FxCWlzJ3FWJ0BVjbAAbopldecPlDpG1Ztk11lwh2VHJk6M h0ds2J8vEmRmVW9nlQwq+IqKtwwCrXsRRmN9s+fDYm9An2uEEIKbj6hvXBLMluLycWy3 yv3vWvqTUKnQJLLovGQs1QurKhe1thO/pnKK0X/B0qUUj3HQt5OorIVAOaKDPtumb/Xy cVT8dFsOtkqoXW1wVA1fDgK5ZRX2YNmonaWVHM/JlUtyQUxJLb89rW7MuXth7iqh7mtq udrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ygGU8DXWO27Q7hU4Gq61cURNY6ZFP912S21n8nKf1Hs=; b=OvutA2G2+ihixwKWsUlxCVxwa4vNRbBlxB8bjmNaPAsNSMp7gnQbH4HihIcn2EvRtV 04tDK+EBgWXkxWQEud8wnfolH5Cvl8JPk6+wBxXn+L39J6nJifesCu/AFNWqXiETG20A tYtkvAi4tzD3gvWSL5IQC1uTFxEGuU6ytShjroSPiuhfRCuWT7QZ3jWDxKOlg3gFs6HH puxxUA8g7ePzaPpXPXVUkpkmszbK5wA7BUX/8PUvSFeJKSS3fXZ1IWyWFcRsUwAALxll xUP9XQk/b+25+3de+hU7sKg46A9UnywGyi7Q2hKr/w/4Q+CikPa+YYYNOufxRNKno5Mz dpWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=UtMoWumb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u3si9933220pgm.122.2019.08.26.15.27.08; Mon, 26 Aug 2019 15:27:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=UtMoWumb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727014AbfHZWZ4 (ORCPT + 99 others); Mon, 26 Aug 2019 18:25:56 -0400 Received: from mail-io1-f66.google.com ([209.85.166.66]:45861 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726552AbfHZWZ4 (ORCPT ); Mon, 26 Aug 2019 18:25:56 -0400 Received: by mail-io1-f66.google.com with SMTP id t3so41304753ioj.12 for ; Mon, 26 Aug 2019 15:25:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ygGU8DXWO27Q7hU4Gq61cURNY6ZFP912S21n8nKf1Hs=; b=UtMoWumbCJ634fip4dFc90v0ojimVkA6Sm7sxgejzU4PsxHukts3eRWIhXUMFiGfrV 8sn596xpxuu10d6PAtHfgCUwGei4KF8Trtt/UwFilKk+pYsyGBh0BPkD/AXTvFfhkDRj Iu8Vpk9ycW0DyzQMHJFuDaaP7bxxkkTjqzUY0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ygGU8DXWO27Q7hU4Gq61cURNY6ZFP912S21n8nKf1Hs=; b=Egkms+QD4ciJzloYUdCGHQsvkgMMZwbw5zHtx1ldhP5o22rDedlSA7CubNs695nZtd WO871hc9NfYJ+JfgIfd0Ch7qk2nVV4rPN0gXSQhGQRmj5Eq4jU/5HEKOgy9bYec1x0q0 +dwJUc3NGIQRelcy3qLprgY4YtD0hSqVEjufdGXRdeovmLo2X4/vGaGjZxnubKlZnyoI 0ZfHzK7NCi/oy9azA7DXqr4X+BzSN67sliumabDqOiAteFy+L26Z4nbi4mag8IAxvhkJ G5m1/MMiHItS18w0/yW3kdInWsF706Okm1vdEvpQJDJT1VJggSIo1dVxCRZRMrQYYtH1 Q3Bg== X-Gm-Message-State: APjAAAVQXNp45MHJgu+xCowS6RuiqNmGDyvhbr1xr0abuEjfGnYr48l+ U0MRGcU6Zl+RRqX0aQwn97EujUj2R4Q= X-Received: by 2002:a02:c00c:: with SMTP id y12mr6447148jai.65.1566858355059; Mon, 26 Aug 2019 15:25:55 -0700 (PDT) Received: from mail-io1-f54.google.com (mail-io1-f54.google.com. [209.85.166.54]) by smtp.gmail.com with ESMTPSA id s6sm10669455ioe.6.2019.08.26.15.25.53 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Aug 2019 15:25:53 -0700 (PDT) Received: by mail-io1-f54.google.com with SMTP id j5so41368610ioj.8 for ; Mon, 26 Aug 2019 15:25:53 -0700 (PDT) X-Received: by 2002:a02:8481:: with SMTP id f1mr20019902jai.112.1566858352806; Mon, 26 Aug 2019 15:25:52 -0700 (PDT) MIME-Version: 1.0 References: <20190731183732.178134-1-dianders@chromium.org> In-Reply-To: <20190731183732.178134-1-dianders@chromium.org> From: Doug Anderson Date: Mon, 26 Aug 2019 15:25:43 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2] kdb: Fix stack crawling on 'running' CPUs that aren't the master To: Jason Wessel , Daniel Thompson Cc: kgdb-bugreport@lists.sourceforge.net, Christophe Leroy , LKML , Catalin Marinas , Will Deacon Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jason / Daniel, On Wed, Jul 31, 2019 at 11:38 AM Douglas Anderson wrote: > > In kdb when you do 'btc' (back trace on CPU) it doesn't necessarily > give you the right info. Specifically on many architectures > (including arm64, where I tested) you can't dump the stack of a > "running" process that isn't the process running on the current CPU. > This can be seen by this: > > echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT > # wait 2 seconds > g > > Here's what I see now on rk3399-gru-kevin. I see the stack crawl for > the CPU that handled the sysrq but everything else just shows me stuck > in __switch_to() which is bogus: > > ====== > > [0]kdb> btc > btc: cpu status: Currently on cpu 0 > Available cpus: 0, 1-3(I), 4, 5(I) > Stack traceback for pid 0 > 0xffffff801101a9c0 0 0 1 0 R 0xffffff801101b3b0 *swapper/0 > Call trace: > dump_backtrace+0x0/0x138 > ... > kgdb_compiled_brk_fn+0x34/0x44 > ... > sysrq_handle_dbg+0x34/0x5c > Stack traceback for pid 0 > 0xffffffc0f175a040 0 0 1 1 I 0xffffffc0f175aa30 swapper/1 > Call trace: > __switch_to+0x1e4/0x240 > 0xffffffc0f65616c0 > Stack traceback for pid 0 > 0xffffffc0f175d040 0 0 1 2 I 0xffffffc0f175da30 swapper/2 > Call trace: > __switch_to+0x1e4/0x240 > 0xffffffc0f65806c0 > Stack traceback for pid 0 > 0xffffffc0f175b040 0 0 1 3 I 0xffffffc0f175ba30 swapper/3 > Call trace: > __switch_to+0x1e4/0x240 > 0xffffffc0f659f6c0 > Stack traceback for pid 1474 > 0xffffffc0dde8b040 1474 727 1 4 R 0xffffffc0dde8ba30 bash > Call trace: > __switch_to+0x1e4/0x240 > __schedule+0x464/0x618 > 0xffffffc0dde8b040 > Stack traceback for pid 0 > 0xffffffc0f17b0040 0 0 1 5 I 0xffffffc0f17b0a30 swapper/5 > Call trace: > __switch_to+0x1e4/0x240 > 0xffffffc0f65dd6c0 > > === > > The problem is that 'btc' eventually boils down to > show_stack(task_struct, NULL); > > ...and show_stack() doesn't work for "running" CPUs because their > registers haven't been stashed. > > On x86 things might work better (I haven't tested) because kdb has a > special case for x86 in kdb_show_stack() where it passes the stack > pointer to show_stack(). This wouldn't work on arm64 where the stack > crawling function seems needs the "fp" and "pc", not the "sp" which is > presumably why arm64's show_stack() function totally ignores the "sp" > parameter. > > NOTE: we _can_ get a good stack dump for all the cpus if we manually > switch each one to the kdb master and do a back trace. AKA: > cpu 4 > bt > ...will give the expected trace. That's because now arm64's > dump_backtrace will now see that "tsk == current" and go through a > different path. > > In this patch I fix the problems by catching a request to stack crawl > a task that's running on a CPU and then I ask that CPU to do the stack > crawl. > > NOTE: this will (presumably) change what stack crawls are printed for > x86 machines. Now kdb functions will show up in the stack crawl. > Presumably this is OK but if it's not we can go back and add a special > case for x86 again. > > Signed-off-by: Douglas Anderson > --- > > Changes in v2: > - Totally new approach; now arch agnostic. > > kernel/debug/debug_core.c | 5 +++++ > kernel/debug/debug_core.h | 1 + > kernel/debug/kdb/kdb_bt.c | 44 ++++++++++++++++++++++++++++++--------- > 3 files changed, 40 insertions(+), 10 deletions(-) Did either of you have thoughts on this patch? -Doug