Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp5188363ybl; Tue, 27 Aug 2019 00:25:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzxY4Tvx4ZRvfXnLiWhUU3fpw8Gdf5V04mIr8CHYgVTi8af6Hf9SOXjOtgyIBT41LTRfD6V X-Received: by 2002:a17:902:a714:: with SMTP id w20mr22591295plq.135.1566890750200; Tue, 27 Aug 2019 00:25:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566890750; cv=none; d=google.com; s=arc-20160816; b=Izm4G3HYSXkiBeiwqxzViyzrqx0+daDQtBvYMswQZigylMB3oXf+DEryUN6jzFz37l 5I9RoJMKoBk0gFl7xMuK6+SUqZ01MNd54wEN770MKsnja3FB8gJS/JUjSmjnzMM5/U3Q Byq8eODL84ksMvVfkOPNJpJKZGu45nBYB5m7IeXDr4Qglm87aiY5EmcfG2DJogSiBYhs 6Ow2MeDcjxgmMLbeM1+shpiLXaLA8HDMhZ9RjkPDawMnDujFZSrCJzIvkjDZWhLFAIIf TW+AZk+LQbr8AgpzTtGbJvVCcFor4dTVOWE/voGqLVysBB7WknchlRIV1m4UoVV5jctd riDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=UdTlWBq67pFf3BFWbVXOJiSHrcyQJ0PaYrdJV+R7zcU=; b=JQKW+VjN5fixSdIIxAtEJ1hwEgpr7wx3uWDHPwBKqTA7VadVm/sjrqNMD9ZRX2Ed8H 0i66VnfnzIPECwR3gqS5TQeFwgOkEax2Z9cQOx+iqz/8GbO1w/F0fzpdChg3iBAB3wt0 FZajqOdZ9VOz5XsA+F5YmEbMFdzKnVDh9x7z8XxeDyRGQMe8BikQOWhtaRFS8+JL1Bst yG9B1rmxHuBTeQ0s5rIqlgyx6Xt5/S/0qJSZsP/IYCA5SywMGbpAaaI5wRCf7SsyFDw+ Q1dIATKxYwuDnAQmzax+sjZtvcQuPp5GJOpSeFUoD2247Gp/hFccN7ovcZ+++svE8chK dluQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=c2sKfjCE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a141si13408218pfd.49.2019.08.27.00.25.34; Tue, 27 Aug 2019 00:25:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=c2sKfjCE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726333AbfH0HYn (ORCPT + 99 others); Tue, 27 Aug 2019 03:24:43 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:52366 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725825AbfH0HYn (ORCPT ); Tue, 27 Aug 2019 03:24:43 -0400 Received: by mail-wm1-f66.google.com with SMTP id o4so1903394wmh.2 for ; Tue, 27 Aug 2019 00:24:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=UdTlWBq67pFf3BFWbVXOJiSHrcyQJ0PaYrdJV+R7zcU=; b=c2sKfjCE0fItXLJHeMv+GFYS7uvSIGQkogDsakG8O19i0NlYaODm94EwmQ50o38LQV mA1kQOnvXiyJA1mPxiSFVX4Bv3lBUKvphBE7sdggGArrhBMi9oi0IcaEwxEw1XPhn+b1 0zFEjRx0AmbyZHeENOesuUaV9+5nCkMBca0RiN6su84zLI7dK5FecyQJg+7tP9T07fzZ SC1h37BQEd/JC2HWLiDjeC4CBIt2vM2Wkogf/2ZzHjUZ4Zg3maxR2QnfpyVb+EX6Zv06 tr8TbSn7OTqoWAk9sgTp2GvblzPAqcudIgCUAtfjHhPouWeNAWyZl61dM0JrV4fjmrmW RqnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=UdTlWBq67pFf3BFWbVXOJiSHrcyQJ0PaYrdJV+R7zcU=; b=CbbpoHNUOstskjROCmWRcdJJwpgKTmKwD6Jc8yrvItA87vTzLQWNWno1sLbXlMkX8G 3P10OIbgaNLRSoUL1IyO0lhHLwdknivvCk1RV/IW9NPqQLh/sOMjhWHc0uAkkDct+YZE DacbdVG/cvmxJr5m30at+yvbdzM2LpnwcRnwk7R42ORmhJhVZ1nNfTFEVNDPOLcPlnvT yCjNDjzEC+n7VzBALdUOXNDXfNGbjIJ4r6NJ1/ShfFaQjg8/OCJN3OQ/ifz2xCm8crV4 O5lnOVdlN9BI7isr1PEdfJFpiQXy73NXJFK1HQUo4G/o2K0vnyjxA5nnUNWJSmKa4/ym WEzA== X-Gm-Message-State: APjAAAW7YNp5hShZy6FamRbQIXokGBPbmhp2sCEa0ug+z8lL1c1knWDx qWl2bucxiSPczDkL4l57TGG4HQ== X-Received: by 2002:a1c:dd82:: with SMTP id u124mr26802396wmg.89.1566890680428; Tue, 27 Aug 2019 00:24:40 -0700 (PDT) Received: from holly.lan (cpc141214-aztw34-2-0-cust773.18-1.cable.virginm.net. [86.9.19.6]) by smtp.gmail.com with ESMTPSA id 16sm2352759wmx.45.2019.08.27.00.24.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2019 00:24:39 -0700 (PDT) Date: Tue, 27 Aug 2019 08:24:38 +0100 From: Daniel Thompson To: Doug Anderson Cc: Jason Wessel , kgdb-bugreport@lists.sourceforge.net, Christophe Leroy , LKML , Catalin Marinas , Will Deacon Subject: Re: [PATCH v2] kdb: Fix stack crawling on 'running' CPUs that aren't the master Message-ID: <20190827072438.fwgggambxp34onid@holly.lan> References: <20190731183732.178134-1-dianders@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 26, 2019 at 03:25:43PM -0700, Doug Anderson wrote: > Jason / Daniel, > > On Wed, Jul 31, 2019 at 11:38 AM Douglas Anderson wrote: > > > > In kdb when you do 'btc' (back trace on CPU) it doesn't necessarily > > give you the right info. Specifically on many architectures > > (including arm64, where I tested) you can't dump the stack of a > > "running" process that isn't the process running on the current CPU. > > This can be seen by this: > > > > echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT > > # wait 2 seconds > > g > > > > Here's what I see now on rk3399-gru-kevin. I see the stack crawl for > > the CPU that handled the sysrq but everything else just shows me stuck > > in __switch_to() which is bogus: > > > > ====== > > > > [0]kdb> btc > > btc: cpu status: Currently on cpu 0 > > Available cpus: 0, 1-3(I), 4, 5(I) > > Stack traceback for pid 0 > > 0xffffff801101a9c0 0 0 1 0 R 0xffffff801101b3b0 *swapper/0 > > Call trace: > > dump_backtrace+0x0/0x138 > > ... > > kgdb_compiled_brk_fn+0x34/0x44 > > ... > > sysrq_handle_dbg+0x34/0x5c > > Stack traceback for pid 0 > > 0xffffffc0f175a040 0 0 1 1 I 0xffffffc0f175aa30 swapper/1 > > Call trace: > > __switch_to+0x1e4/0x240 > > 0xffffffc0f65616c0 > > Stack traceback for pid 0 > > 0xffffffc0f175d040 0 0 1 2 I 0xffffffc0f175da30 swapper/2 > > Call trace: > > __switch_to+0x1e4/0x240 > > 0xffffffc0f65806c0 > > Stack traceback for pid 0 > > 0xffffffc0f175b040 0 0 1 3 I 0xffffffc0f175ba30 swapper/3 > > Call trace: > > __switch_to+0x1e4/0x240 > > 0xffffffc0f659f6c0 > > Stack traceback for pid 1474 > > 0xffffffc0dde8b040 1474 727 1 4 R 0xffffffc0dde8ba30 bash > > Call trace: > > __switch_to+0x1e4/0x240 > > __schedule+0x464/0x618 > > 0xffffffc0dde8b040 > > Stack traceback for pid 0 > > 0xffffffc0f17b0040 0 0 1 5 I 0xffffffc0f17b0a30 swapper/5 > > Call trace: > > __switch_to+0x1e4/0x240 > > 0xffffffc0f65dd6c0 > > > > === > > > > The problem is that 'btc' eventually boils down to > > show_stack(task_struct, NULL); > > > > ...and show_stack() doesn't work for "running" CPUs because their > > registers haven't been stashed. > > > > On x86 things might work better (I haven't tested) because kdb has a > > special case for x86 in kdb_show_stack() where it passes the stack > > pointer to show_stack(). This wouldn't work on arm64 where the stack > > crawling function seems needs the "fp" and "pc", not the "sp" which is > > presumably why arm64's show_stack() function totally ignores the "sp" > > parameter. > > > > NOTE: we _can_ get a good stack dump for all the cpus if we manually > > switch each one to the kdb master and do a back trace. AKA: > > cpu 4 > > bt > > ...will give the expected trace. That's because now arm64's > > dump_backtrace will now see that "tsk == current" and go through a > > different path. > > > > In this patch I fix the problems by catching a request to stack crawl > > a task that's running on a CPU and then I ask that CPU to do the stack > > crawl. > > > > NOTE: this will (presumably) change what stack crawls are printed for > > x86 machines. Now kdb functions will show up in the stack crawl. > > Presumably this is OK but if it's not we can go back and add a special > > case for x86 again. > > > > Signed-off-by: Douglas Anderson > > --- > > > > Changes in v2: > > - Totally new approach; now arch agnostic. > > > > kernel/debug/debug_core.c | 5 +++++ > > kernel/debug/debug_core.h | 1 + > > kernel/debug/kdb/kdb_bt.c | 44 ++++++++++++++++++++++++++++++--------- > > 3 files changed, 40 insertions(+), 10 deletions(-) > > Did either of you have thoughts on this patch? Hi Doug Sorry about this. It got backlogged during a recent holiday... it's still on the list. I took a quick look a week or so ago but at this point I haven't yet tested out the behaviour on x86 and I wanted to do a closer review to check I am happy with the barriering. Daniel. > > -Doug