Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6585imu; Thu, 10 Jan 2019 16:42:32 -0800 (PST) X-Google-Smtp-Source: ALg8bN5OWmf/8xQklK9dnwXLOHproB+XoymKFJTmIC3LMKQUmVcdTILB+zsuKA+pmQ2FCLPa6/r8 X-Received: by 2002:a65:60c2:: with SMTP id r2mr1317051pgv.393.1547167352779; Thu, 10 Jan 2019 16:42:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547167352; cv=none; d=google.com; s=arc-20160816; b=Vai1Sez9+CBZJNrpL7Impb5IGTdsVGThBYnz6oWJjpqPRyiUgr6IMMIN0yl/PviT6n Y2ilrCh0essGofyXLKKWKqEJt/CPEImXZ7QNXi5jusbLIm/29CqCHQr1F1u494hOqzR/ bdCTAjCPq2UQxICAqb3SKXd20+5EXrCpNu5gWKl5qJqQjadhojDeOrCsZK9xUHoBwJWw H2sY4cQMZa+YSbsXZRipdjWNA/J4xQTKb/DG5Qr0hkw6xExNcJmhz1AzZncs8su6Lv+W C+rSOspjxZJbgP+g+ouJOEJUOsVELaAz4rOZCHzOFoDEt43cR3s+sv5BkfrfcCc1fIq1 /7fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:message-id:in-reply-to:date:references:subject:cc:to :from; bh=KG9xqGjPzo+72p/+neelppN7mDHKnUm2U8j0HNkDMYM=; b=SqGb8GuaKm4HuPKitoXgcAMWD8W5Lyr9LG+m1Ut7aD9EfnCd5IP8jWWpQ7hhNeg3mt UR2v8kv/vfAZ6IlbZuFdQLn4Z4drfs0HCBDgZIYxCw2DuhXMyoAALq82gSwrDZftWMRX pg6RlQBaUuD4RmjmnFMK1vUCxN2ytV9cFpe6YTswMQx840p4Ud6vR7M83IS3cvb05gdv cfUoVmLWFOWdjI1bE13iPh/K3vLomvEDkmuiHcFbv+CVWTaY4CxMaXDYIR4ZFLQIl2K4 z1v3LPf5KiI8OcQFnKfMX9O/ret0qm0uvmuLGpU6l2hVKlI5+KNp78IWj8nEQ+3jbEra xzug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gn20si8431598plb.98.2019.01.10.16.42.16; Thu, 10 Jan 2019 16:42:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730689AbfAKAAm convert rfc822-to-8bit (ORCPT + 99 others); Thu, 10 Jan 2019 19:00:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:35752 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729416AbfAKAAm (ORCPT ); Thu, 10 Jan 2019 19:00:42 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 88094AD7F; Fri, 11 Jan 2019 00:00:40 +0000 (UTC) From: Nicolai Stange To: Joe Lawrence Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, live-patching@vger.kernel.org, Torsten Duwe , Michael Ellerman , Jiri Kosina , Balbir Singh Subject: Re: ppc64le reliable stack unwinder and scheduled tasks References: <7f468285-b149-37e2-e782-c9e538b997a9@redhat.com> Date: Fri, 11 Jan 2019 01:00:38 +0100 In-Reply-To: <7f468285-b149-37e2-e782-c9e538b997a9@redhat.com> (Joe Lawrence's message of "Thu, 10 Jan 2019 16:14:00 -0500") Message-ID: <87bm4ocbbt.fsf@suse.de> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Joe, Joe Lawrence writes: > tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks > about? If I'm reading the code in _switch() correctly, the first frame is completely uninitialized except for the pointer back to the caller's stack frame. For completeness: _switch() saves the return address, i.e. the link register into its parent's stack frame, as is mandated by the ABI and consistent with your findings below: it's always the second stack frame where the return address into __switch_to() is kept. > > > Example 1 (RHEL-7) > ================== > > crash> struct task_struct.thread c00000022fd015c0 | grep ksp > ksp = 0xc0000000288af9c0 > > crash> rd 0xc0000000288af9c0 -e 0xc0000000288b0000 > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > sp[0]: > > c0000000288af9c0: c0000000288afb90 0000000000dd0000 ...(............ > c0000000288af9d0: c000000000002a94 c000000001c60a00 .*.............. > > crash> sym c000000000002a94 > c000000000002a94 (T) hardware_interrupt_common+0x114 So that c000000000002a94 certainly wasn't stored by _switch(). I think what might have happened is that the switching frame aliased with some prior interrupt frame as setup by hardware_interrupt_common(). The interrupt and switching frames seem to share a common layout as far as the lower STACK_FRAME_OVERHEAD + sizeof(struct pt_regs) bytes are concerned. That address into hardware_interrupt_common() could have been written by the do_IRQ() called from there. > c0000000288af9e0: c000000001c60a80 0000000000000000 ................ > c0000000288af9f0: c0000000288afbc0 0000000000dd0000 ...(............ > c0000000288afa00: c0000000014322e0 c000000001c60a00 ."C............. > c0000000288afa10: c0000002303ae380 c0000002303ae380 ..:0......:0.... > c0000000288afa20: 7265677368657265 0000000000002200 erehsger."...... > > Uh-oh... > > /* Mark stacktraces with exception frames as unreliable. */ > stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER Aliasing of the switching stack frame with some prior interrupt stack frame would explain why that STACK_FRAME_REGS_MARKER is still found on the stack, i.e. it's a leftover. For testing, could you try whether clearing the word at STACK_FRAME_MARKER from _switch() helps? I.e. something like (completely untested): diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 435927f549c4..b747d0647ec4 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -596,6 +596,10 @@ _GLOBAL(_switch) SAVE_8GPRS(14, r1) SAVE_10GPRS(22, r1) std r0,_NIP(r1) /* Return to switch caller */ + + li r23,0 + std r23,96(r1) /* 96 == STACK_FRAME_MARKER * sizeof(long) */ + mfcr r23 std r23,_CCR(r1) std r1,KSP(r3) /* Set old stack pointer */ > > save_stack_trace_tsk_reliable > ============================= > > arch/powerpc/kernel/stacktrace.c :: save_stack_trace_tsk_reliable() does > take into account the first stackframe, but only to verify that the link > register is indeed pointing at kernel code address. It's actually the other way around: if (!firstframe && !__kernel_text_address(ip)) return 1; So the address gets sanitized only if it's _not_ coming from the first frame. Thanks, Nicolai -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)