Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp882388pxb; Tue, 3 Nov 2020 15:26:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJxhMITn2sPSyxa4pOSMjhKqW/StwP3qlv1tCDY3tYvq0xZthOm6dJIOE2p9TuWMbaAghlVH X-Received: by 2002:aa7:d9c2:: with SMTP id v2mr7679751eds.95.1604445980652; Tue, 03 Nov 2020 15:26:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604445980; cv=none; d=google.com; s=arc-20160816; b=pyYdYDeCXmTun7mpD/ehb+VduJ+bxHDaX2VkMasUXGnGK9E/fAwMND5wPDtHlwL0rB w0DVxD8mFuTsqr03yt4D+O6fGT45HPq7+gVm3oQzmpClnD6KLAMWJKDdTeWbPjnvAJfb 2RzLZISJWzeETGahSiDxusU6ro0ERp6uT+xU1RWlgVi89RrJZlGIoVFZkJJjNvsugc26 WkzDk6qNyIUSX9mDq/7LcDBNRaRNPDDSIyr/vHsfLyR5OYvNK8HZxX9FEfRQBHG14r5n XkiPNscpkq4m2x74YKd2r5m8s6UYDt60MyjQcSN6R1zuXhN0NEUVSiiUaSLRZX/x9GeH hgrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=/KpUtLFr98WZAIgYYa5NUTZufsHGhr7UgHnr0BEDNIA=; b=IGKIN+PqR4aQhA4fsSnsTJ7kSeb2RkxqK/XlggAA5WdkgHLMPwlYJrWdfvktrNvQRF IDSo1SygEzy53W8E9TBF6Zji3F3FgHm3k3wnL+8ZU0nJR9sj1gTS7Lrmi0FRMEZM4wr3 xUk24xBsCTbkbD3rjSxXez/nlANkT0VN8C7jFIQ4jvJIsomBjhDIjS8SRFbAOi5bS+BF StMk3SbTegkdMsOzJHem96282kQ/k0l/ZDyFadTwMuWjl3d0FJ+a9h+mLaW++l02pq2+ v0v4jjBDODl3xKtc/MBlsqNivX5wlfE+jqO0HV0e24dX1Uss+i0q47+0B9hLPaMSyfFl GBcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Jm9Pxb7C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x19si107912edq.122.2020.11.03.15.25.57; Tue, 03 Nov 2020 15:26:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Jm9Pxb7C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388659AbgKCVTA (ORCPT + 99 others); Tue, 3 Nov 2020 16:19:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:48798 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388674AbgKCVJD (ORCPT ); Tue, 3 Nov 2020 16:09:03 -0500 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 65E2C205ED; Tue, 3 Nov 2020 21:09:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1604437741; bh=qYPKpn4mGAPG4AcrVDqkr+LvDahHxitaUN/6VnaLAAY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Jm9Pxb7CcyxHjdZ11tdQGdFdMZfhr89EeJMEfOyNJAdO7V8qIHFeplym/SNT0Z7jv hUtFIo5choieOW1QCwwf8yyCGyqSaZMcDVUhI8bWXtsPu6QQ+AJu0LlgxUjL4zwMWW kcvlUe9ebSkfFaY5ri2weqN9kyr4I2lQ4UwAT5CM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jiri Slaby , Ingo Molnar , Sasha Levin Subject: [PATCH 4.14 014/125] x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels Date: Tue, 3 Nov 2020 21:36:31 +0100 Message-Id: <20201103203158.817761247@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201103203156.372184213@linuxfoundation.org> References: <20201103203156.372184213@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jiri Slaby [ Upstream commit f2ac57a4c49d40409c21c82d23b5706df9b438af ] GCC 10 optimizes the scheduler code differently than its predecessors. When CONFIG_DEBUG_SECTION_MISMATCH=y, the Makefile forces GCC not to inline some functions (-fno-inline-functions-called-once). Before GCC 10, "no-inlined" __schedule() starts with the usual prologue: push %bp mov %sp, %bp So the ORC unwinder simply picks stack pointer from %bp and unwinds from __schedule() just perfectly: $ cat /proc/1/stack [<0>] ep_poll+0x3e9/0x450 [<0>] do_epoll_wait+0xaa/0xc0 [<0>] __x64_sys_epoll_wait+0x1a/0x20 [<0>] do_syscall_64+0x33/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 But now, with GCC 10, there is no %bp prologue in __schedule(): $ cat /proc/1/stack The ORC entry of the point in __schedule() is: sp:sp+88 bp:last_sp-48 type:call end:0 In this case, nobody subtracts sizeof "struct inactive_task_frame" in __unwind_start(). The struct is put on the stack by __switch_to_asm() and only then __switch_to_asm() stores %sp to task->thread.sp. But we start unwinding from a point in __schedule() (stored in frame->ret_addr by 'call') and not in __switch_to_asm(). So for these example values in __unwind_start(): sp=ffff94b50001fdc8 bp=ffff8e1f41d29340 ip=__schedule+0x1f0 The stack is: ffff94b50001fdc8: ffff8e1f41578000 # struct inactive_task_frame ffff94b50001fdd0: 0000000000000000 ffff94b50001fdd8: ffff8e1f41d29340 ffff94b50001fde0: ffff8e1f41611d40 # ... ffff94b50001fde8: ffffffff93c41920 # bx ffff94b50001fdf0: ffff8e1f41d29340 # bp ffff94b50001fdf8: ffffffff9376cad0 # ret_addr (and end of the struct) 0xffffffff9376cad0 is __schedule+0x1f0 (after the call to __switch_to_asm). Now follow those 88 bytes from the ORC entry (sp+88). The entry is correct, __schedule() really pushes 48 bytes (8*7) + 32 bytes via subq to store some local values (like 4U below). So to unwind, look at the offset 88-sizeof(long) = 0x50 from here: ffff94b50001fe00: ffff8e1f41578618 ffff94b50001fe08: 00000cc000000255 ffff94b50001fe10: 0000000500000004 ffff94b50001fe18: 7793fab6956b2d00 # NOTE (see below) ffff94b50001fe20: ffff8e1f41578000 ffff94b50001fe28: ffff8e1f41578000 ffff94b50001fe30: ffff8e1f41578000 ffff94b50001fe38: ffff8e1f41578000 ffff94b50001fe40: ffff94b50001fed8 ffff94b50001fe48: ffff8e1f41577ff0 ffff94b50001fe50: ffffffff9376cf12 Here ^^^^^^^^^^^^^^^^ is the correct ret addr from __schedule(). It translates to schedule+0x42 (insn after a call to __schedule()). BUT, unwind_next_frame() tries to take the address starting from 0xffff94b50001fdc8. That is exactly from thread.sp+88-sizeof(long) = 0xffff94b50001fdc8+88-8 = 0xffff94b50001fe18, which is garbage marked as NOTE above. So this quits the unwinding as 7793fab6956b2d00 is obviously not a kernel address. There was a fix to skip 'struct inactive_task_frame' in unwind_get_return_address_ptr in the following commit: 187b96db5ca7 ("x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks") But we need to skip the struct already in the unwinder proper. So subtract the size (increase the stack pointer) of the structure in __unwind_start() directly. This allows for removal of the code added by commit 187b96db5ca7 completely, as the address is now at '(unsigned long *)state->sp - 1', the same as in the generic case. [ mingo: Cleaned up the changelog a bit, for better readability. ] Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Bug: https://bugzilla.suse.com/show_bug.cgi?id=1176907 Signed-off-by: Jiri Slaby Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20201014053051.24199-1-jslaby@suse.cz Signed-off-by: Sasha Levin --- arch/x86/kernel/unwind_orc.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index a5e2ce931f692..e64c5b78fbfd3 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -255,19 +255,12 @@ EXPORT_SYMBOL_GPL(unwind_get_return_address); unsigned long *unwind_get_return_address_ptr(struct unwind_state *state) { - struct task_struct *task = state->task; - if (unwind_done(state)) return NULL; if (state->regs) return &state->regs->ip; - if (task != current && state->sp == task->thread.sp) { - struct inactive_task_frame *frame = (void *)task->thread.sp; - return &frame->ret_addr; - } - if (state->sp) return (unsigned long *)state->sp - 1; @@ -550,7 +543,7 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task, } else { struct inactive_task_frame *frame = (void *)task->thread.sp; - state->sp = task->thread.sp; + state->sp = task->thread.sp + sizeof(*frame); state->bp = READ_ONCE_NOCHECK(frame->bp); state->ip = READ_ONCE_NOCHECK(frame->ret_addr); state->signal = (void *)state->ip == ret_from_fork; -- 2.27.0