Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp796129pxb; Wed, 3 Feb 2021 19:09:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJwMcj7XUlOX7APxcpYWEdKMDFs44/XEXKoyqKfh8VbDK0NmvoGTqt4zUjhLAlEcc/wxTvUr X-Received: by 2002:a17:906:804a:: with SMTP id x10mr3086760ejw.184.1612408187303; Wed, 03 Feb 2021 19:09:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612408187; cv=none; d=google.com; s=arc-20160816; b=SLKDkwi0ktYJG+OzwmltinCyKAVin5Ppju/ErhUFmzTZK/BR3cRK5XpGsdkhiIK6py jaUNkU7pxXPppFekX4X8COU49SxXr5W40r7lAWUUUiy0fQp4gF4jMjA+kcHph9F5GVv8 nV3Qb/AUmCv3hls1xe1NmwH6gWAeLgo4amXGLz1Xm9uUbceyyFm0Lw9o4rUXEN0/bj6M RxFR1r+QD63fMZhQg/02HXrpgX/OxdEDHG9VFLUS08F6wnbD0mP1fl5VzgsAi95uZYit 4dHsPf7ruARNgrqUkobJLjM9SGb/uM4IKBkjXZflwvvIMdMVU8iUhynreFaQpJ9s/YbF nBuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=S9c95Zn8q0OLwuanJ+Qz7i0VRH5vfRMhfgOI7EfHplY=; b=xuP6Osf4Z68yc55vag9lBdA/95aOMrRx1sX/EgXoiKTfgnL9brmCBiXbFFpAmoRIck C+ICm/uKv5yHoowpVoFUYLay78plT/RhjfduWlhn8O3h26XjIy8x0E5Y8KFwDU3Ce1No oV5LkUgVlrsnkVZ06q/x4Nt9/Bl7ZGLgF7TqXJYBcl2ZZ68nJCG3DZkXDjdTxZs1DeyM qW/MSeScWGDbdU9vDqIm+9ryx7AIyQZ5QCqDHrrSlJsS/+RhM7wTfcq2F5hGJEfUxREo xZ/tNGvNvZSEPjVmeGCP7/zhBvXDdnG8aWV50HtPVYS/ZMCgAXZOc4c6RVoxNqpyj3p+ ALjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ce20si2732989edb.138.2021.02.03.19.09.22; Wed, 03 Feb 2021 19:09:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233517AbhBDCph (ORCPT + 99 others); Wed, 3 Feb 2021 21:45:37 -0500 Received: from mail.kernel.org ([198.145.29.99]:39550 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231608AbhBDCpf (ORCPT ); Wed, 3 Feb 2021 21:45:35 -0500 Received: from oasis.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9B0CA64F4A; Thu, 4 Feb 2021 02:44:50 +0000 (UTC) Date: Wed, 3 Feb 2021 21:44:48 -0500 From: Steven Rostedt To: Ivan Babrou Cc: kernel-team , Ignat Korchagin , Hailong liu , Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Josh Poimboeuf , Miroslav Benes , "Peter Zijlstra (Intel)" , Julien Thierry , Jiri Slaby , kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , Robert Richter , "Joel Fernandes (Google)" , Mathieu Desnoyers , Linux Kernel Network Developers , bpf@vger.kernel.org Subject: Re: BUG: KASAN: stack-out-of-bounds in unwind_next_frame+0x1df5/0x2650 Message-ID: <20210203214448.2703930e@oasis.local.home> In-Reply-To: References: X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2 Feb 2021 19:09:44 -0800 Ivan Babrou wrote: > On Thu, Jan 28, 2021 at 7:35 PM Ivan Babrou wrote: > > > > Hello, > > > > We've noticed the following regression in Linux 5.10 branch: > > > > [ 128.367231][ C0] > > ================================================================== > > [ 128.368523][ C0] BUG: KASAN: stack-out-of-bounds in > > unwind_next_frame (arch/x86/kernel/unwind_orc.c:371 The bug is a stack-out-of-bounds error in unwind_orc.c, right? > > arch/x86/kernel/unwind_orc.c:544) > > [ 128.369744][ C0] Read of size 8 at addr ffff88802fceede0 by task > > kworker/u2:2/591 > > [ 128.370916][ C0] > > [ 128.371269][ C0] CPU: 0 PID: 591 Comm: kworker/u2:2 Not tainted > > 5.10.11-cloudflare-kasan-2021.1.15 #1 > > [ 128.372626][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, > > 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > > [ 128.374346][ C0] Workqueue: writeback wb_workfn (flush-254:0) > > [ 128.375275][ C0] Call Trace: > > [ 128.375763][ C0] > > [ 128.376221][ C0] dump_stack+0x7d/0xa3 > > [ 128.376843][ C0] print_address_description.constprop.0+0x1c/0x210 [ snip ? results ] > > (arch/x86/kernel/unwind_orc.c:371 arch/x86/kernel/unwind_orc.c:544) [ snip ] > > [ 128.381736][ C0] kasan_report.cold+0x1f/0x37 [ snip ] > > [ 128.383192][ C0] unwind_next_frame+0x1df5/0x2650 [ snip ] > > [ 128.391550][ C0] arch_stack_walk+0x8d/0xf0 [ snip ] > > [ 128.392807][ C0] stack_trace_save+0x96/0xd0 [ snip ] > > arch/x86/include/asm/irq_stack.h:77 arch/x86/kernel/irq_64.c:77) [ snip ] > > [ 128.399759][ C0] kasan_save_stack+0x20/0x50 [ snip ] > > [ 128.427691][ C0] kasan_set_track+0x1c/0x30 > > [ 128.428366][ C0] kasan_set_free_info+0x1b/0x30 > > [ 128.429113][ C0] __kasan_slab_free+0x110/0x150 > > [ 128.429838][ C0] slab_free_freelist_hook+0x66/0x120 > > [ 128.430628][ C0] kfree+0xbf/0x4d0 [ snip the rest ] > > [ 128.441287][ C0] RIP: 0010:skcipher_walk_next > > (crypto/skcipher.c:322 crypto/skcipher.c:384) Why do we have an RIP in skcipher_walk_next, if its the unwinder that had a bug? Or are they related? Or did skcipher_walk_next trigger something in KASAN which did a stack walk via the unwinder, and that caused another issue? Looking at the unwinder code in question, we have: static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, unsigned long *ip, unsigned long *sp) { struct pt_regs *regs = (struct pt_regs *)addr; /* x86-32 support will be more complicated due to the ®s->sp hack */ BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_32)); if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; *ip = regs->ip; *sp = regs->sp; <- pointer to here return true; } and the caller of the above static function: case UNWIND_HINT_TYPE_REGS: if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) { orc_warn_current("can't access registers at %pB\n", (void *)orig_ip); goto err; } Could it possibly be that there's some magic canary on the stack that causes KASAN to trigger if you read it? For example, there's this in the stack tracer: kernel/trace/trace_stack.c: check_stack() while (i < stack_trace_nr_entries) { int found = 0; stack_trace_index[x] = this_size; p = start; for (; p < top && i < stack_trace_nr_entries; p++) { /* * The READ_ONCE_NOCHECK is used to let KASAN know that * this is not a stack-out-of-bounds error. */ if ((READ_ONCE_NOCHECK(*p)) == stack_dump_trace[i]) { stack_dump_trace[x] = stack_dump_trace[i++]; this_size = stack_trace_index[x++] = (top - p) * sizeof(unsigned long); found = 1; That is because I read the entire stack frame looking for values, and I know where the top of the stack is, and will not go past it. But it too triggered a stack-out-of-bounds error, which required the above READ_ONCE_NOCHECK() to quiet KASAN. Not to mention there's already some READ_ONCE_NOCHECK() calls in the unwinder. Maybe this too is required? Would this work? diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 73f800100066..22eaf3683c2a 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } -- Steve