Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp526273rdf; Tue, 21 Nov 2023 08:54:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IFTS+YWjevV62fb0JBe/79Jzzdr3bhIr+s8cyPhR6nTlGCzbEJUnK37AMXyUvz/GXRHjSL4 X-Received: by 2002:a05:6a21:3384:b0:187:f7d3:fe4 with SMTP id yy4-20020a056a21338400b00187f7d30fe4mr9349752pzb.11.1700585682153; Tue, 21 Nov 2023 08:54:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700585682; cv=none; d=google.com; s=arc-20160816; b=d841UI4gkNgxOq12dvzY94CteFEjqPRfXJaXqNNrn22kE4Tlg7DZtjgqWe9LtQP+Ll xQRdmeKjEUqAppsKTVAJKy4WL4cpDqR+y91Ooo9AnMZa+e9evCyuzO/6V+37bGDiq9Rx bKBsd1f95j9ES3BaemFjQB9gx+aZvD/FcKN4kmHnkbXil9tEFPPw9ikaGnkKxYgugERV dRjKtM7GI9bHWHMwe5M8qcgW3kBwCwfVzokyOoJA+nY1+KFbJanK22mbkkzZ0JFO0VLt Erkze4PBgKYtqaqH0WKL15X767UX777ZAx57Y2uJYZuMI7d+UkhYXHSf+66ljgssQG9g ubbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=En4UnZqozE7b8mvHdKrVO8R6ObLgjNaox2sC+SuNUH8=; fh=H96IRDLeBZh4CbvCU+/os/Gxo1aS5vf+zy4tsH7m93w=; b=A+VOA7T81oTczbu9hwmJMpCLPFeoiQB6ajHXhvGeDvtRNwsLVfcSoh4NI4T8HbexDh Rgn/2iM2QLjL0wghRt5nwcZCQW3FkHEpt9IrW+Yh3+yqG06HUnrNZ8UCCOVYsxQIuV9X rboqhDtBSg83hcN8rjLmoJaTGEIMNgrLUFp0khzdp9oDkd6iHB7DxBvDKY2hTHINxRVB tk3ezh5QqwvZ2sByYwyIPv8p1eGDKxSL2IpvGWPER4UweNwQAcmEkbH5P5G5u/JfHRsZ o09lCmFqp8Bq9jj07vLc2W76t8IZWN1TwshYi7Qosucs88WI94ISiigOg9O9Lx3w9buW pgPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=NeTiS0S0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id y8-20020a056a00180800b006cb6cc056f8si6933962pfa.334.2023.11.21.08.54.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 08:54:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=NeTiS0S0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 30910801D143; Tue, 21 Nov 2023 08:54:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233504AbjKUQyC (ORCPT + 99 others); Tue, 21 Nov 2023 11:54:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230340AbjKUQyB (ORCPT ); Tue, 21 Nov 2023 11:54:01 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 378B71A3 for ; Tue, 21 Nov 2023 08:53:57 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 15F2FC433C7; Tue, 21 Nov 2023 16:53:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700585636; bh=8Wh9k1lYQvJfipCVzoypNryh9Xz8F8SP6BIc+Oy13+A=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=NeTiS0S0Yhb8TJaxoKQRvVtc0hEmgZTquvYXnmHKZ514rphvbbJW8g0zXUuZ++s2/ zYTny8iPfnZVaPsYLN3OP8wTcEhrU5DHN+L8cJDovIVYqxbKmE4do078ctqyM0K0VF 0rDNPIHEHn9uDVwE3EafNb2E9RfT3nCvh8j9PEESw0XKesPnlqTLUEmXt2qmkyjL5I EhXRZc9ek0T62U3SFPtJVsz0TzadLFKDOe4tgquPG0OSEbTrRCvW13LyPE2Q+vk3VL brekgYmZ8fk7ggD3klAd0PyLfG0PRKhChmkdTcbkZG2MP89r3lNQ+61iLaw0oNmjxx KYYT+XWw8Z+Jw== Date: Tue, 21 Nov 2023 16:53:51 +0000 From: Lee Jones To: Jiri Olsa Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , x86@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [REPORT] BPF: Reproducible triggering of BUG() from userspace PoC Message-ID: <20231121165351.GF173820@google.com> References: <20231108154626.GB8909@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 21 Nov 2023 08:54:21 -0800 (PST) On Thu, 09 Nov 2023, Jiri Olsa wrote: > On Wed, Nov 08, 2023 at 03:46:26PM +0000, Lee Jones wrote: > > Good afternoon, > > > > After coming across a recent Syzkaller report [0] I thought I'd take > > some time to firstly reproduce the issue, then see if there was a > > trivial way to mitigate it. The report suggests that a BUG() in > > prog_array_map_poke_run() [1] can be trivially and reliably triggered > > from userspace using the PoC provided [2]. > > > > ret = bpf_arch_text_poke(poke->tailcall_bypass, > > BPF_MOD_JUMP, > > old_bypass_addr, > > poke->bypass_addr); > > BUG_ON(ret < 0 && ret != -EINVAL); > > > > Indeed the PoC does seem to be able to consistently trigger the BUG(), > > not only on the reported kernel (v6.1), but also on linux-next. I went > > to the trouble of checking LORE, but failed to find any patches which > > may be attempting to fix this. > > > > kernel BUG at kernel/bpf/arraymap.c:1094! > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > > CPU: 5 PID: 45 Comm: kworker/5:0 Not tainted 6.6.0-rc3-next-20230929-dirty #74 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > > Workqueue: events prog_array_map_clear_deferred > > RIP: 0010:prog_array_map_poke_run+0x6b4/0x6d0 > > Code: ff 0f 0b e8 1e 27 e1 ff 48 c7 c7 60 80 93 85 48 c7 c6 00 7f 93 85 48 c7 c2 bb c2 39 86 b9 45 04 00 00 45 89 f8 e8 9c 890 > > RSP: 0018:ffffc9000036fb50 EFLAGS: 00010246 > > RAX: 0000000000000044 RBX: ffff88811f337490 RCX: 63af48a1314f9900 > > RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 > > RBP: ffffc9000036fbe8 R08: ffffffff815c23c5 R09: 1ffff11084c14eba > > R10: dfffe91084c14ebc R11: ffffed1084c14ebb R12: ffff888116517800 > > R13: dffffc0000000000 R14: ffff888125a1a400 R15: 00000000fffffff0 > > FS: 0000000000000000(0000) GS:ffff888426080000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000004ab678 CR3: 0000000122ac4000 CR4: 0000000000350eb0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > > > ? __die_body+0x92/0xf0 > > ? die+0xa2/0xe0 > > ? do_trap+0x12f/0x370 > > ? handle_invalid_op+0xa6/0x140 > > ? handle_invalid_op+0xdf/0x140 > > ? prog_array_map_poke_run+0x6b4/0x6d0 > > ? prog_array_map_poke_run+0x6b4/0x6d0 > > ? exc_invalid_op+0x32/0x50 > > ? asm_exc_invalid_op+0x1b/0x20 > > ? __wake_up_klogd+0xd5/0x110 > > ? prog_array_map_poke_run+0x6b4/0x6d0 > > ? bpf_prog_6781ebc2dae4bad9+0xb/0x53 > > fd_array_map_delete_elem+0x152/0x250 > > prog_array_map_clear_deferred+0xf6/0x210 > > ? __bpf_array_map_seq_show+0xa40/0xa40 > > ? kick_pool+0x164/0x350 > > ? process_one_work+0x57a/0xd00 > > process_one_work+0x5e4/0xd00 > > worker_thread+0x9cf/0xea0 > > kthread+0x2b4/0x350 > > ? pr_cont_work+0x580/0x580 > > ? kthread_blkcg+0xd0/0xd0 > > ret_from_fork+0x4a/0x80 > > ? kthread_blkcg+0xd0/0xd0 > > ret_from_fork_asm+0x11/0x20 > > > > Modules linked in: > > ---[ end trace 0000000000000000 ]--- > > > > However, with my very limited BPF subsystem knowledge I was unable to > > trivially fix the issue. Hopefully some knowledgable person would be > > kind enough to provide me with some pointers. > > > > bpf_arch_text_poke() seems to be returning -EBUSY due to a negative > > memcmp() result from [3]. > > > > ret = -EBUSY; > > mutex_lock(&text_mutex); > > if (memcmp(ip, old_insn, X86_PATCH_SIZE)) { > > goto out; > > [...] > > > > When spitting out the memory at those locations, this is the result: > > > > ip: e9 06 00 00 00 > > old_insn: 0f 1f 44 00 00 > > nop_insn: 0f 1f 44 00 00 > > > > As you can see, the information stored in 'ip' does not match that of > > the data stored in 'old_insn', causing bpf_arch_text_poke() to return > > early with the error -EBUSY, suggesting that the data pointed to by > > 'old_insn', and by extension 'prog' should have been changed when > > emit_call()ing, to the value of 'ip', but wasn't. > > hi, > thanks for the report.. I can reproduce that easily with [2] > > AFAICS it looks like previous update fails because we use bpf_arch_text_poke, > which can't find poke->tailcall_bypass value as bpf program symbol and fails > with -EINVAL > > then the following update fails to find expected jmp/nop because it was never > updated.. I think we should use __bpf_arch_text_poke like we do in > bpf_tail_call_direct_fixup and skip the bpf symbol check > > with the patch below I can't reproduce the issue anymore, I'll do some more > checking though The patch below seems to be working for me also: Tested-by: Lee Jones > --- > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > index 8c10d9abc239..35c2988caf29 100644 > --- a/arch/x86/net/bpf_jit_comp.c > +++ b/arch/x86/net/bpf_jit_comp.c > @@ -391,8 +391,8 @@ static int emit_jump(u8 **pprog, void *func, void *ip) > return emit_patch(pprog, func, ip, 0xE9); > } > > -static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, > - void *old_addr, void *new_addr) > +int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, > + void *old_addr, void *new_addr) > { > const u8 *nop_insn = x86_nops[5]; > u8 old_insn[X86_PATCH_SIZE]; > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index eb84caf133df..0d7b8311fada 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -3172,6 +3172,8 @@ enum bpf_text_poke_type { > > int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, > void *addr1, void *addr2); > +int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, > + void *old_addr, void *new_addr); > > void *bpf_arch_text_copy(void *dst, void *src, size_t len); > int bpf_arch_text_invalidate(void *dst, size_t len); > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > index 2058e89b5ddd..4ab5864746ce 100644 > --- a/kernel/bpf/arraymap.c > +++ b/kernel/bpf/arraymap.c > @@ -1073,33 +1073,33 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key, > new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL; > > if (new) { > - ret = bpf_arch_text_poke(poke->tailcall_target, > + ret = __bpf_arch_text_poke(poke->tailcall_target, > BPF_MOD_JUMP, > old_addr, new_addr); > BUG_ON(ret < 0 && ret != -EINVAL); > if (!old) { > - ret = bpf_arch_text_poke(poke->tailcall_bypass, > + ret = __bpf_arch_text_poke(poke->tailcall_bypass, > BPF_MOD_JUMP, > poke->bypass_addr, > NULL); > - BUG_ON(ret < 0 && ret != -EINVAL); > + BUG_ON(ret < 0); > } > } else { > - ret = bpf_arch_text_poke(poke->tailcall_bypass, > + ret = __bpf_arch_text_poke(poke->tailcall_bypass, > BPF_MOD_JUMP, > old_bypass_addr, > poke->bypass_addr); > - BUG_ON(ret < 0 && ret != -EINVAL); > + BUG_ON(ret < 0); > /* let other CPUs finish the execution of program > * so that it will not possible to expose them > * to invalid nop, stack unwind, nop state > */ > if (!ret) > synchronize_rcu(); > - ret = bpf_arch_text_poke(poke->tailcall_target, > + ret = __bpf_arch_text_poke(poke->tailcall_target, > BPF_MOD_JUMP, > old_addr, NULL); > - BUG_ON(ret < 0 && ret != -EINVAL); > + BUG_ON(ret < 0); > } > } > } -- Lee Jones [李琼斯]