Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2997331imw; Wed, 6 Jul 2022 15:37:10 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vWaSnwC7f0rE64KsIfXDZCB+YGax4TPp76EC2BQqYU0+10TUrs+x7eSBJqymPnH/aBgN0q X-Received: by 2002:a17:907:6d8b:b0:72a:b3bc:2ab2 with SMTP id sb11-20020a1709076d8b00b0072ab3bc2ab2mr24037349ejc.67.1657147030651; Wed, 06 Jul 2022 15:37:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657147030; cv=none; d=google.com; s=arc-20160816; b=HCDdg8rcg50Eh9jLrGhoYXqGEk2VBVj5yYhgGc7cXyrpq6OZxnS+qw5sACk/dQDEbL 9fHXmOdwtOi0q3RAQiv7NO3PDJOmGzoV67bAbhpHGWMZIjY6EgIxe5GfyYhKcrubEydV l3h5geMEfUePdc7WTubZjHlD4drJdg9Cc86EQuqiNnuKUcvREgmtjzE3yhxg1oFCCYJZ 51OFAiKIQZE+pNGHfTUI5ZQA3tKcU+oKxM8B6HO2ev/8VzaUaMdXv1dTRTLIW10O65Qe 70rcpjFS5FZuR/szY+HF0MgH3k6S/ekzplD2LDctWahTg9/KicWFjiXuXgRr32u17GZ5 lTIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=Rf0x/VRJnaSHQggYfjcAgUxHxC7YKoZLKCfwyX9B/Cg=; b=h14U6+u+GSapiSMSLMU7/nnF0otjDTXuIRe3BT9Id1rjyaoxIz7LzCKMm7CQs2zhF0 5c4JTN/gqnBrLxOb8XOFX9RxYHUkGY5HOeyDekxEWz+NxiUDr+h8/R3FGKhaenystLe9 ZJNE+3Xceet2EfgdZL2qu8bLVsqkmtDjG7J5A3rX5CNba0R74qN+rpCaNHzyNCp7zbNh mQTCzab3s0aAeo1vS2KUaBD3Uba0V1BB0v48pXAvWUoQ7FITSbi7RdKGV/XeMnE0iYUi dmdQMgo62kKoFjpD/HsdHifbqC6j8WmnNMK1kHJa/gntN17wxuxtP24pQ6zUpZD/LVkq i7GA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cr19-20020a170906d55300b0072ab5030cd5si1945765ejc.367.2022.07.06.15.36.44; Wed, 06 Jul 2022 15:37:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233857AbiGFW3h (ORCPT + 99 others); Wed, 6 Jul 2022 18:29:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230412AbiGFW3g (ORCPT ); Wed, 6 Jul 2022 18:29:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52E5614D17; Wed, 6 Jul 2022 15:29:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E175B61CFA; Wed, 6 Jul 2022 22:29:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DE56C3411C; Wed, 6 Jul 2022 22:29:33 +0000 (UTC) Date: Wed, 6 Jul 2022 18:29:31 -0400 From: Steven Rostedt To: Song Liu Cc: Song Liu , Networking , bpf , lkml , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , "jolsa@kernel.org" , "mhiramat@kernel.org" Subject: Re: [PATCH v2 bpf-next 5/5] bpf: trampoline: support FTRACE_OPS_FL_SHARE_IPMODIFY Message-ID: <20220706182931.06cb0e20@gandalf.local.home> In-Reply-To: References: <20220602193706.2607681-1-song@kernel.org> <20220602193706.2607681-6-song@kernel.org> <20220706153843.37584b5b@gandalf.local.home> <20220706174049.6c60250f@gandalf.local.home> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 6 Jul 2022 22:15:47 +0000 Song Liu wrote: > > On Jul 6, 2022, at 2:40 PM, Steven Rostedt wrote: > > > > On Wed, 6 Jul 2022 21:37:52 +0000 > > Song Liu wrote: > > > >>> Can you comment here that returning -EAGAIN will not cause this to repeat. > >>> That it will change things where the next try will not return -EGAIN? > >> > >> Hmm.. this is not the guarantee here. This conflict is a real race condition > >> that an IPMODIFY function (i.e. livepatch) is being registered at the same time > >> when something else, for example bpftrace, is updating the BPF trampoline. > >> > >> This EAGAIN will propagate to the user of the IPMODIFY function (i.e. livepatch), > >> and we need to retry there. In the case of livepatch, the retry is initiated > >> from user space. > > > > We need to be careful here then. If there's a userspace application that > > runs at real-time and does a: > > > > do { > > errno = 0; > > regsiter_bpf(); > > } while (errno != -EAGAIN); > > Actually, do you mean: > > do { > errno = 0; > regsiter_bpf(); > } while (errno == -EAGAIN); > > (== -EAGAIN) here? Yeah, of course. > > In this specific race condition, register_bpf() will succeed, as it already > got tr->mutex. But the IPMODIFY (livepatch) side will fail and retry. What else takes the tr->mutex ? If it preempts anything else taking that mutex, when this runs, then it needs to be careful. You said this can happen when the live patch came first. This isn't racing against live patch, it's racing against anything that takes the tr->mutex and then adds a bpf trampoline to a location that has a live patch. > > Since both livepatch and bpf trampoline changes are rare operations, I think > the chance of the race condition is low enough. > > Does this make sense? > It's low, and if it is also a privileged operation then there's less to be concern about. As if it is not, then we could have a way to deadlock the system. I'm more concerned that this will lead to a CVE than it just happening randomly. In other words, it only takes something that can run at a real-time priority to connect to a live patch location, and something that runs at a low priority to take a tr->mutex. If an attacker has both, then it can pin both to a CPU and then cause the deadlock to the system. One hack to fix this is to add a msleep(1) in the failed case of the trylock. This will at least give the owner of the lock a millisecond to release it. This was what the RT patch use to do with spin_trylock() that was converted to a mutex (and we worked hard to remove all of them). -- Steve