Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp4726085rwe; Mon, 17 Apr 2023 18:06:33 -0700 (PDT) X-Google-Smtp-Source: AKy350apZHkh7Sx3VZSdZr146K1ejvFLdk6cFS0wmRybRkzooHwBSsftfDYqsW8YMIefi0X8SANm X-Received: by 2002:a17:902:e5cb:b0:1a6:e564:6044 with SMTP id u11-20020a170902e5cb00b001a6e5646044mr464833plf.7.1681779993358; Mon, 17 Apr 2023 18:06:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681779993; cv=none; d=google.com; s=arc-20160816; b=EuDYApbyV6xnt34YI2T2rSeAOfEaApAY7y9QjL3ZPEpvLDnHqg60eBX0xZo7mpauf5 JR+C8qOQ+jJ/vXoX/o+NlhOyO/kFylB4Zj3+S33NTJkqWhJ4+BJTMiXGuzEXC6gDr//r Esjywa0yLltL/lyVeXW651tIEjvRF4CbFDv2C28n/ICyZ49T7WEsJRa6fVY0nQqFe4Lx FhFopFNnRbfi+rI61huaBZLa3ehkpzbyFJPNlrcbrAxB0aN3jsJwe4a9obfd4C2hm3gv KkLzX013Q0ni5yDJqjORCh1q/VuBL8nsIVDblSBQCQU6Op2njbABBCIYZLKHSYWofMwm +ilg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:mime-version:date :dkim-signature:message-id; bh=F6L4XXjJ7DAIawwAHN6exDhyLIgvinWJ7u4alQvnINk=; b=jGUwKrMXycP2WpQ3qVkmcoCfvjxx9Byo51Z2bYz4Zu2Fe/h/vMZc/U9ZAoYbi8Gt+2 qUorCDt20q9TvwsoZunj/RgIt9kc1BhAB+MSZG69NMu3fwxM5L/9GIA/2SSVTFfQGeSW 2I4XHoOEqUCe4WlvKoEyFNtrZZJFxnXokdfVK1lg0j41QbvIjlmkeLkF2rWvvnzdixD3 2OeHyjUpjg1mIctE4KC3KQQtfWBlmBvRiABI28J7QF/T/MvT4jHnGq6G8gauoqb4mnSx G52YHoIfJjNUVFkdfowOvmbnMhPm8sagM77z+xnJN2CG0RxsWHpZir2vMvmj8uPson3J cQ0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lrWoVAvN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a18-20020a170902b59200b001a50dcd10c2si11609862pls.247.2023.04.17.18.06.09; Mon, 17 Apr 2023 18:06:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lrWoVAvN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229662AbjDRBDt (ORCPT + 99 others); Mon, 17 Apr 2023 21:03:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229589AbjDRBDp (ORCPT ); Mon, 17 Apr 2023 21:03:45 -0400 Received: from out-1.mta1.migadu.com (out-1.mta1.migadu.com [95.215.58.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54098422D for ; Mon, 17 Apr 2023 18:03:43 -0700 (PDT) Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1681779821; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F6L4XXjJ7DAIawwAHN6exDhyLIgvinWJ7u4alQvnINk=; b=lrWoVAvN2dp4uPEXlR77OB9JFoLC9kWgFsZ9/4nEg+fkkX22rpYNjZ5BF5XaMVYgdbk15j yhdXAjHD9Ey+WsBnri+bIiW4dwJXCnhqcJgCMf+C4tMtcF2XvUj3AqxgOBy6x/dojejDj/ Ae2yiGOgVqeW9s9fpj36dJzyQNjz8Es= Date: Mon, 17 Apr 2023 18:03:35 -0700 MIME-Version: 1.0 Subject: handling unsupported optlen in cgroup bpf getsockopt: (was [PATCH net-next v4 2/4] net: socket: add sockopts blacklist for BPF cgroup hook) Content-Language: en-US To: Stanislav Fomichev Cc: Eric Dumazet , davem@davemloft.net, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, daniel@iogearbox.net, Jakub Kicinski , Paolo Abeni , Leon Romanovsky , David Ahern , Arnd Bergmann , Kees Cook , Christian Brauner , Kuniyuki Iwashima , Lennart Poettering , linux-arch@vger.kernel.org, Aleksandr Mikhalitsyn , bpf References: <20230413133355.350571-1-aleksandr.mikhalitsyn@canonical.com> <20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/14/23 6:55 PM, Stanislav Fomichev wrote: > On 04/13, Stanislav Fomichev wrote: >> On Thu, Apr 13, 2023 at 7:38 AM Aleksandr Mikhalitsyn >> wrote: >>> >>> On Thu, Apr 13, 2023 at 4:22 PM Eric Dumazet wrote: >>>> >>>> On Thu, Apr 13, 2023 at 3:35 PM Alexander Mikhalitsyn >>>> wrote: >>>>> >>>>> During work on SO_PEERPIDFD, it was discovered (thanks to Christian), >>>>> that bpf cgroup hook can cause FD leaks when used with sockopts which >>>>> install FDs into the process fdtable. >>>>> >>>>> After some offlist discussion it was proposed to add a blacklist of >>>> >>>> We try to replace this word by either denylist or blocklist, even in changelogs. >>> >>> Hi Eric, >>> >>> Oh, I'm sorry about that. :( Sure. >>> >>>> >>>>> socket options those can cause troubles when BPF cgroup hook is enabled. >>>>> >>>> >>>> Can we find the appropriate Fixes: tag to help stable teams ? >>> >>> Sure, I will add next time. >>> >>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") >>> >>> I think it's better to add Stanislav Fomichev to CC. >> >> Can we use 'struct proto' bpf_bypass_getsockopt instead? We already >> use it for tcp zerocopy, I'm assuming it should work in this case as >> well? > > Jakub reminded me of the other things I wanted to ask here bug forgot: > > - setsockopt is probably not needed, right? setsockopt hook triggers > before the kernel and shouldn't leak anything > - for getsockopt, instead of bypassing bpf completely, should we instead > ignore the error from the bpf program? that would still preserve > the observability aspect stealing this thread to discuss the optlen issue which may make sense to bypass also. There has been issue with optlen. Other than this older post related to optlen > PAGE_SIZE: https://lore.kernel.org/bpf/5c8b7d59-1f28-2284-f7b9-49d946f2e982@linux.dev/, the recent one related to optlen that we have seen is NETLINK_LIST_MEMBERSHIPS. The userspace passed in optlen == 0 and the kernel put the expected optlen (> 0) and 'return 0;' to userspace. The userspace intention is to learn the expected optlen. This makes 'ctx.optlen > max_optlen' and __cgroup_bpf_run_filter_getsockopt() ends up returning -EFAULT to the userspace even the bpf prog has not changed anything. Does it make sense to also bypass the bpf prog when 'ctx.optlen > max_optlen' for now (and this can use a separate patch which as usual requires a bpf selftests)? In the future, does it make sense to have a specific cgroup-bpf-prog (a specific attach type?) that only uses bpf_dynptr kfunc to access the optval such that it can enforce read-only for some optname and potentially also track if bpf-prog has written a new optval? The bpf-prog can only return 1 (OK) and only allows using bpf_set_retval() instead. Likely there is still holes but could be a seed of thought to continue polishing the idea. > - or maybe we can even have a per-proto bpf_getsockopt_cleanup call that > gets called whenever bpf returns an error to make sure protocols have > a chance to handle that condition (and free the fd) >