Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1069635rwl; Fri, 7 Apr 2023 09:26:20 -0700 (PDT) X-Google-Smtp-Source: AKy350ZMNvpUDONewTlWsPEpevGpP8pX/9bXAMVEQObuDOHgWQDge79QC6len4ulXlIbe/OgyGdo X-Received: by 2002:a62:3086:0:b0:627:f740:51f9 with SMTP id w128-20020a623086000000b00627f74051f9mr2449130pfw.3.1680884780108; Fri, 07 Apr 2023 09:26:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680884780; cv=none; d=google.com; s=arc-20160816; b=UMtxVaJ4LMPwUOkZN6g9IXFiaDPDKtaYKCgIcvOettH/HdXjN5sZ81TrDYtNUfYLKf pCaWkRf/K9NABonpbqlzIGWD9iEv1Sv+wzKQkz04BDxGLb0NuJgb6hofM6CqIVHZTQnD GdPf2OPJOC9vfR9SXUj4fsDX/YbY6K9KxR7pwpEGxrkrYmm5EVKtYPzM++MiP/7267/X +yg+sCYTZ87QM8++fMP+iMQb2+T3NYScB7FQwtY+INH1b4EHMj2NSHSV+nOPiHyv8XXh KLkqz9XSsihvbyPa1M4nnQvbe3SWf8rdq/Pn1UFZ8Ija16V8dMczL8BWJzhEqWt5xaHl wyGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=cYSQnvKRLGAqrNMignY9ToUj669QL5/nlNuRlmJb47U=; b=qstFGkHAEvWm6pqrKfE1gBI0biFBhZyHoYeg/WtLc1G3xYY44xNPpBisIAT9RbElQN f39TJi+fxiRCwYTb7Up7idoPJwWQIrnyryiglGmxw1IHUuRIGQXlxNIzucnOvyRoOVuX UCED+GiHvukUQAtJbFT1SH/swr9M0eIYPyC34wipVPACODeTUQxN83J58fdNPHznNOeu GZqMAHC9AtWBaKeriV7FfOc+9le5lDrAGEPydYucXGXJeHsvNUT7q4TTxgI+RXTunk7J h1JLkGXh7ORgWv1ShXTpopu3FPBrEQXLHiyrnyQEb54QW3bwW6IlkHlkJu6AwnjHh+YB lBaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=lbMZuEnI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g21-20020aa796b5000000b00625e1cd8e4asi3858231pfk.281.2023.04.07.09.26.07; Fri, 07 Apr 2023 09:26:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=lbMZuEnI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229633AbjDGQWh (ORCPT + 99 others); Fri, 7 Apr 2023 12:22:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229504AbjDGQWf (ORCPT ); Fri, 7 Apr 2023 12:22:35 -0400 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8D7AB75B; Fri, 7 Apr 2023 09:22:34 -0700 (PDT) Received: by mail-qt1-x835.google.com with SMTP id cn12so37696633qtb.8; Fri, 07 Apr 2023 09:22:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680884554; x=1683476554; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cYSQnvKRLGAqrNMignY9ToUj669QL5/nlNuRlmJb47U=; b=lbMZuEnIn9Ge2aKAOCiX5JP3ihgO3JddBv3q1JH2GSiA/oMMxnDwfBFyGTHHLm68VM NA7DR8FNYe3/1+v0iP+K9TIFSD39LmBEshrN39gKx/S3uT98GhFwdtJCMSNt3o1KBrwR Tu3wbnZNv3lazkSAmWbKMjkHYkg0RmfeupStgvBwxeoFsQNW2ossqAPe/10jozumkItt f5vtPBiEOllLXvIkyN8WoOBfcTrFniK+SLg5a0KVe7hffFVsgainc6O10oUyz9cAqaIb 9KoOazhIQl+ZBEZvCOMfKvKpRFGEvFBT4KmpwJtUqy/tMUxbgFtpIn95aeKIEGprslID pWbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680884554; x=1683476554; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cYSQnvKRLGAqrNMignY9ToUj669QL5/nlNuRlmJb47U=; b=SaTw/u7v4k+tSI6GGbxWHD37hBBCW6XuakBrMEGl4ycEe10ulzVCl4DcIpFv9lwtI7 3TFjO6Q2vnT8CORATBiIBhK2kQjS2D8teTF/iBg9YcI1ogfUuqKf+8CUkwkeB9KXIzkL 8KAAFW0rwtuABeZFrEWWYYKwrDahHmBHMds3hh+MQzDJsfCPAruYU9xBALFArWS3Vjm/ hghFKb4cfkMfX7w39CJpcKfVJ5E+TbHrLF+/hbRuw6+FtbPnjgQRGa58lyPRMQNgUcQf t95iRIdme6NR/oq5+pQptA2JppR6KNNZaKrAksO5x+BmsILT99yAjb9KrIDx7ivB0VVK nXTQ== X-Gm-Message-State: AAQBX9cmUxBi0l3n8M0232Kkzn7Byvi4vsaOxaUzXU1lXaHytltqR9PM wtVevbrWUKP9/JlmJdo2r5Uw6+zz7IFp8A04Ycw= X-Received: by 2002:a05:622a:1828:b0:3df:4392:1aff with SMTP id t40-20020a05622a182800b003df43921affmr984769qtc.6.1680884553979; Fri, 07 Apr 2023 09:22:33 -0700 (PDT) MIME-Version: 1.0 References: <20230403225017.onl5pbp7h2ugclbk@dhcp-172-26-102-232.dhcp.thefacebook.com> <20230406020656.7v5ongxyon5fr4s7@dhcp-172-26-102-232.dhcp.thefacebook.com> <20230407014359.m6tff5ffemvrsyt3@dhcp-172-26-102-232.dhcp.thefacebook.com> In-Reply-To: From: Yafang Shao Date: Sat, 8 Apr 2023 00:21:56 +0800 Message-ID: Subject: Re: [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace To: Alexei Starovoitov Cc: Andrii Nakryiko , Song Liu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 8, 2023 at 12:05=E2=80=AFAM Alexei Starovoitov wrote: > > On Fri, Apr 7, 2023 at 8:59=E2=80=AFAM Andrii Nakryiko > wrote: > > > > On Thu, Apr 6, 2023 at 6:44=E2=80=AFPM Alexei Starovoitov > > wrote: > > > > > > On Thu, Apr 06, 2023 at 01:22:26PM -0700, Andrii Nakryiko wrote: > > > > On Wed, Apr 5, 2023 at 10:44=E2=80=AFPM Yafang Shao wrote: > > > > > > > > > > On Thu, Apr 6, 2023 at 12:24=E2=80=AFPM Alexei Starovoitov > > > > > wrote: > > > > > > > > > > > > On Wed, Apr 5, 2023 at 8:22=E2=80=AFPM Yafang Shao wrote: > > > > > > > > > > > > > > On Thu, Apr 6, 2023 at 11:06=E2=80=AFAM Alexei Starovoitov > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, Apr 5, 2023 at 7:55=E2=80=AFPM Yafang Shao wrote: > > > > > > > > > > > > > > > > > > It seems that I didn't describe the issue clearly. > > > > > > > > > The container doesn't have CAP_SYS_ADMIN, but the CAP_SYS= _ADMIN is > > > > > > > > > required to run bpftool, so the bpftool running in the c= ontainer > > > > > > > > > can't get the ID of bpf objects or convert IDs to FDs. > > > > > > > > > Is there something that I missed ? > > > > > > > > > > > > > > > > Nothing. This is by design. bpftool needs sudo. That's all. > > > > > > > > > > > > > > > > > > > > > > Hmm, what I'm trying to do is make bpftool run without sudo. > > > > > > > > > > > > This is not a task that is worth solving. > > > > > > > > > > > > > > > > Then the container with CAP_BPF enabled can't even iterate its bp= f progs ... > > > > > > > > I'll leave the BPF namespace discussion aside (I agree that it need= s > > > > way more thought). > > > > > > > > I am a bit surprised that we require CAP_SYS_ADMIN for GET_NEXT_ID > > > > operations. GET_FD_BY_ID is definitely CAP_SYS_ADMIN, as they allow > > > > you to take over someone else's link and stuff like this. But just > > > > iterating IDs seems like a pretty innocent functionality, so maybe = we > > > > should remove CAP_SYS_ADMIN for GET_NEXT_ID? > > > > > > > > By itself GET_NEXT_ID is relatively useless without capabilities, b= ut > > > > we've been floating the idea of providing GET_INFO_BY_ID (not by FD= ) > > > > for a while now, and that seems useful in itself, as it would indee= d > > > > help tools like bpftool to get *some* information even without > > > > privileges. Whether those GET_INFO_BY_ID operations should return s= ame > > > > full bpf_{prog,map,link,btf}_info or some trimmed down version of t= hem > > > > would be up to discussion, but I think getting some info without > > > > creating an FD seems useful in itself. > > > > > > > > Would it be worth discussing and solving this separately from > > > > namespacing issues? > > > > > > Iteration of IDs itself is fine. The set of IDs is not security sensi= tive, > > > but GET_NEXT_BY_ID has to be carefully restricted. > > > It returns xlated, jited, BTF, line info, etc > > > and with all the restrictions it would need something like > > > CAP_SYS_PTRACE and CAP_PERFMON to be useful. > > > And with that we're not far from CAP_SYS_ADMIN. > > > Why bother then? > > > > You probably meant that GET_INFO_BY_ID should be carefully restricted? > > yes. > > > So yeah, that's what I said that this would have to be discussed > > further. I agree that returning func/line info, program dump, etc is > > probably a privileged part. But there is plenty of useful info besides > > that (e.g., prog name, insns cnt, run stats, etc) that would be useful > > for unpriv applications to monitor their own apps that they opened > > from BPF FS, or just some observability daemons. > > > > There is a lot of useful information in bpf_map_info and bpf_link_info > > that's way less privileged. I think bpf_link_info is good as is. Same > > for bpf_map_info. > > > > Either way, I'm not insisting, just something that seems pretty simple > > to add and useful in some scenarios. We can reuse existing code and > > types for GET_INFO_BY_FD and just zero-out (or prevent filling out) > > those privileged fields you mentioned. Anyway, something to put on the > > backburner, perhaps. > > Sorry, but I only see negatives. It's an extra code in the kernel > that has to be carefully reviewed when initially submitted and > then every patch that touches get_info_by_id would have to go > through a microscope every time to avoid introducing a security issue. > And for what? So that CAP_BPF application can read prog name and run stat= s? Per my experience, observability is a very important part for a project. If the user can't observe the object directly created by it, he will worry about or even mistrust it. However I don't insist on it either if you think we shouldn't do it. --=20 Regards Yafang