Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp193576lfo; Tue, 17 May 2022 21:58:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxzhxHEXVpphXXLgdN/5H9+xFhqLhRzgC7kHCxXoU2UK5BR7TJZwkV7jEVHPCInYcwkxxc6 X-Received: by 2002:a65:52cd:0:b0:3f5:f3fb:6780 with SMTP id z13-20020a6552cd000000b003f5f3fb6780mr2288349pgp.150.1652849897662; Tue, 17 May 2022 21:58:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652849897; cv=none; d=google.com; s=arc-20160816; b=dzny6hmEsdLcnsPnc3k36UvARLpdR0tZ2PQ4gpkpak4NuCnT6lYaJQtBeBsKBPO777 Gz3iuVcRJa2KAlOBbKUrreGaz86cNzXN392oSzBm0ISO9U1gpBR1AFD07enmwEzkMWlu NxshNQd8fRbL57wHq5vBViOT524YhvkLM6T/mZqSzt5ZYonJ03pLIx7i6G5Fjg9Td5fa Vl2yRaRtCKw0utFgEVshSfr7CbojfGRkpL/lVUn9CDpuMUOEJOWrMb0+LgOxHwhr4UiY zznDgh4dNIywmxaaiEl1Jl6Rzn2zncTLxvHadU32qAf0rLUjfVFxdMilMwIzTt/uvIMO rKQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=CRUVZxUiPFa++e0erHqoMtj4bXpVQNsrlJ20yGYjhvc=; b=FjvE20DVXHHZo39u3SolTotVKMBRFaeohRqmENlffn0smdmDea5q7KgOpx6q23/ytQ 5HBuhwNpnxzGj38hHUlfFY7m7S9NX24Mlce67jIxst7LYaizPbRtODG6BUOY9hgOXGHL SdHVzxVB9lAv2YBcxaN5vXW/ykLFerKKv9GP0QBKsfQqb4kButLuMHerClOu8RFKoHr+ aN8ueooMjSWCIYHAJstr9KOgxEXknPSZVsjxtUCmQ9EA88tj8DDLT8vJ246sft94K32a qNvyxGenRZ/tPSnVyTNM6oUpLNB7hjRiOKjFHiexkc7bE4a/srzpxo+kOlaaoypyWTMv xP4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=UwonYpJJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id j21-20020a17090aeb1500b001bed39e61cesi1213410pjz.23.2022.05.17.21.58.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 May 2022 21:58:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=UwonYpJJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AA3677890B; Tue, 17 May 2022 21:06:05 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229652AbiERDb4 (ORCPT + 99 others); Tue, 17 May 2022 23:31:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229446AbiERDbx (ORCPT ); Tue, 17 May 2022 23:31:53 -0400 Received: from mail-ej1-x641.google.com (mail-ej1-x641.google.com [IPv6:2a00:1450:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8FEB814B0; Tue, 17 May 2022 20:31:50 -0700 (PDT) Received: by mail-ej1-x641.google.com with SMTP id y13so546402eje.2; Tue, 17 May 2022 20:31:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CRUVZxUiPFa++e0erHqoMtj4bXpVQNsrlJ20yGYjhvc=; b=UwonYpJJCOJV4Rl7KWzUFXn0d60C4PgqizzRx0Ah9Shb0qiH5aqP83U/u+FcVa5Wfq O/hMSFCfmDK31AJ9JXapppG2PLQBbVA6nFgL0Y+z2AUvXvfw8xzLsHhVxx7sDmLfTLrx FIb9S/lQLvRvSCRBJ3CSgRw56hvNYE1DYskMaNFJSgLaEQuqsP17dWNUri3tmaHzzQH5 sOkr1XsYK0vRGLqcoihOHwP6JVEqWhwvjgVHeV+w0BQHD3uFtrcyzdoJMq8YrRF/AMLg dWj4rUi0c2iEgqxCLv6d1am3njz7IVZP7jrt+jtALdyHMcHGoHA5fp8vpUe8x7V9+6LT iT7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CRUVZxUiPFa++e0erHqoMtj4bXpVQNsrlJ20yGYjhvc=; b=vUl3XH4yEAAEFL0eON/Vb1oTDXF9wzPoIc3YN8SPgjcFkA7x8RjPrks0q4qMx2+UON j9B5iiV/cwti9GDkT1lUNSe/+LdqFh4WkF8dR0rirG3FVyFM19ph9rV45iBq3lzAzYRM IcQI/lRv659xfXl5xhQCcHtggF4PUdfyJj3B3xvx9Q3QKrUzLJhC9lMJE0/2aQtVWZnc jLjQdU3Wgs+p4YPyfkVqB7YGSrIfcjsqu+KNxHkoxjWASPhVyHU/ih2XmESt1k5KViCB gwKlf/9zhD1V/9P147FQTqrKqQEm8UlvvNtyUhTE0VOAYl8CCFJ3626CN5JaF+UwWsFc G8EQ== X-Gm-Message-State: AOAM530z6yNQgPjLhnMP//29vY75DBdYtOk9Jxm8xbwGfUOULB+nAgNU yqxTwsLHEW09mD7xVHcSQQEXt6toZmILPofLnW4= X-Received: by 2002:a17:907:7ea7:b0:6f4:7a72:da92 with SMTP id qb39-20020a1709077ea700b006f47a72da92mr22032393ejc.348.1652844709009; Tue, 17 May 2022 20:31:49 -0700 (PDT) MIME-Version: 1.0 References: <20220517034107.92194-1-imagedong@tencent.com> In-Reply-To: From: Menglong Dong Date: Wed, 18 May 2022 11:31:37 +0800 Message-ID: Subject: Re: [PATCH] bpf: add access control for map To: Alexei Starovoitov Cc: Alexei Starovoitov , Network Development , bpf , LKML , Menglong Dong , Andrii Nakryiko , Daniel Borkmann , Alan Maguire Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 18, 2022 at 12:58 AM Alexei Starovoitov wrote: > > On Mon, May 16, 2022 at 8:44 PM wrote: > > > > From: Menglong Dong > > > > Hello, > > > > I have a idea about the access control of eBPF map, could you help > > to see if it works? > > > > For now, only users with the CAP_SYS_ADMIN or CAP_BPF have the right > > to access the data in eBPF maps. So I'm thinking, are there any way > > to control the access to the maps, just like what we do to files? > > The bpf objects pinned in bpffs should always be accessible > as files regardless of sysctl or cap-s. > > > Therefore, we can decide who have the right to read the map and who > > can write. > > > > I think it is useful in some case. For example, I made a eBPF-based > > network statistics program, and the information is stored in an array > > map. And I want all users can read the information in the map, without > > changing the capacity of them. As the information is iunsensitive, > > normal users can read it. This make publish-consume mode possible, > > the eBPF program is publisher and the user space program is consumer. > > Right. It is a choice of the bpf prog which data expose in the map. > > > So this aim can be achieve, if we can control the access of maps as a > > file. There are many ways I thought, and I choosed one to implement: > > > > While pining the map, add the inode that is created to a list on the > > map. root can change the permission of the inode through the pin path. > > Therefore, we can try to find the inode corresponding to current user > > namespace in the list, and check whether user have permission to read > > or write. > > > > The steps can be: > > > > 1. create the map with BPF_F_UMODE flags, which imply that enable > > access control in this map. > > 2. load and pin the map on /sys/fs/bpf/xxx. > > 3. change the umode of /sys/fs/bpf/xxx with 'chmod 744 /sys/fs/bpf/xxx', > > therefor all user can read the map. > > This behavior should be available by default. > Only sysctl was preventing it. It's being fixed by > the following patch. Please take a look at: > https://patchwork.kernel.org/project/netdevbpf/patch/1652788780-25520-2-git-send-email-alan.maguire@oracle.com/ > > Does it solve your use case? Yeah, it seems this patch gives another way: give users all access to bpf commands (except map_create and prog_load). Therefore, users that have the access to the pin path have corresponding r/w of the eBPF object. This patch can cover my case. However, this seems to give users too much permission (or the access check is not enough?) I have do a test: 1. load a ebpf program of type cgroup and pin it on /sys/fs/bpf/post_bind as root. 2. give users access to read /sys/fs/bpf/post_bind 3. Now, all users can attach or detach the eBPF program to /sys/fs/cgroup/, who have only read access to the ebpf and the cgroup. I think there are many such cases. Is this fine? > > > @@ -542,14 +557,26 @@ int bpf_obj_get_user(const char __user *pathname, int flags) > > if (IS_ERR(raw)) > > return PTR_ERR(raw); > > > > - if (type == BPF_TYPE_PROG) > > + if (type != BPF_TYPE_MAP && !bpf_capable()) > > + return -EPERM; > > obj_get already implements normal ACL style access to files. > Let's not fragment this security model with extra cap checks. > Yeah, my way is too rough. > > + > > + switch (type) { > > + case BPF_TYPE_PROG: > > ret = bpf_prog_new_fd(raw); > > - else if (type == BPF_TYPE_MAP) > > + break; > > + case BPF_TYPE_MAP: > > + if (bpf_map_permission(raw, f_flags)) { > > + bpf_any_put(raw, type); > > + return -EPERM; > > + } > > bpf_obj_do_get() already does such check. > With the patch you mentioned above, now bpf_obj_do_get() can do this job, as normal users can also get there too. > > +int bpf_map_permission(struct bpf_map *map, int flags) > > +{ > > + struct bpf_map_inode *map_inode; > > + struct user_namespace *ns; > > + > > + if (capable(CAP_SYS_ADMIN)) > > + return 0; > > + > > + if (!(map->map_flags & BPF_F_UMODE)) > > + return -1; > > + > > + rcu_read_lock(); > > + list_for_each_entry_rcu(map_inode, &map->inode_list, list) { > > + ns = map_inode->inode->i_sb->s_user_ns; > > + if (ns == current_user_ns()) > > + goto found; > > + } > > + rcu_read_unlock(); > > + return -1; > > +found: > > + rcu_read_unlock(); > > + return inode_permission(ns, map_inode->inode, ACC_MODE(flags)); > > +} > > See path_permission() in bpf_obj_do_get(). > > > static int bpf_map_get_fd_by_id(const union bpf_attr *attr) > > @@ -3720,9 +3757,6 @@ static int bpf_map_get_fd_by_id(const union bpf_attr *attr) > > attr->open_flags & ~BPF_OBJ_FLAG_MASK) > > return -EINVAL; > > > > - if (!capable(CAP_SYS_ADMIN)) > > - return -EPERM; > > - > > This part we cannot relax. > What you're trying to do is to bypass path checks > by pointing at a map with its ID only. > That contradicts to your official goal in the cover letter. > > bpf_map_get_fd_by_id() has to stay cap_sys_admin only. > Exactly for the reason that bpf subsystem has file ACL style. > fd_by_id is a debug interface used by tools like bpftool and > root admin that needs to see the system as a whole. > Normal tasks/processes need to use bpffs and pin files with > correct permissions to pass maps from one process to another. > Or use FD passing kernel facilities. Yeah, this part is not necessary for me either. Without this part, the current code already can do what I wanted. Thanks Menglong Dong