Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp3487096pxp; Mon, 14 Mar 2022 22:12:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxpQCB+DiawkY03wALNguPZh8+2mv/Daz4N5UUKF8StVuPDcRySORNWxpM1DQdd/cUddFZr X-Received: by 2002:a05:6402:2023:b0:416:2c7:945c with SMTP id ay3-20020a056402202300b0041602c7945cmr23681782edb.148.1647321140453; Mon, 14 Mar 2022 22:12:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647321140; cv=none; d=google.com; s=arc-20160816; b=KR2TgWQfjlmmK1MqeCOJW7JlEVl7AHbZPxA8d71c5vcs/YVCLEGNEVCGM/sKZoMFV7 DgPtni/uSx8KnvBA22sbc/Ei4vFUsZ1ssYedWefB20mzJtn/wh/19OdJLKdxRtdOqGdA bd9Y8GOuMMwvicrt4z8IWT2u7aTTzeN5vjt0sZtcjMx0gWCydhjhBtKI8B7Avi3CzkYF CqOHgFgTQZKYxQ6E8GnA6zZWwKp7kPCxawwEWr6cC7oaxhCRrkSJOrIXbMKpdxAWfQxG qigkdZ1hfbMpyZdSlGEAM0x1slO+4acl0iEIMN1IuLWoM8v9439iYK983sip4cZmsKIY yMNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=I9gjU0x/PgdM8bvtL6SwyLtmlqm3fb9Iulm2fPDe870=; b=rKhzKURzRNJLY6ARriWfOyAxJboaKsgpS8oRZBPzixjM1bceAQ8/y3Qt5bkzFinvl9 GvIM59NLPpdVotj37QhdxddOLltzvX2MUtn4w9BY1BwJDk74FCS6L8efEb4DzH7sjX3b T3QPG+X4uzGb54ayKS8rwhu4QAfD1OkKppDsrgxWzxxZsM0lyxSMWxiMxBSJeuEraPD7 AWpq/VaNvW0fbqFNvpASLBmSYpzCkXzyC9666ZtRIA/Maz7npUjY7HlhFs9unezKM6Wm WdvHrIIlT8LsbkB7GpXaOJLDFbS68Z/KybmEzFXeiVmv7u7LVh0yjf8nOrSW1VxTJRCK fALQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=TDH+vn8M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e12-20020a056402190c00b00416a66c3f53si11895728edz.514.2022.03.14.22.11.56; Mon, 14 Mar 2022 22:12:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=TDH+vn8M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236727AbiCNRI4 (ORCPT + 99 others); Mon, 14 Mar 2022 13:08:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234467AbiCNRIz (ORCPT ); Mon, 14 Mar 2022 13:08:55 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5788C3DA58 for ; Mon, 14 Mar 2022 10:07:44 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id e24so25026970wrc.10 for ; Mon, 14 Mar 2022 10:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=I9gjU0x/PgdM8bvtL6SwyLtmlqm3fb9Iulm2fPDe870=; b=TDH+vn8MRDd0TIB/7sTaTIjyrSXs8HnE41Pah2MUMlSNzgrfQ07BlDrRt6vnV9JeF/ RAWJ1UkMrKFqkzQiiuZsik44O39eSmBjZ/Z0Nxr6go34gwnOlZ7Ocq/eSGzObW4s6hAk pgR/q2kn736c/MTokUaliD8FOZ8zTosL3VNsOoT1c/fv+Y/lg4bXIp8Zcku0zjJlrY64 OWXit4bmUsIXP1MjuvvbYNu5lMsMWpfkEt7UlsGPcmiLIfqn/Xh65eL6h9/L9yURmFxi 2KUh12l/KKOJ7r0Jvvaq7WFBa4alT8MiNUNfvkD5HmQbLUmb+GtN1TCdrrQXLhaUEzq7 NSqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=I9gjU0x/PgdM8bvtL6SwyLtmlqm3fb9Iulm2fPDe870=; b=u/jmOdzSXwPFEBHujcARI3gw1VecbtOKCjKIrh2Xpke55rlROLQ6eZZCIxsRmc2BFV Qd+PmAkv7ATT1sp4BNfTRR0RAN+KdRM7gKe2G+u+/56gsrDuDr1pzfbScpAydU+ryG5w vxsH5l6UjQQl5X26vuQc8p6MKzI+pItcs/N35SFrWD9PZ2ScCbhGS4sliOnNIAVvK22b 6liQVYT0eEWOE7/J5sE8tn5bgSEwREOY1YBnoz6HeLdWEFmQu81uc9l5Oc8mowyklQuE 07pq7Md5SWqVDMY3lSZmqiLISzV0j+Gm+tE2HOcuVXgon297/rybrpuLjARWeRLjs1TV KBPA== X-Gm-Message-State: AOAM532L4IeuQYgwuFU4aG/hNA1ULf2rBKUffbaK54HCPOG5XVlqrZsE OSx7tYh9NbBRnXVbF5zrNSe+3Jl65UAsnLNW8vPViQ== X-Received: by 2002:adf:e6c7:0:b0:1ed:9f7c:c99e with SMTP id y7-20020adfe6c7000000b001ed9f7cc99emr16863843wrm.0.1647277662677; Mon, 14 Mar 2022 10:07:42 -0700 (PDT) MIME-Version: 1.0 References: <20220225234339.2386398-1-haoluo@google.com> <20220225234339.2386398-2-haoluo@google.com> In-Reply-To: From: Hao Luo Date: Mon, 14 Mar 2022 10:07:31 -0700 Message-ID: Subject: Re: [PATCH bpf-next v1 1/9] bpf: Add mkdir, rmdir, unlink syscalls for prog_bpf_syscall To: Al Viro Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Shakeel Butt , Joe Burton , Tejun Heo , joshdon@google.com, sdf@google.com, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Al, On Fri, Mar 11, 2022 at 7:46 PM Al Viro wrote: > > On Fri, Feb 25, 2022 at 03:43:31PM -0800, Hao Luo wrote: > > This patch allows bpf_syscall prog to perform some basic filesystem > > operations: create, remove directories and unlink files. Three bpf > > helpers are added for this purpose. When combined with the following > > patches that allow pinning and getting bpf objects from bpf prog, > > this feature can be used to create directory hierarchy in bpffs that > > help manage bpf objects purely using bpf progs. > > > > The added helpers subject to the same permission checks as their syscall > > version. For example, one can not write to a read-only file system; > > The identity of the current process is checked to see whether it has > > sufficient permission to perform the operations. > > > > Only directories and files in bpffs can be created or removed by these > > helpers. But it won't be too hard to allow these helpers to operate > > on files in other filesystems, if we want. > > In which contexts can those be called? > In a sleepable context. The plan is to introduce a certain tracepoints as sleepable, a program that attaches to sleepable tracepoints is allowed to call these functions. In particular, the first sleepable tracepoint introduced in this patchset is one at the end of cgroup_mkdir(). Do you have any advices? > > +BPF_CALL_2(bpf_rmdir, const char *, pathname, int, pathname_sz) > > +{ > > + struct user_namespace *mnt_userns; > > + struct path parent; > > + struct dentry *dentry; > > + int err; > > + > > + if (pathname_sz <= 1 || pathname[pathname_sz - 1]) > > + return -EINVAL; > > + > > + err = kern_path(pathname, 0, &parent); > > + if (err) > > + return err; > > + > > + if (!bpf_path_is_bpf_dir(&parent)) { > > + err = -EPERM; > > + goto exit1; > > + } > > + > > + err = mnt_want_write(parent.mnt); > > + if (err) > > + goto exit1; > > + > > + dentry = kern_path_locked(pathname, &parent); > > This can't be right. Ever. There is no promise whatsoever > that these two lookups will resolve to the same place. > > > +BPF_CALL_2(bpf_unlink, const char *, pathname, int, pathname_sz) > > +{ > > + struct user_namespace *mnt_userns; > > + struct path parent; > > + struct dentry *dentry; > > + struct inode *inode = NULL; > > + int err; > > + > > + if (pathname_sz <= 1 || pathname[pathname_sz - 1]) > > + return -EINVAL; > > + > > + err = kern_path(pathname, 0, &parent); > > + if (err) > > + return err; > > + > > + err = mnt_want_write(parent.mnt); > > + if (err) > > + goto exit1; > > + > > + dentry = kern_path_locked(pathname, &parent); > > + if (IS_ERR(dentry)) { > > + err = PTR_ERR(dentry); > > + goto exit2; > > + } > > Ditto. NAK; if you want to poke into fs/namei.c guts, do it right. > Or at least discuss that on fsdevel. As it is, it's completely broken. > It's racy *and* it blatantly leaks both vfsmount and dentry references. > > NAKed-by: Al Viro Thanks Al for taking a look. Actually, there is a simpler approach: can we export two functions in namei.c that wrap call to do_mkdirat and do_unlinkat, but take a kernel string as pathname? Then these two bpf helpers can use them, don't have to be this complicated. Does this sound good to you? Thanks!