Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1237341rdb; Fri, 1 Dec 2023 10:18:16 -0800 (PST) X-Google-Smtp-Source: AGHT+IHGH/AmKb7y061jv61a2DiKUA8g1Ocqrkxslv6HE+phgJEBoh+lfN0N56WpjkwAm9o2Dvse X-Received: by 2002:a17:90b:1d09:b0:280:6d53:f5d2 with SMTP id on9-20020a17090b1d0900b002806d53f5d2mr26374217pjb.11.1701454696073; Fri, 01 Dec 2023 10:18:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701454696; cv=none; d=google.com; s=arc-20160816; b=JxV+BtUsdpGJtdDaHChjc6YAbq8viTSQ+3Lpx4lGvZmnDCwNjVeUY19GJd5hntKXzC SVJLwJZXtJ3ANOKLD8Kl+fYpeHiOut96DiRG3yRBxDZQC3NuMtyuNPU55l9M2syT2fWh iRfftGEO34hIp+1KJ1w8yoE8HvaA39JTOgX7lZCmvVmY7yFrAxxJgUj55qZVCO2nEog9 PzPvTS6ZgVQbhZtXtgTdCzHVGYss01kIjuCro/rute+wKLZV+9aboOVt5X6cQaLxVOLR KyMXHsGSxdaV0lpctsoVbDCVUkBvSFwZ+o/Y2n4n5LdAE17/UiT8nZplD7q9oO9c7n7A S7UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=FGZWMOtoM0G21o02418RUklmQmFTr6HTTX121kothG4=; fh=7fXGgMwf+//x0JQxo8LE3wiBrkF7IJx/jxDwq0xWsEg=; b=YYFNmW8yEOAiFu9CLlNhtQP03c8LH6r/ghF3Atj6JKz2KFoM9YRg5nptmMW4czbMRy M12thGMDy7XvZqcl8EQIhMcGC6MUPNpSUvOVXf3mJUkDi2BAFKzXDZKcIB7V1diI+mth FfZGy9jJbXMmdItCxmuEDN72SOrNSCFklrTY+6DHOFGg6H5EwjUZRr0/Lc+rVBWOGtCj vI78AkFKdadhOxzWu+OO3wswxRwCeR4yAMQpO5nqRBZW43z8rWFo+3grW6fwfhYkm5Jh 8lE0M+/yZF8BzyYEO9A5rjXBByNC1POoU0NpcXLMdOIgicrtVnuQQLBNRZtlTKBLF0lr 0Ayg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZGaKbsUp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id oj14-20020a17090b4d8e00b0028588a74a86si6394857pjb.83.2023.12.01.10.18.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 10:18:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZGaKbsUp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 5F86A8200CA5; Fri, 1 Dec 2023 10:18:13 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379081AbjLASR5 (ORCPT + 99 others); Fri, 1 Dec 2023 13:17:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229454AbjLASRz (ORCPT ); Fri, 1 Dec 2023 13:17:55 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A4EE10D for ; Fri, 1 Dec 2023 10:18:02 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85ACAC433C7; Fri, 1 Dec 2023 18:18:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701454681; bh=ujCjp75X8yi+WvQp0VZeg+iQb5i3JpQWywg/I5QIQBI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ZGaKbsUpWLmQ+x99zErN9VEnQZkJDK2RXwFBwKt9EkRarBXR4Eq+oRs9CTm7VRogN 7CXUVH2iRK5JimTZJSit/XUgSn0e4pVksPYOo8ft0SLJf16t1yF6roxVv+Zg+CLbIa sSjydc7rreEqs7iTUfOC8jFCVWVmFCYl5UURZVlCmu9pQ0TJbc5ezvtkBajPAVJGWV stUE1Q5SbW+VbUiicRRVuxYvkTKpQattI4R2PoYhqSsmuc+paipASOfjZQuVQOGOBE x0Xa2wjFsNiGAdD11zOBFpltnEO56LzG+qJQ3G4KK1C9ngKquLfHNgx7kXYyxDh8S4 vyLb93E9AgRtw== Date: Fri, 1 Dec 2023 12:18:00 -0600 From: "Seth Forshee (DigitalOcean)" To: Christian Brauner Cc: Serge Hallyn , Paul Moore , Eric Paris , James Morris , Alexander Viro , Miklos Szeredi , Amir Goldstein , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-security-module@vger.kernel.org, audit@vger.kernel.org, linux-unionfs@vger.kernel.org Subject: Re: [PATCH 09/16] fs: add vfs_set_fscaps() Message-ID: References: <20231129-idmap-fscap-refactor-v1-0-da5a26058a5b@kernel.org> <20231129-idmap-fscap-refactor-v1-9-da5a26058a5b@kernel.org> <20231201-reintreten-gehalt-435a960f80ed@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231201-reintreten-gehalt-435a960f80ed@brauner> X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 01 Dec 2023 10:18:13 -0800 (PST) On Fri, Dec 01, 2023 at 06:39:18PM +0100, Christian Brauner wrote: > > +/** > > + * vfs_set_fscaps - set filesystem capabilities > > + * @idmap: idmap of the mount the inode was found from > > + * @dentry: the dentry on which to set filesystem capabilities > > + * @caps: the filesystem capabilities to be written > > + * @flags: setxattr flags to use when writing the capabilities xattr > > + * > > + * This function writes the supplied filesystem capabilities to the dentry. > > + * > > + * Return: 0 on success, a negative errno on error. > > + */ > > +int vfs_set_fscaps(struct mnt_idmap *idmap, struct dentry *dentry, > > + const struct vfs_caps *caps, int flags) > > +{ > > + struct inode *inode = d_inode(dentry); > > + struct inode *delegated_inode = NULL; > > + struct vfs_ns_cap_data nscaps; > > + int size, error; > > + > > + /* > > + * Unfortunately EVM wants to have the raw xattr value to compare to > > + * the on-disk version, so we need to pass the raw xattr to the > > + * security hooks. But we also want to do security checks before > > + * breaking leases, so that means a conversion to the raw xattr here > > + * which will usually be reduntant with the conversion we do for > > + * writing the xattr to disk. > > + */ > > + size = vfs_caps_to_xattr(idmap, i_user_ns(inode), caps, &nscaps, > > + sizeof(nscaps)); > > + if (size < 0) > > + return size; > > Oh right, I remember that. Slight eyeroll. See below though... > > > + > > +retry_deleg: > > + inode_lock(inode); > > + > > + error = xattr_permission(idmap, inode, XATTR_NAME_CAPS, MAY_WRITE); > > + if (error) > > + goto out_inode_unlock; > > + error = security_inode_setxattr(idmap, dentry, XATTR_NAME_CAPS, &nscaps, > > + size, flags); > > + if (error) > > + goto out_inode_unlock; > > For posix acls I added dedicated security hooks that take the struct > posix_acl stuff and then plumb that down into the security modules. You > could do the same thing here and then just force EVM and others to do > their own conversion from in-kernel to xattr format, instead of forcing > the VFS to do this. > > Because right now we make everyone pay the price all the time when > really EVM should pay that price and this whole unpleasantness. Good point, I'll do that. > > > + > > + error = try_break_deleg(inode, &delegated_inode); > > + if (error) > > + goto out_inode_unlock; > > + > > + if (inode->i_opflags & IOP_XATTR) { > > So I'm trying to remember the details how I did this for POSIX ACLs in > commit e499214ce3ef ("acl: don't depend on IOP_XATTR"). I think what you > did here is correct because you need to have an xattr handler for > fscaps currently. IOW, it isn't purely based on inode operations. > > And here starts the hate mail in so far as you'll hate me for asking > this: > > I think I asked this before when we talked about this but how feasible > would it be to move fscaps completely off of xattr handlers and purely > on inode operations for all filesystems? > > Yes, that's a fairly large patchset but it would also be a pretty good > win because we avoid munging this from inode operations through xattr > handlers again which seems a bit ugly and what we really wanted to > avoid desperately with POSIX ACLs. > > If this is feasible and you'd be up for it I wouldn't even mind doing > that in two steps. IOW, merge something like this first and them move > everyone off of their individual xattr handlers. > > Could you quickly remind me whether there would be any issues with this? It's certainly possible to do this. There wouldn't be any issues per se, but there are some tradoffs to consider. First, it's really only overlayfs that needs special handling. It seems pretty unfortunate to make every filesystem provide its own implementations which are virtually identical, which is what we'd need to do if we want to completely avoid the xattr handlers. But we could still provide a generic implementation that uses only __vfs_{get,set}xattr(), and most filesystems could use those in their inode ops. How does that sound? The other drawback I see is needing to duplicate logic from the {get,set}xattr codepaths into the fscaps codepaths and maintain them in parallel. I was trying to avoid that as much as possible, but in the end I had to duplicate some of the logic anyway. And as Amir pointed out I did miss some things I needed to duplicate from the setxattr logic, so I already need to revisit that code and probably pull in more of the setxattr logic, so there may not be as much benefit here as I'd originally hoped.