Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp2555977rda; Wed, 25 Oct 2023 06:18:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG5TtkRJq+9rVBpqz1lbxPwt4PfHjd5F8SZAeFaQzPGE0hUgpNHBl2xmZSzgETmA/UUU5zX X-Received: by 2002:a67:c303:0:b0:457:c57c:ef13 with SMTP id r3-20020a67c303000000b00457c57cef13mr15188092vsj.31.1698239887247; Wed, 25 Oct 2023 06:18:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698239887; cv=none; d=google.com; s=arc-20160816; b=cIxtlLdZVZyms8bnEn1DJE29yTj2Lg62/lFL3aQGNWUPWY/5puZHkZUFKNyuAUYJFr U6QHxQ11548ETv9oxxPg8/stbYAfPQKg6TG7pPfAnbY+jirBB06HWlkxB1iyP0OLNR6E G/K1dOArn0Ssh6g8oLgR3Us6/mSk6jlI+9JRo7T785HFjCdGpJOlQujy4gfvOoPCWMIx 4jz0g+jLYQuMGiIiH7XSdagnrbk70PRIhllgRlvJEoaLFgkkZWyvLMhOqONPvFBmKcj/ b3LNvJ61QytYdiBhCirt6PqbgV31PkgVFr2ua838n6K/1aVvK/O2B9CKBQYEPfWtCBCH 6nqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=OhI+0U3C4maNiYVWrEGT2SlLeFp5Rtq5XADzsjH+0y0=; fh=sgl3mH/QwSIqjVTBbHFWEOqkFejW84e0CBx/v0zgpSw=; b=eUUx45CY5TZGQ/6h20OLigShyfRlVGLEil6z3vLJlRNWUSSQDpJVj7bEHP/lifNHp9 ZZJHU7es8OLBeUhH5VUCKCidoTAKOEKpLSz50v/ZDxoC3nCPUHni35NRrdwNqmNuG57X G0bhQbVobXX1mrqaTqZYoHXex2goGrO1KndYTr9L+0KJjbWG3+GXKHbtKSQRfhBmfYck T0Ym6csH6KQUWFA2IkqRnuosifJdJTGrGatfwp642md5UM8SrkZ0hJTvSwf9eEtZWq4l czb0XZ+8wML5s3/KB95t43XsP1Vw8/8+kqbuqZZX1g1qLttBJDEbvUfJYHkxyt8MyNhy 1mKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@paul-moore.com header.s=google header.b=YUExOLi3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=paul-moore.com Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id y3-20020a05610207c300b0045268d1ca7fsi1162960vsg.558.2023.10.25.06.18.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 06:18:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@paul-moore.com header.s=google header.b=YUExOLi3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=paul-moore.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 2C529801CF8F; Wed, 25 Oct 2023 06:18:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343844AbjJYNRw (ORCPT + 99 others); Wed, 25 Oct 2023 09:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233875AbjJYNRv (ORCPT ); Wed, 25 Oct 2023 09:17:51 -0400 Received: from mail-yw1-x112f.google.com (mail-yw1-x112f.google.com [IPv6:2607:f8b0:4864:20::112f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43DCC138 for ; Wed, 25 Oct 2023 06:17:49 -0700 (PDT) Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-5a7eef0b931so56781717b3.0 for ; Wed, 25 Oct 2023 06:17:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paul-moore.com; s=google; t=1698239868; x=1698844668; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=OhI+0U3C4maNiYVWrEGT2SlLeFp5Rtq5XADzsjH+0y0=; b=YUExOLi3bB63BLB3LmYOz5fXJsFz3giuSe6MQq8EMmn0CbAicL3fynsuabXbu5RnpM 8YerNvWGWSvLxa4RpdC8ZKkr6GO+zdwHREY0+u6WbPDTZxY+HFatJXYan1fFXjo1IKdZ vGEF4K/J3gwmKsairkEsU5iQ7q5Z1Av03mCrwGf7P42OsacRgFuwDhR5676fhGxlXki8 lFKd24FRToX7wRSQOqIZAmTOxSTYCIl9/0b9RZGhHsO3R6+yi2d9Zi1Uwp5XD+D9tPlf s1OBenoev16RvP2Jd8aoDcQDGm7D28qgTAnINVZAzfr0KmVchGFRN/KpvWSgc+Mwd0iC 2fsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698239868; x=1698844668; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OhI+0U3C4maNiYVWrEGT2SlLeFp5Rtq5XADzsjH+0y0=; b=bWApbU4YGNuAidkEovF4rL9tTYcwwGI/GBxoC+G2/LQguGn2g0FVRRmElZ8iB6XoT0 exfkviEKAgu1t6RUc10zgAyEvck/CTuGqGnGDaV/85DU+epVU3eBsuoLqTHWDl8X0mjy z2VzkxYDBUTPyi9CcqV0uEa9yreqqdJ3cXqjmZqH5FfzUDsr85R3vsYcfJIUqnLzHLqa MFERVJ7ETpwsh4CTmtFCrdgKQE4/4eBaMT8YMgjiDB0Pa7tOv6jscNHNMgCB53hdbDC7 lFHl8uMqxY2yZgLaaDeQD/nBK1plEjS7l2aFvmT6TBr5P2KTfm4XvL4Rn8Tf6kxFjAiH xAfA== X-Gm-Message-State: AOJu0Yxka4GMQ9GJkdnmBl11WrGwkjlhHQZx21+JG6NqOVC5x0DvBdBL Bxd+AIH+svOdONWxkmFVtD+rpzSs5UsE+G3s3T5f X-Received: by 2002:a25:5856:0:b0:da0:86e8:aea4 with SMTP id m83-20020a255856000000b00da086e8aea4mr840904ybb.57.1698239868426; Wed, 25 Oct 2023 06:17:48 -0700 (PDT) MIME-Version: 1.0 References: <20231025094224.72858-1-michael.weiss@aisec.fraunhofer.de> In-Reply-To: <20231025094224.72858-1-michael.weiss@aisec.fraunhofer.de> From: Paul Moore Date: Wed, 25 Oct 2023 09:17:37 -0400 Message-ID: Subject: Re: [RESEND RFC PATCH v2 00/14] device_cgroup: guard mknod for non-initial user namespace To: =?UTF-8?Q?Michael_Wei=C3=9F?= Cc: Alexander Mikhalitsyn , Christian Brauner , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Quentin Monnet , Alexander Viro , Miklos Szeredi , Amir Goldstein , "Serge E. Hallyn" , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, gyroidos@aisec.fraunhofer.de, linux-security-module@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 25 Oct 2023 06:18:04 -0700 (PDT) On Wed, Oct 25, 2023 at 5:42=E2=80=AFAM Michael Wei=C3=9F wrote: > > Introduce the flag BPF_DEVCG_ACC_MKNOD_UNS for bpf programs of type > BPF_PROG_TYPE_CGROUP_DEVICE which allows to guard access to mknod > in non-initial user namespaces. > > If a container manager restricts its unprivileged (user namespaced) > children by a device cgroup, it is not necessary to deny mknod() > anymore. Thus, user space applications may map devices on different > locations in the file system by using mknod() inside the container. > > A use case for this, we also use in GyroidOS, is to run virsh for > VMs inside an unprivileged container. virsh creates device nodes, > e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails > in a non-initial userns, even if a cgroup device white list with the > corresponding major, minor of /dev/null exists. Thus, in this case > the usual bind mounts or pre populated device nodes under /dev are > not sufficient. > > To circumvent this limitation, allow mknod() by checking CAP_MKNOD > in the userns by implementing the security_inode_mknod_nscap(). The > hook implementation checks if the corresponding permission flag > BPF_DEVCG_ACC_MKNOD_UNS is set for the device in the bpf program. > To avoid to create unusable inodes in user space the hook also > checks SB_I_NODEV on the corresponding super block. > > Further, the security_sb_alloc_userns() hook is implemented using > cgroup_bpf_current_enabled() to allow usage of device nodes on super > blocks mounted by a guarded task. > > Patch 1 to 3 rework the current devcgroup_inode hooks as an LSM > > Patch 4 to 8 rework explicit calls to devcgroup_check_permission > also as LSM hooks and finalize the conversion of the device_cgroup > subsystem to a LSM. > > Patch 9 and 10 introduce new generic security hooks to be used > for the actual mknod device guard implementation. > > Patch 11 wires up the security hooks in the vfs > > Patch 12 and 13 provide helper functions in the bpf cgroup > subsystem. > > Patch 14 finally implement the LSM hooks to grand access > > Signed-off-by: Michael Wei=C3=9F > --- > Changes in v2: > - Integrate this as LSM (Christian, Paul) > - Switched to a device cgroup specific flag instead of a generic > bpf program flag (Christian) > - do not ignore SB_I_NODEV in fs/namei.c but use LSM hook in > sb_alloc_super in fs/super.c > - Link to v1: https://lore.kernel.org/r/20230814-devcg_guard-v1-0-654971a= b88b1@aisec.fraunhofer.de > > Michael Wei=C3=9F (14): > device_cgroup: Implement devcgroup hooks as lsm security hooks > vfs: Remove explicit devcgroup_inode calls > device_cgroup: Remove explicit devcgroup_inode hooks > lsm: Add security_dev_permission() hook > device_cgroup: Implement dev_permission() hook > block: Switch from devcgroup_check_permission to security hook > drm/amdkfd: Switch from devcgroup_check_permission to security hook > device_cgroup: Hide devcgroup functionality completely in lsm > lsm: Add security_inode_mknod_nscap() hook > lsm: Add security_sb_alloc_userns() hook > vfs: Wire up security hooks for lsm-based device guard in userns > bpf: Add flag BPF_DEVCG_ACC_MKNOD_UNS for device access > bpf: cgroup: Introduce helper cgroup_bpf_current_enabled() > device_cgroup: Allow mknod in non-initial userns if guarded > > block/bdev.c | 9 +- > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +- > fs/namei.c | 24 ++-- > fs/super.c | 6 +- > include/linux/bpf-cgroup.h | 2 + > include/linux/device_cgroup.h | 67 ----------- > include/linux/lsm_hook_defs.h | 4 + > include/linux/security.h | 18 +++ > include/uapi/linux/bpf.h | 1 + > init/Kconfig | 4 + > kernel/bpf/cgroup.c | 14 +++ > security/Kconfig | 1 + > security/Makefile | 2 +- > security/device_cgroup/Kconfig | 7 ++ > security/device_cgroup/Makefile | 4 + > security/{ =3D> device_cgroup}/device_cgroup.c | 3 +- > security/device_cgroup/device_cgroup.h | 20 ++++ > security/device_cgroup/lsm.c | 114 +++++++++++++++++++ > security/security.c | 75 ++++++++++++ > 19 files changed, 294 insertions(+), 88 deletions(-) > delete mode 100644 include/linux/device_cgroup.h > create mode 100644 security/device_cgroup/Kconfig > create mode 100644 security/device_cgroup/Makefile > rename security/{ =3D> device_cgroup}/device_cgroup.c (99%) > create mode 100644 security/device_cgroup/device_cgroup.h > create mode 100644 security/device_cgroup/lsm.c Hi Michael, I think this was lost because it wasn't CC'd to the LSM list (see below). I've CC'd the list on my reply, but future patch submissions that involve the LSM must be posted to the LSM list if you would like them to be considered. http://vger.kernel.org/vger-lists.html#linux-security-module --=20 paul-moore.com