Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp4296852rwb; Tue, 20 Sep 2022 11:37:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5+GTM7hAhbzbN1n98V9bzY9+iPte5du/+6IerzZUjyNr2QgGyoPquWdtEk4o0GGQ9J2Bis X-Received: by 2002:a17:906:db03:b0:741:337e:3600 with SMTP id xj3-20020a170906db0300b00741337e3600mr18553381ejb.343.1663699041325; Tue, 20 Sep 2022 11:37:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663699041; cv=none; d=google.com; s=arc-20160816; b=ow8zqhfwH+XFji3pSlmWa2LkRhlGyNTy7EZjhewXNdhfaM54+3utlX4jzbAo4qio/q yJDbcDbJmd1MacMyZhNxBG17H9aMZhWY/N2e7gU0aVGOtqJQOB10hndqjyUnYnhE0HQl TZAnQPXlRcvhypP1qAlbqfwSNIrFJPhiEXkj78tB7tJvj1ng/lK8qzHD5OLFpfcBtMue vev+BZhxB/BXIfpq88j3xjapnSCNqhL2OG5QASgpFIIjL+G7ODzRMTBV3HcLSHDZ808f EHtij6QlZRYYIYADIylX7NwrIbMgmK2sghN7Tbq9IE3BbvYVr9pDbiiaOR3WOouUWV+5 zOLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=yRxRUBIVakoIVUI3pPacp5FvPM6mZaVtVpqUe64XVDU=; b=cx/XjI6iGT1kcdJrCpoNZ66q8rqpU/QaYYOF/38o6/pILtYT/iSX+lmiVNDcidWjh1 CIe6U9oy0/nhioSM+ok9u/zPce9zJ/W20sm2lkY7p7gLUn6I8PhQ1v58IujIc0Y1JLBj CT7P8sIgP6SA+TBbkcIN4MZWASdZc6urVeAmCReDfX/ITNW3J+AzOwzP2pWdqarHD+4h NXz5WAs2o2YRdjiB91ty5VsgV/ibOZgyA0fXnT360NaMAHLcW4CT9cK+BxBLI2LyQCl9 39/nD18eEq+5MXlegjIr5mn8mT5hZwRiX2yKvkRs0XoAMgW5mZroTZtUf0x5gQCGy7C8 6/dg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d12-20020a170906544c00b007707ab4be26si327986ejp.967.2022.09.20.11.36.56; Tue, 20 Sep 2022 11:37:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230442AbiITSJK (ORCPT + 99 others); Tue, 20 Sep 2022 14:09:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231152AbiITSI7 (ORCPT ); Tue, 20 Sep 2022 14:08:59 -0400 Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [205.139.111.44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FF076747B for ; Tue, 20 Sep 2022 11:08:56 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-168-cS9gUiCyO26z06RYJEcy-A-1; Tue, 20 Sep 2022 14:08:52 -0400 X-MC-Unique: cS9gUiCyO26z06RYJEcy-A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E98FF8027EA; Tue, 20 Sep 2022 18:08:51 +0000 (UTC) Received: from comp-core-i7-2640m-0182e6.redhat.com (unknown [10.40.208.17]) by smtp.corp.redhat.com (Postfix) with ESMTP id 821B92166B26; Tue, 20 Sep 2022 18:08:50 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers Cc: Andrew Morton , Christian Brauner , "Eric W . Biederman" , Kees Cook , Manfred Spraul Subject: [PATCH v2 1/3] sysctl: Allow change system v ipc sysctls inside ipc namespace Date: Tue, 20 Sep 2022 20:08:20 +0200 Message-Id: <0895bd453013370eb4f9600e26e2a9969ee755de.1663696560.git.legion@kernel.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_SOFTFAIL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Rootless containers are not allowed to modify kernel IPC parameters. All default limits are set to such high values that in fact there are no limits at all. All limits are not inherited and are initialized to default values when a new ipc_namespace is created. For new ipc_namespace: size_t ipc_ns.shm_ctlmax = SHMMAX; // (ULONG_MAX - (1UL << 24)) size_t ipc_ns.shm_ctlall = SHMALL; // (ULONG_MAX - (1UL << 24)) int ipc_ns.shm_ctlmni = IPCMNI; // (1 << 15) int ipc_ns.shm_rmid_forced = 0; unsigned int ipc_ns.msg_ctlmax = MSGMAX; // 8192 unsigned int ipc_ns.msg_ctlmni = MSGMNI; // 32000 unsigned int ipc_ns.msg_ctlmnb = MSGMNB; // 16384 The shm_tot (total amount of shared pages) has also ceased to be global, it is located in ipc_namespace and is not inherited from anywhere. In such conditions, it cannot be said that these limits limit anything. The real limiter for them is cgroups. If we allow rootless containers to change these parameters, then it can only be reduced. Signed-off-by: Alexey Gladkov --- ipc/ipc_sysctl.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c index ef313ecfb53a..a6a9d7f680dd 100644 --- a/ipc/ipc_sysctl.c +++ b/ipc/ipc_sysctl.c @@ -190,25 +190,53 @@ static int set_is_seen(struct ctl_table_set *set) return ¤t->nsproxy->ipc_ns->ipc_set == set; } +static void ipc_set_ownership(struct ctl_table_header *head, + struct ctl_table *table, + kuid_t *uid, kgid_t *gid) +{ + struct ipc_namespace *ns = + container_of(head->set, struct ipc_namespace, ipc_set); + + kuid_t ns_root_uid = make_kuid(ns->user_ns, 0); + kgid_t ns_root_gid = make_kgid(ns->user_ns, 0); + + *uid = uid_valid(ns_root_uid) ? ns_root_uid : GLOBAL_ROOT_UID; + *gid = gid_valid(ns_root_gid) ? ns_root_gid : GLOBAL_ROOT_GID; +} + static int ipc_permissions(struct ctl_table_header *head, struct ctl_table *table) { + struct ipc_namespace *ns = + container_of(head->set, struct ipc_namespace, ipc_set); int mode = table->mode; + kuid_t ns_root_uid; + kgid_t ns_root_gid; -#ifdef CONFIG_CHECKPOINT_RESTORE - struct ipc_namespace *ns = current->nsproxy->ipc_ns; + ipc_set_ownership(head, table, &ns_root_uid, ns_root_gid); +#ifdef CONFIG_CHECKPOINT_RESTORE if (((table->data == &ns->ids[IPC_SEM_IDS].next_id) || (table->data == &ns->ids[IPC_MSG_IDS].next_id) || (table->data == &ns->ids[IPC_SHM_IDS].next_id)) && checkpoint_restore_ns_capable(ns->user_ns)) mode = 0666; + else #endif - return mode; + if (uid_eq(current_euid(), ns_root_uid)) + mode >>= 6; + + else if (in_egroup_p(ns_root_gid)) + mode >>= 3; + + mode &= 7; + + return (mode << 6) | (mode << 3) | mode; } static struct ctl_table_root set_root = { .lookup = set_lookup, .permissions = ipc_permissions, + .set_ownership = ipc_set_ownership, }; bool setup_ipc_sysctls(struct ipc_namespace *ns) -- 2.33.4