Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp1392487pxb; Fri, 21 Jan 2022 17:27:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJyyU4FLK+E3RFjJ4nEeBPbhufRxBqLEhGDOVExFM131RtvQaTWejS/3J4qhvpOpzi81GmAx X-Received: by 2002:a17:90b:3010:: with SMTP id hg16mr3226472pjb.62.1642814867700; Fri, 21 Jan 2022 17:27:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642814867; cv=none; d=google.com; s=arc-20160816; b=QMaSV2eO+1vY55UC+qPQKLK2b6WuDVxhLX4vIWvpFLvl2uyrOuAYrnCCcY6UVKzUPO wtbTwP1iyLJPEW87ew3Tpw7Z53jco6Y2UEt4Q5KCd2daOgW5QbZFsR9vXbjue4q7tWdi oY7zk9YA/JoXQELf297z8EdnN4ax2r/sbxHMHwOUB+ixXeHnRO7kDK34V4XeVP8PuGt9 bLCXN8LCqgUHQOZHMgw5/fz+hlCK8RlBicUdrBY4kR4gKxUhU+wshnubgITKbyhz3ToV qAZtzpYNxtjnFJ6n8d7FR3d8bwyt+qHaFJOAMcLGFDPcM8G/jFzMTZeozzXzMZw8e0gZ Jr+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Rc0y2QmrtzSlM9fG48ydj8yDOjMkUGj0zT9Hb7xqWyU=; b=MZj96zjvPIsNUZKe/kBXF9QJRR40KM3SCuvno02X5iropzs0ughVhKTjgDAM8pFa1e /000zmJdQ6pfesX2fsWLJ+2cw//Im2u7xlGIjQmR+L9Aw5QQl/P4jUMSmYLg5DTCFqi6 Znj/3Zguh/6m814h0BrpXQyzwhK1XNv1o21athxHMFtR5le7w6kQ1AZmRwq/2KSpdfCm Fcke4SvaiZTwFIF2SS31U6UGByxP7Lpks8/Oo9/XjzeeAU8c78Plz4gbIi4DiHs4U5a7 VKbs2ATHjQYWBjacilrJaiEjW/CWSSMvKlJcx6D9leFpSWgTjURmHZnSVUHHm40sczfe OuzQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p13si8421426pfh.140.2022.01.21.17.27.35; Fri, 21 Jan 2022 17:27:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1380536AbiAUNKR convert rfc822-to-8bit (ORCPT + 99 others); Fri, 21 Jan 2022 08:10:17 -0500 Received: from us-smtp-delivery-44.mimecast.com ([207.211.30.44]:23857 "EHLO us-smtp-delivery-44.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245103AbiAUNKQ (ORCPT ); Fri, 21 Jan 2022 08:10:16 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-355-2fp-tSzhPpi_DP4pLkh_6Q-1; Fri, 21 Jan 2022 08:10:12 -0500 X-MC-Unique: 2fp-tSzhPpi_DP4pLkh_6Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EDBD983DEE1; Fri, 21 Jan 2022 13:10:10 +0000 (UTC) Received: from comp-core-i7-2640m-0182e6.redhat.com (unknown [10.36.110.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id A985078AA0; Fri, 21 Jan 2022 13:09:11 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers Cc: Alexander Mikhalitsyn , Andrew Morton , Christian Brauner , Daniel Walsh , Davidlohr Bueso , "Eric W . Biederman" , Kirill Tkhai , Manfred Spraul , Serge Hallyn , Varad Gautam , Vasily Averin Subject: [RFC PATCH v3 2/4] ipc: Store ipc sysctls in the ipc namespace Date: Fri, 21 Jan 2022 14:08:39 +0100 Message-Id: <5cee8775a57c18d56701b39d28dfbe9ad8a7cc38.1642769810.git.legion@kernel.org> In-Reply-To: References: <87tuebwo99.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=legion@kernel.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset=WINDOWS-1252 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The ipc sysctls are not available for modification inside the user namespace. Following the mqueue sysctls, we changed the implementation to be more userns friendly. So far, the changes do not provide additional access to files. This will be done in a future patch. Signed-off-by: Alexey Gladkov --- include/linux/ipc_namespace.h | 21 ++++ ipc/ipc_sysctl.c | 189 ++++++++++++++++++++++------------ ipc/namespace.c | 4 + 3 files changed, 147 insertions(+), 67 deletions(-) diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h index fa787d97d60a..94af746704fa 100644 --- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -67,6 +67,9 @@ struct ipc_namespace { struct ctl_table_set mq_set; struct ctl_table_header *mq_sysctls; + struct ctl_table_set set; + struct ctl_table_header *ipc_sysctls; + /* user_ns which owns the ipc ns */ struct user_namespace *user_ns; struct ucounts *ucounts; @@ -188,4 +191,22 @@ static inline bool setup_mq_sysctls(struct ipc_namespace *ns) } #endif /* CONFIG_POSIX_MQUEUE_SYSCTL */ + +#ifdef CONFIG_SYSVIPC_SYSCTL + +bool setup_ipc_sysctls(struct ipc_namespace *ns); +void retire_ipc_sysctls(struct ipc_namespace *ns); + +#else /* CONFIG_SYSVIPC_SYSCTL */ + +static inline void retire_ipc_sysctls(struct ipc_namespace *ns) +{ +} + +static inline bool setup_ipc_sysctls(struct ipc_namespace *ns) +{ + return true; +} + +#endif /* CONFIG_SYSVIPC_SYSCTL */ #endif diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c index f101c171753f..dd87ba12f5e3 100644 --- a/ipc/ipc_sysctl.c +++ b/ipc/ipc_sysctl.c @@ -13,43 +13,22 @@ #include #include #include +#include #include "util.h" -static void *get_ipc(struct ctl_table *table) -{ - char *which = table->data; - struct ipc_namespace *ipc_ns = current->nsproxy->ipc_ns; - which = (which - (char *)&init_ipc_ns) + (char *)ipc_ns; - return which; -} - -static int proc_ipc_dointvec(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ - struct ctl_table ipc_table; - - memcpy(&ipc_table, table, sizeof(ipc_table)); - ipc_table.data = get_ipc(table); - - return proc_dointvec(&ipc_table, write, buffer, lenp, ppos); -} - -static int proc_ipc_dointvec_minmax(struct ctl_table *table, int write, +static int proc_ipc_dointvec_minmax_orphans(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { + struct ipc_namespace *ns = table->extra1; struct ctl_table ipc_table; + int err; memcpy(&ipc_table, table, sizeof(ipc_table)); - ipc_table.data = get_ipc(table); - return proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos); -} + ipc_table.extra1 = SYSCTL_ZERO; + ipc_table.extra2 = SYSCTL_ONE; -static int proc_ipc_dointvec_minmax_orphans(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ - struct ipc_namespace *ns = current->nsproxy->ipc_ns; - int err = proc_ipc_dointvec_minmax(table, write, buffer, lenp, ppos); + err = proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos); if (err < 0) return err; @@ -58,17 +37,6 @@ static int proc_ipc_dointvec_minmax_orphans(struct ctl_table *table, int write, return err; } -static int proc_ipc_doulongvec_minmax(struct ctl_table *table, int write, - void *buffer, size_t *lenp, loff_t *ppos) -{ - struct ctl_table ipc_table; - memcpy(&ipc_table, table, sizeof(ipc_table)); - ipc_table.data = get_ipc(table); - - return proc_doulongvec_minmax(&ipc_table, write, buffer, - lenp, ppos); -} - static int proc_ipc_auto_msgmni(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -87,11 +55,17 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write, static int proc_ipc_sem_dointvec(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { + struct ipc_namespace *ns = table->extra1; + struct ctl_table ipc_table; int ret, semmni; - struct ipc_namespace *ns = current->nsproxy->ipc_ns; + + memcpy(&ipc_table, table, sizeof(ipc_table)); + + ipc_table.extra1 = NULL; + ipc_table.extra2 = NULL; semmni = ns->sem_ctls[3]; - ret = proc_ipc_dointvec(table, write, buffer, lenp, ppos); + ret = proc_dointvec(table, write, buffer, lenp, ppos); if (!ret) ret = sem_check_semmni(current->nsproxy->ipc_ns); @@ -108,12 +82,18 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write, static int proc_ipc_dointvec_minmax_checkpoint_restore(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct user_namespace *user_ns = current->nsproxy->ipc_ns->user_ns; + struct ipc_namespace *ns = table->extra1; + struct ctl_table ipc_table; - if (write && !checkpoint_restore_ns_capable(user_ns)) + if (write && !checkpoint_restore_ns_capable(ns->user_ns)) return -EPERM; - return proc_ipc_dointvec_minmax(table, write, buffer, lenp, ppos); + memcpy(&ipc_table, table, sizeof(ipc_table)); + + ipc_table.extra1 = SYSCTL_ZERO; + ipc_table.extra2 = SYSCTL_INT_MAX; + + return proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos); } #endif @@ -121,27 +101,27 @@ int ipc_mni = IPCMNI; int ipc_mni_shift = IPCMNI_SHIFT; int ipc_min_cycle = RADIX_TREE_MAP_SIZE; -static struct ctl_table ipc_kern_table[] = { +static struct ctl_table ipc_sysctls[] = { { .procname = "shmmax", .data = &init_ipc_ns.shm_ctlmax, .maxlen = sizeof(init_ipc_ns.shm_ctlmax), .mode = 0644, - .proc_handler = proc_ipc_doulongvec_minmax, + .proc_handler = proc_doulongvec_minmax, }, { .procname = "shmall", .data = &init_ipc_ns.shm_ctlall, .maxlen = sizeof(init_ipc_ns.shm_ctlall), .mode = 0644, - .proc_handler = proc_ipc_doulongvec_minmax, + .proc_handler = proc_doulongvec_minmax, }, { .procname = "shmmni", .data = &init_ipc_ns.shm_ctlmni, .maxlen = sizeof(init_ipc_ns.shm_ctlmni), .mode = 0644, - .proc_handler = proc_ipc_dointvec_minmax, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = &ipc_mni, }, @@ -151,15 +131,13 @@ static struct ctl_table ipc_kern_table[] = { .maxlen = sizeof(init_ipc_ns.shm_rmid_forced), .mode = 0644, .proc_handler = proc_ipc_dointvec_minmax_orphans, - .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_ONE, }, { .procname = "msgmax", .data = &init_ipc_ns.msg_ctlmax, .maxlen = sizeof(init_ipc_ns.msg_ctlmax), .mode = 0644, - .proc_handler = proc_ipc_dointvec_minmax, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_INT_MAX, }, @@ -168,7 +146,7 @@ static struct ctl_table ipc_kern_table[] = { .data = &init_ipc_ns.msg_ctlmni, .maxlen = sizeof(init_ipc_ns.msg_ctlmni), .mode = 0644, - .proc_handler = proc_ipc_dointvec_minmax, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = &ipc_mni, }, @@ -186,7 +164,7 @@ static struct ctl_table ipc_kern_table[] = { .data = &init_ipc_ns.msg_ctlmnb, .maxlen = sizeof(init_ipc_ns.msg_ctlmnb), .mode = 0644, - .proc_handler = proc_ipc_dointvec_minmax, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_INT_MAX, }, @@ -204,8 +182,6 @@ static struct ctl_table ipc_kern_table[] = { .maxlen = sizeof(init_ipc_ns.ids[IPC_SEM_IDS].next_id), .mode = 0666, .proc_handler = proc_ipc_dointvec_minmax_checkpoint_restore, - .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_INT_MAX, }, { .procname = "msg_next_id", @@ -213,8 +189,6 @@ static struct ctl_table ipc_kern_table[] = { .maxlen = sizeof(init_ipc_ns.ids[IPC_MSG_IDS].next_id), .mode = 0666, .proc_handler = proc_ipc_dointvec_minmax_checkpoint_restore, - .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_INT_MAX, }, { .procname = "shm_next_id", @@ -222,25 +196,106 @@ static struct ctl_table ipc_kern_table[] = { .maxlen = sizeof(init_ipc_ns.ids[IPC_SHM_IDS].next_id), .mode = 0666, .proc_handler = proc_ipc_dointvec_minmax_checkpoint_restore, - .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_INT_MAX, }, #endif {} }; -static struct ctl_table ipc_root_table[] = { - { - .procname = "kernel", - .mode = 0555, - .child = ipc_kern_table, - }, - {} +static struct ctl_table_set *set_lookup(struct ctl_table_root *root) +{ + return ¤t->nsproxy->ipc_ns->set; +} + +static int set_is_seen(struct ctl_table_set *set) +{ + return ¤t->nsproxy->ipc_ns->set == set; +} + +static struct ctl_table_root set_root = { + .lookup = set_lookup, }; +bool setup_ipc_sysctls(struct ipc_namespace *ns) +{ + struct ctl_table *tbl; + + setup_sysctl_set(&ns->set, &set_root, set_is_seen); + + tbl = kmemdup(ipc_sysctls, sizeof(ipc_sysctls), GFP_KERNEL); + if (tbl) { + int i; + + for (i = 0; i < ARRAY_SIZE(ipc_sysctls); i++) { + if (tbl[i].data == &init_ipc_ns.shm_ctlmax) { + tbl[i].data = &ns->shm_ctlmax; + + } else if (tbl[i].data == &init_ipc_ns.shm_ctlall) { + tbl[i].data = &ns->shm_ctlall; + + } else if (tbl[i].data == &init_ipc_ns.shm_ctlmni) { + tbl[i].data = &ns->shm_ctlmni; + + } else if (tbl[i].data == &init_ipc_ns.shm_rmid_forced) { + tbl[i].data = &ns->shm_rmid_forced; + tbl[i].extra1 = ns; + + } else if (tbl[i].data == &init_ipc_ns.msg_ctlmax) { + tbl[i].data = &ns->msg_ctlmax; + + } else if (tbl[i].data == &init_ipc_ns.msg_ctlmni) { + tbl[i].data = &ns->msg_ctlmni; + + } else if (tbl[i].data == &init_ipc_ns.msg_ctlmnb) { + tbl[i].data = &ns->msg_ctlmnb; + + } else if (tbl[i].data == &init_ipc_ns.sem_ctls) { + tbl[i].data = &ns->sem_ctls; + tbl[i].extra1 = ns; +#ifdef CONFIG_CHECKPOINT_RESTORE + } else if (tbl[i].data == &init_ipc_ns.ids[IPC_SEM_IDS].next_id) { + tbl[i].data = &ns->ids[IPC_SEM_IDS].next_id; + tbl[i].extra1 = ns; + + } else if (tbl[i].data == &init_ipc_ns.ids[IPC_MSG_IDS].next_id) { + tbl[i].data = &ns->ids[IPC_MSG_IDS].next_id; + tbl[i].extra1 = ns; + + } else if (tbl[i].data == &init_ipc_ns.ids[IPC_SHM_IDS].next_id) { + tbl[i].data = &ns->ids[IPC_SHM_IDS].next_id; + tbl[i].extra1 = ns; +#endif + } else { + tbl[i].data = NULL; + } + } + + ns->ipc_sysctls = __register_sysctl_table(&ns->set, "kernel", tbl); + } + if (!ns->ipc_sysctls) { + kfree(tbl); + retire_sysctl_set(&ns->set); + return false; + } + + return true; +} + +void retire_ipc_sysctls(struct ipc_namespace *ns) +{ + struct ctl_table *tbl; + + tbl = ns->ipc_sysctls->ctl_table_arg; + unregister_sysctl_table(ns->ipc_sysctls); + retire_sysctl_set(&ns->set); + kfree(tbl); +} + static int __init ipc_sysctl_init(void) { - register_sysctl_table(ipc_root_table); + if (!setup_ipc_sysctls(&init_ipc_ns)) { + pr_warn("ipc sysctl registration failed\n"); + return -ENOMEM; + } return 0; } diff --git a/ipc/namespace.c b/ipc/namespace.c index f760243ca685..754f3237194a 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -63,6 +63,9 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns, if (!setup_mq_sysctls(ns)) goto fail_put; + if (!setup_ipc_sysctls(ns)) + goto fail_put; + sem_init_ns(ns); msg_init_ns(ns); shm_init_ns(ns); @@ -130,6 +133,7 @@ static void free_ipc_ns(struct ipc_namespace *ns) shm_exit_ns(ns); retire_mq_sysctls(ns); + retire_ipc_sysctls(ns); dec_ipc_namespaces(ns->ucounts); put_user_ns(ns->user_ns); -- 2.33.0