Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp3797309rwb; Tue, 16 Aug 2022 08:56:30 -0700 (PDT) X-Google-Smtp-Source: AA6agR4XkqrWy7V7F77q10B9xMI1lxQHxlnvXiJmq4MhZE+vxBlw8FRRlRgrahPObyMMG2i3CVwz X-Received: by 2002:a62:d418:0:b0:52d:9322:b285 with SMTP id a24-20020a62d418000000b0052d9322b285mr22163915pfh.66.1660665390190; Tue, 16 Aug 2022 08:56:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660665390; cv=none; d=google.com; s=arc-20160816; b=bF0W60797IalLaiI+cpIixMHkT4cYOufyfrd/83nf6M1A0x29pvdH/vUd2wyejBVF0 ZJKkplx6e0ZzKdKVEMi0BDyROMouip7etivaB+sXFGGlw1KDopo21wZejAq6DSZPsrke fG1KEoqoBbTh0mGaPvWutTyh4qTcpUYQvEsTondY3wjKsY+U8yCzRnANpcYccYDWVY0p EGbYW0JZ3CWEY/Ogyzk/UcOI4fv+Sq4GrHRlpwhllkd9xKZ/fyiVy97wqpMx3sdU4ooV HvKUNbDHs0TSBxCDYksYjYAV42AntFeGka+W4kDHDDeSQy7igAywkElvGpBgPd1xxc8M gBig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=80RFXApA5UQOntRCqe6cWkhzHm8grMeNgg+95eunq7U=; b=CDgAdMPxpSQcT2BNJnPsv3lnt1pvEVDEvLJDvpNNFnF8dgVbYSl5SjyeI4w5W6nT/z 5VCXUYW+/8v9iZSp1ymYia2f/dNbFQAaEXZynMkTWmjBRVrvlkEvnyk64WMDQc5OC7kr BzAvHMx2we8sUl9b9mylEHQpjMIZivSNPAfYmeDS/sedcVMSSASyyDCJqthMyKXlePxb K/EmoJiGuukWVc2JS2STcW7ZBOQOunorhOrw7aAQhLdCJVM/GCSxrt9/mi/JAiX3OATr 0ofVn1J9QONM/cfJzWsLvrA/riVaYlp7/r0gWkztettqiHpZlwEEahkkCX4UlD5mk3dX ikFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v1-20020aa799c1000000b005182fda1b23si13632530pfi.236.2022.08.16.08.56.18; Tue, 16 Aug 2022 08:56:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233712AbiHPPtA (ORCPT + 99 others); Tue, 16 Aug 2022 11:49:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236283AbiHPPsk (ORCPT ); Tue, 16 Aug 2022 11:48:40 -0400 X-Greylist: delayed 62 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 16 Aug 2022 08:45:03 PDT Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [207.211.30.44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D751288DFB for ; Tue, 16 Aug 2022 08:45:01 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-159-qpeCTJYqNdyWCkUQKw-9yw-1; Tue, 16 Aug 2022 11:43:52 -0400 X-MC-Unique: qpeCTJYqNdyWCkUQKw-9yw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BDD6985A581; Tue, 16 Aug 2022 15:43:51 +0000 (UTC) Received: from comp-core-i7-2640m-0182e6.redhat.com (unknown [10.40.208.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 585FE2166B26; Tue, 16 Aug 2022 15:43:50 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers Cc: Andrew Morton , Christian Brauner , "Eric W . Biederman" , Kees Cook , Manfred Spraul Subject: Re: [PATCH v1] sysctl: Allow change system v ipc sysctls inside ipc namespace Date: Tue, 16 Aug 2022 17:42:42 +0200 Message-Id: In-Reply-To: <87wnc1i2wo.fsf@email.froward.int.ebiederm.org> References: <87wnc1i2wo.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_SOFTFAIL,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 25, 2022 at 11:16:07AM -0500, Eric W. Biederman wrote: > Alexey Gladkov writes: > > > Rootless containers are not allowed to modify kernel IPC parameters such > > as kernel.msgmnb. > > > > It seems to me that we can allow customization of these parameters if > > the user has CAP_SYS_RESOURCE in that ipc namespace. > > > > CAP_SYS_RESOURCE is already needed in order to overcome mqueue limits > > (msg_max and msgsize_max). > > > For changing the permissions on who can modify the SysV limits, I don't > think this change is safe. I don't see anything that will prevent abuse > if anyone can modify these limits. Replacing the ordinary unix DAC > permission check with ns_capable will allow anyone to modify the limits. All limits are set to almost maximum values - ULONG_MAX. Limit values are not inherited and are counted in the each ipc namespace (shm_tot is not global and is located in ipc_ns). In fact, limits are disabled by default. They can only be reduced. > That said there is RLIMIT_MSGQUEUE that limits the posix messages queues > so those should be safe to allow anyone to modify their limits. > > The code in mqueue_get_inode is where that limiting happens. > > For the posix message queues all that should be needed is to change the > owner of the sysctl files from the global root to the user namespace > root. There are also two capable calls in ipc/mqueue.c that can > probably be changed to ns_capable calls. > > > The only posix message queue limit that I don't immediately see > something that will prevent abuse of is /proc/sys/fs/mqueue/queus_max. > That probably still runs into RLIMIT_MSGQUEUE somewhere but it was > not immediately obvious at first glance. Everything always ends in mqueue_get_inode. In mqueue_create_attr we check mq_queues_max and call mqueue_get_inode almost immediately. I suggest allowing root in user namespace to change ipc namespace limits. -- Alexey Gladkov (3): sysctl: Allow change system v ipc sysctls inside ipc namespace sysctl: Allow to change limits for posix messages queues docs: Add information about ipc sysctls limitations Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++-- ipc/ipc_sysctl.c | 34 ++++++++++++++++--- ipc/mq_sysctl.c | 36 +++++++++++++++++++++ 3 files changed, 76 insertions(+), 8 deletions(-) -- 2.33.4