Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp1387429rbb; Mon, 26 Feb 2024 07:46:49 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVPZ9R6C5/6Az4Ta2JervrUe5AE90xWxbhjqgX2Dh6BgprhL/lWoaqOr++4iPDjXZTtFTIt9mFtpJ4wK1Z/i/Dr7QoDvrx1I4dkU9NCeA== X-Google-Smtp-Source: AGHT+IEq4hJZ6TMz5ODh63cYIfzHtKEkZDrQbw8SdfwHjKVk4xbhQrR452DYSAmh5FOnkqou4JQG X-Received: by 2002:a05:620a:4d86:b0:787:d4c6:2296 with SMTP id uw6-20020a05620a4d8600b00787d4c62296mr2865980qkn.28.1708962409340; Mon, 26 Feb 2024 07:46:49 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708962409; cv=pass; d=google.com; s=arc-20160816; b=xCUpP5ojSPPbvXWkMuMDrLAkZyDMpOaMrHv7C+h6qKq3e2WHGxJgA6UCsDEoAgIpPk +LjhKryO5kqw5oqbhkOHx9Nfknzd2BfKRaWQ8ZeR713NLkMJKNXPDXlA1ak9p1/tUGvq UwV9XKpuBYG92aNQq7uF665cVvNiniWV3VLg8J3sKhD31EcKQcXH27vpT9RkFMg3/rOn eFsUoenKkhWpwP4IY1IGOH4/WxkVwK++Y1HVDR+z4oUiOWAW2VbLs6+nNUr2tDQketNH E20ppqmSo0Mpu12MdAjFjE7TPoMvI9UKPYjC10BlD9m4fehuCRuLrM/BlJb6Yv3cMMp2 eL0A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=zR72Dh4gxnBaZ9uptDLfNnq4NZ9srd01R2ORnnPLuSY=; fh=muet5XheSKk424aO+q2wr1QJDU24w75wvX24/TNUm6w=; b=nwgF2y1VWY3TBF0OX9Ozwa1ri8CayT/qHee80dDh1/awaD1s/p9oqcdLeZVaMe7/qf fqKd3480F/wmGPYZxRLGl4lgSIW/tZKCf7DERo07JhTi94ddWnQbXdzINDIkPbcfvlLH 2CoQBggnVd/o3NAHuGYEIiKzMPbVADIeUKIJuRD0CD0bVicDI+5vQcIQQOK5jYRV7B6D ZFnm/gbEbcsYvlvj7WazIjnqYjnAJP4tYISsn3/5gHXjeeNoGCVJ2xZFDnh2mYUii3XK M4p71g7onj9jt6cOOsayFowsriY8eCeSbe0nMsyUr2JsNUqhLGHSZ6G8IMTvzhq5D+US o0/w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s9SY++mh; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-81834-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-81834-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id bp44-20020a05620a45ac00b00787dfab7d41si100325qkb.63.2024.02.26.07.46.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 07:46:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-81834-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s9SY++mh; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-81834-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-81834-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id A06CD1C29615 for ; Mon, 26 Feb 2024 15:45:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id AA29D12B159; Mon, 26 Feb 2024 15:45:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s9SY++mh" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A3A812BE87 for ; Mon, 26 Feb 2024 15:45:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708962338; cv=none; b=C3So+Ko/7XPQP7KfZL1W3EDjrr7V3Fy8l4KL3+2nwdfsLGj9TTHqyTHz8XITs+Mz8X767tzcEmLdViKJH2i23GrbS1D5mLVxgClBjqBLqtHfhX2TrGYzwe3N4FzTuWjCUK2uvj1mSGo4Kn00Mu6gfO7kUsQEBzEq03TmWkETcvE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708962338; c=relaxed/simple; bh=VsH0ymrYLcRyJRnocuhbJuZDkMN1aFvJrkS1qBHwOqU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=NuXGm7Jh/peg09VTNJSZEDLc7hQ+ryVD2Zygs36bJcfpTuEbLpb+G4PeJpnPqA8BjGnjPPleTq5UbIKwg0YO0xfJg/wfUfg/HjLDVfxRPztdPdB6uKlrDvUKThPl0Pz3gQanEXiHErKfu1TKOynin+3lSDw/Te/yr+GP9678FEY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s9SY++mh; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22787C433C7; Mon, 26 Feb 2024 15:45:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1708962337; bh=VsH0ymrYLcRyJRnocuhbJuZDkMN1aFvJrkS1qBHwOqU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=s9SY++mhNAKw369O/4ET4NztG3dF+U4U2Fo2gCThAJHS4inmUjFIgWYn/UdttAMMf lVfeJGREmZ4uUbb0LcItayvC/EPlyKo0Z/n2lTJ6e/Ym1uDG+AFZipoqoKhR9iJzJH reBe+HZNyeza0I7xTpHvKx+zzPrhig6dO1/1NBcTXxHOi5ILswryo2YX2L+advZbET +jeaLLyhz4252rs456/kX7oJYss55jEz0OgKThg5yk97/bz5N1/lWH5h/zL0gTjq/5 Sn61paoidyZIpP+v8eeIYhhOGkgd7dlDqkgp15DbPoEE2ZiO5NmumKBVDvqgF6w83f uaJ9/JtryeWTw== Date: Mon, 26 Feb 2024 16:45:33 +0100 From: Christian Brauner To: Tycho Andersen Cc: Alexander Mikhalitsyn , stgraber@stgraber.org, cyphar@cyphar.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1 2/2] tests/pid_namespace: add pid_max tests Message-ID: <20240226-bewaffnen-kinokarten-94eb5abf727c@brauner> References: <20240222160915.315255-1-aleksandr.mikhalitsyn@canonical.com> <20240222160915.315255-3-aleksandr.mikhalitsyn@canonical.com> <20240223-kantholz-knallen-558beba46c62@brauner> <20240226-gestrafft-pastinaken-94ff0e993a51@brauner> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Mon, Feb 26, 2024 at 08:30:35AM -0700, Tycho Andersen wrote: > On Mon, Feb 26, 2024 at 09:57:47AM +0100, Christian Brauner wrote: > > > > > A small quibble, but I wonder about the semantics here. "You can write > > > > > whatever you want to this file, but we'll ignore it sometimes" seems > > > > > weird to me. What if someone (CRIU) wants to spawn a pid numbered 450 > > > > > in this case? I suppose they read pid_max first, they'll be able to > > > > > tell it's impossible and can exit(1), but returning E2BIG from write() > > > > > might be more useful. > > > > > > > > That's a good idea. But it's a bit tricky. The straightforward thing is > > > > to walk upwards through all ancestor pid namespaces and use the lowest > > > > pid_max value as the upper bound for the current pid namespace. This > > > > will guarantee that you get an error when you try to write a value that > > > > you would't be able to create. The same logic should probably apply to > > > > ns_last_pid as well. > > > > > > > > However, that still leaves cases where the current pid namespace writes > > > > a pid_max limit that is allowed (IOW, all ancestor pid namespaces are > > > > above that limit.). But then immediately afterwards an ancestor pid > > > > namespace lowers the pid_max limit. So you can always end up in a > > > > scenario like this. > > > > > > I wonder if we can push edits down too? Or an render .effective file, like > > > > I don't think that works in the current design? The pid_max value is per > > struct pid_namespace. And while there is a 1:1 relationship between a > > child pid namespace to all of its ancestor pid namespaces there's a 1 to > > many relationship between a pid namespace and it's child pid namespaces. > > IOW, if you change pid_max in pidns_level_1 then you'd have to go > > through each of the child pid namespaces on pidns_level_2 which could be > > thousands. So you could only do this lazily. IOW, compare and possibly > > update the pid_max value of the child pid namespace everytime it's read > > or written. Maybe that .effective is the way to go; not sure right now. > > I wonder then, does it make sense to implement this as a cgroup thing > instead, which is used to doing this kind of traversal? > > Or I suppose not, since the idea is to get legacy software that's > writing to pid_max to work? My personal perspective is that this is not so important. The original motivation for this had been legacy workloads that expect to only get pid numbers up to a certain size which would otherwise break. And for them it doesn't matter whether that setting is applied through pid_max or via some cgroup setting. All that matters is that they don't get pids beyond what they expect. So yes, from my POV we could try and make this a cgroup property. But we should check with Tejun first whether he'd consider this a useful addition or not.