Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3210519pxk; Mon, 21 Sep 2020 07:58:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzTnk+EPkAYhNgLl04UO2ZVoTqVlztJPnswKvIiUBfTU3cSupz820qrG3yHSP2scNtGsln8 X-Received: by 2002:a17:906:b04a:: with SMTP id bj10mr50008067ejb.303.1600700337271; Mon, 21 Sep 2020 07:58:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600700337; cv=none; d=google.com; s=arc-20160816; b=B1wpYdnt/yw8+k9ea1+ctSM7Srkq94Y3DGtzYdVOqNAeCUWKDhvv2lwZAt4TbJX/kK tZIGJwxn8JpV9CU5hoKgJ0B/NiGYg+oEkglI8uPhmO35Uicz4/DlZuA8vzVj/HwJtXAL ZfhVToIQMgPwafI6Q3i0CFuIuiAo13LB2ouipD4nywkAc/NVH0Rp/sTcdQBBAQZvEdTC gg2k/TzoaSgaTpwS6It9y0VusL4eu6/uMCi6kOT5ticXo52x0NRLtFFHo6+kstmJaoJ5 p7qbd5rbtkhs9wePa881ZqKrc64sa4SEwvkXfscsGDtpFghCoSwaq6Oe5M+vGxaU6Kgb ExpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Wi7ymOcpFYjdpuVeX/JVEnmGW5RFuMojzVtSTICsBi0=; b=KQNjlKWe1tkJprPMi6VE8wHUZLY7X38fpydAGTVDZrVOEF1IeffzoeB5SsIITC7PU9 P5TilLYkLVqp11VfEXz0m1M6LJVZbpIbCVXcKJmgXf5Rlvli0CQTp+7TqvPzPJiUyanM kOoFbKZnGICdERYEF7EJbrPob8iUv8Nn9RAMZyrIKb6nQaN8bNQ9+hIKj/Vc6iRDN7Cn IYm7ngwDkCyAKEmRpddm+otfvo6VjA60AfPkYGtRp0J+w3xXfpnxdnvi/QiI5LNPCtzx PrPKN/TEdhtez4xlutH6btIb2fsAKCBS9fKLaWyl/fs39P0HIM1mcEQxdlc9y46+XZbh IvMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="CFoD/5jI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i11si8560252edl.389.2020.09.21.07.58.33; Mon, 21 Sep 2020 07:58:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="CFoD/5jI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727294AbgIUO5l (ORCPT + 99 others); Mon, 21 Sep 2020 10:57:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:36916 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726419AbgIUO5k (ORCPT ); Mon, 21 Sep 2020 10:57:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1600700259; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Wi7ymOcpFYjdpuVeX/JVEnmGW5RFuMojzVtSTICsBi0=; b=CFoD/5jItiHHwU12A5tvLagpytLD2blip/iNMDFTNuLEDzD+Ftnx0nOTrSM81fZProO760 aVL0aoFQJJUtJe3eq6skTdiXZBR5p6c+75UiaHCK54KAPWK4vfrrx/9/8QIhYk045Gk5Da xnsDChPVkKGnZ26ku4wxqMBVlFOUFZQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0559CABC4; Mon, 21 Sep 2020 14:58:15 +0000 (UTC) Date: Mon, 21 Sep 2020 16:57:38 +0200 From: Michal Hocko To: Christian Brauner Cc: Peter Xu , Tejun Heo , Linus Torvalds , Jason Gunthorpe , John Hubbard , Leon Romanovsky , Linux-MM , Linux Kernel Mailing List , "Maya B . Gokhale" , Yang Shi , Marty Mcfadden , Kirill Shutemov , Oleg Nesterov , Jann Horn , Jan Kara , Kirill Tkhai , Andrea Arcangeli , Christoph Hellwig , Andrew Morton Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification Message-ID: <20200921145738.GN12990@dhcp22.suse.cz> References: <20200916174804.GC8409@ziepe.ca> <20200916184619.GB40154@xz-x1> <20200917112538.GD8409@ziepe.ca> <20200917193824.GL8409@ziepe.ca> <20200918164032.GA5962@xz-x1> <20200921134200.GK12990@dhcp22.suse.cz> <20200921144134.fuvkkv6wgrzpbwnv@wittgenstein> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200921144134.fuvkkv6wgrzpbwnv@wittgenstein> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 21-09-20 16:41:34, Christian Brauner wrote: > On Mon, Sep 21, 2020 at 03:42:00PM +0200, Michal Hocko wrote: > > [Cc Tejun and Christian - this is a part of a larger discussion which is > > not directly related to this particular question so let me trim the > > original email to the bare minimum.] > > > > On Fri 18-09-20 12:40:32, Peter Xu wrote: > > [...] > > > One issue is when we charge for cgroup we probably can't do that onto the new > > > mm/task, since copy_namespaces() is called after copy_mm(). I don't know > > > enough about cgroup, I thought the child will inherit the parent's, but I'm not > > > sure. Or, can we change that order of copy_namespaces() && copy_mm()? I don't > > > see a problem so far but I'd like to ask first.. > > > > I suspect you are referring to CLONE_INTO_CGROUP, right? I have only now > > learned about this feature so I am not deeply familiar with all the > > details and I might be easily wrong. Normally all the cgroup aware > > resources are accounted to the parent's cgroup. For memcg that includes > > all the page tables, early CoW and other allocations with __GFP_ACCOUNT. > > IIUC CLONE_INTO_CGROUP properly then this hasn't changed as the child is > > associated to its new cgroup (and memcg) only in cgroup_post_fork. If > > that is correct then we might have quite a lot of resources bound to > > child's lifetime but accounted to the parent's memcg which can lead to > > all sorts of interesting problems (e.g. unreclaimable memory - even by > > the oom killer). > > > > Christian, Tejun is this the expected semantic or I am just misreading > > the code? > > Hey Michal, > > Thanks for the Cc! > > If I understand your question correctly, then you are correct. The logic > is split in three simple parts: > 1. Child gets created and doesn't live in any cset > - This should mean that resources are still charged against the > parent's memcg which is what you're asking afiu. > 1. cgroup_can_fork() > - create new or find existing matching cset for the child > 3. cgroup_post_fork() > - move/attach child to the new or found cset > > _Purely from a CLONE_INTO_CGROUP perspective_ you should be ok to > reverse the order of copy_mm() and copy_namespaces(). Switching the order wouldn't make much of a difference right. At least not for memcg where all the accounted allocations will still go to current's memcg. -- Michal Hocko SUSE Labs