Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3191529pxk; Mon, 21 Sep 2020 07:31:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxR+7Z9KozkM5AwCdY6gXuDhZ2hidyXqoajQn6io08cdtdorUXfmXHiul/5OiPSNLa/yBqz X-Received: by 2002:a17:906:f150:: with SMTP id gw16mr47612989ejb.528.1600698660329; Mon, 21 Sep 2020 07:31:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600698660; cv=none; d=google.com; s=arc-20160816; b=lelhUGeiAG4WYrKUkGUAEM+dGMWycI7l7FeKYsyGP5g2knNL6N7XuRD3p4gTYXy/kA vKdvELzj3UGYbJrtzWb9Ol5QcBZxji46EtcsQIVqgrFJYi+6jkRaxt1g4pJBxOwWePoP 5mqQJWncMBNcMi0cdYk2UCvvBdtp1ZE1bMDeNi95zDJHCM9s+Wiq1cDmBtzDps+pId6i fq6oEv/Wf3HuK7yfC8yDS77ONRUmMEyf9j1pvLayFK2qwoLzZL8emtZ1y/x/N/9YX0ow KipplKq2EeWvsTbq9VQ5JCFUYm/omVk81tcxVZG0U86PBkc3edCthqx14p0k0tuJKOjU IomQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=1RdoYXKwEjK4J3d1DYrl5vaMGQPhf2n+JMMWMOYRwtU=; b=Y3++ikL9I/71Jcc6ZtGNE1x2iZT9iHjqFaRVvTjWewtFD9ISj7/zQrtXzEmHYvYkjM y92t5bM9WJH2oyJUt1aaifcWtWWTe56q/VqjG0qIQFQCmRe8TwHOyGc2GbsKWcAeMQeu kq9iNr/l5Sn3acIppSg1e58WaXI2fJYRInUFetitpfZ8Fkng+npj/Ha+j0rTLmMguSv3 bkkjL0G8LJKxPnr+Gvo5F6i15S7Wrwy+Blai3c7zETrFcxmCU9St7lxcPiHXOgjsMitL 7zByRvjgLEthM3WhKfFhJihaTNRHbUWsR7i0txRO8k16EPHbQ6QHwMm4c5ebBJPI879j LoCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=kTgxnzS7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i3si8288713eds.570.2020.09.21.07.30.36; Mon, 21 Sep 2020 07:31:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=kTgxnzS7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727054AbgIUO2l (ORCPT + 99 others); Mon, 21 Sep 2020 10:28:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:40230 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726341AbgIUO2k (ORCPT ); Mon, 21 Sep 2020 10:28:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1600698519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1RdoYXKwEjK4J3d1DYrl5vaMGQPhf2n+JMMWMOYRwtU=; b=kTgxnzS7teOS36PrXh0wm+C0EApapZblNjjrebqXoJ5ChnxL4I+xPE9HMdxVW9xlGYz6eg rK0EppRHDdkJma4AyUvZAvg18F2k2CLiiNFPHUNLRyPeGGLStbtax09rA13XWRpQXwxuZy /tlacC6+W7gvF48ZV17gPjianbo/3Fk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 28DB2B12E; Mon, 21 Sep 2020 14:29:15 +0000 (UTC) Date: Mon, 21 Sep 2020 16:28:34 +0200 From: Michal Hocko To: Peter Xu Cc: Tejun Heo , Christian Brauner , Linus Torvalds , Jason Gunthorpe , John Hubbard , Leon Romanovsky , Linux-MM , Linux Kernel Mailing List , "Maya B . Gokhale" , Yang Shi , Marty Mcfadden , Kirill Shutemov , Oleg Nesterov , Jann Horn , Jan Kara , Kirill Tkhai , Andrea Arcangeli , Christoph Hellwig , Andrew Morton Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification Message-ID: <20200921142834.GL12990@dhcp22.suse.cz> References: <20200916174804.GC8409@ziepe.ca> <20200916184619.GB40154@xz-x1> <20200917112538.GD8409@ziepe.ca> <20200917193824.GL8409@ziepe.ca> <20200918164032.GA5962@xz-x1> <20200921134200.GK12990@dhcp22.suse.cz> <20200921141830.GE5962@xz-x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200921141830.GE5962@xz-x1> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 21-09-20 10:18:30, Peter Xu wrote: > Hi, Michal, > > On Mon, Sep 21, 2020 at 03:42:00PM +0200, Michal Hocko wrote: [...] > > I have only now > > learned about this feature so I am not deeply familiar with all the > > details and I might be easily wrong. Normally all the cgroup aware > > resources are accounted to the parent's cgroup. For memcg that includes > > all the page tables, early CoW and other allocations with __GFP_ACCOUNT. > > IIUC CLONE_INTO_CGROUP properly then this hasn't changed as the child is > > associated to its new cgroup (and memcg) only in cgroup_post_fork. If > > that is correct then we might have quite a lot of resources bound to > > child's lifetime but accounted to the parent's memcg which can lead to > > all sorts of interesting problems (e.g. unreclaimable memory - even by > > the oom killer). > > Right that's one of the things that I'm confused too, on that if we always > account to the parent, then when the child quits whether we uncharge them or > not, and how.. Not sure whether the accounting of the parent could steadily > grow as it continues the fork()s. > > So is it by design that we account all these to the parents? Let me try to clarify a bit further my concern. Without CLONE_INTO_CGROUP this makes some sense. Because both parent and child will live in the same cgroup. All the charges are reference counted so they will be released when the respective resource gets freed (e.g. page table released or the backing page dropped) irrespective of the current cgroup the owner is living in. Fundamentaly CLONE_INTO_CGROUP is similar to regular fork + move to the target cgroup after the child gets executed. So in principle there shouldn't be any big difference. Except that the move has to be explicit and the the child has to have enough privileges to move itself. I am not completely sure about CLONE_INTO_CGROUP model though. According to man clone(2) it seems that O_RDONLY for the target cgroup directory is sufficient. That seems much more relaxed IIUC and it would allow to fork into a different cgroup while keeping a lot of resources in the parent's proper. -- Michal Hocko SUSE Labs