Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp10276905rwp; Thu, 20 Jul 2023 18:15:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlEQ0PjT0Dm7WFlJhshCKu4pMFfOt5fHe8gvtM43GXyZSfGyklV/4R89XblLZcQ5T7Yz10ad X-Received: by 2002:a17:902:f681:b0:1b6:bced:1dc2 with SMTP id l1-20020a170902f68100b001b6bced1dc2mr690429plg.0.1689902110708; Thu, 20 Jul 2023 18:15:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689902110; cv=none; d=google.com; s=arc-20160816; b=hAS2M9D2u+5CBfEQ1dvTv+Ii3+ZxkXbJfu5d5/nG1nBUca0i8F0ue1ILG1Ve3Iu7kK YF7gAIOJp/dIZWNMW2nVJO5McfrQ9Bj9qUDtWdCTi3W9ABjkNfnKv9W37kJvOD/L2h29 koXbcpS8N++ifHFM+lzYgm7mQPt6q+rS7YgQYKh0c7CEuuzyCMnE9zO/mQSotO4KQEW2 6VrJe/vwWgat4JRw61YVAkvj3XpI/XJlcTVCqCCziZMzG1HwZRQ6ebCGpBxnQMBTQmVQ 9d1NQ9BwMxGDdH2uJSGqtXC+FHwRwz2eh7ZAqBrp+xn7FGfn1Kee7rxes7JOQmOhi8gc 7YOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=y2QMFvECdm8dFJV/IrLJkLq1p3G+GKee/U2h0MLyOkg=; fh=4IhZU2JW9gYLrOb/ulLR4BJpWFZwANV2qaE4oUDxwrE=; b=cJgGwB5N1fkhW5zsWwPNEGx/xC9yHILgLrHyya0fPpzJHjOoUdAPOZ20pWnSfRplg7 NhdDXkt7IaPWhMREYlMdTWss7ta53L7h1YL6LtZb77QS+CN4mFurAwzB57/JyMU9Kk3l tQaK9e+mxiRJ6WOxKP8jRi8lHZ7/6jcWroVvOzELpCvygsWyIlYzss/5aDouzKlSll7B 4eA8FsdeMZacJgjMc15RABBCNs7NLuvkf2oXF9FnxmiKsnxBijhNSPw4hricavYFsT22 j9WoXvOkTvAv/H+azn5S3hQVsSkKmhiiO8gdRnDHT79uzj5n3QY/TJxwd3omOI21T8rk CpUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KOmSPRR+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id li11-20020a170903294b00b001b86c526fd8si1811333plb.448.2023.07.20.18.14.57; Thu, 20 Jul 2023 18:15:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KOmSPRR+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229633AbjGUACi (ORCPT + 99 others); Thu, 20 Jul 2023 20:02:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229991AbjGUACh (ORCPT ); Thu, 20 Jul 2023 20:02:37 -0400 Received: from out-4.mta1.migadu.com (out-4.mta1.migadu.com [95.215.58.4]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42EEA2736 for ; Thu, 20 Jul 2023 17:02:31 -0700 (PDT) Date: Thu, 20 Jul 2023 17:02:22 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1689897749; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=y2QMFvECdm8dFJV/IrLJkLq1p3G+GKee/U2h0MLyOkg=; b=KOmSPRR+QbQBjXhqggDP5pjoKz8izA0OgekgidARozZue4LF/JrgMyfSRsVOon8kZShUCs iuKScGj3/4eJx8nQmoeKvDrgYrl23dCurXmodrsSemoBLksfFX420En5g1UBLuJaEqjteF xsULlcNlaLyxKaP2p1y6iCgloSi1UGE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , "Matthew Wilcox (Oracle)" , Tejun Heo , Zefan Li , Yu Zhao , Luis Chamberlain , Kees Cook , Iurii Zaikin , "T.J. Mercier" , Greg Thelen , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org Subject: Re: [RFC PATCH 0/8] memory recharging for offline memcgs Message-ID: References: <20230720070825.992023-1-yosryahmed@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230720070825.992023-1-yosryahmed@google.com> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 20, 2023 at 07:08:17AM +0000, Yosry Ahmed wrote: > This patch series implements the proposal in LSF/MM/BPF 2023 conference > for reducing offline/zombie memcgs by memory recharging [1]. The main > difference is that this series focuses on recharging and does not > include eviction of any memory charged to offline memcgs. > > Two methods of recharging are proposed: > > (a) Recharging of mapped folios. > > When a memcg is offlined, queue an asynchronous worker that will walk > the lruvec of the offline memcg and try to recharge any mapped folios to > the memcg of one of the processes mapping the folio. The main assumption > is that a process mapping the folio is the "rightful" owner of the > memory. > > Currently, this is only supported for evictable folios, as the > unevictable lru is imaginary and we cannot iterate the folios on it. A > separate proposal [2] was made to revive the unevictable lru, which > would allow recharging of unevictable folios. > > (b) Deferred recharging of folios. > > For folios that are unmapped, or mapped but we fail to recharge them > with (a), we rely on deferred recharging. Simply put, any time a folio > is accessed or dirtied by a userspace process, and that folio is charged > to an offline memcg, we will try to recharge it to the memcg of the > process accessing the folio. Again, we assume this process should be the > "rightful" owner of the memory. This is also done asynchronously to avoid > slowing down the data access path. Unfortunately I have to agree with Johannes, Tejun and others who are not big fans of this approach. Lazy recharging leads to an interesting phenomena: a memory usage of a running workload may suddenly go up only because some other workload is terminated and now it's memory is being recharged. I find it confusing. It also makes hard to set up limits and/or guarantees. In general, I don't think we can handle shared memory well without getting rid of "whoever allocates a page, pays the full price" policy and making a shared ownership a fully supported concept. Of course, it's a huge work and I believe the only way we can achieve it is to compromise on the granularity of the accounting. Will the resulting system be better in the real life, it's hard to say in advance. Thanks!