Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp7026949pxb; Wed, 17 Feb 2021 22:20:42 -0800 (PST) X-Google-Smtp-Source: ABdhPJxxOEuMifw9VZkJUeAXOv0R/LNefiHsDY/sGPIp3IIiR8MvfFWMHLnpDtsAVXgdO8WJoaSX X-Received: by 2002:aa7:dd0d:: with SMTP id i13mr2382754edv.132.1613629242579; Wed, 17 Feb 2021 22:20:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613629242; cv=none; d=google.com; s=arc-20160816; b=DzWesOgSuOJ6K948kLVpUsgNmh4lYt86HHqDLpRyBU5ZUkU3pPbNWYnWjXjAszyae2 4osij9ZQyX7vem48YLJyRAGTvrhLgDJ7AYZFKrlYfdyYCC5G2BhL/t/Nr/SwA6BWrZi9 Pc3p7jSvQdwUSdvOAKGSgD6W08WWFDahFO8xkMlqzKvhfcPx4CgWh8J45XHx8O5sK4ea 7y3eiWJo89pdA/NTnx2+4DZI+BRGwWUxgipNbt3G9DjuXnupaAhcdE57dt3a1GMRN2Br x3Of375OqpD9FZsSoSetkYqPAgw0otwD4kG2yNArmW7d9xglVFC/hn1833827WWFdjDU Dfgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=97xepu+H893Qn2oD7QzEQlF0nqYf2pV6us14XKtvHxU=; b=d0sdUqArjWan2GvuVp14R3Z8JWKET3/hI6nx6mUefZQ1fobQB+DsyADd7RpIZuO5n+ WYY7Qmf/OB9wu9zCqifX3qAV8Nj7R/CUwbYvMNT3BWmaXhwoI6mf+TQERnx24XaKUCp2 DMQr+QR3eygpFPiUXQqyuIw7FDL+Uiq8/EDmAAGoWMo8NdsvrVf3I7NAJQ6l7JFTWHgE vKZEbSMTr/Rs1/1yY98ik/RlWjihxjdLNUDp6rPNrYyJ5TJLH7HdY/IfG7FQu0+nteqd kPXq9mJRmPI8zy9PrQpX8EjJAHRFwjYuSL89l5CUant9SRUUHgsUNUl/MEZdYQQtHuKj k9VA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=mgbz52W0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cb6si1948146ejb.495.2021.02.17.22.20.14; Wed, 17 Feb 2021 22:20:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=mgbz52W0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231250AbhBRGSV (ORCPT + 99 others); Thu, 18 Feb 2021 01:18:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230520AbhBRF4o (ORCPT ); Thu, 18 Feb 2021 00:56:44 -0500 Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73458C06178B for ; Wed, 17 Feb 2021 21:56:04 -0800 (PST) Received: by mail-qv1-xf2f.google.com with SMTP id c25so448210qvb.4 for ; Wed, 17 Feb 2021 21:56:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=97xepu+H893Qn2oD7QzEQlF0nqYf2pV6us14XKtvHxU=; b=mgbz52W0JN7xwQftwEtM6NGsY/g84xRLA4xkhTjEINN5my44bcXRu6XD/L38sC1qfB fFk0r+RCj1e1clOhXV7+g4RwblTqhm4rMAYlgBga522+NSBTxYmvRQBy1Wby31qUxspS 2o1DQPJXVPiyJh6aiJRdpzWtHvBAg8XR4lxxIg1tkZhIUN+CxS34HLATPmSjYMbryzYh L3gZ0Bmtz2aLr/fB2xjsFlEgfigT7X0md89CHZzGaIwcsqT+wP+5w7Yh09lA+jl/U8NS 5ZI43x8+XjOGS5AddwtIoX1G0f0hze4USrUbCr7PTJ7ElbZRW957RBC1kocJFd96cIPp w/sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=97xepu+H893Qn2oD7QzEQlF0nqYf2pV6us14XKtvHxU=; b=DsfeRKU/B8ztsRLdk2xOW0GsXwVoeghbvSK2Chsof8SI9/y6KnglvuVJtOiBcIftNE jRLwZFqVIdMiP0BfG+is6D/1UwrAZ6sL4UXcfQ3/0KotpHC9XOmsS5JfBpa7U9rkkXAg l3i7EXpVeT4t5nwWM5WtddE+le5ALL6epUyQA7bjrZxdYrwZtTrFxuOKduhJ1Pxgl+fR 33t2OYTQhIJ+ArN9JoREBsdpBAc7LF912zv7ElTggaPREuyFM01+fm7lOcSzOPReu7o+ FK82CJbHoHOTbg4I6YgzX6DE2CTKAX++ISd8RQVo3ZsrJx6zRJTD1yZx8eGpAA6jjNjq 5jOg== X-Gm-Message-State: AOAM5320EflXLvmQtJqRHthrMs+n6fGNrGlc5fieKanC2UbgT4FfskjS LLugt8UMKhjibtTYRaUxcsA56Q== X-Received: by 2002:a05:6214:242f:: with SMTP id gy15mr2624583qvb.17.1613627763640; Wed, 17 Feb 2021 21:56:03 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id p6sm3245805qkg.36.2021.02.17.21.56.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Feb 2021 21:56:03 -0800 (PST) Date: Thu, 18 Feb 2021 00:56:02 -0500 From: Johannes Weiner To: Tim Chen Cc: Andrew Morton , Michal Hocko , Vladimir Davydov , Dave Hansen , Ying Huang , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 17, 2021 at 12:41:36PM -0800, Tim Chen wrote: > On a per node basis, the mem cgroup soft limit tree on each node tracks > how much a cgroup has exceeded its soft limit memory limit and sorts > the cgroup by its excess usage. On page release, the trees are not > updated right away, until we have gathered a batch of pages belonging to > the same cgroup. This reduces the frequency of updating the soft limit tree > and locking of the tree and associated cgroup. > > However, the batch of pages could contain pages from multiple nodes but > only the soft limit tree from one node would get updated. Change the > logic so that we update the tree in batch of pages, with each batch of > pages all in the same mem cgroup and memory node. An update is issued for > the batch of pages of a node collected till now whenever we encounter > a page belonging to a different node. Note that this batching for > the same node logic is only relevant for v1 cgroup that has a memory > soft limit. > > Reviewed-by: Ying Huang > Signed-off-by: Tim Chen > --- > mm/memcontrol.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d72449eeb85a..8bddee75f5cb 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6804,6 +6804,7 @@ struct uncharge_gather { > unsigned long pgpgout; > unsigned long nr_kmem; > struct page *dummy_page; > + int nid; > }; > > static inline void uncharge_gather_clear(struct uncharge_gather *ug) > @@ -6849,7 +6850,13 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > * exclusive access to the page. > */ > > - if (ug->memcg != page_memcg(page)) { > + if (ug->memcg != page_memcg(page) || > + /* > + * Update soft limit tree used in v1 cgroup in page batch for > + * the same node. Relevant only to v1 cgroup with a soft limit. > + */ > + (ug->dummy_page && ug->nid != page_to_nid(page) && > + ug->memcg->soft_limit != PAGE_COUNTER_MAX)) { Sorry, I used weird phrasing in my last email. Can you please preface the checks you're adding with a !cgroup_subsys_on_dfl(memory_cgrp_subsys) to static branch for cgroup1? The uncharge path is pretty hot, and this would avoid the runtime overhead on cgroup2 at least, which doesn't have the SL. Also, do we need the ug->dummy_page check? It's only NULL on the first loop - where ug->memcg is NULL as well and the branch is taken anyway. The soft limit check is also slightly cheaper than the nid check, as page_to_nid() might be out-of-line, so we should do it first. This? /* * Batch-uncharge all pages of the same memcg. * * Unless we're looking at a cgroup1 with a softlimit * set: the soft limit trees are maintained per-node * and updated on uncharge (via dummy_page), so keep * batches confined to a single node as well. */ if (ug->memcg != page_memcg(page) || (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && ug->memcg->soft_limit != PAGE_COUNTER_MAX && ug->nid != page_to_nid(page)))