Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1980904ybk; Mon, 11 May 2020 08:58:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvXXhYa8U9ZgnStgEu8L87PskVOD/5N/ZYC3H6CNq7j4WBSjqj4KOt6lR4ScF2fSCGQh94 X-Received: by 2002:aa7:d787:: with SMTP id s7mr2257178edq.104.1589212730108; Mon, 11 May 2020 08:58:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589212730; cv=none; d=google.com; s=arc-20160816; b=xPN+3/lAWgDnS3A4oytyZIiqbtcnHq4J+Za7BebA2LSJa44CcP7FLiohQFGsk5iDrF VbuKahyaqBw6WR3m+B/ruen6UfJ+C+utt/BkKCkR+l0J8wfL7U23/PYMbSo40CMy5O3C MB1vVuhpYfaRB6tiCtfuP4f4Uk4ciJFlv+reSjuE3xGiX1yC4GOgYsx3CojG2paqGJCt HvLx0qe2ZteZtspYCCZLlaAufxxuJcdcn8wNzsb/IcD1ttuSbP07q47PJUOroEglfyRb J9ttVvNdUGM+IqFbA23wUDbLLtzclFZB46AJKem/WKbpskSbzgNBNxb1EFr3ReGLarZQ yj4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=0mkmAQYTPxUwTzaufmQ28FUE0wAOW0XuP/PajjBfBuI=; b=QDI8TWFSKpqgenLVWd1G4sTr/dMKuP3xww/EE2H9BPGFfXXL5dr5JxtkSq/18+ranR 8+hzYjII7KvmBWHe29/UERsaezlf3b/+MCaO0vJMgTMlv50UcwbWdLtnJbEp4xQfqKHf sDoz4nPgJWoPgbIzkkRSOy79uv2/cY9MEQggDOVD+K7Y2pDP/XvmHGI4fvRe9GA+RTZ1 L9zVCPvENdShgFRZ9DnOt0l7MqP98zWuT3i3A5eKiRmwyQRIxSzMSLDXUk/xWFQGGahh M8IyVNkLIPoGgDE+ujODUCNvP3E3Besu5BghgtZFfIdBu7XwB6o15WiORaoDugFrYpX9 IVwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=oj6nl8Et; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j23si5926050ejm.361.2020.05.11.08.58.25; Mon, 11 May 2020 08:58:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=oj6nl8Et; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730365AbgEKP5H (ORCPT + 99 others); Mon, 11 May 2020 11:57:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726687AbgEKP5G (ORCPT ); Mon, 11 May 2020 11:57:06 -0400 Received: from mail-qt1-x843.google.com (mail-qt1-x843.google.com [IPv6:2607:f8b0:4864:20::843]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA838C061A0C for ; Mon, 11 May 2020 08:57:06 -0700 (PDT) Received: by mail-qt1-x843.google.com with SMTP id z90so8317270qtd.10 for ; Mon, 11 May 2020 08:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=0mkmAQYTPxUwTzaufmQ28FUE0wAOW0XuP/PajjBfBuI=; b=oj6nl8Etq+Cp7D5OexjUrLRZfJ6Boh1fr3GR6ETsCeQgMNybVUKDK0SlyrXGit3FGs Kk4Yshjf2ZNzTx3DiZrm/f03A9DD5twUHxDzBbOQtkSX+qhqGRELd8hPWbxYR6XcP7c3 AqZYcLK6CdIjDSO+xG1LoDdPVgEkch+Eb627llInyJOkzFEe2tTFHu4jBztuP60v8bx+ EY2XURnrX3zUqwtOy1Iu5TwM6uN1XpzGxwBv+RXpvyYam2nJ8ocqSqgbtTUnSltdPVTU gEXP2FC/T77cVQ2cXgLbdF4X+1qS94vQvbOEXrkDdjbjGNU59A7GMlhFsW8TmO0UPNM1 4WDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=0mkmAQYTPxUwTzaufmQ28FUE0wAOW0XuP/PajjBfBuI=; b=ualy2RCZn09kEjxhk/k6h/INWexuw+te5/PEtFuGkk6R/PJAe0u5f2uaTOqIAI3tPf rS2tTLY6j+zrZOCbA/bILIiTHsAiwTINWGqfSeVW57HvXixDd+xaGVYuI4vw87WlGBPV rWFZyQi8JjAoYAqAKqhBvtLE159btoKD3QyWwDiRhGbycgzFjaZ+9y8IFV1i/MhfvEao fHoIz7lxu3U1X4H7uVEyu3UUjpUziecfh/B6COBqx3WVVbZCovJWQEzvx7PHy5LCAY8L U8LYDaTu8PoiFPXUkwWgEBRdiVaps8pDTGVPFpm/pQN9SXuEw7gyFBtprzhCnjUADE5i hmkQ== X-Gm-Message-State: AGi0PubxdHgCR+9Ns0U/dM3JwJX5hlI5SUgqzrgDR38+atY+577prB0w M7N638YpNsDb6f5nUb1hTFLVTQ== X-Received: by 2002:ac8:4686:: with SMTP id g6mr17447463qto.144.1589212625970; Mon, 11 May 2020 08:57:05 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id q17sm8637971qkq.111.2020.05.11.08.57.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2020 08:57:05 -0700 (PDT) Date: Mon, 11 May 2020 11:56:46 -0400 From: Johannes Weiner To: Shakeel Butt Cc: Michal Hocko , Roman Gushchin , Greg Thelen , Andrew Morton , Linux MM , Cgroups , LKML Subject: Re: [PATCH] memcg: effective memory.high reclaim for remote charging Message-ID: <20200511155646.GB306292@cmpxchg.org> References: <20200507163301.229070-1-shakeelb@google.com> <20200507164653.GM6345@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 07, 2020 at 10:00:07AM -0700, Shakeel Butt wrote: > On Thu, May 7, 2020 at 9:47 AM Michal Hocko wrote: > > > > On Thu 07-05-20 09:33:01, Shakeel Butt wrote: > > [...] > > > @@ -2600,8 +2596,23 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > > schedule_work(&memcg->high_work); > > > break; > > > } > > > - current->memcg_nr_pages_over_high += batch; > > > - set_notify_resume(current); > > > + > > > + if (gfpflags_allow_blocking(gfp_mask)) > > > + reclaim_over_high(memcg, gfp_mask, batch); > > > + > > > + if (page_counter_read(&memcg->memory) <= > > > + READ_ONCE(memcg->high)) > > > + break; > > > > I am half way to a long weekend so bear with me. Shouldn't this be continue? The > > parent memcg might be still in excess even the child got reclaimed, > > right? > > > > The reclaim_high() actually already does this walk up to the root and > reclaim from ones who are still over their high limit. Though having > 'continue' here is correct too. If reclaim was weak and failed to bring the child back in line, we still do set_notify_resume(). We should do that for ancestors too. But it seems we keep adding hierarchy walks and it's getting somewhat convoluted: page_counter does it, then we check high overage recursively, and now we add the call to reclaim which itself is a walk up the ancestor line. Can we hitchhike on the page_counter_try_charge() walk, which already has the concept of identifying counters with overage? Rename the @fail to @limited and return the first counter that is in excess of its high as well, even when the function succeeds? Then we could ditch the entire high checking loop here and simply replace it with done_restock: ... if (*limited) { if (gfpflags_allow_blocking()) reclaim_over_high(memcg_from_counter(limited)); /* Reclaim may not be able to do much, ... */ set_notify_resume(); // or schedule_work() }; In the long-term, the best thing might be to integrate memory.high reclaim with the regular reclaim that try_charge() is already doing. Especially the part where it retries several times - we currently give up on memory.high unnecessarily early. Make page_counter_try_charge() fail on high and max equally, and after several reclaim cycles, instead of invoking the OOM killer, inject the penalty sleep and force the charges. OOM killing and throttling is supposed to be the only difference between the two, anyway, and yet the code diverges far more than that for no apparent reason. But I also appreciate that this is a cleanup beyond the scope of this patch here, so it's up to you how far you want to take it.