Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp4481193ybz; Tue, 28 Apr 2020 12:02:44 -0700 (PDT) X-Google-Smtp-Source: APiQypKOR6WQwUcyH6A+9xtZSGrIKW+lC7ZKPb5vl5TOrkW0u3WrXtmo7HDzlhQEWitkQR/cFzj4 X-Received: by 2002:a50:f74c:: with SMTP id j12mr24265815edn.197.1588100564736; Tue, 28 Apr 2020 12:02:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588100564; cv=none; d=google.com; s=arc-20160816; b=QZS5AK1g6uGjkFWSFJgkVARd3r8xOVH/zbM1/lDqDRIx3B4JS5wS8UYUGh0XRTTC/y bmjEmc+nYy32qNQibkxmca/uUvF5rkL/isUHcvmF3VQKKTrRw49CZBHO2bor7botiWDf hLQ+csAC/CJ4xaMwYXc9eoSrSfsPgS65tycmdv5ZiaF3y5DQy08846bdoH1sLHjlvYUe gBpmUL9KZrDAEr0KMV3B2PPhYMAggShhUwqCtqZ9ICYNpGEUXB8UOwI5bbqcf9syl5Z3 G+5jaQc3kkQAzpdIveUN84pGNcCPK6LGMvoztFqHn3zEn3bkON75NhPO1EYSSiP/IhpK QeIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=8ZdOelXqdALp+t5GpXOOCmyCgx7RbdhWdYql4OpbI1c=; b=kJWIp/PDZmvKLbMP0F+fZbAd+A8Tl/HSeRbIbDv6DR54ekWCuYA7nZe5dakSMtryWY 1Y/QBEHq6kkrD6raQji0uE1szOCvT3SoVkBjkJN2b/z4c7irfm7WMgOVGqetXjZFbYk+ X7j0dZKVL365vCIt95+N5d9FCueph8d2yucXk+3CczqUxhlEkVyeo9al/aRjXKF5cXaB hroCkNA9GHuNtaexb6StsD49j9Pq4Pp0QPTloo7jEL0UNqhAquPvWXzj1gzsz/TPLrK8 KyGcUIqTIlcg8msfcdzBlG4+YIw9vHt5tXyKtnQu7w4MkXXeAZvV7t8QQsxypI7zFPpn QY4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chrisdown.name header.s=google header.b=FIUW+caX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chrisdown.name Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l35si2040933edl.187.2020.04.28.12.02.20; Tue, 28 Apr 2020 12:02:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chrisdown.name header.s=google header.b=FIUW+caX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chrisdown.name Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728845AbgD1S0y (ORCPT + 99 others); Tue, 28 Apr 2020 14:26:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1728813AbgD1S0v (ORCPT ); Tue, 28 Apr 2020 14:26:51 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42727C03C1AB for ; Tue, 28 Apr 2020 11:26:50 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id k13so25888867wrw.7 for ; Tue, 28 Apr 2020 11:26:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=8ZdOelXqdALp+t5GpXOOCmyCgx7RbdhWdYql4OpbI1c=; b=FIUW+caXr+iJxtL6SRGRKizlkX7XhzWQuOY05F0+5+5/65k+s+9ywMQwaLpbod1SvJ FfDh3SAdEhODgckqhW2bHwjVwebnbxOH1HsAcAe4T3Fd/1U0rACWccleZh2AdKLIkD+h o/RxBNRzbhY/2DUlNUpR3LbBF56ufvPXeDgfo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=8ZdOelXqdALp+t5GpXOOCmyCgx7RbdhWdYql4OpbI1c=; b=l1+K5UXYwCKoFsl0Z0n1PwjYXjARt742lXph6ZVxhP7+s5cgqkZalshjBiQHi/Ty9Z f9uuZF1Mo8LzeAp27hjod2TOu1cW11xO9X1P2IisKIv3BK3URQoVAdmjhsQDp9WhQuHH 6tx9ebXHP6M2gK9l14q2ambjdLnK4z5pNVdNVLpQ00WUompAb/IqRFlfXB4iqoSHY6do IjwJT11eMAZC48UpVKtqgALMJIwRAO0EJOvnWSxAttaXlS61JrE7A0Gi5vIT8bOhQDF5 /W65cHGaHdbcsM7Ac9bfgRd9v0DxrX7oMX+ubE0oQvc759tjrn4rRyBclLdMS4bJ9dVT ex5A== X-Gm-Message-State: AGi0PuY92QqQsFktZrsuXTMZSpZ8u0fX5+sx0Dpn2C1bPGH3yn18UeNR StfoOYZcXV8tshUBsVTxGqBLYA== X-Received: by 2002:a5d:490f:: with SMTP id x15mr33086688wrq.37.1588098408915; Tue, 28 Apr 2020 11:26:48 -0700 (PDT) Received: from localhost ([2a01:4b00:8432:8a00:56e1:adff:fe3f:49ed]) by smtp.gmail.com with ESMTPSA id r17sm25942192wrn.43.2020.04.28.11.26.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Apr 2020 11:26:48 -0700 (PDT) Date: Tue, 28 Apr 2020 19:26:47 +0100 From: Chris Down To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Yafang Shao , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] mm, memcg: Avoid stale protection values when cgroup is above protection Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yafang Shao A cgroup can have both memory protection and a memory limit to isolate it from its siblings in both directions - for example, to prevent it from being shrunk below 2G under high pressure from outside, but also from growing beyond 4G under low pressure. Commit 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") implemented proportional scan pressure so that multiple siblings in excess of their protection settings don't get reclaimed equally but instead in accordance to their unprotected portion. During limit reclaim, this proportionality shouldn't apply of course: there is no competition, all pressure is from within the cgroup and should be applied as such. Reclaim should operate at full efficiency. However, mem_cgroup_protected() never expected anybody to look at the effective protection values when it indicated that the cgroup is above its protection. As a result, a query during limit reclaim may return stale protection values that were calculated by a previous reclaim cycle in which the cgroup did have siblings. When this happens, reclaim is unnecessarily hesitant and potentially slow to meet the desired limit. In theory this could lead to premature OOM kills, although it's not obvious this has occurred in practice. Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") Signed-off-by: Yafang Shao Signed-off-by: Chris Down Cc: Johannes Weiner Cc: Michal Hocko Cc: Roman Gushchin [hannes@cmpxchg.org: rework code comment] [hannes@cmpxchg.org: changelog] [chris@chrisdown.name: fix store tear] [chris@chrisdown.name: retitle] --- mm/memcontrol.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0be00826b832..b0374be44e9e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6392,8 +6392,19 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, if (!root) root = root_mem_cgroup; - if (memcg == root) + if (memcg == root) { + /* + * The cgroup is the reclaim root in this reclaim + * cycle, and therefore not protected. But it may have + * stale effective protection values from previous + * cycles in which it was not the reclaim root - for + * example, global reclaim followed by limit reclaim. + * Reset these values for mem_cgroup_protection(). + */ + WRITE_ONCE(memcg->memory.emin, 0); + WRITE_ONCE(memcg->memory.elow, 0); return MEMCG_PROT_NONE; + } usage = page_counter_read(&memcg->memory); if (!usage) -- 2.26.2