Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1023680ybe; Wed, 4 Sep 2019 11:18:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqyvQtVUl31qZdyTrV8X2IZFcQhU4Zo1zMVHJt+KzFadjjtUJRD4MrBALDJnP9ZLWCFXDOqF X-Received: by 2002:a17:90a:718a:: with SMTP id i10mr6537927pjk.27.1567621084561; Wed, 04 Sep 2019 11:18:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567621084; cv=none; d=google.com; s=arc-20160816; b=FhvdNtsRt1d2agm2IvJ7M/KufT3GMiZ+HzQhc3Vv6umnk/+PEUqVhEaHqBpstN4pJI X8CDwfPhLjyBQwd58mfbClXI3gPm0dkDXlu1Tk5pAopgA3CLaWvM5Iaa0hNowiUg0YzG JwvC76lcMIlOtn2ChYkWEzLzOJp7GMG4Z5ip6EFfP9pGdlv01NU8gH9uezJIdEbSbY1K g1mMBmK/Yl0QwKv1CNHoTsWsPJXF80frZX71ryM0IhesnhNUcpfR3FbtL+F9PUqiifJb u2FVSvDGugX+qZ/m1a82zyZNsreC9WcNqH8jbHFAUHag8iW86S0BVL0l4yB6jIy9N1bK U08Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=eWWrEGZvSHql9faf50q7BHqD2zaiYxgggYPtahU2G/k=; b=REy0qUpCNfOJSN4jehkMT/GlZXR9M+QY98V4FxBpX0Etacnzzm5brQ6052bVl9AxlQ 3sGRWIciYsmKMpxMBOLZpZ7OWvaYGUdB8qSoy6AHrhvx2sdUSIZfKCSD1e4ukORYzDvV pW2SpWt5H2OSA66fTRxVcIjPyMQEATPIUjfe1sSAntqFOKvx8KHFzNSGNY3iX2ahVdAz MVv6gSPIpy1zxN/vg3Z4l5qAobnqdpsyrPexTv81obMfR2vx3or18ja4G7xwhtqBSKs+ alq/kDz8mBNBRi6+q5QkJnM/4ef6NpUnJmH57BP+cxcPgJTtPaFcYw5W2JybvY7IknRF czAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=a8nQ2EfQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j9si2868972pjz.23.2019.09.04.11.17.49; Wed, 04 Sep 2019 11:18:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=a8nQ2EfQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390524AbfIDSOZ (ORCPT + 99 others); Wed, 4 Sep 2019 14:14:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:59496 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389453AbfIDSOV (ORCPT ); Wed, 4 Sep 2019 14:14:21 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9B4EB22CF7; Wed, 4 Sep 2019 18:14:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567620860; bh=QiiWuGcgOjRZ/srI8+ia0peu/RvqTW7WEzFTV7LwK2o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=a8nQ2EfQj4U0y30WV8ndFenN+Z0q0Gj/h906XDFW3f589s8vz1M1XmVcmHwYfd9eY 1q1xNaFYB3whMG1mhweHjFiQYHQJddtiZW8z2SOvK0KCESKI/Q8Ry/dKe4hrHSgJCU QpoM0bK1ZT4nsXqS5h6rZitVZCT1R5LploQsWwSg= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Roman Gushchin , Yafang Shao , Johannes Weiner , Michal Hocko , Andrew Morton , Linus Torvalds Subject: [PATCH 5.2 122/143] mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones" Date: Wed, 4 Sep 2019 19:54:25 +0200 Message-Id: <20190904175319.173967594@linuxfoundation.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190904175314.206239922@linuxfoundation.org> References: <20190904175314.206239922@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Roman Gushchin commit b4c46484dc3fa3721d68fdfae85c1d7b1f6b5472 upstream. Commit 766a4c19d880 ("mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones") effectively decreased the precision of per-memcg vmstats_local and per-memcg-per-node lruvec percpu counters. That's good for displaying in memory.stat, but brings a serious regression into the reclaim process. One issue I've discovered and debugged is the following: lruvec_lru_size() can return 0 instead of the actual number of pages in the lru list, preventing the kernel to reclaim last remaining pages. Result is yet another dying memory cgroups flooding. The opposite is also happening: scanning an empty lru list is the waste of cpu time. Also, inactive_list_is_low() can return incorrect values, preventing the active lru from being scanned and freed. It can fail both because the size of active and inactive lists are inaccurate, and because the number of workingset refaults isn't precise. In other words, the result is pretty random. I'm not sure, if using the approximate number of slab pages in count_shadow_number() is acceptable, but issues described above are enough to partially revert the patch. Let's keep per-memcg vmstat_local batched (they are only used for displaying stats to the userspace), but keep lruvec stats precise. This change fixes the dead memcg flooding on my setup. Link: http://lkml.kernel.org/r/20190817004726.2530670-1-guro@fb.com Fixes: 766a4c19d880 ("mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones") Signed-off-by: Roman Gushchin Acked-by: Yafang Shao Cc: Johannes Weiner Cc: Michal Hocko Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/memcontrol.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -748,15 +748,13 @@ void __mod_lruvec_state(struct lruvec *l /* Update memcg */ __mod_memcg_state(memcg, idx, val); + /* Update lruvec */ + __this_cpu_add(pn->lruvec_stat_local->count[idx], val); + x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]); if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { struct mem_cgroup_per_node *pi; - /* - * Batch local counters to keep them in sync with - * the hierarchical ones. - */ - __this_cpu_add(pn->lruvec_stat_local->count[idx], x); for (pi = pn; pi; pi = parent_nodeinfo(pi, pgdat->node_id)) atomic_long_add(x, &pi->lruvec_stat[idx]); x = 0;