Received: by 10.213.65.68 with SMTP id h4csp2444229imn; Thu, 5 Apr 2018 15:21:47 -0700 (PDT) X-Google-Smtp-Source: AIpwx48/rjlv/6pEikpMvLhCrOqPzLF6OX2nrztgFhrcjZzCuNhW8q/G78RtS5vnTZ/QAshQchEa X-Received: by 2002:a17:902:8205:: with SMTP id x5-v6mr4016266pln.57.1522966907816; Thu, 05 Apr 2018 15:21:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522966907; cv=none; d=google.com; s=arc-20160816; b=cnwH41FomrTH/RrSVNqWADDPil0lYOBMdwIoZ7TGvKxwRZOLDGawNe7ibFm9JL4Fxe NyXpBB07VU/Dv9SDHCSjbYQHBwlA9qs+FsWF95uL+PBxUIt7R+ecirZpk8WRvopVhYa4 WeRYuAogwXilXZeD8EcQH1O7OUDjvgUS8Wtxhuv2h715/5GyIhoCOqOKs9fPrT3dGqtF b2x/Vb+e0rM7HFQmKla+jrkb1Oi59rDbqJwj3G1Pqdp00tsjcN+bUMD9PA4o7LzjMF8W Aax3CyW5PytMIbESO7S7ygojC6SKyog9H7eH6axtkzEBV4HIvWnSLuHlovFx1wAMhDrB Foqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=thosb79/4+p1LZYcJ50TzVzQ1R0so9OsPmzWmKlREXw=; b=X7YxlH0IkHZO9m/EAG588tIU/RSxLJ7wGSJhz3L9DoFSSOecXz3jpiZNgRILALCbW4 wlbq5u0STtySuK255l2jD9W7z+kN9FPYGMPYqXmUSh0HnsO68m/EAzBMgVyRzXi9fbu4 V9Io+PmN2v74uZdi2LxpXmeNaRSjtvj66Up057PBgyzxo8gfFIeGNMlTptkXClyXchgl M0rgQ3JXE6VTMdVvmJVVYVfVsYKFgvNx7tQ+x1lHIZU+hJq+2bXGQbmGesjycZ9qcCGB THN2a3mc1+8f7yKHtKv6ia8G7mLJrufGoAO0pDl2hDNK13N1bcznf1fPxWyFMePk7IUS GVIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s78si6758889pfj.259.2018.04.05.15.21.04; Thu, 05 Apr 2018 15:21:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753920AbeDEWRy (ORCPT + 99 others); Thu, 5 Apr 2018 18:17:54 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:58064 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751443AbeDEWRx (ORCPT ); Thu, 5 Apr 2018 18:17:53 -0400 Received: from akpm3.svl.corp.google.com (unknown [104.133.9.71]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 97F11D66; Thu, 5 Apr 2018 22:17:52 +0000 (UTC) Date: Thu, 5 Apr 2018 15:17:51 -0700 From: Andrew Morton To: Andrey Ryabinin Cc: Mel Gorman , Tejun Heo , Johannes Weiner , Michal Hocko , Shakeel Butt , Steven Rostedt , linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH v2 3/4] mm/vmscan: Don't change pgdat state on base of a single LRU list state. Message-Id: <20180405151751.c07ee14496f9d5b691b49c64@linux-foundation.org> In-Reply-To: <20180323152029.11084-4-aryabinin@virtuozzo.com> References: <20180323152029.11084-1-aryabinin@virtuozzo.com> <20180323152029.11084-4-aryabinin@virtuozzo.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 23 Mar 2018 18:20:28 +0300 Andrey Ryabinin wrote: > We have separate LRU list for each memory cgroup. Memory reclaim iterates > over cgroups and calls shrink_inactive_list() every inactive LRU list. > Based on the state of a single LRU shrink_inactive_list() may flag > the whole node as dirty,congested or under writeback. This is obviously > wrong and hurtful. It's especially hurtful when we have possibly > small congested cgroup in system. Than *all* direct reclaims waste time > by sleeping in wait_iff_congested(). And the more memcgs in the system > we have the longer memory allocation stall is, because > wait_iff_congested() called on each lru-list scan. > > Sum reclaim stats across all visited LRUs on node and flag node as dirty, > congested or under writeback based on that sum. Also call > congestion_wait(), wait_iff_congested() once per pgdat scan, instead of > once per lru-list scan. > > This only fixes the problem for global reclaim case. Per-cgroup reclaim > may alter global pgdat flags too, which is wrong. But that is separate > issue and will be addressed in the next patch. > > This change will not have any effect on a systems with all workload > concentrated in a single cgroup. > Could we please get this reviewed?