Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5059509imm; Tue, 31 Jul 2018 04:59:34 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdf3zFtTximsmuhwvwXubDP1Q/SY8GFzPK3ToSNqKz0+CFFftSMHKuQbGibXxeGwHbge5fD X-Received: by 2002:a62:4083:: with SMTP id f3-v6mr21805355pfd.229.1533038374291; Tue, 31 Jul 2018 04:59:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533038374; cv=none; d=google.com; s=arc-20160816; b=Kdi7Rs0y3TN6aaXHePIvz42FUIzCtq8dM/nysxQxlEnKepNAMar4Zu8J3YUCda1u1i hgiLAUGtPaKHiMgveX+cLn2YSOIHTtvOp97oLg+P15P7U3wZNdMmmgQMiP4sSugUwVGO z/rjaWafUb22aJ0cSzIEJdg5H+gnvD6YuAXBt0O1k7zWuUnHdY+b3hHLXLV52kfYGqka G7McVX8S+b6gV4g5XCMiB3jLmBAEk79NOTZY8DACB7NLsbCu6gV5pVJbVo/jvnTEJp7V sw5S1C9mF64Vmk7ZDq//S1H4vSeYqI1KgiDbHrR2356tqWim6HX4CAAAKng4qm+IEe1a gACA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=NBmICVvCU1qN27R+73QzpCdboitbfyfLAhWBxyXhWAc=; b=AiJ54cGBwNCQeMHqpWTvpvV8ogD4UqH7BRQStd4EV/WGZROEVPNfdOt2Kfnkgtpo/A EY/Qy84A6jk5bJdKvPzt6Ix0CNDAdOoaFHV5Ihdw6ywD2fa+4MvZY9D7Qs4UbWuR33WZ jDBvQ0qtJL13QzJ/fKiE41L8IMSA5i7QFAAaVX3zSeVwvB4D0O74Wdvi1G8XkLNeRn+c zD5IdNFrIV+2VVrZQ7Qcd80DXgBNSHDpOki3c5h3FbHxw0RQRj1fgFazIxowlwTmIz4W 8DvDC79b13lfKSmh3rAHr1wck049povhQjAQVmG8I120yV/53CQWWRDUD/sdw1ZZCLuk nwTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=keqxAfam; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j5-v6si12710224plk.406.2018.07.31.04.59.20; Tue, 31 Jul 2018 04:59:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=keqxAfam; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732128AbeGaNic (ORCPT + 99 others); Tue, 31 Jul 2018 09:38:32 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:40482 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732090AbeGaNib (ORCPT ); Tue, 31 Jul 2018 09:38:31 -0400 Received: by mail-oi0-f65.google.com with SMTP id w126-v6so27375836oie.7 for ; Tue, 31 Jul 2018 04:58:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NBmICVvCU1qN27R+73QzpCdboitbfyfLAhWBxyXhWAc=; b=keqxAfamYcix3eI+VqsbQMbz7UWC+EOWq3NS6bfnTUB5hHt3Q1caUQ5xFMd+Qom+my q1VFjz4z+w1gsikpnYMkT/E1BbJRMWh96i36/pjHvPRr2tWxr70Q8B/IunfPRMqaSKpq S/4hpNZIvyXkIVjx/PlyAsm8C26Yu3JUHwB/6iIVVGJXMLjXRJcNsAi9UUH9N42pG7j4 dzE2zxPrVcI66B3scfttfuAzK7+p77OAjTZlcr7Dht6rDEsYw+WphajoK9EP9gh2CvXX zT100bGRTo20WDRwaB8xkWwLe8wFGgMNXgqlbncpiMDunh9vdckyZXPuRxLs30wP6MuS XS5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NBmICVvCU1qN27R+73QzpCdboitbfyfLAhWBxyXhWAc=; b=dYFsC3IGO2OUz18r6CZ8RTrrl6s1uN041yyV/JnKkM25xINqSDehcYCxxG8CNjDE77 MN/x1skv6bncgdTUMpNq5DZebcBSwYhFwGgG2gslXo/sYZPBTd7VMMO4chTLfa6QZKgj TzOdy6/bpypFI2jvVW/VdGJLuZ62zMjgONAMipmmNjR2I1lrkeY5rz5aPDTGEuvfAqai A78rV+TvZbdi7oEQjixsrhNKwKgZiQZFus4DXgtYD57ivnwmTbyZvIxsccgjSnK4qIBb N/wMLsDwCI3BLK1KtIXl6RlHSNszqMRl0wVKqJd+oqa532xLfQfSAocOT9iGwFS/NXfq rSMw== X-Gm-Message-State: AOUpUlFsH5HVEbzqumFSR5qGQsIT+QtQzsttl5m08Lff1ZqHlSrqRzvH ow5DzXI8AnGbW5vtQWwYmwevfPtAomNaIO3fifQ= X-Received: by 2002:aca:52d1:: with SMTP id g200-v6mr20801569oib.134.1533038311821; Tue, 31 Jul 2018 04:58:31 -0700 (PDT) MIME-Version: 1.0 References: <1533035368-30911-1-git-send-email-zhaoyang.huang@spreadtrum.com> <20180731111924.GI4557@dhcp22.suse.cz> In-Reply-To: <20180731111924.GI4557@dhcp22.suse.cz> From: Zhaoyang Huang Date: Tue, 31 Jul 2018 19:58:20 +0800 Message-ID: Subject: Re: [PATCH v2] mm: terminate the reclaim early when direct reclaiming To: Michal Hocko Cc: Steven Rostedt , Ingo Molnar , Johannes Weiner , Vladimir Davydov , "open list:MEMORY MANAGEMENT" , LKML , kernel-patch-test@lists.linaro.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 31, 2018 at 7:19 PM Michal Hocko wrote: > > On Tue 31-07-18 19:09:28, Zhaoyang Huang wrote: > > This patch try to let the direct reclaim finish earlier than it used > > to be. The problem comes from We observing that the direct reclaim > > took a long time to finish when memcg is enabled. By debugging, we > > find that the reason is the softlimit is too low to meet the loop > > end criteria. So we add two barriers to judge if it has reclaimed > > enough memory as same criteria as it is in shrink_lruvec: > > 1. for each memcg softlimit reclaim. > > 2. before starting the global reclaim in shrink_zone. > > Then I would really recommend to not use soft limit at all. It has > always been aggressive. I have propose to make it less so in the past we > have decided to go that way because we simply do not know whether > somebody depends on that behavior. Your changelog doesn't really tell > the whole story. Why is this a problem all of the sudden? Nothing has > really changed recently AFAICT. Cgroup v1 interface is mostly for > backward compatibility, we have much better ways to accomplish > workloads isolation in cgroup v2. > > So why does it matter all of the sudden? > > Besides that EXPORT_SYMBOL for such a low level functionality as the > memory reclaim is a big no-no. > > So without a much better explanation and with a low level symbol > exported NAK from me. > My test workload is from Android system, where the multimedia apps require much pages. We observed that one thread of the process trapped into mem_cgroup_soft_limit_reclaim within direct reclaim and also blocked other thread in mmap or do_page_fault(by semphore?). Furthermore, we also observed other long time direct reclaim related with soft limit which are supposed to cause page thrash as the allocator itself is the most right of the rb_tree . Besides, even without the soft_limit, shall the 'direct reclaim' check the watermark firstly before shrink_node, for the concurrent kswapd may have reclaimed enough pages for allocation. > > > > Signed-off-by: Zhaoyang Huang > > --- > > include/linux/memcontrol.h | 3 ++- > > mm/memcontrol.c | 3 +++ > > mm/vmscan.c | 38 +++++++++++++++++++++++++++++++++++++- > > 3 files changed, 42 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 6c6fb11..a7e82c7 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -325,7 +325,8 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg, > > void mem_cgroup_uncharge_list(struct list_head *page_list); > > > > void mem_cgroup_migrate(struct page *oldpage, struct page *newpage); > > - > > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed, > > + unsigned long nr_scanned, gfp_t gfp_mask, int order); > > static struct mem_cgroup_per_node * > > mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) > > { > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 8c0280b..e4efd46 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2577,6 +2577,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, > > (next_mz == NULL || > > loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) > > break; > > + if (direct_reclaim_reach_watermark(pgdat, nr_reclaimed, > > + *total_scanned, gfp_mask, order)) > > + break; > > } while (!nr_reclaimed); > > if (next_mz) > > css_put(&next_mz->memcg->css); > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 03822f8..19503f3 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2518,6 +2518,34 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) > > (memcg && memcg_congested(pgdat, memcg)); > > } > > > > +bool direct_reclaim_reach_watermark(pg_data_t *pgdat, unsigned long nr_reclaimed, > > + unsigned long nr_scanned, gfp_t gfp_mask, > > + int order) > > +{ > > + struct scan_control sc = { > > + .gfp_mask = gfp_mask, > > + .order = order, > > + .priority = DEF_PRIORITY, > > + .nr_reclaimed = nr_reclaimed, > > + .nr_scanned = nr_scanned, > > + }; > > + if (!current_is_kswapd()) > > + return false; > > + if (!IS_ENABLED(CONFIG_COMPACTION)) > > + return false; > > + /* > > + * In fact, we add 1 to nr_reclaimed and nr_scanned to let should_continue_reclaim > > + * NOT return by finding they are zero, which means compaction_suitable() > > + * takes effect here to judge if we have reclaimed enough pages for passing > > + * the watermark and no necessary to check other memcg anymore. > > + */ > > + if (!should_continue_reclaim(pgdat, > > + sc.nr_reclaimed + 1, sc.nr_scanned + 1, &sc)) > > + return true; > > + return false; > > +} > > +EXPORT_SYMBOL(direct_reclaim_reach_watermark); > > + > > static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) > > { > > struct reclaim_state *reclaim_state = current->reclaim_state; > > @@ -2802,7 +2830,15 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > sc->nr_scanned += nr_soft_scanned; > > /* need some check for avoid more shrink_zone() */ > > } > > - > > + /* > > + * we maybe have stolen enough pages from soft limit reclaim, so we return > > + * back if we are direct reclaim > > + */ > > + if (direct_reclaim_reach_watermark(zone->zone_pgdat, sc->nr_reclaimed, > > + sc->nr_scanned, sc->gfp_mask, sc->order)) { > > + sc->gfp_mask = orig_mask; > > + return; > > + } > > /* See comment about same check for global reclaim above */ > > if (zone->zone_pgdat == last_pgdat) > > continue; > > -- > > 1.9.1 > > -- > Michal Hocko > SUSE Labs