Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5070130imm; Tue, 31 Jul 2018 05:07:23 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcMjc5AQzCCMu8Ai+1Lvr0I82oetwFrqnRJFGjcTE64P73RMtimOdi27UvguuqY6WrNlWh7 X-Received: by 2002:a65:620b:: with SMTP id d11-v6mr20451794pgv.429.1533038843573; Tue, 31 Jul 2018 05:07:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533038843; cv=none; d=google.com; s=arc-20160816; b=gZ+cC/PK25eGvCgWftXIYF+aj4xrbnWuB8InR7vcXwLZwCO7LIke7VHB8XJpxDtNQF c0KYPRdmUFDuN5MExsbWQnJcPHfB80CPjIX+svfmQr9KjQHdmPCcnbR9D0Mcrs+uMviA humNcWx/l+NlHto6gMMKWhvKaM4qob/krQf4pIkVNNnwEqI6yb9JMgZqjQwGQ+nlQV8R DmlfiFNdyMEiWkl4tl2sqcbSxMHrNTIRqEmqidSwSfGdlN6tATzbBcsL56x6bg8/qQYE OojyzVcVWqNJoUZFqLs6Bo2y4pOgc73ppcO5bmAKfQk5p9ahDb/LyaXb1iaHTPChiqTR Lgpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=z6iPLWPAvKes5TpVRtR/jt3lTElRfZQGGoWW9hluuiM=; b=JKZ20CJ+ahVdUboBFO63+K53ed6dYSAf47C0aI+k/EQ7ElhawrFvSdZ8l37sx14a0e mMrGtR4azcZpsqi075mZtdvlli+fExHGGnicnNhDRoDx+kyubJz5IuS1YJ3ZPTjBAayr kJL76W/uvZkqvStaVjMoArbQq4PPhtCuUNitv2vVmGboWOmIWf71Ayfx9DnsEzeUr5qR 055xB6Sy9mbDn0m8dBRsiuT4+3Y3SVN+TARxzYOztUlSdVV6sy8iJAVVR+UFplVI0aGF 8/0XSKt/yj5Uh0ea4DBzjBpGNoRBupXkseCnQkX48roK9blMYaoXgGxld80KJ5ef1UdA 2NSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x16-v6si11899359pln.165.2018.07.31.05.07.09; Tue, 31 Jul 2018 05:07:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732020AbeGaNqM (ORCPT + 99 others); Tue, 31 Jul 2018 09:46:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:52158 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729723AbeGaNqM (ORCPT ); Tue, 31 Jul 2018 09:46:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B9638AD39; Tue, 31 Jul 2018 12:06:09 +0000 (UTC) Date: Tue, 31 Jul 2018 14:06:07 +0200 From: Michal Hocko To: Zhaoyang Huang Cc: Steven Rostedt , Ingo Molnar , Johannes Weiner , Vladimir Davydov , "open list:MEMORY MANAGEMENT" , LKML , kernel-patch-test@lists.linaro.org Subject: Re: [PATCH v2] mm: terminate the reclaim early when direct reclaiming Message-ID: <20180731120607.GK4557@dhcp22.suse.cz> References: <1533035368-30911-1-git-send-email-zhaoyang.huang@spreadtrum.com> <20180731111924.GI4557@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 31-07-18 19:58:20, Zhaoyang Huang wrote: > On Tue, Jul 31, 2018 at 7:19 PM Michal Hocko wrote: > > > > On Tue 31-07-18 19:09:28, Zhaoyang Huang wrote: > > > This patch try to let the direct reclaim finish earlier than it used > > > to be. The problem comes from We observing that the direct reclaim > > > took a long time to finish when memcg is enabled. By debugging, we > > > find that the reason is the softlimit is too low to meet the loop > > > end criteria. So we add two barriers to judge if it has reclaimed > > > enough memory as same criteria as it is in shrink_lruvec: > > > 1. for each memcg softlimit reclaim. > > > 2. before starting the global reclaim in shrink_zone. > > > > Then I would really recommend to not use soft limit at all. It has > > always been aggressive. I have propose to make it less so in the past we > > have decided to go that way because we simply do not know whether > > somebody depends on that behavior. Your changelog doesn't really tell > > the whole story. Why is this a problem all of the sudden? Nothing has > > really changed recently AFAICT. Cgroup v1 interface is mostly for > > backward compatibility, we have much better ways to accomplish > > workloads isolation in cgroup v2. > > > > So why does it matter all of the sudden? > > > > Besides that EXPORT_SYMBOL for such a low level functionality as the > > memory reclaim is a big no-no. > > > > So without a much better explanation and with a low level symbol > > exported NAK from me. > > > My test workload is from Android system, where the multimedia apps > require much pages. We observed that one thread of the process trapped > into mem_cgroup_soft_limit_reclaim within direct reclaim and also > blocked other thread in mmap or do_page_fault(by semphore?). This requires a much more specific analysis > Furthermore, we also observed other long time direct reclaim related > with soft limit which are supposed to cause page thrash as the > allocator itself is the most right of the rb_tree . I do not follow. > Besides, even > without the soft_limit, shall the 'direct reclaim' check the watermark > firstly before shrink_node, for the concurrent kswapd may have > reclaimed enough pages for allocation. Yes, but the direct reclaim is also a way to throttle allocation requests and we want them to do at least some work. Making shortcuts here can easily backfire and allow somebody to runaway too quickly. Not that this wouldn't be possible right now but adding more heuristic is surely tricky and far from trivial. -- Michal Hocko SUSE Labs