Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1201624imm; Fri, 27 Jul 2018 12:57:30 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfl/j2q/jj09iviloDdOV0h+HbGehZmM15HsWCuk8be+WmzXlCPr7QzxChHUZiWriZdSo0r X-Received: by 2002:a63:1b17:: with SMTP id b23-v6mr7354327pgb.275.1532721450402; Fri, 27 Jul 2018 12:57:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532721450; cv=none; d=google.com; s=arc-20160816; b=jP8XWpEFzdc8ClURKCpuFot8xJx3rIxX1xTk518xQs5gkgYyY4H1IEHvv+P5Qa32BQ 5Qa7/friQwnqIfZPsU00xCZWYnnghug+anKgkUzsDFOQzvXCcSWvCPqJ60s6uGViPOS6 4sgYN3QLN4bwwBXoookHUeJ9o1h7E1xLtkbooU/Nsp7BvaBYFsQ6DnokaX0vP1rFwbJm IQWMEKl3WlI9y8EcDd4+0aSVX0lM+PhNr2Y993ramknR0kpynhM60kYnCq91jBJwQDNs WTlcNA7zw0OebaRRRHAPIbkDrc+U3plGjGjeImfV/tMgjhCGLtYVrLipIVGjq0P+X8Hl tD+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=bh5p8Te6Lt0TIFIhpD8wpZ0lwUnLWLjDJguAumW93Pc=; b=Oaoe7p8ELyWeqzGW1mnbJhGHTihaL4VDyuZ6Tfo4gYjTjnmukk6bVKHD20cQrRRFpz EPs4bEO4OWPnFNypz7IPnQUObE0N4qC85yCNivuvkhQ6omrzpmq3TpKPJwYmsS/Vkg/W 6VV70XQAEkFJwpHs9zEg3EY7J4CbQvyIrIZRuXpx9KPoZFGMYtPgzMlCdUY2Kv/VmdMa KfO1nFATikoriLXYXReAdg1bd2kiurJQyKRj28xWfZVkQh9hAAq4y206jP6ta8OaAOfV eH7/u9tRkT2IJz2dYbVNRGr43sWvP43gFrmijyX/8TKul6ieIGvT5NXdfSnCmRRf/1Nb VGsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=q9E9PKka; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z73-v6si4249610pgd.484.2018.07.27.12.57.15; Fri, 27 Jul 2018 12:57:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=q9E9PKka; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389600AbeG0VTW (ORCPT + 99 others); Fri, 27 Jul 2018 17:19:22 -0400 Received: from mail-yb0-f195.google.com ([209.85.213.195]:33945 "EHLO mail-yb0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389340AbeG0VTW (ORCPT ); Fri, 27 Jul 2018 17:19:22 -0400 Received: by mail-yb0-f195.google.com with SMTP id e9-v6so2468027ybq.1 for ; Fri, 27 Jul 2018 12:55:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=bh5p8Te6Lt0TIFIhpD8wpZ0lwUnLWLjDJguAumW93Pc=; b=q9E9PKka3zgNFJAOJOpjYpKcBVTWg/TsIHBw/uj0ragoKPG15Z7SgalwOrRRmdu6yn ZCaUtTs5UPbjU7TLKVPPoSOke29G1+WxRPqYEGz4gnrOsD62/ucS2uyOzn22pbtE5Om9 620ZCu7115MmkyfYueg0oRCcqa28Yhg4v6O+r1Hr0JOCyanM9R/CKyP2zIIlXWsgfed0 YWbCNLR9nOe4AHoJkiz4SdnQobfw2DXzGWpAemaJfLIZfs4xb84k3tLYioDNTy0QnV8X NsjvvQJttmHsQ/6giQqw6AC/g9PNJYuePxJH7SbridYqV39HmCt8dQ/FcDSscf7d74yK Pexg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=bh5p8Te6Lt0TIFIhpD8wpZ0lwUnLWLjDJguAumW93Pc=; b=jAYI41Z+OdFPsQrVJ0ghXvh9ZBLthnFBsebPh0qRAk98oMYYA/XGwtJOMXkJY5MTg4 HjhECunr1uCz/Sb7Y8VBWoCwubEd23va6dJrJsBTeyXW0o3z2WV8oflhYtGqx/b4rcZo vcHAud6e+apIBU+b5Vqx3bAKA37A8U5vCgI6BPuP5MOY8rnL+Zi4/oDLa7fVUq0bEUW/ fbJpSGFmIFqGD9FICEbSa8OQh/B5LjWfZdeyZK5spOdhW+a7jkxrYwTBAjjZJgKVs/QM liKe47JpJe6XQhYtykBsbiFlWkoIM5FaJITVBXSY7nOJA0fUMedJh9w4olIbvxvH5sww pmTQ== X-Gm-Message-State: AOUpUlGeW5M1SYq1JMokMziH5cHodke9vgnnnBxMXhmTgHf5zjFQS5kJ cP4hCCAw4VLLBv3CWwVYd9jalg== X-Received: by 2002:a25:e406:: with SMTP id b6-v6mr4225899ybh.397.1532721357631; Fri, 27 Jul 2018 12:55:57 -0700 (PDT) Received: from localhost ([2620:10d:c091:180::1:b944]) by smtp.gmail.com with ESMTPSA id l204-v6sm1545362ywe.50.2018.07.27.12.55.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 27 Jul 2018 12:55:56 -0700 (PDT) Date: Fri, 27 Jul 2018 15:58:48 -0400 From: Johannes Weiner To: Zhaoyang Huang Cc: Steven Rostedt , Ingo Molnar , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-patch-test@lists.linaro.org Subject: Re: [PATCH] mm: terminate the reclaim early when direct reclaiming Message-ID: <20180727195848.GA12399@cmpxchg.org> References: <1532683165-19416-1-git-send-email-zhaoyang.huang@spreadtrum.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1532683165-19416-1-git-send-email-zhaoyang.huang@spreadtrum.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Zhaoyang, On Fri, Jul 27, 2018 at 05:19:25PM +0800, Zhaoyang Huang wrote: > This patch try to let the direct reclaim finish earlier than it used > to be. The problem comes from We observing that the direct reclaim > took a long time to finish when memcg is enabled. By debugging, we > find that the reason is the softlimit is too low to meet the loop > end criteria. So we add two barriers to judge if it has reclaimed > enough memory as same criteria as it is in shrink_lruvec: > 1. for each memcg softlimit reclaim. > 2. before starting the global reclaim in shrink_zone. Yes, the soft limit reclaim cycle is fairly aggressive and can introduce quite some allocation latency into the system. Let me say right up front, though, that we've spend hours in conference sessions and phone calls trying to fix this and could never agree on anything. You might have better luck trying cgroup2 which implements memory.low in a more scalable manner. (Due to the default value of 0 instead of infinitity, it can use a smoother 2-pass reclaim cycle.) On your patch specifically: should_continue_reclaim() is for compacting higher order pages. It assumes you have already made a full reclaim cycle and returns false for most allocations without checking any sort of reclaim progress. You may end up in a situation where soft limit reclaim finds nothing, and you still abort without trying a regular reclaim cycle. That can trigger the OOM killer while there is still plenty of reclaimable memory in other groups. So if you want to fix this, you'd have to look for a different threshold for soft limit reclaim and. Maybe something like this already works: diff --git a/mm/vmscan.c b/mm/vmscan.c index ee91e8cbeb5a..5b2388fa6bc4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2786,7 +2786,8 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) &nr_soft_scanned); sc->nr_reclaimed += nr_soft_reclaimed; sc->nr_scanned += nr_soft_scanned; - /* need some check for avoid more shrink_zone() */ + if (nr_soft_reclaimed) + continue; } /* See comment about same check for global reclaim above */