Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1039808ybh; Mon, 13 Jul 2020 07:52:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwU1e7DoAKKNJBw9J3ZhQa8pVcQ9xzD0ptihdS0bvM2z5Q19nEBq15CpCt8Q8cdEgw4ZARl X-Received: by 2002:a17:906:e215:: with SMTP id gf21mr160712ejb.310.1594651948834; Mon, 13 Jul 2020 07:52:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594651948; cv=none; d=google.com; s=arc-20160816; b=YCrnAkkaiZyN+BcYHR0KFijybL3mfc+bhgKny+Ao9qav9TxxVK6d7vjxJ8Jx5VVpo+ 2udR3MpdOrYY8U59yFo7qTKnuXQxSyHBAEJJ34HPORUoJ631ULnTgsZd/fzAJ0sNL7Bk ikoFPdP94x8cs2/OGombcEKv8BtdTUwyfJB9qtBKJKXXHFZuwvBqzJMUVF3Im0QgTplk Yab2nlodwmRXxcnvxPLzmallZiRe9Th3wBUTvJreEPnCh3f5rP+RDpU3WHWUX5XKJ2tv Y9YjelsFeck5ITkFirahz164hlkrdu1aF5ycI4eQi25f0RAmPsk19jo/0ahK2ww4tyMT RiXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=xUKZlCYinai0Luu2hOdm2qnhIZ2lqYBnLnpnyrbTn3I=; b=0aU66wHxec+NUmrr8K95SixAt8Hs+XEfyncl0mtdMglCEumbt60UHFpQPF3wet8F5+ 3tZafpSdPZQPTX1xKhCrA7Q8NHeKU6npEGd9HGrQMzQVyoZaF6eWDHjkwEZb2yjAJhDz YZfR6cKjLvaqIzoakTi3ge2w8oLF85fqxxDU8LKQXLmXIbe6A8eZY92Jueed/XR0GvuE Zf2BEZ3rlesGzH9R180MOqtf12eoeghjcnu5wTORLwsUosGidiZ/aV/6qWXqzupXSsOM wScUnibNKaz4hvtg+Q74B9QMPkH6iW5MGCBT1l2HYX0edTomcdr5kWDmApn5QNZR53GK 5boQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="qu2t9/SU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n8si9294008edi.222.2020.07.13.07.52.06; Mon, 13 Jul 2020 07:52:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="qu2t9/SU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730352AbgGMOvG (ORCPT + 99 others); Mon, 13 Jul 2020 10:51:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730122AbgGMOvE (ORCPT ); Mon, 13 Jul 2020 10:51:04 -0400 Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5ED61C061755 for ; Mon, 13 Jul 2020 07:51:04 -0700 (PDT) Received: by mail-lf1-x144.google.com with SMTP id u25so9226450lfm.1 for ; Mon, 13 Jul 2020 07:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xUKZlCYinai0Luu2hOdm2qnhIZ2lqYBnLnpnyrbTn3I=; b=qu2t9/SUqzkLaYFEwlQpODBK/nnlgLwz7l2VJYTrR/DC+v3zsZaCKn56tElzWPtcHn /Q0rjazMjk+yHC1rwE1i8nQL9hhJXzlcyXtEAtlmk9i3HxETJsnHiDlivYKSIbc56wK/ 8vm0kqjPm0gBZH2MdJQMaQ5byy/UHj4ypCn6Iu2V1CFGE83BYFlgkmJS557gdfQUsRz3 YCMy6tBmB+wtWOtLsa5DPlxcueem/ZeE7vP6YDeQQ//bYUkOA57A027j5wKUEAQ6LZem gCLVAA+bEN8QBxtkQksQwh1Z7ionO2BfUakJra1klkKve+16G5WCJ3i2xVXyCdXj5tKF OcBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xUKZlCYinai0Luu2hOdm2qnhIZ2lqYBnLnpnyrbTn3I=; b=t8YAH7KDuToYGGTCHsREwJ7idYGPBfh4v9ovdVDUHcduHjCKIC0t2zB+4iq1L/JBU+ mNruvLgLt52pCygHVI6Ki0OIbjHJkl6KJi6vxvosKiHaOOd+QXWpsYnoybjxZz8VCSur RbB3YU5C3yqpe/qYKXK0HiF4fBhbBAOrcnWKuMs/RZN6dhTZZW/lot2M5CjGMWfa2edi es22zGkeyKGEXn02eA9SZIsoY0wyDUlXcAuE5L38t1f+FQRxmkhDmbMDmxlJi4MJ6CfB tgytmfHxJRNV828mB0a+UGkVcIOF1KUsAuj4UMRRtszZU9Dxj/O3qVl2Xjai97L/hKr/ F/jQ== X-Gm-Message-State: AOAM533CDXpLaUt568upmZyjLXkwJP7Jm8cgZDQc4vvZ1Rh367sWLHPR 9Y55tKi82ytoFSY0IX2INQv8UttgMJuGvDy4h8pmRQ== X-Received: by 2002:a19:e61a:: with SMTP id d26mr47379303lfh.96.1594651862566; Mon, 13 Jul 2020 07:51:02 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Shakeel Butt Date: Mon, 13 Jul 2020 07:50:51 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] mm, memcg: reclaim more aggressively before high allocator throttling To: Chris Down Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Linux MM , Cgroups , LKML , Kernel Team Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 13, 2020 at 4:42 AM Chris Down wrote: > > In Facebook production, we've seen cases where cgroups have been put > into allocator throttling even when they appear to have a lot of slack > file caches which should be trivially reclaimable. > > Looking more closely, the problem is that we only try a single cgroup > reclaim walk for each return to usermode before calculating whether or > not we should throttle. This single attempt doesn't produce enough > pressure to shrink for cgroups with a rapidly growing amount of file > caches prior to entering allocator throttling. > > As an example, we see that threads in an affected cgroup are stuck in > allocator throttling: > > # for i in $(cat cgroup.threads); do > > grep over_high "/proc/$i/stack" > > done > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > > ...however, there is no I/O pressure reported by PSI, despite a lot of > slack file pages: > > # cat memory.pressure > some avg10=78.50 avg60=84.99 avg300=84.53 total=5702440903 > full avg10=78.50 avg60=84.99 avg300=84.53 total=5702116959 > # cat io.pressure > some avg10=0.00 avg60=0.00 avg300=0.00 total=78051391 > full avg10=0.00 avg60=0.00 avg300=0.00 total=78049640 > # grep _file memory.stat > inactive_file 1370939392 > active_file 661635072 > > This patch changes the behaviour to retry reclaim either until the > current task goes below the 10ms grace period, or we are making no > reclaim progress at all. In the latter case, we enter reclaim throttling > as before. > > To a user, there's no intuitive reason for the reclaim behaviour to > differ from hitting memory.high as part of a new allocation, as opposed > to hitting memory.high because someone lowered its value. As such this > also brings an added benefit: it unifies the reclaim behaviour between > the two. > > There's precedent for this behaviour: we already do reclaim retries when > writing to memory.{high,max}, in max reclaim, and in the page allocator > itself. > > Signed-off-by: Chris Down > Cc: Andrew Morton > Cc: Johannes Weiner > Cc: Tejun Heo > Cc: Michal Hocko Reviewed-by: Shakeel Butt