Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1845444ybh; Tue, 14 Jul 2020 08:46:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxzb2b0bP838fFd0rX9028XpkcTGu8z9P7Z3Zp1UN9YMIt5GIf+FkzbUGoFmQbXjul7JLNf X-Received: by 2002:a50:f702:: with SMTP id g2mr5239682edn.348.1594741598150; Tue, 14 Jul 2020 08:46:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594741598; cv=none; d=google.com; s=arc-20160816; b=LK3fwIiHWOdvLX3XPZfDnS+ylE5CxE/nKSIessqUkVw+ExnR3rjFld9W0X/igCxo8R SOi6u+EA3uoZpUb6afSvU9tWg4GOCnu7C2B45HRIJAGuej5VGPs+z0eEXMwbtIIZfQPv zxMo/8eXro1itQITjN29zLNH9hi53xePx2osva6WYr2RvGn5V6dbI5nytMfxpr9ZebPT HpWrmumTwVP0YmhMrgsmrcnurhcZ57s5x01gCNz3VnkWsLAeOf7y+hoz2tEqL1U5xVTs VBiSP0bu7cmB1wxWBekSLIioBortWjWXOLiJsJz26RFRWLmRmIzkkkvOsTuKhsnViEg3 zvhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=7U8JsZvgXHH3n8xgxoCLEbRbt6AY8Z+mxAelisJgRw8=; b=K9IXqWU7cLFZ81KSvmUFoNLSBLXpf0BMIqJ5RQNySofQaMmgY5yqrYue/Tu51MY0/B b2t7p11BdNg/TQ0mfONx508uU6AWwRw+0cm5gvMKtLNt1FxxX+2wP9+VXuI89odyrLtn hZR4zrv44ubsBz3GXUH9uLIFjXbh2pqgtsgBuxtHNuOldsokh4f8uzW7ESfm7dKqF8Y9 SgqE/eppjz7SgyldrrqPKsqX/xtvUXN6DgZH9esTkpn6gPtTuF67Pz9wx4OgOyQ97UwD gf+bE2rn3Xvzpu10/44ZnwUxasa1yHmeFLZ+2VtncJkLWO9A0yLgaGDyykRVJXTRQAkd 6ugg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=TeT4WtuD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f21si13862902ejl.329.2020.07.14.08.46.14; Tue, 14 Jul 2020 08:46:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=TeT4WtuD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725962AbgGNPp5 (ORCPT + 99 others); Tue, 14 Jul 2020 11:45:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725876AbgGNPp4 (ORCPT ); Tue, 14 Jul 2020 11:45:56 -0400 Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AE3BC061794 for ; Tue, 14 Jul 2020 08:45:56 -0700 (PDT) Received: by mail-qk1-x743.google.com with SMTP id b185so16012350qkg.1 for ; Tue, 14 Jul 2020 08:45:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7U8JsZvgXHH3n8xgxoCLEbRbt6AY8Z+mxAelisJgRw8=; b=TeT4WtuDoSJlYSAQH2bV1YMeSx1E6X2BFU8rj0/8YyGh0bErT86/GtYKHAowuaOF6E i5sZLU99YAKqeQ4qr4Qp40qynq4yNSyzg1yH0BPltc5IY1ZLrkVQwxI8UTVm3ozseEdy 5ZkeWdsxPMWs2cvuTkyXFRGoW6rRjLvtArEwKHjtaJEZrrRGsCnu7J5y946a97v7K8xc O8xhwzvcMs0lb/N+ymrisTzO4wstCoEc86JTj/YVRRwGUOkVXxACR80FW58sUvUxstgn J4rgoTKUsqwoXilNN1i9HA6gtgwsAfe+lhDNrffl8KLuAP0+jQXa9OkWVGhEGjbFqp8/ 6C6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7U8JsZvgXHH3n8xgxoCLEbRbt6AY8Z+mxAelisJgRw8=; b=NvWTDoxFghjjAwkRVEBBaHCqtEcDgTGOB2Z7zW1lAoxyz3iKOQc1EWHtFjaVpy7tA5 AEMwJBRh6bkAI9WUI/VTA5Wuy6FX3GqvMmqM9+GL/g9f0fp/xsgWeByZ1HgYUqJumHId edFmSkLogtxtCz0hc5SVzXflF96BXuipiISd4Jne23ePlPWCtR6QseNk9nR8rlphc9qy ip+BWgwqj6GAASdkuJRB9T+lKogKdRogITqQ9V+jTHrsGAen6oZjAnHa8iF9ioiLARps MCV/eVhli4S6KCH6Xe3mx63W93BKZ/OwGSWwkrSySPDEDhXh2DIK2bA8+eyshGWhZiAq sBbA== X-Gm-Message-State: AOAM531a0uw9b42nmsT4aDXQ7YIB7/i+NgWjYvXDbelLe/ZgNLr4rCtz j6GwmKRm3doGwF0C7ECHERrSDg== X-Received: by 2002:a37:689:: with SMTP id 131mr4805114qkg.468.1594741554235; Tue, 14 Jul 2020 08:45:54 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:81a8]) by smtp.gmail.com with ESMTPSA id r6sm23081172qtt.81.2020.07.14.08.45.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jul 2020 08:45:53 -0700 (PDT) Date: Tue, 14 Jul 2020 11:45:04 -0400 From: Johannes Weiner To: Chris Down Cc: Andrew Morton , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2 1/2] mm, memcg: reclaim more aggressively before high allocator throttling Message-ID: <20200714154504.GB215857@cmpxchg.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 13, 2020 at 12:42:35PM +0100, Chris Down wrote: > In Facebook production, we've seen cases where cgroups have been put > into allocator throttling even when they appear to have a lot of slack > file caches which should be trivially reclaimable. > > Looking more closely, the problem is that we only try a single cgroup > reclaim walk for each return to usermode before calculating whether or > not we should throttle. This single attempt doesn't produce enough > pressure to shrink for cgroups with a rapidly growing amount of file > caches prior to entering allocator throttling. > > As an example, we see that threads in an affected cgroup are stuck in > allocator throttling: > > # for i in $(cat cgroup.threads); do > > grep over_high "/proc/$i/stack" > > done > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > > ...however, there is no I/O pressure reported by PSI, despite a lot of > slack file pages: > > # cat memory.pressure > some avg10=78.50 avg60=84.99 avg300=84.53 total=5702440903 > full avg10=78.50 avg60=84.99 avg300=84.53 total=5702116959 > # cat io.pressure > some avg10=0.00 avg60=0.00 avg300=0.00 total=78051391 > full avg10=0.00 avg60=0.00 avg300=0.00 total=78049640 > # grep _file memory.stat > inactive_file 1370939392 > active_file 661635072 > > This patch changes the behaviour to retry reclaim either until the > current task goes below the 10ms grace period, or we are making no > reclaim progress at all. In the latter case, we enter reclaim throttling > as before. > > To a user, there's no intuitive reason for the reclaim behaviour to > differ from hitting memory.high as part of a new allocation, as opposed > to hitting memory.high because someone lowered its value. As such this > also brings an added benefit: it unifies the reclaim behaviour between > the two. > > There's precedent for this behaviour: we already do reclaim retries when > writing to memory.{high,max}, in max reclaim, and in the page allocator > itself. > > Signed-off-by: Chris Down > Cc: Andrew Morton > Cc: Johannes Weiner > Cc: Tejun Heo > Cc: Michal Hocko Acked-by: Johannes Weiner