Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp500236ybl; Wed, 14 Aug 2019 01:13:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqx7KpuWei2Q0TQXPWmc1D6cBDWo5LwE5FdkwIXkjHZlFOBrIfOup0MLmyQaTUWuFmob64e6 X-Received: by 2002:a62:7a0f:: with SMTP id v15mr25927226pfc.35.1565770406947; Wed, 14 Aug 2019 01:13:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565770406; cv=none; d=google.com; s=arc-20160816; b=wNkNBFVLtMlQlgMpzk3c1XF5kDFAJS/Lj9KUePmm0qYIFWSsEyi4EkT0ZaQdWK/8Yj HI0Kx9gmAewvWEf75JViVUHQnVpbh4YYK8h2CSNtElENpm0KN/65E7sOJcutwn+VgNZz vfBEc7DE75W9jfBIICVxIwZpIdUJhM3WACeKB4iHnX+ldMd6ie8MFi/T3JVwvyNr5AjU K+lv4wYuxHQljcXtv9QqIG1IdCg0YtELF8wiWAyr/TA6OCrBFoEvKepQ0ImQkrUjQ4DP D3REXOi3VHvzA7ZIHdvUtziGruM/RwPI4EN9l8Zgiiqi8LWF3pVIjTv2bLAUH6m/cMFR iatg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=4jjSdnnGHuuW+GDZgSv0dKQLmMWsx8qn6NzUC7w1wGA=; b=VPjJu9eg2XilU6Tzwc3gvUYoSoiWCL6CbWpGvWkVjVDT2MvNlKFcd65E3LNho2WcxX DhM5GOdbqV7uLfEEI9DG+UO0c+rH+USeTy58x6QFeRCltWiUrmaI3EXV21tuUZJ0fni0 /nkEEENis1IIDI40flMpmJ+0CWx8koyguAyHAjCc+2y6pW0o1TckNC8BBdSSYLIcBItW MXDzw5ntoFHn1unFNrmYptKTe60JkzwwZTV8no947vhwZlDwhqFUoRDDEiRL+kbVzbeY OMwQwW60t4XVcXwdr9SuQzvqxG5qV9Q0xVWvFsNuIwcU1hRDEBP/h1ZA1CCLU4EdUHIQ COiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s70si34166736pgc.53.2019.08.14.01.13.11; Wed, 14 Aug 2019 01:13:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727099AbfHNIL5 (ORCPT + 99 others); Wed, 14 Aug 2019 04:11:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:38490 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726555AbfHNIL5 (ORCPT ); Wed, 14 Aug 2019 04:11:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EA6AAAF98; Wed, 14 Aug 2019 08:11:55 +0000 (UTC) Date: Wed, 14 Aug 2019 10:11:55 +0200 From: Michal Hocko To: Johannes Weiner Cc: Andrew Morton , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: vmscan: do not share cgroup iteration between reclaimers Message-ID: <20190814081155.GQ17933@dhcp22.suse.cz> References: <20190812192316.13615-1-hannes@cmpxchg.org> <20190813132938.GJ17933@dhcp22.suse.cz> <20190813171237.GA21743@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190813171237.GA21743@cmpxchg.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 13-08-19 13:12:37, Johannes Weiner wrote: > On Tue, Aug 13, 2019 at 03:29:38PM +0200, Michal Hocko wrote: > > On Mon 12-08-19 15:23:16, Johannes Weiner wrote: [...] > > > This change completely eliminates the OOM kills on our service, while > > > showing no signs of overreclaim - no increased scan rates, %sys time, > > > or abrupt free memory spikes. I tested across 100 machines that have > > > 64G of RAM and host about 300 cgroups each. > > > > What is the usual direct reclaim involvement on those machines? > > 80-200 kb/s. In general we try to keep this low to non-existent on our > hosts due to the latency implications. So it's fair to say that kswapd > does page reclaim, and direct reclaim is a sign of overload. Well, there are workloads which are much more direct reclaim heavier. How much they rely on large memcg trees remains to be seen. Your changelog should state that the above workload is very light on direct reclaim, though, because the above paragraph suggests that a risk of longer stalls is really non-issue while I think this is not really all that clear. -- Michal Hocko SUSE Labs