Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp3106761pxa; Tue, 18 Aug 2020 06:51:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyunWVnX8836bp78lQQmS2VnJi8zsjGVp5N5VTR797MbxLcSnkYKu9v1mtXMwoVM7eC8CXA X-Received: by 2002:a17:906:fb04:: with SMTP id lz4mr21168935ejb.394.1597758680768; Tue, 18 Aug 2020 06:51:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597758680; cv=none; d=google.com; s=arc-20160816; b=sDjZO0BwpJ7cr+0h3E7PU2jMYVm6nZHb5TJ5Lk+bcBkVJkS8Xm7+1X7jHs/g2sNufq 310QaFzc7JXEVX/y6Pw7WZcYvqv5v8qw6cYW9JN/SPqBgfRJ6KYsrjDbpcWsYlOI1Zyd YHiRjXiEyMTj3NdIK5SoRUCzvV55ATt8jYb8RmRDqWKJ8LDurJJZGikHO2EM+HWfma1P COV6wloyDt2muTo5jA9y0kfjD4YIAnWLgL4nccu6VZZ4uHvl+8Gwn0q1jpFmQBEy3GwA VZMfuDlZ01i4iaosZmQN8M3suUBnR08ugu3iOthTiGeFp9uwXCjTc17FUBaIY93s35dB 6CTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=i3gzOAPUdUUgkW+MfJiRBZzylD1SwKDup35ranLTE+8=; b=WTG4NTpTEf2FRMoT0U/C8eiioEjWzJBFaUVUh7tepWOJ9Z24ZLOqL3Z+hbCdR2OIJC REyO1T3NR6Mx7D3Z6nBlN2NwWXPMp20NKfRIw3R4ODrTH5UShHt6rCzdXd/SWRYTASje weo8vv6jwO2oVUkVxu/I3zU4KQuG8bT8oOJ0n2QKwzcc7637fv7oJryOgUaWNqmVzkuP JLkVeVRZF0hj0d8XTq4HnhVyUaf0Bo6Bsj+Fyxcj+TshLv91wSQVDW7uq0sMZkTRj81J RSNNZzQYWNhz6mAmD3mePXwsjwcKNE9lG4aNA1bi+eX9WG9aQKqoz9Sr4tdK6TNnuycd RIhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=gFo2tdrj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v6si13607417ejg.516.2020.08.18.06.50.56; Tue, 18 Aug 2020 06:51:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=gFo2tdrj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726788AbgHRNuR (ORCPT + 99 others); Tue, 18 Aug 2020 09:50:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726482AbgHRNuO (ORCPT ); Tue, 18 Aug 2020 09:50:14 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D22CFC061342 for ; Tue, 18 Aug 2020 06:50:13 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id c12so15138721qtn.9 for ; Tue, 18 Aug 2020 06:50:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=i3gzOAPUdUUgkW+MfJiRBZzylD1SwKDup35ranLTE+8=; b=gFo2tdrjm1qkUJWuDiFskzbeO0w5LJx6dyzsZOPwPfOmHDSXzdBGEcXIwDX7whrNV9 JEgVlPiTxZyMBirBuYCHGck9cYWCBQmt5ny0jayeNQ6gfZp4LNLHaLYVZRHrSHKeUY6e uRSpvrjdgx5mqhJbaHjlzb+ptaePm6k7omgqsuP3hL/1RmdPdVTPoALIZZdr5eyJ54WL dsftHxbUTIR9tXIDvqG8jN4H2/lH+Rf1hR50Z0vtfC6A7sLoLgtiM/v15If1BpTZRwDI SAX24oCanz2Ny6U8qxDU6UGtpXs/jD/FGeqLm7SIrw5GSVnzTYUHBo7UfMACrG1lFPPe a1mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=i3gzOAPUdUUgkW+MfJiRBZzylD1SwKDup35ranLTE+8=; b=F5TKoDgOo7RTvAOkUwC7g6giRkyvQk5Ms4tEJqK06Q2BbHxrwxNeaYbtgk/eM0nmei 9OcyH0tVNaz6KEWG6GttRv3hE3iNV7+p5QW2Oc9pSp66Ry/dD1KCzEkoiX+oCbZcKlTl Q8OLdBwNiyIt3ezaEDbz+bFFa2BObHF7T8ZnTge9pysKtErU5o6yMigPPWTwi/GraoYE X1MBUcJWqYp7bLaxNyeznnLrcpNYF+9VZ9wdk3e58kw3R5T9prQTj3mPYe9L/E/tWiBd HYafZ8Yb2pm50aXNQbu1bvBRg3Nd56pzslf2OryMq8HDUfmBKGxtJUcLVKg0JEJuBgHN Ewfg== X-Gm-Message-State: AOAM530X7XgPPr+ZH1q2D7Ou+ytqRZdlEV5norMoz7pU8dIka75Cye2y XvDAPEbOGghLyfdTEujDdk7xBQ== X-Received: by 2002:ac8:368f:: with SMTP id a15mr18218469qtc.288.1597758612969; Tue, 18 Aug 2020 06:50:12 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:8b3]) by smtp.gmail.com with ESMTPSA id n15sm20639882qkk.28.2020.08.18.06.50.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Aug 2020 06:50:11 -0700 (PDT) Date: Tue, 18 Aug 2020 09:49:00 -0400 From: Johannes Weiner To: peterz@infradead.org Cc: Michal Hocko , Waiman Long , Andrew Morton , Vladimir Davydov , Jonathan Corbet , Alexey Dobriyan , Ingo Molnar , Juri Lelli , Vincent Guittot , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH 0/8] memcg: Enable fine-grained per process memory control Message-ID: <20200818134900.GA829964@cmpxchg.org> References: <20200817140831.30260-1-longman@redhat.com> <20200818091453.GL2674@hirez.programming.kicks-ass.net> <20200818092617.GN28270@dhcp22.suse.cz> <20200818095910.GM2674@hirez.programming.kicks-ass.net> <20200818100516.GO28270@dhcp22.suse.cz> <20200818101844.GO2674@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818101844.GO2674@hirez.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 18, 2020 at 12:18:44PM +0200, peterz@infradead.org wrote: > What you need is a feeback loop against the rate of freeing pages, and > when you near the saturation point, the allocation rate should exactly > match the freeing rate. IO throttling solves a slightly different problem. IO occurs in parallel to the workload's execution stream, and you're trying to take the workload from dirtying at CPU speed to rate match to the independent IO stream. With memory allocations, though, freeing happens from inside the execution stream of the workload. If you throttle allocations, you're most likely throttling the freeing rate as well. And you'll slow down reclaim scanning by the same amount as the page references, so it's not making reclaim more successful either. The alloc/use/free (im)balance is an inherent property of the workload, regardless of the speed you're executing it at. So the goal here is different. We're not trying to pace the workload into some form of sustainability. Rather, it's for OOM handling. When we detect the workload's alloc/use/free pattern is unsustainable given available memory, we slow it down just enough to allow userspace to implement OOM policy and job priorities (on containerized hosts these tend to be too complex to express in the kernel's oom scoring system). The exponential curve makes it look like we're trying to do some type of feedback system, but it's really only to let minor infractions pass and throttle unsustainable expansion ruthlessly. Drop-behind reclaim can be a bit bumpy because we batch on the allocation side as well as on the reclaim side, hence the fuzz factor there.