Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1812067pxb; Mon, 22 Feb 2021 11:30:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJynmwEgWC1qVc7EzenUhhe4ThW9XRa5A5FED/tCe7IOFGFuw9m8lz2WjGwyriqxoot9hasq X-Received: by 2002:a50:cbc7:: with SMTP id l7mr24149618edi.134.1614022203047; Mon, 22 Feb 2021 11:30:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614022203; cv=none; d=google.com; s=arc-20160816; b=w2yuF4ZMgenCNmi7IA69PaA3G/tfJJPAhSca370Xqw7+gMIEfw45K03nczxG2aGzsq zAI4iYnuorB9RXb/uoJ5jhvTWHUKVnfjWlks4gqVxWBh3741ZxqBNvwplFz3Wl48fubb e8aUaACplNEPd6dTH61ZLkeX5x58NEALqOo7Sn05HTKPTmqPyorj7185hydw5VAHW9yU P4JborAEN6/1QCmfdM/zM7IutD0OTJ46ZFdacGQwZjyeGUdu0hQvCF1pNx85cLAG2QzW DwgH6Oa+XCj7xopsBXqwaVyoBZrhWO9jWjnHFFBRMfwD7bG10YzZkf7pqrvvFWpJDU7Q wqsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=DXnP4HKoMeIGKw938LhjfknpXeP0WYj5djYM/7MkCb4=; b=z80s180l0veRk3mx3Pb0f0sJ98yG5Y3vVOS6Jz3VyOX7bTERDXkJnW/7k1jSHUYpPx Vji4r/KylzKXmZ0edWBISAr613xyQJeroFfB04g1HtajYcOSvVFspmnfzMeieieoR+K9 shYOa78emdB+n9lBcoRakEUCQPAO9FRIyqCENqg5pKWFZD1BChuEwhreI0A+leKO0CWp biPaGVOk0TedjGFXipLPQdXvLlgggs7wzYoIUOWXJgXrr+W/x/zkJWO24Essw62rH5+U DRdFNz3OhrACNgGSPKn2+SEneQVDjh1o1CTvAnaeMMRagAGm6f06z3U1E/HmwAmIC3wC 0gRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k13si12488331ejg.567.2021.02.22.11.29.39; Mon, 22 Feb 2021 11:30:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233335AbhBVT2z (ORCPT + 99 others); Mon, 22 Feb 2021 14:28:55 -0500 Received: from mga14.intel.com ([192.55.52.115]:41387 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233094AbhBVTZn (ORCPT ); Mon, 22 Feb 2021 14:25:43 -0500 IronPort-SDR: EltXaYKtY04X2/3G4XuYWqwr7L7SVyLWUHR5vzkmF0ihH36CgMp0CcRm8/ewhg7fpIi6RT4pG0 xfpysCpUSbhQ== X-IronPort-AV: E=McAfee;i="6000,8403,9903"; a="183847440" X-IronPort-AV: E=Sophos;i="5.81,198,1610438400"; d="scan'208";a="183847440" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2021 11:23:52 -0800 IronPort-SDR: RaY9gAVM1BH1UuQYP8Jx5AilW/WXuGS+KiwNANxgZWgz+mUuyJqDSIxVwt4albE5cAyFj+1M5s oPtfz/AT7Khw== X-IronPort-AV: E=Sophos;i="5.81,198,1610438400"; d="scan'208";a="389981629" Received: from schen9-mobl.amr.corp.intel.com ([10.251.12.88]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2021 11:23:51 -0800 Subject: Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , Dave Hansen , Ying Huang , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org References: <06f1f92f1f7d4e57c4e20c97f435252c16c60a27.1613584277.git.tim.c.chen@linux.intel.com> <884d7559-e118-3773-351d-84c02642ca96@linux.intel.com> From: Tim Chen Message-ID: <1d8d4ec6-e97d-b7e5-695e-f189404a80fd@linux.intel.com> Date: Mon, 22 Feb 2021 11:23:45 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/22/21 11:09 AM, Michal Hocko wrote: > On Mon 22-02-21 09:41:00, Tim Chen wrote: >> >> >> On 2/22/21 12:40 AM, Michal Hocko wrote: >>> On Fri 19-02-21 10:59:05, Tim Chen wrote: >> occurrence. >>>>> >>>>> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * SOFTLIMIT_EVENTS_TARGET. >>>>> If all events correspond with a newly charged memory and the last event >>>>> was just about the soft limit boundary then we should be bound by 128k >>>>> pages (512M and much more if this were huge pages) which is a lot! >>>>> I haven't realized this was that much. Now I see the problem. This would >>>>> be a useful information for the changelog. >>>>> >>>>> Your fix is focusing on the over-the-limit boundary which will solve the >>>>> problem but wouldn't that lead to to updates happening too often in >>>>> pathological situation when a memcg would get reclaimed immediatelly? >>>> >>>> Not really immediately. The memcg that has the most soft limit excess will >>>> be chosen for page reclaim, which is the way it should be. >>>> It is less likely that a memcg that just exceeded >>>> the soft limit becomes the worst offender immediately. >>> >>> Well this all depends on when the the soft limit reclaim triggeres. In >>> other words how often you see the global memory reclaim. If we have a >>> memcg with a sufficient excess then this will work mostly fine. I was more >>> worried about a case when you have memcgs just slightly over the limit >>> and the global memory pressure is a regular event. You can easily end up >>> bouncing memcgs off and on the tree in a rapid fashion. >>> >> >> If you are concerned about such a case, we can add an excess threshold, >> say 4 MB (or 1024 4K pages), before we trigger a forced update. You think >> that will cover this concern? > > Yes some sort of rate limiting should help. My understanding has been > that this is the main purpose of the even counting threshold. The code > as we have doesn't seem to work properly so there are two ways, either > tune the existing threshold or replace it by something else. Having both > a force update and non-functional threshold is not a great outcome. > The event counting threshold's purpose is to limit the *rate* of event update. The side effect is we will miss some update by the sampling nature. However, we can't afford to let the sampling go to frequent or the overhead will be too much. The forced update makes sure that a needed update happens to put the memcg on the tree happens and we don't allow the memcg to escape page reclaim even if the update happens relatively rarely. Lowering the threshold does not guarantee that the update happens, resulting in a runaway memcg. The forced update and threshold serves different purpose. In my opinion the forced update is necessary. We can keep the update overhead low with low update rate, and yet make sure that a needed update happens. I think both are necessary for proper function. The overhead from the forced update is negligible and I don't a downside adding it. Tim