Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp8249251pxb; Fri, 19 Feb 2021 11:03:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJx2lgqcQDUQse/PGIahmb81R+LCki3xKuEn10MOxKMFDwmPoQDorzICgzFI3bQUMLXhV3ig X-Received: by 2002:aa7:c915:: with SMTP id b21mr10563006edt.58.1613761410918; Fri, 19 Feb 2021 11:03:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613761410; cv=none; d=google.com; s=arc-20160816; b=sGIoInZV5HJQI+7PxcioTLsUcP04TPrjgQGk3zSmWJoHEl0GA1LwFa9bqLh+rlPszn HLI/2EKnM1Jwk4ifs7oHsO5X0wsw1TiSLKVNjBy0mNk4ZjOXKnkpEHgE/hcygplhTvP8 6XM6nnl3gbX/am4qSLKQiKsCnwVWt+NxYbSf1/BKCYz49agAFkhakArLNgeaBKJkfCd3 kv0ligMSiPQgm0MWUIQ2kZlHIlr931TyT7P9hPx3np2ruiVsxeV7ctPEk34DvxAmkGtk FhYdv2vhx4kNIfsupcLi1FK1Jn0+mju0+PikN+lm6KRqu7gBBUtvDY5FsiOAipWiJeje 2lmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=9nVm9FZjT8ajqBVJgSQEoBT0Esc9uZ7Xdhanl2KZ0rg=; b=j+CRPE/FXyBNvHII9/DYqI1OHHGjAaDzOoRarrDcoSxeFnMx/TLPi0+B7XyeC5Sh1H t7TiAcZccCW5FioxHlxOy9OH8p3tEO6juzm6hlZMdgaFfN/RRhqJkuXRA6xCsR7aJdot /IQXfLnS+F2VKyTvaBJY1EKV7s0Bl0CrFN7kejnG2yP6ggfXBHTHx64kEJUZ26hMOMFw e2sg53wpCLSBGQjgeAdhL4yHcWYLfUxmlMOQRqiW2EW7dbHz1LvsvkpnVMoaMf4aXzgx UAWk0shs4WbbcTgiDeWmZWH+Sje85rud8S4G4xwLwyEcdoEI7JCFhIGfvMcFAfLujsmd bffQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z5si6576853edp.368.2021.02.19.11.02.58; Fri, 19 Feb 2021 11:03:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229755AbhBSTBN (ORCPT + 99 others); Fri, 19 Feb 2021 14:01:13 -0500 Received: from mga17.intel.com ([192.55.52.151]:31784 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229555AbhBSTBI (ORCPT ); Fri, 19 Feb 2021 14:01:08 -0500 IronPort-SDR: PaqsFw1mDJiPp332sbO256QKgggWn8YKVNiusLijRBsPLBLoJMFUqOeI1Ff0Q+h8pmPS7sc4MN DWLHf26T2I5Q== X-IronPort-AV: E=McAfee;i="6000,8403,9900"; a="163714562" X-IronPort-AV: E=Sophos;i="5.81,189,1610438400"; d="scan'208";a="163714562" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2021 10:59:07 -0800 IronPort-SDR: ydQjHIvRVWQzSbtxEkBhnUQPSo9AYdN4OeN+tifCiaNeCb1Aj2HAAXQSv7DrzF4eB5Hl8v+5wj jAl7jBKIZnaw== X-IronPort-AV: E=Sophos;i="5.81,189,1610438400"; d="scan'208";a="378989329" Received: from schen9-mobl.amr.corp.intel.com ([10.251.10.112]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2021 10:59:07 -0800 Subject: Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , Dave Hansen , Ying Huang , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org References: <06f1f92f1f7d4e57c4e20c97f435252c16c60a27.1613584277.git.tim.c.chen@linux.intel.com> From: Tim Chen Message-ID: <884d7559-e118-3773-351d-84c02642ca96@linux.intel.com> Date: Fri, 19 Feb 2021 10:59:05 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/19/21 1:11 AM, Michal Hocko wrote: > On Wed 17-02-21 12:41:35, Tim Chen wrote: >> Memory is accessed at a much lower frequency >> for the second cgroup. The memcg event update was not triggered for the >> second cgroup as the memcg event update didn't happened on the 1024th sample. >> The second cgroup was not placed on the soft limit tree and we didn't >> try to reclaim the excess pages. >> >> As time goes on, we saw that the first cgroup was kept close to its >> soft limit due to reclaim activities, while the second cgroup's memory >> usage slowly creep up as it keeps getting missed from the soft limit tree >> update as the update didn't fall on the modulo 1024 sample. As a result, >> the memory usage of the second cgroup keeps growing over the soft limit >> for a long time due to its relatively rare occurrence. > > Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * SOFTLIMIT_EVENTS_TARGET. > If all events correspond with a newly charged memory and the last event > was just about the soft limit boundary then we should be bound by 128k > pages (512M and much more if this were huge pages) which is a lot! > I haven't realized this was that much. Now I see the problem. This would > be a useful information for the changelog. > > Your fix is focusing on the over-the-limit boundary which will solve the > problem but wouldn't that lead to to updates happening too often in > pathological situation when a memcg would get reclaimed immediatelly? Not really immediately. The memcg that has the most soft limit excess will be chosen for page reclaim, which is the way it should be. It is less likely that a memcg that just exceeded the soft limit becomes the worst offender immediately. With the fix, we make sure that it is on the bad guys list and will not be ignored and be chosen eventually for reclaim. It will not sneakily increase its memory usage slowly. > > One way around that would be to lower the SOFTLIMIT_EVENTS_TARGET. Have > you tried that? Do we even need a separate treshold for soft limit, why > cannot we simply update the tree each MEM_CGROUP_TARGET_THRESH? > Lowering the threshold is a band aid that really doesn't fix the problem. I found that if the cgroup touches the memory infrequently enough, you could still miss the update of it. And in the mean time, you are updating things a lot more frequently with added overhead. Tim