Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp574472ybt; Fri, 10 Jul 2020 07:13:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqnAQrp0np8nKy/bdUYADxqUtFG0lFX6uwQEnE6m9uAFXSWXziUbTjPaAoPTSP/CWRMVOb X-Received: by 2002:a17:906:2318:: with SMTP id l24mr64911181eja.291.1594390420949; Fri, 10 Jul 2020 07:13:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594390420; cv=none; d=google.com; s=arc-20160816; b=zIEIgzUQIZEYLEOJq2dXBOKOOUZ8XC/J6+al0sWH+CWc//TGCTcCez8PxMr4qbjpIE kNoRAEFAceHVpSpTq9hIl20fbGSnYyea6V0T6YDdkbbZ6acaUuB+nKrw7lT6XuLg0fFs UOgWAy0wsl52hna/9EBPt6Tkdm+mSDHTBY7yoGN6/cuRO6ddnJPWq+btdH7QqjgRJS2U LGUqOnONt0mmCw27JBf6BbTOMSXCd4yzEHG2QCvnPGB1emSo5xRB7hxNKWKmt+GKR21x iWdxoyij4F/Y1Y5oHuYpm772f30QUOTX9YxeVat8Ia4Gtvjet3AEIbWd28uLqGLi9nAo /cmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=PKCWrTfr3M3VE4zZraZncAFHMeh1ED72eMf914LZfYE=; b=OBSy0qH8pijfvYOoo5GknJia3hJrTd+3J/iReiKqNafTz+rBZSrcJkWowpysdDTWKL 6l1NNsR4Al2Qe4jANYXfcjOTtqiES/31Wg61sRqKmG1YOegAxuVWaUjJYOw7eNCuFyLq vONXjwfMLNzMc7vpKMIdRTvcBU6xlDkIORRyWZtjzLsMLlkq3toMV5OS+X+QHA7FQn9m VzOWejWEYAi/2xKIz0nQYtzHCTKpgx54tGwYqZCuVBdGg3+a2HQRFjX3woISah3cTqX5 6qDsNfMPAj0mZcYNzClXwE0tfRFbIdnPuJWk2m3iTz5mWDBRcY9/gAhreli8BkCglp2p Jbng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vuH+P3Lx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s21si4129253edy.104.2020.07.10.07.13.17; Fri, 10 Jul 2020 07:13:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vuH+P3Lx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728330AbgGJOMi (ORCPT + 99 others); Fri, 10 Jul 2020 10:12:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726965AbgGJOMg (ORCPT ); Fri, 10 Jul 2020 10:12:36 -0400 Received: from mail-lj1-x243.google.com (mail-lj1-x243.google.com [IPv6:2a00:1450:4864:20::243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0DC9FC08C5CE for ; Fri, 10 Jul 2020 07:12:36 -0700 (PDT) Received: by mail-lj1-x243.google.com with SMTP id j11so6609006ljo.7 for ; Fri, 10 Jul 2020 07:12:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PKCWrTfr3M3VE4zZraZncAFHMeh1ED72eMf914LZfYE=; b=vuH+P3LxQOhQzesxYlIlfiMh1qXv9/xVVXJv14axnvi9nFSaX0pINMUMb3BuL/K29E 8xjQ+MPMBLBSys5Npfb+b9A9SWsdvjMFZ7AbFQD1KRDWUkNX308eSDqg0AAz9UaLwj6J qpQMQZ9PxlFmIuGPQ4NI9buSv9aP9P3EfCInUHeaVaVTIT3Tfr6qDd57DUnSvOHMkzJQ JGMdTceGMmn1A5xPQfaZQjJGeajN0FPeNvmAmWY0OTp8XsOxyg/waavYXcu30KyNqXdC p8TkcQvqHrlLWeHu8EJeBhkqH72QPFopaz4sIjtAXuicSSLd4hYpCIa9O7Ct/DKWR8Wt 9/Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PKCWrTfr3M3VE4zZraZncAFHMeh1ED72eMf914LZfYE=; b=aFjptTj/KZpxyTPf5ULS82eSygJCuOI16JMWOJHmxyi/rCe4QTpypKUzJYUQdIDtXl Y8CSucxFRwCMT/xWKi1qMhQyn3yoRq7r5rAbUEaMi9agh1S57jhe8S6dxzXBN3NV4H9l 9Fgv5YnHJ7IexgrqiKtOJ6eNtvH5/w8GrrZHVRo25J2qfvLFbjf9gezBk352JhLYjxVM AKzfj9ayUelYtPdcDyC/sKbUGT9NwTOXDpZHDUwu100KQJorEuCLWj3VXcYLCDDozCHR /tsgcs45RQNrXHFjN4Ao9q/X39fJCsSaxAcioiWhQZDQT2O32ujZYv+KnGr3GD/C8Fzw QaXQ== X-Gm-Message-State: AOAM530K0ZH9NLvqtdrNyWVEeUHi5ak/AoEqIgMu6MHznWyIRITGZ1+f xr3INFBl8a6cqfqIV8qL0iwMbIQJE8OjgnhygeYTMQ== X-Received: by 2002:a05:651c:10f:: with SMTP id a15mr38227429ljb.192.1594390354195; Fri, 10 Jul 2020 07:12:34 -0700 (PDT) MIME-Version: 1.0 References: <20200709194718.189231-1-guro@fb.com> <20200710122917.GB3022@dhcp22.suse.cz> In-Reply-To: <20200710122917.GB3022@dhcp22.suse.cz> From: Shakeel Butt Date: Fri, 10 Jul 2020 07:12:22 -0700 Message-ID: Subject: Re: [PATCH] mm: memcontrol: avoid workload stalls when lowering memory.high To: Michal Hocko Cc: Roman Gushchin , Andrew Morton , Johannes Weiner , Linux MM , Kernel Team , LKML , Domas Mituzas , Tejun Heo , Chris Down Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 10, 2020 at 5:29 AM Michal Hocko wrote: > > On Thu 09-07-20 12:47:18, Roman Gushchin wrote: > > Memory.high limit is implemented in a way such that the kernel > > penalizes all threads which are allocating a memory over the limit. > > Forcing all threads into the synchronous reclaim and adding some > > artificial delays allows to slow down the memory consumption and > > potentially give some time for userspace oom handlers/resource control > > agents to react. > > > > It works nicely if the memory usage is hitting the limit from below, > > however it works sub-optimal if a user adjusts memory.high to a value > > way below the current memory usage. It basically forces all workload > > threads (doing any memory allocations) into the synchronous reclaim > > and sleep. This makes the workload completely unresponsive for > > a long period of time and can also lead to a system-wide contention on > > lru locks. It can happen even if the workload is not actually tight on > > memory and has, for example, a ton of cold pagecache. > > > > In the current implementation writing to memory.high causes an atomic > > update of page counter's high value followed by an attempt to reclaim > > enough memory to fit into the new limit. To fix the problem described > > above, all we need is to change the order of execution: try to push > > the memory usage under the limit first, and only then set the new > > high limit. > > Shakeel would this help with your pro-active reclaim usecase? It would > require to reset the high limit right after the reclaim returns which is > quite ugly but it would at least not require a completely new interface. > You would simply do > high = current - to_reclaim > echo $high > memory.high > echo infinity > memory.high # To prevent direct reclaim > # allocation stalls > This will reduce the chance of stalls but the interface is still non-delegatable i.e. applications can not change their own memory.high for the use-cases like application controlled proactive reclaim and uswapd. One more ugly fix would be to add one more layer of cgroup and the application use memory.high of that layer to fulfill such use-cases. I think providing a new interface would allow us to have a much cleaner solution than to settle on a bunch of ugly hacks. > The primary reason to set the high limit in advance was to catch > potential runaways more effectively because they would just get > throttled while memory_high_write is reclaiming. With this change > the reclaim here might be just playing never ending catch up. On the > plus side a break out from the reclaim loop would just enforce the limit > so if the operation takes too long then the reclaim burden will move > over to consumers eventually. So I do not see any real danger. > > > Signed-off-by: Roman Gushchin > > Reported-by: Domas Mituzas > > Cc: Johannes Weiner > > Cc: Michal Hocko > > Cc: Tejun Heo > > Cc: Shakeel Butt > > Cc: Chris Down > > Acked-by: Michal Hocko > This patch seems reasonable on its own. Reviewed-by: Shakeel Butt