Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3477364pxb; Mon, 4 Apr 2022 18:13:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjYpbHcP0EN+K9ZS1KnNAnbGGf9jQPmS+bZUD172smlJ7+TeZk7m9ij5k6U+TXz0dP0qJ4 X-Received: by 2002:a17:90b:314b:b0:1c7:4a4f:6740 with SMTP id ip11-20020a17090b314b00b001c74a4f6740mr1180394pjb.145.1649121216309; Mon, 04 Apr 2022 18:13:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649121216; cv=none; d=google.com; s=arc-20160816; b=KOVyc1MO7+KmfOBI516R7TRigBHdAkgzhmTjSR0o7s7yHc9R1N+k0wV/W4zhsB1L90 +kTZGR0GTUarlG64dVF/Az9HGWteb22TArSjcsE26lKh+a8jl3HMm/gDUX8rx2WGu3xx R3W0UADJlu62ljUYQ9vuz8BoTCYfI77eqZkUi4+4m7y+3KPyVuycl9iTpT084PxCRtb3 mT3BmpnCUMUtxJCM1j4HbQoQrhnMI3frNCHL65qOhwvZknbLYS/UTlTKvazwPgY79oJK FoBXCpZR4pQseN86cjF4AZW8mzwQ4joDZfkNwqMHrlaYEP0jRxKKCZp+vt6Fu9i9RhYA k5VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=Og7JJS97Y7kZpeaH/i0wuDH1r4Ru5n8sJ0IhUTsaTkc=; b=FlircfkLvvVrthTuXOqfprkBlyCToPDwS+DmBJHaY6g6rh5Hf2E38U6iInI0EzQvY9 4Yj7zUy7vTUHlKyK1noguLGl2s/iYdIMOlQPTbBdBQxZvOGjVkJ6qaECJCypmAHsWvww 3CAd1WrxARi4Ol22arbd/GsHt0imq1L/XgXz3rWJJfNc1r9Vc22UTZu2Y1A/wFeGcGp9 FTd6fKj1rU22hXULSLKatNuxlJfuGWgdlGqlkki4r6k9Q39LWVRmEB+VHEe9FvsC6gtV gmEH08uCfipSl9q+K+l1iT8qlyUh0osh7IjS0he24u8glTU/nupCAHM5kdCRHYbGEjX/ rtvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=IFf94GWr; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id x3-20020a633103000000b003816043ef3bsi11512258pgx.304.2022.04.04.18.13.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 18:13:36 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=IFf94GWr; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C6AFB17B0DC; Mon, 4 Apr 2022 17:12:19 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381783AbiDDWI6 (ORCPT + 99 others); Mon, 4 Apr 2022 18:08:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379964AbiDDS1m (ORCPT ); Mon, 4 Apr 2022 14:27:42 -0400 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 616C213F4A; Mon, 4 Apr 2022 11:25:45 -0700 (PDT) Date: Mon, 4 Apr 2022 11:25:35 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649096743; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Og7JJS97Y7kZpeaH/i0wuDH1r4Ru5n8sJ0IhUTsaTkc=; b=IFf94GWr/6nr/4HgPNIQTPJGHrC3WBKmIpRxtM36jz9zsR04NOjeL/8OfSzqcvZG0zgjq5 bxiztvliLfe6bwYLU/TybmMsDuIL3Dv81VLiVs4OB9hzOhwlSPvHK8CDaU4e7vrJKGzg2u I0k54olorsvEY4ZnPZLQZx8TVXRhjg8= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Michal Hocko Cc: Yosry Ahmed , Johannes Weiner , Shakeel Butt , Andrew Morton , David Rientjes , Tejun Heo , Zefan Li , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jonathan Corbet , Yu Zhao , Dave Hansen , Wei Xu , Greg Thelen Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface Message-ID: References: <20220331084151.2600229-1-yosryahmed@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 04, 2022 at 10:44:04AM +0200, Michal Hocko wrote: > On Fri 01-04-22 09:58:59, Roman Gushchin wrote: > > On Fri, Apr 01, 2022 at 03:49:19PM +0200, Michal Hocko wrote: > > > On Thu 31-03-22 10:25:23, Roman Gushchin wrote: > > > > On Thu, Mar 31, 2022 at 08:41:51AM +0000, Yosry Ahmed wrote: > > > [...] > > > > > - A similar per-node interface can also be added to support proactive > > > > > reclaim and reclaim-based demotion in systems without memcg. > > > > > > > > Maybe an option to specify a timeout? That might simplify the userspace part. > > > > > > What do you mean by timeout here? Isn't > > > timeout $N echo $RECLAIM > .... > > > > > > enough? > > > > It's nice and simple when it's a bash script, but when it's a complex > > application trying to do the same, it quickly becomes less simple and > > likely will require a dedicated thread to avoid blocking the main app > > for too long and a mechanism to unblock it by timer/when the need arises. > > > > In my experience using correctly such semi-blocking interfaces (semi- because > > it's not clearly defined how much time the syscall can take and whether it > > makes sense to wait longer) is tricky. > > We have the same approach to setting other limits which need to perform > the reclaim. Have we ever hit that as a limitation that would make > userspace unnecessarily too complex? The difference here is that some limits are most likely set once and never adjusted, e.g. memory.max or memory.low. I do definitely remember some issues around memory.high, but as I recall, we've fixed them on the kernel side. We've even had a private memory.high.tmp interface with a value and a timeout, which later was replaced with a memory.reclaim interface similar to what we discuss here. But with memory.high we set the limit first, so if a user tries to reclaim a lot of hot memory, it will soon put all processes in the cgroup into the sleep/direct reclaim. So it's not expected to block for too long. In general it all comes to the question how hard the kernel should try to reclaim the memory before giving up. The userspace might have different needs in different cases. But if the interface is defined very vaguely like it tries for an undefined amount of time and then gives up, it's hard to use it in a predictive manner. Thanks!