Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp824328ybj; Tue, 5 May 2020 08:06:38 -0700 (PDT) X-Google-Smtp-Source: APiQypLYYgMIrLt/He+2lUCNnVLzuMO31V9/KMQEYBYY5MxsfpHG2dIgG9y17VuogZKdVnG/EVF4 X-Received: by 2002:a50:9e2a:: with SMTP id z39mr3101095ede.371.1588691198086; Tue, 05 May 2020 08:06:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588691198; cv=none; d=google.com; s=arc-20160816; b=PbHWr92Tpxx+j7EfhfcOi+caHWN/PkHlPkMuagfpIRHZjLDTsafJDCoT1ycQjITHMG 2qmEU63V/5kZYW5FPxJGiSxtO1JWbDZ/csFM9ZoEa1wxVITWmJF2e+cwecLxjKGvd7zu jxRXwntzNQYtkQKm8GsZNJy8PuPJ3e0y1Y12xiZA8b74s/25rxnYXEmvvMhsIGJYHT4U Iwbi9jbKjCHsGKd8/fJY5KoTw9QvKlUj0gEo8E5g/TG55zA3xapSP81H4QHrNIxumWYF BUGSoWgTDOEPhWPEWVfiJBWJsU9psbE5sw6oCmrzFStuIdm3Rs2jpEW/unVZ8RIyLvwd Yqsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=G5HI11tS+yKnZwbaq8TtL0vEO4XhsA9qTOxtb1zDBVc=; b=sA6Y6Q/De+HnhOpTrqFvashwnT0rID95uMLB+FH9aqMhttavA5nRXqTrinJ1iABbVi U+ORJBV5BoMrLZjU+2bvcCHdQSUgPHK4I7pv+5W3j/xzkcPARjO81phGA4WbEFdxRKmQ i6yWemKCgSSFoK2Ip0B/kHCcZADiQvgKGgf6o1fjHNFMXBrhFz5eN64M7d5b0vMYb6Ws dynIEqW4nmZDFnCkX+ykeaDaF9ALra9sLiUQt/r+qnVwnykx01k9kVUNZx0hG9Zek96+ 9JslWNOIiLb7i2rob+12rjuyjmOZqCAdt3NQq+LewHm8fOv4nU31+rZUrR2VJAf9qS2h 14WQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NedESV1m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i25si1561212edb.440.2020.05.05.08.06.08; Tue, 05 May 2020 08:06:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NedESV1m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729796AbgEEPDb (ORCPT + 99 others); Tue, 5 May 2020 11:03:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1729347AbgEEPDb (ORCPT ); Tue, 5 May 2020 11:03:31 -0400 Received: from mail-lf1-x143.google.com (mail-lf1-x143.google.com [IPv6:2a00:1450:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C453EC061A10 for ; Tue, 5 May 2020 08:03:30 -0700 (PDT) Received: by mail-lf1-x143.google.com with SMTP id a4so1610627lfh.12 for ; Tue, 05 May 2020 08:03:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=G5HI11tS+yKnZwbaq8TtL0vEO4XhsA9qTOxtb1zDBVc=; b=NedESV1mf2KEGbnGwBXjkNGi7R9SVO/NUVnBR/MIIA4ksHkYG9Y0BOJUSO7cPfF101 DhqLvXmcnRAqIon/UhQwhmdhaRJTFMRbqv7rnUrFiiLplm7OYMLhdbiEBbtARuNhVV8D LTzgdGvjzWUJX1UQL/oIjMDuzzQetiMjcVkKOHZidnk7VXhouvnvChgf89k1EWSMWRjT FhIwRnMbbr/4G57qwJzkX9wXI0g0aGrRDYcsPodwxKY6npl93JABKfVo0PFuu+LIkLAx gWRTkTusvXUAXg8/Xmo6bq3CqbG9PEAZT5/liY82/n294mGhWG9jm1kwV8U5Yd9K6Gur 9WEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=G5HI11tS+yKnZwbaq8TtL0vEO4XhsA9qTOxtb1zDBVc=; b=f5QZRivaVLmX/n59lFNy6m8iATvdNn533GELF8vbmBaxvFsNiBHPcwseHXKwi5cgmz HelZpzBsv2OzkZk5gRsKsQQWtod1o3gq73xhXlFt6BPvLXgR/FiRYyLSv1YR7U4Q9wNN XrOn85lgdBV9rRJBKdWaA6C2icIgKUtj2DHW2imE+s0NPPYLTkkJv9aBYnvz9Lp6ID8e foONaXWAuQACk/0RnAajIgC0BdiMxO7tStj6cFXQcfqqvWzu4l5QnxgpA7fE9KjCxoX+ VKIjm5/J8yDDfH3IVM2eAJ8fY9DpP8QmsfvQVqyIptWJT5SLI2gvpdz3w882NkvbFRKK Tiyg== X-Gm-Message-State: AGi0Pub4q2IvOUz3WSGIsLhemYd9LWLWAQhiJFUPfd86iEjiJJdJruV4 PAfZKG9mJn1NClWSpo9uiCXC8jUXPE1XD7TiGzRhEQ== X-Received: by 2002:a19:c85:: with SMTP id 127mr1421149lfm.189.1588691008811; Tue, 05 May 2020 08:03:28 -0700 (PDT) MIME-Version: 1.0 References: <20200430182712.237526-1-shakeelb@google.com> <20200504065600.GA22838@dhcp22.suse.cz> <20200504141136.GR22838@dhcp22.suse.cz> <20200504150052.GT22838@dhcp22.suse.cz> <20200504160613.GU22838@dhcp22.suse.cz> <20200505071324.GB16322@dhcp22.suse.cz> In-Reply-To: <20200505071324.GB16322@dhcp22.suse.cz> From: Shakeel Butt Date: Tue, 5 May 2020 08:03:17 -0700 Message-ID: Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max To: Michal Hocko Cc: Johannes Weiner , Roman Gushchin , Greg Thelen , Andrew Morton , Linux MM , Cgroups , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 5, 2020 at 12:13 AM Michal Hocko wrote: > > On Mon 04-05-20 12:23:51, Shakeel Butt wrote: > [...] > > *Potentially* useful for debugging versus actually beneficial for > > "sweep before tear down" use-case. > > I definitely do not want to prevent you from achieving what you > want/need. Let's get back to your argument on why you cannot use > memory.high for this purpose and what is the actual difference from > memory.max on the "sweep before removal". You've said > > : Yes that would work but remote charging concerns me. Remote charging > : can still happen after the memcg is offlined and at the moment, high > : reclaim does not work for remote memcg and the usage can go till max > : or global pressure. This is most probably a misconfiguration and we > : might not receive the warnings in the log ever. Setting memory.max to > : 0 will definitely give such warnings. > > So essentially the only reason you are not using memory.high which would > effectively achieve the same level of reclaim for your usecase is that > potential future remote charges could get unnoticed. Yes. > I have proposed to > warn when charging to an offline memcg because that looks like a sign of > bug to me. Instead of a bug, I would say misconfiguration but there is at least a genuine user i.e. buffer_head. It can be allocated in reclaim and trigger remote charging but it should be fine as the page it is attached to will possibly get freed soon. So, I don't think we want to warn for all remote charges to an offlined memcg. > Having the hard limit clamped to 0 (or some other small > value) would complain loud by the oom report and no eligible tasks > message but it will unlikely help to stop such a usage because, well, > there is nothing reclaimable and we force the charge in that case. So > you are effectively in the memory.high like situation. Yes, effectively it will be similar to memory.high but at least we will get early warnings. Now rethinking about the remote charging of buffer_head to an offlined memcg with memory.max=0. It seems like it is special in the sense that it is using __GFP_NOFAIL and will skip the oom-killer and thus warnings. Maybe the right approach is, as you suggested, always warn for charging an offline memcg unless (__GFP_NOFAIL|__GFP_RETRY_MAYFAIL). Though I am not sure if this is doable without code duplication. > > So instead of potentially removing a useful information can we focus on > the remote charging side of the problem and deal with it in a sensible > way? That would make memory.high usable for your usecase and I still > believe that this is what you should be using in the first place. We talked about this at LSFMM'19 and I think the decision was to not fix high reclaim for remote memcg until it will be an actual issue. I suppose now we can treat it as an actual issue. There are a couple of open questions: 1) Should the remote chargers be throttled and do the high reclaim? 2) There can be multiple remote charges to multiple memcgs in a single kernel entry. Should we handle such scenarios? Shakeel