Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5546506yba; Wed, 10 Apr 2019 23:52:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqzGTShs6ExLQXDgit3X3kv7A1+Mwu53CF35krUNREqA+cpfqOw77lYbxKN7lOoIsUxoDkGu X-Received: by 2002:a17:902:70c8:: with SMTP id l8mr49094814plt.177.1554965573088; Wed, 10 Apr 2019 23:52:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554965573; cv=none; d=google.com; s=arc-20160816; b=igtfOXIkOqnLf0G9ftPbDavLsExaWUxZJnVT1Swe/I7/7z06aevVQ7RbhSM8SmMu2l ngj1HXEjZhw0KJW7GMAWv+p03rSdtyAYPh84337i9oMI6V6Il+JHt+jjXR8uEtAodMYH KocKIIMbTGt5NSL7aGui3u9qSm/UDxfKIOr2/+35JSrZPmf+UNgInMQNYT+vI22zs96w g37y+9ZB3I6ukEothKscsm0vJbPDY36+2PsmKmGDbE2yRBImPPTYo3NRuN7OYpXW9TDd 3XXLfMRPFAxXhrChVg/LFDVPJ1d9+iWe+gMGNR8xou2jQg0+N56c4Jfp6g6jLiH0UK94 CEuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=fRixQD7vDHfgsw6GEBOtcvehPnub3T+YWeDpJdsQ0Qs=; b=lHt/GpdNtHDvtH4ryhhOGTrhy2nG/0P7aKx3mDckIB0cAbAY/U3Jqkwb9Hi0g4vZ5I tqoJNNbGMkXJd15UuxnTwHwVPNdjaBxALr03gVHs5MRWhQHcX0+gz/gQNJGWrBlzfhjS 7A2qMejCRVfCRdYtnQRkFbSEGjctgfCoMX5nTAg/c1ztg3RQI6pyUkVXao4g88bIt98d KRrwYaaQ1ebBQhyffYdS/1y+XtVcUKpjgqrIg0S2c+SFQg9ZiyDfdm/K+iR2B63XNS+/ pYkVXfNjCu8opC+U4P9Ja3qdQmVCpytc1FqYqZUYXuE6DkSoOGs9Fd+KL6iTavWD0EvC F48A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@foreca.com header.s=google header.b=TzxGeJfc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k8si25711562plt.354.2019.04.10.23.52.37; Wed, 10 Apr 2019 23:52:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@foreca.com header.s=google header.b=TzxGeJfc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726713AbfDKGve (ORCPT + 99 others); Thu, 11 Apr 2019 02:51:34 -0400 Received: from mail-yw1-f65.google.com ([209.85.161.65]:43679 "EHLO mail-yw1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726145AbfDKGvd (ORCPT ); Thu, 11 Apr 2019 02:51:33 -0400 Received: by mail-yw1-f65.google.com with SMTP id j66so1683359ywc.10 for ; Wed, 10 Apr 2019 23:51:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foreca.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fRixQD7vDHfgsw6GEBOtcvehPnub3T+YWeDpJdsQ0Qs=; b=TzxGeJfc2fmWtlblUnBom0NPpcwIvnDDQp26S+l122MhgN3QPGQs1Fo+wZQMCkdeUa flbQexZgB52FtwlB8oT3o5fx4uP/ZAt4BFcqVrtdjRgNBLMaaykIROR+GwwIF48Kzzet knkooeTrf5RszsO2r7Pu+CgG9Tfw8PRcoixMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fRixQD7vDHfgsw6GEBOtcvehPnub3T+YWeDpJdsQ0Qs=; b=VCPbzklkR7MGwS+pmEGjFwUahmwA4uMDOGFqshUM+nqNyzIrkaO/PxifLu4hVQmTZE pqRTnjruu356mF5er29N/th9CVq2Y8KzxAphFJbl4AQO7B2wai+yPC17AtfVHF5ZIw9/ wKTXEG+xKsx0GfyACj/AXkvLmY0iAU3UtbDQ6rJsu5r4SR3YmCk93Cdyhp37+MMVhmOP RgqvjFzjnp7/B+T0NK2VytaZpc8GL7P1QbQqP+sjyF9eAmYy0dk5GcgiTA2iKN1KbtH2 oJQImom2aeXL5Om5tk1mfuH8yGtLvEGc66mQUni4uW2cs/93tZvBuvJb8dd7VXjmxNfl zE3g== X-Gm-Message-State: APjAAAWYyML2EDx8ABeqnNTZSZ9BLN6rj5iMT2P2mlNskKSXOJj1yMxu haspiPm16hIsI97wbgLWoP6hlKvFc2O/4Y2kBwvhRw== X-Received: by 2002:a81:5488:: with SMTP id i130mr38165993ywb.417.1554965492967; Wed, 10 Apr 2019 23:51:32 -0700 (PDT) MIME-Version: 1.0 References: <20190410101947.8603-1-juha-matti.tilli@foreca.com> <20190410.121125.839541085072412175.davem@davemloft.net> In-Reply-To: <20190410.121125.839541085072412175.davem@davemloft.net> From: Juha-Matti Tilli Date: Thu, 11 Apr 2019 09:51:21 +0300 Message-ID: Subject: Re: [PATCH] net: add big honking pfmemalloc OOM warning To: David Miller Cc: Eric Dumazet , Juha-Matti Tilli , LKML , netdev , Rafael Aquini , Murphy Zhou , Yongcheng Yang , Jianhong Yin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 10, 2019 at 10:11 PM David Miller wrote: > > SNMP counters are per netns, and more useful in the modern computing > > era, where a host is shared by many different containers. > > +1 There is no way I am applying this patch. > > The kernel should not "big honking" anything in the logs. Just to check, is the opposition to the patch related to the expectation that it will log the condition too often despite the rate limit, if many packets are dropped? Because if it is, that might be possible to fix. I think it might be possible to check the SNMP counter value, and if zero, log the first instance of pfmemalloc drop, and then omit logging afterwards. There could be race conditions, so in the absolute worst case, you could have let's say 2 or 3 of these log lines instead of 1, but I don't see that as an issue, because 99% of the time there would be just one, and 2 or 3 lines won't fill the logs. In our case, the existence of such a log message and the helpful suggestion to bump up vm.min_free_kbytes would have saved us approximately one month of debugging (or 2-3 weeks if the SNMP counter was there in this kernel version). Even one such log message would be enough. Our production systems were hanging daily during this debugging happening. In my opinion, the ideal count of pfmemalloc drops is exactly 0, and the interesting event is the first instance of pfmemalloc drop occurring. If there's a bug in the kernel, I think the user should be notified, so I see this as similar to some WARN_ON line -- which is even more "big honking" log event because it's associated with a backtrace. BR, Juha-Matti