Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1163941ybe; Wed, 4 Sep 2019 13:45:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqw9WSnMmyQ0vbKD7emyZDH2nToTWOcyz1R4Ee+8Wc7QC6kNHZSuMtH7NDlw3LayrnUPQr55 X-Received: by 2002:a62:8683:: with SMTP id x125mr4901727pfd.108.1567629914069; Wed, 04 Sep 2019 13:45:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567629914; cv=none; d=google.com; s=arc-20160816; b=jAnhUN7xyERBH1zLYn8ZNkfHbxfr+I6D8DFxYA8WI9LcbYxjENK1uYthbzPu1Qv2TU IDu0ycvPWIEK8cfyzfyeZ7PWbMU2gCqHyetjOlVInh/CEwB0nBY32rTvJghFcHl34JjG jxDO2SsH6C4HhhBQAIfek0p+Amwn6boXV8k2Mw8grPFNIdQj45hTbKVBKXuz6jso7apX MkvUpRhfR3AVQO5kUHrZDrr+Y0N+BjuxRxIaDV6lh/DrW2+yiaxfn5L77WkcPPzhZeAp Mn7EEiX9oZF9r4TMQUSxEykrBUMYMITZcryLE6GUjtY0jdnTOIkh9R+w2V7reFrxtSq2 evgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=eWOicZgj5Of+wSfvuMxF3nPebfxOImDgPFtP2TA8Hig=; b=zFt8AaOGYjPeaukOva3KQJ6kFepCfPUaWqTpexNu2MKZz/Ncmqr8upPnvM6xx7odvO pFMJCatsdu3ozRfNMwFt1vapMX7QrVc2XNJ+4i2mzQRgsLjtDXqAb0yVzsTOiQ94dY4C M5JRzWlGq9qOf1E/L9NcQXLBeoHEj3ZtVbLIXI3m97A3u7UkjZUUA7z7P7uq/ob3Djv4 dvkREhtJv3Xnuodh1aOwbzFmnq80EikEI949mt0dP+ZJYxgSZXjUaPZ+cfHeMeEG0l4q Lfx7buQVbNYihj9SX1+AVGF+qHgmqgp7phE+MEfnUVFpr4Zc+mjcb3ArhB9sN24QPbrb kmSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=ZGqEH+6G; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p9si18034516plk.386.2019.09.04.13.44.58; Wed, 04 Sep 2019 13:45:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=ZGqEH+6G; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730214AbfIDUmW (ORCPT + 99 others); Wed, 4 Sep 2019 16:42:22 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:40753 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727562AbfIDUmW (ORCPT ); Wed, 4 Sep 2019 16:42:22 -0400 Received: by mail-qt1-f193.google.com with SMTP id g4so62216qtq.7 for ; Wed, 04 Sep 2019 13:42:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=eWOicZgj5Of+wSfvuMxF3nPebfxOImDgPFtP2TA8Hig=; b=ZGqEH+6G7pAd8jrzjuGDhWkoWvtaL64D3AW2jvc0zj1/oUJplf2IrF6gHKD5s2FsPw Adfze0hKG7/rv+15yzEJjiUX6og3ZyANLD8D4ERpNzpK9YlV6uONlh9xQeKTGv+ZrquI hSKMreZSUi810Y6ZWKEeb9LD1PXnDKImDqVZgjLKigfeMpRxfjBMo9OZoosRbYdu8/79 bC/LvgoUb75NX3rr7WrL9taazuGhCpGAeDkOOoK6nNbcPd9ea3wqJdnrB3QlTAQjQaHv uAIhyBb8pqdsGUBbUJ1GvBMJDm22LpVbJdi/xMfxcFKEw56+qyPT7o3odV4a7QNQbRiw /B9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=eWOicZgj5Of+wSfvuMxF3nPebfxOImDgPFtP2TA8Hig=; b=cTQJwQF+KFvZu7AvV1tB6nsZEgQubyXOWqfBfeK+AYADGH8aMp/Z0A/SzYU8grX8X0 09pvFyXPBymFj3qz45KiMgVlfF7WMYoIV2OccuOhoh6w8u63sUGmEg7S+806OQl+f3rv 9kPP1iZ7jqzFQIZUbI+HfaIDDyqP6WMhM+srm7z3qmvRE3Bz5/KEMEvmWi047RY56N3d ByNskzrOdcx3nl9L2n0hNj8tTDATxeYpzPUl5xpO28auvnyfgQZvJ/2qT5tbrRVThvon PFUTywFoq1OMiguUbuQVGP1GQGVQTwZi+wTIC4ebdiv7Y61GZrV9q3nRBpkDmzq+abVR scDg== X-Gm-Message-State: APjAAAWglMlfp0YmsvOqHDvAxzbj7DlQ1T9kzeGTkZWK3ivFA9F3sP+Q z8adfYWR1F5YHhhIgr9Uifab4w== X-Received: by 2002:a0c:8c0b:: with SMTP id n11mr26275353qvb.66.1567629741114; Wed, 04 Sep 2019 13:42:21 -0700 (PDT) Received: from dhcp-41-57.bos.redhat.com (nat-pool-bos-t.redhat.com. [66.187.233.206]) by smtp.gmail.com with ESMTPSA id z200sm87656qkb.5.2019.09.04.13.42.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Sep 2019 13:42:20 -0700 (PDT) Message-ID: <1567629737.5576.87.camel@lca.pw> Subject: Re: [PATCH] net/skbuff: silence warnings under memory pressure From: Qian Cai To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Michal Hocko , Eric Dumazet , davem@davemloft.net, netdev@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Petr Mladek , Steven Rostedt Date: Wed, 04 Sep 2019 16:42:17 -0400 In-Reply-To: <20190904144850.GA8296@tigerII.localdomain> References: <20190903132231.GC18939@dhcp22.suse.cz> <1567525342.5576.60.camel@lca.pw> <20190903185305.GA14028@dhcp22.suse.cz> <1567546948.5576.68.camel@lca.pw> <20190904061501.GB3838@dhcp22.suse.cz> <20190904064144.GA5487@jagdpanzerIV> <20190904065455.GE3838@dhcp22.suse.cz> <20190904071911.GB11968@jagdpanzerIV> <20190904074312.GA25744@jagdpanzerIV> <1567599263.5576.72.camel@lca.pw> <20190904144850.GA8296@tigerII.localdomain> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-10.el7) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2019-09-04 at 23:48 +0900, Sergey Senozhatsky wrote: > On (09/04/19 08:14), Qian Cai wrote: > > > Plus one more check - waitqueue_active(&log_wait). printk() adds > > > pending irq_work only if there is a user-space process sleeping on > > > log_wait and irq_work is not already scheduled. If the syslog is > > > active or there is noone to wakeup then we don't queue irq_work. > > > > Another possibility for this potential livelock is that those printk() from > > warn_alloc(), dump_stack() and show_mem() increase the time it needs to > > process > > build_skb() allocation failures significantly under memory pressure. As the > > result, ksoftirqd() could be rescheduled during that time via a different > > CPU > > (this is a large x86 NUMA system anyway), > > > > [83605.577256][   C31]  run_ksoftirqd+0x1f/0x40 > > [83605.577256][   C31]  smpboot_thread_fn+0x255/0x440 > > [83605.577256][   C31]  kthread+0x1df/0x200 > > [83605.577256][   C31]  ret_from_fork+0x35/0x40 > > Hum hum hum... > > So I can, _probably_, think of several patches. > > First, move wake_up_klogd() back to console_unlock(). > > Second, move `printk_pending' out of per-CPU region and make it global. > So we will have just one printk irq_work scheduled across all CPUs; > currently we have one irq_work per CPU. I think I sent a patch a long > long time ago, but we never discussed it, as far as I remember. > > > In addition, those printk() will deal with console drivers or even a > > networking > > console, so it is probably not unusual that it could call irq_exit()- > > __do_softirq() at one point and then this livelock. > > Do you use netcon? Because this, theoretically, can open up one more > vector. netcon allocates skbs from ->write() path. We call con drivers' > ->write() from printk_safe context, so should netcon skb allocation > warn we will scedule one more irq_work on that CPU to flush per-CPU > printk_safe buffer. > > If this is the case, then we can stop calling console_driver() under > printk_safe. I sent a patch a while ago, but we agreed to keep the > things the way they are, fot the time being. > > Let me think more. To summary, those look to me are all good long-term improvement that would reduce the likelihood of this kind of livelock in general especially for other unknown allocations that happen while processing softirqs, but it is still up to the air if it fixes it 100% in all situations as printk() is going to take more time and could deal with console hardware that involve irq_exit() anyway. On the other hand, adding __GPF_NOWARN in the build_skb() allocation will fix this known NET_TX_SOFTIRQ case which is common when softirqd involved at least in short-term. It even have a benefit to reduce the overall warn_alloc() noise out there. I can resubmit with an update changelog. Does it make any sense?