Received: by 10.223.148.5 with SMTP id 5csp6091793wrq; Wed, 17 Jan 2018 09:14:49 -0800 (PST) X-Google-Smtp-Source: ACJfBoshplStavqm9iubNDreF12Bfjkyr/NtyZFYUeKGFV6rGUjcB2Oo6paWh101ejfnTgXy07t7 X-Received: by 10.159.246.152 with SMTP id c24mr26748907pls.294.1516209288906; Wed, 17 Jan 2018 09:14:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516209288; cv=none; d=google.com; s=arc-20160816; b=d1dRtx7fTqF0aBXdY7uBZwfDaXemQJyIUaVPoz8X8Av/5kgKsEAH83BdbxnPiejuxl XIguGiXoGvQugBCFmY5ld2DtKdjbgY7RgY22dcPTveSSu4fEhe2bdnRM07optv8sqZY0 UXCEbo7oif2xGofNhYDhvcYFHPNa3tCfm/3jH8SNgWsyAWle+3y/PwQsFtQD8VCkzN9u V7OlndNQGopVTwCdoWYOzDdNyKwFuiV63lChNsC83WNziAJoLYI+v9C39K0qstooLRnl 8Amz++cxHzHYCyudz+1l/k7rKcJpRUZvfnKmBZn3Q3BiyHUY//XiWS9y6fQnoMpk0q4z imWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dmarc-filter:arc-authentication-results; bh=XKH9MFEjkgn2Rq40dBx9iS9WDGOtiSw7uneKpiAoQD8=; b=m2p1Ro8/h9ylYWUNQvGQ7U9UMY0goby2xU0oGNtidZVVN/AJIjZM+9xZGzLYi85Zuj EvmL61V4V+IWDY3DHry8OToKw/4iF+NVTaWgkMY05iyDX6a2sXSeIfAYQRjgDOTupslQ TiFLr9Nl2nQorX8mFdBQtNeXGj0IeimCfYMnGARyncVNGFpl5V0fM4TdVNPTZzKjz3Qi QKzIj/DXptWSvAmxtLVvXmviWBZjAjMu+WnRigDub1jo54IwbISbIQHoTxDnavgcKanS 4DkBfpt/rKrvBYpmwgGvpz3Mcjm6nO+22fxVMWMElAFmrY531NosLFtagnYKAbMLUY/5 qWRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s80si4714126pfg.322.2018.01.17.09.14.34; Wed, 17 Jan 2018 09:14:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753279AbeAQRM5 convert rfc822-to-8bit (ORCPT + 99 others); Wed, 17 Jan 2018 12:12:57 -0500 Received: from mail.kernel.org ([198.145.29.99]:39770 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750969AbeAQRMz (ORCPT ); Wed, 17 Jan 2018 12:12:55 -0500 Received: from gandalf.local.home (cpe-172-100-180-131.stny.res.rr.com [172.100.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2FD1A20C48; Wed, 17 Jan 2018 17:12:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2FD1A20C48 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=goodmis.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=rostedt@goodmis.org Date: Wed, 17 Jan 2018 12:12:51 -0500 From: Steven Rostedt To: Tejun Heo Cc: Petr Mladek , Sergey Senozhatsky , Sergey Senozhatsky , akpm@linux-foundation.org, linux-mm@kvack.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Peter Zijlstra , Linus Torvalds , Jan Kara , Mathieu Desnoyers , Tetsuo Handa , rostedt@home.goodmis.org, Byungchul Park , Pavel Machek , linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Message-ID: <20180117121251.7283a56e@gandalf.local.home> In-Reply-To: <20180117151509.GT3460072@devbig577.frc2.facebook.com> References: <20180110140547.GZ3668920@devbig577.frc2.facebook.com> <20180110130517.6ff91716@vmware.local.home> <20180111045817.GA494@jagdpanzerIV> <20180111093435.GA24497@linux.suse> <20180111103845.GB477@jagdpanzerIV> <20180111112908.50de440a@vmware.local.home> <20180111203057.5b1a8f8f@gandalf.local.home> <20180111215547.2f66a23a@gandalf.local.home> <20180116194456.GS3460072@devbig577.frc2.facebook.com> <20180117091208.ezvuhumnsarz5thh@pathway.suse.cz> <20180117151509.GT3460072@devbig577.frc2.facebook.com> X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 17 Jan 2018 07:15:09 -0800 Tejun Heo wrote: > It's great that Steven's patches solve a good number of problems. It > is also true that there's a class of problems that it doesn't solve, > which other approaches do. The productive thing to do here is trying > to solve the unsolved one too, especially given that it doesn't seem > too difficuilt to do so on top of what's proposed. OK, let's talk about the other problems, as this is no longer related to my patch. From your previous email: > 1. Console is IPMI emulated serial console. Super slow. Also > netconsole is in use. > 2. System runs out of memory, OOM triggers. > 3. OOM handler is printing out OOM debug info. > 4. While trying to emit the messages for netconsole, the network stack > / driver tries to allocate memory and then fail, which in turn > triggers allocation failure or other warning messages. printk was > already flushing, so the messages are queued on the ring. > 5. OOM handler keeps flushing but 4 repeats and the queue is never > shrinking. Because OOM handler is trapped in printk flushing, it > never manages to free memory and no one else can enter OOM path > either, so the system is trapped in this state. From what I gathered, you said an OOM would trigger, and then the network console would not be able to allocate memory and it would trigger a printk too, and cause an infinite amount of printks. This could very well be a great place to force offloading. If a printk is called from within a printk, at the same context (normal, softirq, irq or NMI), then we should trigger the offloading. My ftrace ring buffer has a context level recursion check, we could use that, and even tie it into my previous patch: With something like this (not compiled tested or anything, and kick_offload_thread() would need to be implemented). diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 9cb943c90d98..b80b23a0ca13 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2261,6 +2261,63 @@ static int have_callable_console(void) return 0; } +/* + * Used for which context the printk is in. + * NMI = 0 + * IRQ = 1 + * SOFTIRQ = 2 + * NORMAL = 3 + * + * Stack ordered, where the lower number can preempt + * the higher number: mask &= mask - 1, will only clear + * the lowerest set bit. + */ +enum { + CTX_NMI, + CTX_IRQ, + CTX_SOFTIRQ, + CTX_NORMAL, +}; + +static DEFINE_PER_CPU(int, recursion_bits); + +static bool recursion_check_start(void) +{ + unsigned long pc = preempt_count(); + int val = this_cpu_read(recursion_bits); + + if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) + bit = CTX_NORMAL; + else + bit = pc & NMI_MASK ? CTX_NMI : + pc & HARDIRQ_MASK ? CTX_IRQ : CTX_SOFTIRQ; + + if (unlikely(val & (1 << bit))) + return true; + + val |= (1 << bit); + this_cpu_write(recursion_bits, val); + return false; +} + +static void recursion_check_finish(bool offload) +{ + int val = this_cpu_read(recursion_bits); + + if (offload) + return; + + val &= val - 1; + this_cpu_write(recursion_bits, val); +} + +static void kick_offload_thread(void) +{ + /* + * Consoles are triggering printks, offload the printks + * to another CPU to hopefully avoid a lockup. + */ +} /* * Can we actually use the console at this time on this cpu? @@ -2333,6 +2390,7 @@ void console_unlock(void) for (;;) { struct printk_log *msg; + bool offload; size_t ext_len = 0; size_t len; @@ -2393,15 +2451,20 @@ void console_unlock(void) * waiter waiting to take over. */ console_lock_spinning_enable(); + offload = recursion_check_start(); stop_critical_timings(); /* don't trace print latency */ call_console_drivers(ext_text, ext_len, text, len); start_critical_timings(); + recursion_check_finish(offload); + if (console_lock_spinning_disable_and_check()) { printk_safe_exit_irqrestore(flags); return; } + if (offload) + kick_offload_thread(); printk_safe_exit_irqrestore(flags); -- Steve