Received: by 10.223.176.46 with SMTP id f43csp4355186wra; Tue, 23 Jan 2018 08:13:03 -0800 (PST) X-Google-Smtp-Source: AH8x224S6Z7IsnsWp3m9AhLPPgt0GHsXirCrbyyn0ErPu+Rzu0lzfyLYW53Mbr16/a3mDijSIx7I X-Received: by 10.36.66.14 with SMTP id i14mr4364906itb.32.1516723983831; Tue, 23 Jan 2018 08:13:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516723983; cv=none; d=google.com; s=arc-20160816; b=dJPe8vChPVRe6CJ2hEng5CnhTvsBQXS5ZX2pwXFP+YE88GxY8dQV81Qk3X9+jXWbPr tqqCJ/vk4npSDiqjvOwU2kP8osQ1gW5pIgSy+rq8xBJU9D003mg1B2huo47ff89pFFSo hwUHr3Y/Dqu2KtYjXHexqqaP49bu1VjLJgumHB74yD3hyqMK6EfJfo6lCLXhBkQ4KmdO 50zu4nfup++K5Vu8piGI34dCpybZazoHGhT8/YD35WOfA1q/hWcnizkDBNRcPH2CL1No u3iol9UNFGjBXj8W5stPGQDQvIgP5Hea7Jji+NylHN5MmzDW5Qev8RhJL0zthqJh2KVx ijcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=o/Bdn5PM8PEtC6N4mTP2Jato2WiLS65YoJpEFKu0z/4=; b=NEXOFS3gvBfrAY/rGUSfWY2YQZQLTgzbqcwSIOV9Y8Pmi6mC6yQWTYJ41d8876CXOx QwtIaqhYwfjkowID6jBcuiPf8AJRj3T3VvWT3NPvEt9KIP8ourxlL0ooxIqRX6iVo1mX GDxKk7mQBStF1ckoaqbKOZddLZsvRmRWAA2ZQQeGnavQtNpl38SEGDhj4sDsAEdaW1Pj DRfUjXEiV7LW84zaQSjWS25XruXIbYQZeYFmyW/KxXJKhGD9aELOrAFqOIUgCPcUGOrW OkzfRXceIJdm7rTWSMjodJznndFR/HLOYJwGJJ4R9Z6aBEmGpmTCsdov4FZsKlLtzk4s tYyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=eew2j6VZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n93si15673479ioe.108.2018.01.23.08.12.49; Tue, 23 Jan 2018 08:13:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=eew2j6VZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751388AbeAWQM0 (ORCPT + 99 others); Tue, 23 Jan 2018 11:12:26 -0500 Received: from mail-pg0-f68.google.com ([74.125.83.68]:46626 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750994AbeAWQMZ (ORCPT ); Tue, 23 Jan 2018 11:12:25 -0500 Received: by mail-pg0-f68.google.com with SMTP id s9so583769pgq.13 for ; Tue, 23 Jan 2018 08:12:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=o/Bdn5PM8PEtC6N4mTP2Jato2WiLS65YoJpEFKu0z/4=; b=eew2j6VZbBM9u7M/h2j06j/EHI4V+f7DD9ukiGUGT0U9cosKnziCiWjZR22GOUhIad 984AT/2X0ex6+uC9nqqEXpr38vbCnPw2WN2yZOLMxpg27hnXia3KArn5uKvrhgEmoDwm UM0m7jA52jvWH8HITd4jg1W4FQ0s52P73lrA6SqyLyMrFgM8rrVv6JLb7KJXdzZWQMcM 9GnFvvcejURhPcLbg4SnX8M6IWucEeqnFCKd7Mn+VUqlunLXyMaqakJAwj5svA12/3/U DmvEs5GXen06bKB0TnlOYlN7M68oQowAgiOESnbJjnQ5Lfm/e7WMK1I8D0J1p+JG5i2c RoBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=o/Bdn5PM8PEtC6N4mTP2Jato2WiLS65YoJpEFKu0z/4=; b=nyUiE+tIVxWsgxPaylhqdxoQ+Oqu+SH9Jk32x6r4KTOry74fP2nFMF4zX2BHT3sl3a kea0chmXKoMeo4IsI48WSh2q5vlwar2B3n/9XE7eCn68cbioCoWyL7adYmjSNKuqTAqt YBCeeG/sbZWlQKU7htarcaT4fJFjXFoM54HmsJegjGL29dTsY/wcHwv+VV96YITbnF+V a5MPpvkcVDp3J9RNRyqflrDHDztB3fQ87CqoyvJNyEUJiA1h7Ostu5/5Ign3ViCTMZ77 ascb9oz6TRxYkiMK2fIJfyDSlIsLvfP4o97P/APBshA+K0HG2ijQ+vpSu4bpH+F9XW1c nUvQ== X-Gm-Message-State: AKwxytcAKfrSPiA8YgkLZX20cFBk2ZerY3FFVfivf3uXV7wzjOv6uoNt gevshfL19Oi3hjfg9L0wN8k= X-Received: by 10.98.62.150 with SMTP id y22mr10821650pfj.92.1516723944332; Tue, 23 Jan 2018 08:12:24 -0800 (PST) Received: from localhost ([121.137.63.184]) by smtp.gmail.com with ESMTPSA id b81sm7071949pfm.25.2018.01.23.08.12.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 23 Jan 2018 08:12:23 -0800 (PST) Date: Wed, 24 Jan 2018 01:12:21 +0900 From: Sergey Senozhatsky To: Tejun Heo Cc: Steven Rostedt , Sergey Senozhatsky , Sergey Senozhatsky , Petr Mladek , akpm@linux-foundation.org, linux-mm@kvack.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Peter Zijlstra , Linus Torvalds , Jan Kara , Mathieu Desnoyers , Tetsuo Handa , rostedt@home.goodmis.org, Byungchul Park , Pavel Machek , linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Message-ID: <20180123161221.GD429@tigerII.localdomain> References: <20180117134201.0a9cbbbf@gandalf.local.home> <20180119132052.02b89626@gandalf.local.home> <20180120071402.GB8371@jagdpanzerIV> <20180120104931.1942483e@gandalf.local.home> <20180121141521.GA429@tigerII.localdomain> <20180123064023.GA492@jagdpanzerIV> <20180123095652.5e14da85@gandalf.local.home> <20180123152130.GB429@tigerII.localdomain> <20180123104121.2ef96d81@gandalf.local.home> <20180123154347.GE1771050@devbig577.frc2.facebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180123154347.GE1771050@devbig577.frc2.facebook.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Tejun On (01/23/18 07:43), Tejun Heo wrote: > Hello, Steven. > > On Tue, Jan 23, 2018 at 10:41:21AM -0500, Steven Rostedt wrote: > > > I don't want to have heuristics in print_safe, I don't want to have a magic > > > number controlled by a user-space visible knob, I don't want to have the > > > first 3 lines of a lockdep splat. > > > > We can have more. But if printk is causing printks, that's a major bug. > > And work queues are not going to fix it, it will just spread out the > > pain. Have it be 100 printks, it needs to be fixed if it is happening. > > And having all printks just generate more printks is not helpful. Even > > if we slow them down. They will still never end. > > So, at least in the case that we were seeing, it isn't that black and > white. printk keeps causing printks but only because printk buffer > flushing is preventing the printk'ing context from making forward > progress. The key problem there is that a flushing context may get > pinned flushing indefinitely and using a separate context does solve > the problem. Would you, as the original bug reporter, be OK if we flush printk_safe (only printk_safe, not printk_nmi for the time being) via WQ? This should move that "uncontrolled" flush to a safe context. I don't think we can easily add kthread offloading to printk at the moment (this will result in a massive gun fight). Just in case, below is something like a patch. I think I worked around the possible wq deadlock scenario. But I haven't tested the patch yet. It's a bit late here and I guess I need some rest. Will try to look more at it tomorrow. From: Sergey Senozhatsky Subject: [PATCH] printk/safe: split flush works --- kernel/printk/printk_safe.c | 75 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 63 insertions(+), 12 deletions(-) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index 3e3c2004bb23..54bc40ce3c34 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -22,6 +22,7 @@ #include #include #include +#include #include "internal.h" @@ -49,7 +50,10 @@ static int printk_safe_irq_ready __read_mostly; struct printk_safe_seq_buf { atomic_t len; /* length of written data */ atomic_t message_lost; - struct irq_work work; /* IRQ work that flushes the buffer */ + /* IRQ work that flushes NMI buffer */ + struct irq_work irq_flush_work; + /* WQ work that flushes SAFE buffer */ + struct work_struct wq_flush_work; unsigned char buffer[SAFE_LOG_BUF_LEN]; }; @@ -61,10 +65,18 @@ static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq); #endif /* Get flushed in a more safe context. */ -static void queue_flush_work(struct printk_safe_seq_buf *s) +static void queue_irq_flush_work(struct printk_safe_seq_buf *s) { if (printk_safe_irq_ready) - irq_work_queue(&s->work); + irq_work_queue(&s->irq_flush_work); +} + +static void queue_wq_flush_work(struct printk_safe_seq_buf *s) +{ + if (printk_safe_irq_ready) + queue_work_on(smp_processor_id(), + system_wq, + &s->wq_flush_work); } /* @@ -89,7 +101,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s, /* The trailing '\0' is not counted into len. */ if (len >= sizeof(s->buffer) - 1) { atomic_inc(&s->message_lost); - queue_flush_work(s); + queue_irq_flush_work(s); return 0; } @@ -112,7 +124,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s, if (atomic_cmpxchg(&s->len, len, len + add) != len) goto again; - queue_flush_work(s); + queue_irq_flush_work(s); return add; } @@ -186,12 +198,10 @@ static void report_message_lost(struct printk_safe_seq_buf *s) * Flush data from the associated per-CPU buffer. The function * can be called either via IRQ work or independently. */ -static void __printk_safe_flush(struct irq_work *work) +static void __printk_safe_flush(struct printk_safe_seq_buf *s) { static raw_spinlock_t read_lock = __RAW_SPIN_LOCK_INITIALIZER(read_lock); - struct printk_safe_seq_buf *s = - container_of(work, struct printk_safe_seq_buf, work); unsigned long flags; size_t len; int i; @@ -243,6 +253,46 @@ static void __printk_safe_flush(struct irq_work *work) raw_spin_unlock_irqrestore(&read_lock, flags); } +static void irq_flush_work_fn(struct irq_work *work) +{ + struct printk_safe_seq_buf *s = + container_of(work, struct printk_safe_seq_buf, irq_flush_work); + + __printk_safe_flush(s); +} + +/* + * We can't queue wq work directly from vprintk_safe(), because we can + * deadlock. For instance: + * + * queue_work() + * spin_lock(pool->lock) + * printk() + * call_console_drivers() + * vprintk_safe() + * queue_work() + * spin_lock(pool->lock) + * + * So we use irq_work, from which we queue wq work. WQ disables local IRQs + * while it works with pool, so if we have irq_work on that CPU then we can + * expect that pool->lock is not locked. + */ +static void irq_to_wq_flush_work_fn(struct irq_work *work) +{ + struct printk_safe_seq_buf *s = + container_of(work, struct printk_safe_seq_buf, irq_flush_work); + + queue_wq_flush_work(s); +} + +static void wq_flush_work_fn(struct work_struct *work) +{ + struct printk_safe_seq_buf *s = + container_of(work, struct printk_safe_seq_buf, wq_flush_work); + + __printk_safe_flush(s); +} + /** * printk_safe_flush - flush all per-cpu nmi buffers. * @@ -256,9 +306,9 @@ void printk_safe_flush(void) for_each_possible_cpu(cpu) { #ifdef CONFIG_PRINTK_NMI - __printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work); + __printk_safe_flush(this_cpu_ptr(&nmi_print_seq)); #endif - __printk_safe_flush(&per_cpu(safe_print_seq, cpu).work); + __printk_safe_flush(this_cpu_ptr(&safe_print_seq)); } } @@ -387,11 +437,12 @@ void __init printk_safe_init(void) struct printk_safe_seq_buf *s; s = &per_cpu(safe_print_seq, cpu); - init_irq_work(&s->work, __printk_safe_flush); + init_irq_work(&s->irq_flush_work, irq_to_wq_flush_work_fn); + INIT_WORK(&s->wq_flush_work, wq_flush_work_fn); #ifdef CONFIG_PRINTK_NMI s = &per_cpu(nmi_print_seq, cpu); - init_irq_work(&s->work, __printk_safe_flush); + init_irq_work(&s->irq_flush_work, irq_flush_work_fn); #endif } -- 2.16.1