Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933867AbcJUMvL (ORCPT ); Fri, 21 Oct 2016 08:51:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:56886 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933531AbcJUMvI (ORCPT ); Fri, 21 Oct 2016 08:51:08 -0400 From: Petr Mladek To: Jason Wessel Cc: Daniel Thompson , Peter Zijlstra , Andrew Morton , Sergey Senozhatsky , linux-kernel@vger.kernel.org, Petr Mladek Subject: [PATCH 1/2] kdb: Properly synchronize vkdb_printf() calls with other CPUs Date: Fri, 21 Oct 2016 14:50:34 +0200 Message-Id: <1477054235-1624-2-git-send-email-pmladek@suse.com> X-Mailer: git-send-email 1.8.5.6 In-Reply-To: <1477054235-1624-1-git-send-email-pmladek@suse.com> References: <1477054235-1624-1-git-send-email-pmladek@suse.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4567 Lines: 130 kdb_printf_lock does not prevent other CPUs from entering the critical section because it is ignored when KDB_STATE_PRINTF_LOCK is set. The problematic situation might look like: CPU0 CPU1 vkdb_printf() if (!KDB_STATE(PRINTF_LOCK)) KDB_STATE_SET(PRINTF_LOCK); spin_lock_irqsave(&kdb_printf_lock, flags); vkdb_printf() if (!KDB_STATE(PRINTF_LOCK)) BANG: The PRINTF_LOCK state is set and CPU1 is entering the critical section without spinning on the lock. The problem is that the code tries to implement locking using two state variables that are not handled atomically. Well, we need a custom locking because we want to allow reentering the critical section on the very same CPU. Let's use solution from Petr Zijlstra that was proposed for a similar scenario, see https://lkml.kernel.org/r/20161018171513.734367391@infradead.org This patch uses the same trick with cmpxchg(). The only difference is that we want to handle only recursion from the same context and therefore we disable interrupts. In addition, we want to be on the safe side and update "kdb_event" outside of the critical section. Therefore we need an extra barrier before it gets decremented. Note that "kdb_event" is always incremented/decremented now. But it should not cause any harm. The important information is whether it is zero or not. Finally, KDB_STATE_PRINTF_LOCK is removed. In fact, we are not able to set it a non-racy way. Signed-off-by: Petr Mladek --- kernel/debug/kdb/kdb_io.c | 36 +++++++++++++++++------------------- kernel/debug/kdb/kdb_private.h | 1 - 2 files changed, 17 insertions(+), 20 deletions(-) diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c index fc1ef736253c..227b59ec7dbe 100644 --- a/kernel/debug/kdb/kdb_io.c +++ b/kernel/debug/kdb/kdb_io.c @@ -555,16 +555,16 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap) int colcount; int logging, saved_loglevel = 0; int saved_trap_printk; - int got_printf_lock = 0; int retlen = 0; int fnd, len; + int this_cpu, old_cpu; + static int kdb_printf_cpu = -1; char *cp, *cp2, *cphold = NULL, replaced_byte = ' '; char *moreprompt = "more> "; struct console *c = console_drivers; - static DEFINE_SPINLOCK(kdb_printf_lock); unsigned long uninitialized_var(flags); - preempt_disable(); + local_irq_save(flags); saved_trap_printk = kdb_trap_printk; kdb_trap_printk = 0; @@ -572,13 +572,14 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap) * But if any cpu goes recursive in kdb, just print the output, * even if it is interleaved with any other text. */ - if (!KDB_STATE(PRINTF_LOCK)) { - KDB_STATE_SET(PRINTF_LOCK); - spin_lock_irqsave(&kdb_printf_lock, flags); - got_printf_lock = 1; - atomic_inc(&kdb_event); - } else { - __acquire(kdb_printf_lock); + this_cpu = smp_processor_id(); + atomic_inc(&kdb_event); + for (;;) { + old_cpu = cmpxchg(&kdb_printf_cpu, -1, this_cpu); + if (old_cpu == -1 || old_cpu == this_cpu) + break; + + cpu_relax(); } diag = kdbgetintenv("LINES", &linecount); @@ -847,16 +848,13 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap) suspend_grep = 0; /* end of what may have been a recursive call */ if (logging) console_loglevel = saved_loglevel; - if (KDB_STATE(PRINTF_LOCK) && got_printf_lock) { - got_printf_lock = 0; - spin_unlock_irqrestore(&kdb_printf_lock, flags); - KDB_STATE_CLEAR(PRINTF_LOCK); - atomic_dec(&kdb_event); - } else { - __release(kdb_printf_lock); - } + /* kdb_printf_cpu locked the code above. */ + smp_store_release(&kdb_printf_cpu, old_cpu); + /* Update kdb_event around the locked section. */ + smp_mb__before_atomic(); + atomic_dec(&kdb_event); kdb_trap_printk = saved_trap_printk; - preempt_enable(); + local_irq_restore(flags); return retlen; } diff --git a/kernel/debug/kdb/kdb_private.h b/kernel/debug/kdb/kdb_private.h index 75014d7f4568..fc224fbcf954 100644 --- a/kernel/debug/kdb/kdb_private.h +++ b/kernel/debug/kdb/kdb_private.h @@ -132,7 +132,6 @@ extern int kdbgetaddrarg(int, const char **, int*, unsigned long *, #define KDB_STATE_PAGER 0x00000400 /* pager is available */ #define KDB_STATE_GO_SWITCH 0x00000800 /* go is switching * back to initial cpu */ -#define KDB_STATE_PRINTF_LOCK 0x00001000 /* Holds kdb_printf lock */ #define KDB_STATE_WAIT_IPI 0x00002000 /* Waiting for kdb_ipi() NMI */ #define KDB_STATE_RECURSE 0x00004000 /* Recursive entry to kdb */ #define KDB_STATE_IP_ADJUSTED 0x00008000 /* Restart IP has been -- 1.8.5.6