Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933068AbcKVMpT (ORCPT ); Tue, 22 Nov 2016 07:45:19 -0500 Received: from mx2.suse.de ([195.135.220.15]:33745 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932329AbcKVMpS (ORCPT ); Tue, 22 Nov 2016 07:45:18 -0500 Date: Tue, 22 Nov 2016 13:45:16 +0100 From: Petr Mladek To: Daniel Thompson Cc: Jason Wessel , Peter Zijlstra , Andrew Morton , Sergey Senozhatsky , linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] kdb: Call vkdb_printf() from vprintk_default() only when wanted Message-ID: <20161122124516.GD8220@pathway.suse.cz> References: <1477054235-1624-1-git-send-email-pmladek@suse.com> <1477054235-1624-3-git-send-email-pmladek@suse.com> <73b8fe23-4fc7-9d56-ed78-a3d6b398ad74@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <73b8fe23-4fc7-9d56-ed78-a3d6b398ad74@linaro.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2902 Lines: 84 On Mon 2016-11-07 10:24:22, Daniel Thompson wrote: > On 21/10/16 13:50, Petr Mladek wrote: > >kdb_trap_printk allows to pass normal printk() messages to kdb via > >vkdb_printk(). For example, it is used to get backtrace using > >the classic show_stack(), see kdb_show_stack(). > > > >vkdb_printf() tries to avoid a potential infinite loop by disabling > >the trap. But this approach is racy, for example: > > > >CPU1 CPU2 > > > >vkdb_printf() > > // assume that kdb_trap_printk == 0 > > saved_trap_printk = kdb_trap_printk; > > kdb_trap_printk = 0; > > > > kdb_show_stack() > > kdb_trap_printk++; > > When kdb is running any of the commands that use kdb_trap_printk > there is a single active CPU and the other CPUs should be in a > holding pen inside kgdb_cpu_enter(). > > The only time this is violated is when there is a timeout waiting > for the other CPUs to report to the holding pen. It means that the race window is small but it is there. Do I get it correctly, please? Thanks a lot for explanation. I was not sure how exactly this worked. I only saw the games with kdb_printf_cpu in vkdb_printf(). Therefore I expected that some parallelism was possible. > > >diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > >index d5e397315473..db73e33811e7 100644 > >--- a/kernel/printk/printk.c > >+++ b/kernel/printk/printk.c > >@@ -1941,7 +1941,9 @@ int vprintk_default(const char *fmt, va_list args) > > int r; > > > > #ifdef CONFIG_KGDB_KDB > >- if (unlikely(kdb_trap_printk)) { > >+ /* Allow to pass printk() to kdb but avoid a recursion. */ > >+ if (unlikely(kdb_trap_printk && > >+ kdb_printf_cpu != smp_processor_id())) { > > Firstly, why !=? > > Secondly, if kdb_trap_printk is set and the "wrong" CPU calls printk > then we have an opportunity to trap a rouge processor in the holding > pen meaning the test should probably be part of vkdb_printk() > anyway. I agree that it is confusing: On one hand, vkdb_printf() explicitly allows recursion on the same CPU. See the handling of kdb_printf_lock before the 1st patch from this series. Also it mentioned by the comment: /* Serialize kdb_printf if multiple cpus try to write at once. * But if any cpu goes recursive in kdb, just print the output, * even if it is interleaved with any other text. */ On the other hand. The lines saved_trap_printk = kdb_trap_printk; kdb_trap_printk = 0; means that someone wanted to explicitly disable recursion via the generic printk(). This is the reason why I used "!=" and why I added this check into vprintk_default(). By other words, we allow recursion caused by kdb internal messages that are printed directly by kdb_printf()). But we disable recursion caused by all other messages that are printed using the generic printk(). This patch keeps the logic. It might make some sense. But it is hard for me to judge. Best Regards, Petr