Subject: Re: [PATCH v2] debug: More properly delay for secondary CPUs
To: Douglas Anderson <dianders@chromium.org>,
        Jason Wessel <jason.wessel@windriver.com>
References: <1477091361-2039-1-git-send-email-dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, briannorris@chromium.org,
        kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org
From: Daniel Thompson <daniel.thompson@linaro.org>
Message-ID: <edefd1a4-8ffd-aa49-b695-485febcdf306@linaro.org>
Date: Mon, 24 Oct 2016 10:15:44 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.3.0
MIME-Version: 1.0
In-Reply-To: <1477091361-2039-1-git-send-email-dianders@chromium.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2391
Lines: 71

On 22/10/16 00:09, Douglas Anderson wrote:
> We've got a delay loop waiting for secondary CPUs.  That loop uses
> loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
> many tight loops make up a jiffy on all architectures.  It is quite
> common to see things like this in the boot log:
>   Calibrating delay loop (skipped), value calculated using timer
>   frequency.. 48.00 BogoMIPS (lpj=24000)
>
> In my case I was seeing lots of cases where other CPUs timed out
> entering the debugger only to print their stack crawls shortly after the
> kdb> prompt was written.
>
> Elsewhere in kgdb we already use udelay(), so that should be safe enough
> to use to implement our timeout.  We'll delay 1 ms for 1000 times, which
> should give us a full second of delay (just like the old code wanted)
> but allow us to notice that we're done every 1 ms.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> Changes in v2:
> - Use udelay, not __delay
>
>  kernel/debug/debug_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 0874e2edd275..85a246feb442 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -61,6 +61,8 @@
>
>  #include "debug_core.h"
>
> +#define WAIT_CPUS_STOP_MS	1000
> +
>  static int kgdb_break_asap;
>
>  struct debuggerinfo_struct kgdb_info[NR_CPUS];
> @@ -598,11 +600,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
>  	/*
>  	 * Wait for the other CPUs to be notified and be waiting for us:
>  	 */
> -	time_left = loops_per_jiffy * HZ;
> +	time_left = WAIT_CPUS_STOP_MS;

Might be nit picking but a named constant used only one, with 500 lines 
between defn and use and with a slightly cryptic name doesn't make the 
code easier to read.

Perhaps just:

time_left = MSEC_PER_SEC;

>  	while (kgdb_do_roundup && --time_left &&
>  	       (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
>  		   online_cpus)
> -		cpu_relax();
> +		udelay(1000);

I guess mdelay(1) might be read better but I don't care especially much 
so with the first change and with your preference on the second:

Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>

Also I think this is arguably a regression (sorry) so it might also be 
worth adding:

Cc: stable@vger.kernel.org # v4.0+


Daniel.