Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938708AbcJXJPu (ORCPT ); Mon, 24 Oct 2016 05:15:50 -0400 Received: from mail-lf0-f53.google.com ([209.85.215.53]:33925 "EHLO mail-lf0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936395AbcJXJPt (ORCPT ); Mon, 24 Oct 2016 05:15:49 -0400 Subject: Re: [PATCH v2] debug: More properly delay for secondary CPUs To: Douglas Anderson , Jason Wessel References: <1477091361-2039-1-git-send-email-dianders@chromium.org> Cc: Andrew Morton , briannorris@chromium.org, kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org From: Daniel Thompson Message-ID: Date: Mon, 24 Oct 2016 10:15:44 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <1477091361-2039-1-git-send-email-dianders@chromium.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2391 Lines: 71 On 22/10/16 00:09, Douglas Anderson wrote: > We've got a delay loop waiting for secondary CPUs. That loop uses > loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how > many tight loops make up a jiffy on all architectures. It is quite > common to see things like this in the boot log: > Calibrating delay loop (skipped), value calculated using timer > frequency.. 48.00 BogoMIPS (lpj=24000) > > In my case I was seeing lots of cases where other CPUs timed out > entering the debugger only to print their stack crawls shortly after the > kdb> prompt was written. > > Elsewhere in kgdb we already use udelay(), so that should be safe enough > to use to implement our timeout. We'll delay 1 ms for 1000 times, which > should give us a full second of delay (just like the old code wanted) > but allow us to notice that we're done every 1 ms. > > Signed-off-by: Douglas Anderson > --- > Changes in v2: > - Use udelay, not __delay > > kernel/debug/debug_core.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c > index 0874e2edd275..85a246feb442 100644 > --- a/kernel/debug/debug_core.c > +++ b/kernel/debug/debug_core.c > @@ -61,6 +61,8 @@ > > #include "debug_core.h" > > +#define WAIT_CPUS_STOP_MS 1000 > + > static int kgdb_break_asap; > > struct debuggerinfo_struct kgdb_info[NR_CPUS]; > @@ -598,11 +600,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, > /* > * Wait for the other CPUs to be notified and be waiting for us: > */ > - time_left = loops_per_jiffy * HZ; > + time_left = WAIT_CPUS_STOP_MS; Might be nit picking but a named constant used only one, with 500 lines between defn and use and with a slightly cryptic name doesn't make the code easier to read. Perhaps just: time_left = MSEC_PER_SEC; > while (kgdb_do_roundup && --time_left && > (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) != > online_cpus) > - cpu_relax(); > + udelay(1000); I guess mdelay(1) might be read better but I don't care especially much so with the first change and with your preference on the second: Reviewed-by: Daniel Thompson Also I think this is arguably a regression (sorry) so it might also be worth adding: Cc: stable@vger.kernel.org # v4.0+ Daniel.