2016-10-21 23:09:30

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2] debug: More properly delay for secondary CPUs

We've got a delay loop waiting for secondary CPUs. That loop uses
loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures. It is quite
common to see things like this in the boot log:
Calibrating delay loop (skipped), value calculated using timer
frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.

Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout. We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.

Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v2:
- Use udelay, not __delay

kernel/debug/debug_core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 0874e2edd275..85a246feb442 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -61,6 +61,8 @@

#include "debug_core.h"

+#define WAIT_CPUS_STOP_MS 1000
+
static int kgdb_break_asap;

struct debuggerinfo_struct kgdb_info[NR_CPUS];
@@ -598,11 +600,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
/*
* Wait for the other CPUs to be notified and be waiting for us:
*/
- time_left = loops_per_jiffy * HZ;
+ time_left = WAIT_CPUS_STOP_MS;
while (kgdb_do_roundup && --time_left &&
(atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
online_cpus)
- cpu_relax();
+ udelay(1000);
if (!time_left)
pr_crit("Timed out waiting for secondary CPUs.\n");

--
2.8.0.rc3.226.g39d4020


2016-10-24 09:15:50

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2] debug: More properly delay for secondary CPUs

On 22/10/16 00:09, Douglas Anderson wrote:
> We've got a delay loop waiting for secondary CPUs. That loop uses
> loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
> many tight loops make up a jiffy on all architectures. It is quite
> common to see things like this in the boot log:
> Calibrating delay loop (skipped), value calculated using timer
> frequency.. 48.00 BogoMIPS (lpj=24000)
>
> In my case I was seeing lots of cases where other CPUs timed out
> entering the debugger only to print their stack crawls shortly after the
> kdb> prompt was written.
>
> Elsewhere in kgdb we already use udelay(), so that should be safe enough
> to use to implement our timeout. We'll delay 1 ms for 1000 times, which
> should give us a full second of delay (just like the old code wanted)
> but allow us to notice that we're done every 1 ms.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
> Changes in v2:
> - Use udelay, not __delay
>
> kernel/debug/debug_core.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 0874e2edd275..85a246feb442 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -61,6 +61,8 @@
>
> #include "debug_core.h"
>
> +#define WAIT_CPUS_STOP_MS 1000
> +
> static int kgdb_break_asap;
>
> struct debuggerinfo_struct kgdb_info[NR_CPUS];
> @@ -598,11 +600,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
> /*
> * Wait for the other CPUs to be notified and be waiting for us:
> */
> - time_left = loops_per_jiffy * HZ;
> + time_left = WAIT_CPUS_STOP_MS;

Might be nit picking but a named constant used only one, with 500 lines
between defn and use and with a slightly cryptic name doesn't make the
code easier to read.

Perhaps just:

time_left = MSEC_PER_SEC;

> while (kgdb_do_roundup && --time_left &&
> (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
> online_cpus)
> - cpu_relax();
> + udelay(1000);

I guess mdelay(1) might be read better but I don't care especially much
so with the first change and with your preference on the second:

Reviewed-by: Daniel Thompson <[email protected]>

Also I think this is arguably a regression (sorry) so it might also be
worth adding:

Cc: [email protected] # v4.0+


Daniel.