From: Douglas Anderson <dianders@chromium.org>
To: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
        Andrew Morton <akpm@linux-foundation.org>, briannorris@chromium.org,
        Douglas Anderson <dianders@chromium.org>,
        kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: [PATCH] debug: More properly delay for secondary CPUs
Date: Fri, 14 Oct 2016 11:41:21 -0700
Message-Id: <1476470481-4879-1-git-send-email-dianders@chromium.org>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2132
Lines: 50

We've got a delay loop waiting for secondary CPUs.  That loop uses
loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures.  It is quite
common to see things like this in the boot log:
  Calibrating delay loop (skipped), value calculated using timer
  frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugging only to print their stack crawls shortly after
the kdb> prompt was written.

It appears that other code with similar loops (like __spin_lock_debug)
adds an extra __delay(1) in there which makes it work better.
Presumably the __delay(1) is very safe.  At least on modern ARM/ARM64
systems it will just do a CP15 instruction, which should be safe.  On
older ARM systems it will fall back to an actual delay loop, or perhaps
another memory-mapped timer.  On other platforms it must be safe too or
it wouldn't be used in __spin_lock_debug.

Note that we use __delay(100) instead of __delay(1) so we can get a
little closer to a more accurate timeout on systems where __delay() is
backed by a timer.  It's better to have a more accurate timeout and the
only penalty is that we might wait an extra 99 "loops" before we enter
the debugger.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
 kernel/debug/debug_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 0874e2edd275..454150d98dbc 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 	/*
 	 * Wait for the other CPUs to be notified and be waiting for us:
 	 */
-	time_left = loops_per_jiffy * HZ;
+	time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100);
 	while (kgdb_do_roundup && --time_left &&
 	       (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
 		   online_cpus)
-		cpu_relax();
+		__delay(100);
 	if (!time_left)
 		pr_crit("Timed out waiting for secondary CPUs.\n");
 
-- 
2.8.0.rc3.226.g39d4020