Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756113AbcJNTPL (ORCPT ); Fri, 14 Oct 2016 15:15:11 -0400 Received: from mail-pf0-f180.google.com ([209.85.192.180]:32855 "EHLO mail-pf0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755482AbcJNTPC (ORCPT ); Fri, 14 Oct 2016 15:15:02 -0400 From: Douglas Anderson To: Jason Wessel Cc: Daniel Thompson , Andrew Morton , briannorris@chromium.org, Douglas Anderson , kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: [PATCH] debug: More properly delay for secondary CPUs Date: Fri, 14 Oct 2016 11:41:21 -0700 Message-Id: <1476470481-4879-1-git-send-email-dianders@chromium.org> X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2132 Lines: 50 We've got a delay loop waiting for secondary CPUs. That loop uses loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how many tight loops make up a jiffy on all architectures. It is quite common to see things like this in the boot log: Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=24000) In my case I was seeing lots of cases where other CPUs timed out entering the debugging only to print their stack crawls shortly after the kdb> prompt was written. It appears that other code with similar loops (like __spin_lock_debug) adds an extra __delay(1) in there which makes it work better. Presumably the __delay(1) is very safe. At least on modern ARM/ARM64 systems it will just do a CP15 instruction, which should be safe. On older ARM systems it will fall back to an actual delay loop, or perhaps another memory-mapped timer. On other platforms it must be safe too or it wouldn't be used in __spin_lock_debug. Note that we use __delay(100) instead of __delay(1) so we can get a little closer to a more accurate timeout on systems where __delay() is backed by a timer. It's better to have a more accurate timeout and the only penalty is that we might wait an extra 99 "loops" before we enter the debugger. Signed-off-by: Douglas Anderson --- kernel/debug/debug_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 0874e2edd275..454150d98dbc 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, /* * Wait for the other CPUs to be notified and be waiting for us: */ - time_left = loops_per_jiffy * HZ; + time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100); while (kgdb_do_roundup && --time_left && (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) != online_cpus) - cpu_relax(); + __delay(100); if (!time_left) pr_crit("Timed out waiting for secondary CPUs.\n"); -- 2.8.0.rc3.226.g39d4020