Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp8134145ybl; Thu, 16 Jan 2020 11:13:12 -0800 (PST) X-Google-Smtp-Source: APXvYqyZRatzEy9XHt136l6PqRw8tTuIVJvxrCyTMM8QGE24TYmM/HOKnhcQjW51KxPtnZ96UflP X-Received: by 2002:a05:6830:145:: with SMTP id j5mr3160525otp.242.1579201992290; Thu, 16 Jan 2020 11:13:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579201992; cv=none; d=google.com; s=arc-20160816; b=jek1P1PBJSiowaySG+uQwsKJ0TbrYj4lvB3BtBd09DCjbQAOI4E2H5MaC9nVHbS1C+ CnTG0dgqcfDWCGvt0U4wGkXEwPoPH2GQUio1bXCws/qKEBNFaHR+W1FvQSu1KIPo4Oy5 +eIacNoJwj7rUGQdItELRYkzym5xN7cM5QaNSophAed7Nu545hBXFjpdZgteAKNo+z9x uSoFzpKkxHJolPJa9d/PALNgpHRbSFsfINl/RrOYfsO+Yn0mr0o93+JYV8e7uqBGtFce AMFkFQGO1Q4QDEwxHgVolQU6OPp7BAxvoZ6Nec34HNOlAKFuKugdEY+u0ysZI8QezbWJ wagw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:in-reply-to:subject:cc:to:from; bh=VN+zqVB4KPZFbWorAezBxJcZijzlQdxh6THPna2eg5A=; b=qDbQ0gJuBd/GpYl1lL04A1mDXP7c5/nf/fuzQiPpWmOSAMcXbPqrA6ehoQtFoR2xgX MGu6xc5Fu5caX08AP3q/r8iHwr/Lr4xkPvLw4icg/8kaHHXsqmonbOL2jC8P25mTzqhI 3KR+JdipKe3bJzoUEXuEMyEWaZ+x8uxiDrofUBY7L5eTp7CPFEBmZllSfr7Ll1V6Ipwo Mq59qNJo871MJxEdtnEKAowElOdjrHl5BcmdPxOVKySgrW/GGXCOIlw4HpydjbeirDsy U3ZyJOlG57IZD8vQ09CGnr/d6gaF9vX4SIUxjO7VGlpFGHJVFm+UPn+Y1YlsGmDuMtBV tzXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y6si12138591oih.217.2020.01.16.11.13.00; Thu, 16 Jan 2020 11:13:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437310AbgAPTKf convert rfc822-to-8bit (ORCPT + 99 others); Thu, 16 Jan 2020 14:10:35 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:53049 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437450AbgAPTK2 (ORCPT ); Thu, 16 Jan 2020 14:10:28 -0500 Received: from p5b06da22.dip0.t-ipconnect.de ([91.6.218.34] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1isAX2-0007HG-TQ; Thu, 16 Jan 2020 20:10:17 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 5F3FB101226; Thu, 16 Jan 2020 20:10:16 +0100 (CET) From: Thomas Gleixner To: Waiman Long , Robert Richter Cc: Andrew Morton , Ingo Molnar , "linux-kernel\@vger.kernel.org" , Mike Rapoport , Kees Cook , Catalin Marinas , Will Deacon , Peter Zijlstra Subject: Re: [PATCH v2] watchdog: Fix possible soft lockup warning at bootup In-Reply-To: <9ae2ee4d-7b67-50ff-e736-1d51753c5ccd@redhat.com> Date: Thu, 16 Jan 2020 20:10:16 +0100 Message-ID: <87ftgffc9z.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Waiman Long writes: > On 1/16/20 11:57 AM, Thomas Gleixner wrote: >>> So your theory the MONOTONIC clock runs differently/wrongly could >>> explain that (assuming this drives the sched clock). Though, I am >> No. sched_clock() is separate. It uses a raw timestamp (in your case >> from the ARM arch timer) and converts it to something which is close to >> proper time. So my assumption was based on the printout Waiman had: >> >> [ 1... ] CPU.... watchdog_fn now 170000000 >> [ 25.. ] CPU.... watchdog_fn now 4170000000 >> >> I assumed that now comes from ktime_get() or something like >> that. Waiman? > > I printed out the now parameter of theĀ  __hrtimer_run_queues() call. Yes. That's clock MONOTONIC. > So from the timer perspective, it is losing time. For watchdog, the soft > expiry time is 4s. The watchdog function won't be called until the > timer's time advances 4s or more. That corresponds to about 24s in > timestamp time for that particular class of systems. Right. And assumed that the firmware call is the culprit this has an explanation. Could you please take sched_clock() timestamps before and after the firmware call which kicks the secondary CPUs into life to verify that? They should sum up to the amount of time which gets lost accross smp_init(). Thanks, tglx