Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751838AbbHEIXD (ORCPT ); Wed, 5 Aug 2015 04:23:03 -0400 Received: from metis.ext.pengutronix.de ([92.198.50.35]:34556 "EHLO metis.ext.pengutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751433AbbHEIXA (ORCPT ); Wed, 5 Aug 2015 04:23:00 -0400 Date: Wed, 5 Aug 2015 10:22:52 +0200 From: Uwe =?iso-8859-1?Q?Kleine-K=F6nig?= To: Guenter Roeck Cc: Timo Kokkonen , linux-watchdog@vger.kernel.org, Jonathan Corbet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Wim Van Sebroeck , kernel@pengutronix.de Subject: Re: [PATCH 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Message-ID: <20150805082252.GX9999@pengutronix.de> References: <1438654414-29259-1-git-send-email-linux@roeck-us.net> <1438654414-29259-3-git-send-email-linux@roeck-us.net> <20150804121816.GM9999@pengutronix.de> <55C0DADF.9050505@roeck-us.net> <20150804155220.GV9999@pengutronix.de> <55C0E24F.5020802@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <55C0E24F.5020802@roeck-us.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::c0 X-SA-Exim-Mail-From: ukl@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3709 Lines: 84 Hello Guenter, On Tue, Aug 04, 2015 at 09:03:27AM -0700, Guenter Roeck wrote: > On 08/04/2015 08:52 AM, Uwe Kleine-K?nig wrote: > >On Tue, Aug 04, 2015 at 08:31:43AM -0700, Guenter Roeck wrote: > >>On 08/04/2015 05:18 AM, Uwe Kleine-K?nig wrote: > >>>On Mon, Aug 03, 2015 at 07:13:28PM -0700, Guenter Roeck wrote: > >>>>structure. If the configured timeout exceeds half the value of the > >>>>maximum hardware timeout, the watchdog core enables a timer function > >>>>to assist sending keepalive requests to the watchdog driver. > >>>I don't understand why you want to halve the maximum hw-timeout. If my > >>>watchdog has hw-max-timeout = 5s and userspace sets it to 3s there > >>>should be no need for assistance?! I think the implementation is the > >>>other way round? > >>> > >>It is supposed to reflect the _maximum_ timeout. That is different to > >>the time between heartbeats, which is supposed to be less; using half > >>the value of the maximum hardware timeout seemed to be a safe number. > >Right, I got that. With hw-max-timeout = 5s the machine resets after 5s > >not caring for the device. And so pinging repeatedly after 2.5s is fine. > >But if userspace sets a timeout of 3s (probably with the intention to > >ping with a frequency of 1/1.5s) there is no need for worker-assistance, > >because the pings coming in each 1.5s provided by userspace are good > >enough. > > > Yes, that is how it is supposed to work. So for the changelog you want: If the configured timeout exceeds the maximum hardware timeout the watchdog core enables a timer function ... right? > >>>>+static inline bool watchdog_need_worker(struct watchdog_device *wdd) > >>>>+{ > >>>>+ unsigned int hm = wdd->max_hw_timeout_ms; > >>>>+ unsigned int m = wdd->max_timeout * 1000; > >>>>+ > >>>>+ return watchdog_active(wdd) && hm && hm != m && > >>>>+ wdd->timeout * 500 > hm; One problem with the worker I see is that the reset will probably be delayed with your worker. Consider userspace sets timeout = 10 s because if the main application doesn't work for 12 s something dangerous can happen. (Consider a guillotine where the blade can only be hold up for 12 s when not locked. :-) Now if the hw-max-timeout is 9s you setup a timer to ping at $last_keepalive + 4.5 s and $last_keepalive + 9 s (not taking timer and system latency into account). That means the system only resets 18 s after the last userspace ping. Oops. So ideally you send the last auto-ping at $last_keepalive + $configured_timeout - $hw-max-timeout (assuming the hardware is configured for $hw-max-timeout). > >>>I don't understand what max_timeout is now that there is max_hw_timeout. > >>>So I don't understand why you need hm != m either. > >>> > >> > >>Backward compatibility. A driver which does not set max_hw_timeout_ms, > >>or sets both to the same value, by definition expects to handle everything > >>internally, and thus no worker is configured. > >And a driver that does > > > > max_timeout = 5 > > max_hw_timeout = 5125 > > > >falls through the cracks. > > > Hmm - not that this configuration makes any sense, but you are right. > I'll make it "hm < m". It does not? What do you expect max_timeout to be set to if the maximal hw-timeout is 5125 ms? 0 would work, but IMHO you need some more documentation then. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-K?nig | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/