Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2045527imm; Sun, 23 Sep 2018 18:48:13 -0700 (PDT) X-Google-Smtp-Source: ACcGV626iz7undW73nguLcfgNcXX0nDTO4snTxUXPCGaVtq+CQFgfqIHurgtPO+WEf7yvmxWVSDC X-Received: by 2002:a65:6086:: with SMTP id t6-v6mr7596798pgu.424.1537753693925; Sun, 23 Sep 2018 18:48:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537753693; cv=none; d=google.com; s=arc-20160816; b=FekItj2O09E3As6U29w7qA4nrvblZrU6rFOpYdvRkhrDELXNgGk9Jwi084ktXoXEZx QtDKKPIIqBlHcdFdJj/UvGqYGtlQ8JZMObIgNXtqgNcgC/xlMyayGM5bcaSr8CrWXFRd Kmj/QRTfMOTMuVa7j7eq2I+8ViCxEt7sXoCHo02viazsT7SpPVLL57ZM06n+WpJJ1VJ6 yfqRUv3RWXqgp0yRHsn8zugkAVN7oxwMKpYmcBZSi/5Im/EGvNYh+jHJCdsYh1s5jZMV jnmaEQ5mIodl3hPnsQtxspdnfrdI7gC7RfASZzEukSgpTFkSD2OE6VAXiTmmnPjIkq26 hMMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date; bh=knRhRwoZBPllDm1PZgEU45DYp6mnQHSMkR0Xc0BWCwE=; b=Zv3aQ89DYj7ckJlntCDKP3YXzGcb2piCORXcFILEgrnKZVzmCVlTIhZ8AiRd1QOabt eR8Oqo+wTo6bswpK5vJ48NiY08c7VjVsh+owWlBUGx9FmqE6at7y4pcW7kFb+WKjo3U0 yRc6qInfjZHD9GeOWSfT0+kU6/y1e4fNo1hAVu6qhKS7d8ROTrg+Cw3alL/f3kx6BBVD MVhYklQJBtzxOzL+mAfmnpjhWwBbUtOSFAAL5S93+BvLQk0rvZ2PNnFXTFLYZhuw8nBC nCSWCZAvNq9mD4AhmL8GWifMauFJjuXWr4HgwWIDXvhPAGt3dPoMucT487KsbO0Gcw6p 1/Ag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r25-v6si34995644pga.489.2018.09.23.18.47.57; Sun, 23 Sep 2018 18:48:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727338AbeIXHri (ORCPT + 99 others); Mon, 24 Sep 2018 03:47:38 -0400 Received: from g9t5009.houston.hpe.com ([15.241.48.73]:37482 "EHLO g9t5009.houston.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726053AbeIXHri (ORCPT ); Mon, 24 Sep 2018 03:47:38 -0400 Received: from g9t2301.houston.hpecorp.net (g9t2301.houston.hpecorp.net [16.220.97.129]) by g9t5009.houston.hpe.com (Postfix) with ESMTP id 7EAAD28B; Mon, 24 Sep 2018 01:47:51 +0000 (UTC) Received: from anatevka.americas.hpqcorp.net (anatevka.americas.hpqcorp.net [10.34.81.6]) by g9t2301.houston.hpecorp.net (Postfix) with ESMTP id 002FF4B; Mon, 24 Sep 2018 01:47:50 +0000 (UTC) Date: Sun, 23 Sep 2018 19:47:50 -0600 From: Jerry Hoemann To: Shuah Khan Cc: erosca@de.adit-jv.com, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] selftests: watchdog: Add gettimeout and get|set pretimeout Message-ID: <20180924014750.GA22296@anatevka.americas.hpqcorp.net> Reply-To: Jerry.Hoemann@hpe.com References: <1537570526-65241-1-git-send-email-jerry.hoemann@hpe.com> <0c6376ec-a3f4-7798-8f30-829480c41f79@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0c6376ec-a3f4-7798-8f30-829480c41f79@kernel.org> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 21, 2018 at 05:42:00PM -0600, Shuah Khan wrote: > Hi Jerry, > > Thanks for the patch. A few comments below: Replies inline. > > On 09/21/2018 04:55 PM, Jerry Hoemann wrote: > > Add command line arguments to call ioctl WDIOC_GETTIMEOUT, > > WDIOC_GETPRETIMEOUT and WDIOC_SETPRETIMEOUT. > > > > Signed-off-by: Jerry Hoemann > > --- > > tools/testing/selftests/watchdog/watchdog-test.c | 30 +++++++++++++++++++++++- > > 1 file changed, 29 insertions(+), 1 deletion(-) > > > > diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c > > index 6e29087..4861e2c 100644 > > --- a/tools/testing/selftests/watchdog/watchdog-test.c > > +++ b/tools/testing/selftests/watchdog/watchdog-test.c > > @@ -19,7 +19,7 @@ > > > > int fd; > > const char v = 'V'; > > -static const char sopts[] = "bdehp:t:"; > > +static const char sopts[] = "bdehp:t:Tn:N"; > > static const struct option lopts[] = { > > {"bootstatus", no_argument, NULL, 'b'}, > > {"disable", no_argument, NULL, 'd'}, > > @@ -27,6 +27,9 @@ > > {"help", no_argument, NULL, 'h'}, > > {"pingrate", required_argument, NULL, 'p'}, > > {"timeout", required_argument, NULL, 't'}, > > + {"gettimeout", no_argument, NULL, 'T'}, > > + {"pretimeout", required_argument, NULL, 'n'}, > > + {"getpretimeout", no_argument, NULL, 'N'}, > > {NULL, no_argument, NULL, 0x0} > > }; > > > > @@ -71,6 +74,9 @@ static void usage(char *progname) > > printf(" -h, --help Print the help message\n"); > > printf(" -p, --pingrate=P Set ping rate to P seconds (default %d)\n", DEFAULT_PING_RATE); > > printf(" -t, --timeout=T Set timeout to T seconds\n"); > > + printf(" -T, --gettimeout Get the timeout\n"); > > + printf(" -n, --pretimeout Set the pretimeout to T seconds\n"); > > + printf(" -N, --getpretimeout Get the pretimeout\n"); > > How are the new arguments used? I forgot the param. Should be: + printf(" -n, --pretimeout=T Set the pretimeout to T seconds\n"); I'll update in v2. Is this what you mean? Or did I misunderstand? > > > printf("\n"); > > printf("Parameters are parsed left-to-right in real-time.\n"); > > printf("Example: %s -d -t 10 -p 5 -e\n", progname); > > Please add an example usage for each of these new arguments. Will do. > > > @@ -135,6 +141,28 @@ int main(int argc, char *argv[]) > > else > > printf("WDIOC_SETTIMEOUT errno '%s'\n", strerror(errno)); > > break; > > + case 'T': > > + ret = ioctl(fd, WDIOC_GETTIMEOUT, &flags); > > + if (!ret) > > + printf("Watchdog timeout set to %u seconds.\n", flags); > > It would good to make this message different from the WDIOC_SETTIMEOUT message. > Please update it to reflect that this is the result of a WDIOC_GETTIMEOUT. Will update message to make distinct. > > What would user intend to do with this GETTIMEOUT? Shouldn't this be the case that it > prints the current value and exits instead of the same logic as SETTIMEOUT option? Are you suggesting setting the "oneshot" flag so the test app doesn't actually go into the while(1) keep_alive loop? Watchdog drivers may adjust the requested value to match hardware constraints. Callers of set timeout (and set pretimeout) should call get timeout to see what value was actually set. B/c of above, I just got into the habit of specifying both flags: first set, then get to make sure value set was what I intended. But I can make the "Get" a one shot. Just let me know if this is your preference. > > > + else > > + printf("WDIOC_GETTIMEOUT errno '%s'\n", strerror(errno)) > > Shouldn't this error be an exit condition? Hmmm, I don't see this error path much different than the error path for the other failing ioctl. Am I missing something? But, If we make the "GET" a one shot, then we wouldn't really need to special case the failure case as we wouldn't go into the keep_alive loop in either case. > > > + break; > > + case 'n': > > + flags = strtoul(optarg, NULL, 0); > > + ret = ioctl(fd, WDIOC_SETPRETIMEOUT, &flags); > > + if (!ret) > > + printf("Watchdog pretimeout set to %u seconds.\n", flags); > > + else > > + printf("WDIOC_SETPRETIMEOUT errno '%s'\n", strerror(errno)); > > + break; > > + case 'N': > > + ret = ioctl(fd, WDIOC_GETPRETIMEOUT, &flags); > > + if (!ret) > > + printf("Watchdog pretimeout set to %u seconds.\n", flags); > > It would good to make this message different from the WDIOC_GETPRETIMEOUT message. > Please update it to reflect that this is the result of a WDIOC_GETPRETIMEOUT will do. > > What would user intend to do with this GETTIMEOUT? Shouldn't this be the case that it > prints the current value and exits instead of the same logic as WDIOC_SETPRETIMEOUT? I think you're just asking me to set the "oneshot" flag on this, which I can certainly do. But, some background on pretimeout that (I think) is interesting: The underling HW for the watchdog on proliants allows for the pre-timeout to be enabled or disabled. But if the pretimeout is enabled, the value of the pretimeout is hard coded by HW (9 seconds.) The hpwdt driver allows for setting pretimeout by passing in a value 0 < pretimeout < timeout to enable a pretimeout. The user then needs to call get pretimeout to determine the actual value. Failure to take into account the pretimeout when pinging the WD can lead to unexpected system crashes. I've handled the following issue multiple times: A user wants to set the timeout to value T and ping the WD every T/2 seconds. He fails to take into account the pretimeout value of P. The system crashes with the pretimeout NMI when (T/2) < P. The basic misunderstanding is that to prevent the WD from acting, the WD only needs to be pinged at least once every T seconds, when in actuality the WD needs to be pinged at least once every (T-P) seconds. Specifically for Proliants, I've seen people set the timeout to 10 seconds thinking they had plenty of time to ping the WD only to be surprised when the pretimeout NMI takes the system down 1 second later. Note: a WD doesn't need to support the pretimeout feature. > > > + else > > + printf("WDIOC_GETPRETIMEOUT errno '%s'\n", strerror(errno)); > > Shouldn't this error be an exit condition? Similar to above. I can make GETPRETIMEOUT a "oneshot" to handle both the success/failing case of the ioctl call. > > > + break; > > default: > > usage(argv[0]); > > goto end; > > > > Also can you run this test as normal user? No. Must be run as root to open /dev/watchdog. When /dev/watchdog is opened, the WD is started and if not updated properly, the system will crash. "cat /dev/watchdog" is one of my favorite ways to crash a system. :) :) > > thanks, > -- Shuah -- ----------------------------------------------------------------------------- Jerry Hoemann Software Engineer Hewlett Packard Enterprise -----------------------------------------------------------------------------