2015-06-02 04:06:10

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 0/7] Watchdog: introduce ARM SBSA watchdog driver

From: Fu Wei <[email protected]>

This patchset:
(1)Introduce Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
for FDT info of SBSA Generic Watchdog, and give two examples of
adding SBSA Generic Watchdog device node into the dts files:
foundation-v8.dts and amd-seattle-soc.dtsi.

(2)Introduce ACPI GTDT parser: drivers/acpi/gtdt.c
Parse SBSA Generic Watchdog Structure in GTDT table of ACPI,
and create a platform device with that information.
This platform device can be used by This Watchdog driver.
drivers/clocksource/arm_arch_timer.c is simplified by this GTDT support.

(3)Introduce "pretimeout" into the watchdog framework, and update
Documentation/watchdog/watchdog-kernel-api.txt to introduce:
(1)the new elements in the watchdog_device and watchdog_ops struct;
(2)the new API "watchdog_init_timeouts".

(4)Introduce ARM SBSA watchdog driver:
a.Use linux kernel watchdog framework;
b.Work with FDT on ARM64;
c.Use "pretimeout" in watchdog framework;
d.In first timeout, do panic to save system context;
e.Support getting timeout and pretimeout from parameter and FDT
at the driver init stage.

This patchset has been tested with watchdog daemon
(ACPI/FDT, module/build-in) on the following platforms:
(1)ARM Foundation v8 model

Changelog:
v4: Refactor GTDT support code: remove it from arch/arm64/kernel/acpi.c,
put it into drivers/acpi/gtdt.c file.
Integrate the GTDT code of drivers/clocksource/arm_arch_timer.c into
drivers/acpi/gtdt.c.
Improve pretimeout support, fix "pretimeout == 0" problem.
Simplify sbsa_gwdt driver:
(1)timeout/pretimeout limits setup;
(2)keepalive function;
(3)delete "clk == 0" check;
(4)delete WS0 status bit check in interrupt routine;
(5)sbsa_gwdt_set_wcv function.

v3: Delete "export arch_timer_get_rate" patch.
Driver back to use arch_timer_get_cntfrq.
Improve watchdog_init_timeouts function and update relevant documentation.
Improve watchdog_timeout_invalid and watchdog_pretimeout_invalid.
Improve foundation-v8.dts: delete the unnecessary tag of device node.
Remove "ARM64 || COMPILE_TEST" from Kconfig.
Add comments in arch/arm64/kernel/acpi.c
Fix typoes and incorrect comments.

v2: Improve watchdog-kernel-api.txt documentation for pretimeout support.
Export "arch_timer_get_rate" in arm_arch_timer.c.
Add watchdog_init_timeouts API for pretimeout support in framework.
Improve suspend and resume foundation in driver
Improve timeout/pretimeout values init code in driver.
Delete unnecessary items of the sbsa_gwdt struct and #define.
Delete all unnecessary debug info in driver.
Fix 64bit division bug.
Use the arch_timer interface to get watchdog clock rate.
Add MODULE_DEVICE_TABLE for platform device id.
Fix typoes.

v1: The first version upstream patchset to linux mailing list.

Fu Wei (7):
Documentation: add sbsa-gwdt.txt documentation
ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts
ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi
Watchdog: introdouce "pretimeout" into framework
Watchdog: introduce ARM SBSA watchdog driver
ACPI: add GTDT table parse driver into ACPI driver
clocksource: simplify ACPI code in arm_arch_timer.c

.../devicetree/bindings/watchdog/sbsa-gwdt.txt | 36 ++
Documentation/watchdog/watchdog-kernel-api.txt | 47 ++-
arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi | 11 +
arch/arm64/boot/dts/arm/foundation-v8.dts | 10 +
arch/arm64/kernel/time.c | 4 +-
drivers/acpi/Kconfig | 9 +
drivers/acpi/Makefile | 1 +
drivers/acpi/gtdt.c | 180 +++++++++
drivers/clocksource/Kconfig | 1 +
drivers/clocksource/arm_arch_timer.c | 60 +--
drivers/watchdog/Kconfig | 12 +
drivers/watchdog/Makefile | 1 +
drivers/watchdog/sbsa_gwdt.c | 426 +++++++++++++++++++++
drivers/watchdog/watchdog_core.c | 115 ++++--
drivers/watchdog/watchdog_dev.c | 53 +++
include/clocksource/arm_arch_timer.h | 8 +
include/linux/acpi.h | 5 +
include/linux/clocksource.h | 4 +-
include/linux/watchdog.h | 33 +-
19 files changed, 927 insertions(+), 89 deletions(-)
create mode 100644 Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
create mode 100644 drivers/acpi/gtdt.c
create mode 100644 drivers/watchdog/sbsa_gwdt.c

--
1.8.3.1


2015-06-02 04:06:31

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 1/7] Documentation: add sbsa-gwdt.txt documentation

From: Fu Wei <[email protected]>

The sbsa-gwdt.txt documentation in devicetree/bindings/watchdog is for
introducing SBSA(Server Base System Architecture) Generic Watchdog
device node info into FDT.

Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Fu Wei <[email protected]>
---
.../devicetree/bindings/watchdog/sbsa-gwdt.txt | 36 ++++++++++++++++++++++
1 file changed, 36 insertions(+)
create mode 100644 Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt

diff --git a/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
new file mode 100644
index 0000000..010e5c4
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/sbsa-gwdt.txt
@@ -0,0 +1,36 @@
+* SBSA(Server Base System Architecture) Generic Watchdog
+
+The SBSA Generic Watchdog Timer is used for resetting the system after
+two stages of timeout.
+More details: ARM-DEN-0029 - Server Base System Architecture (SBSA)
+
+Required properties:
+- compatible : Should at least contain "arm,sbsa-gwdt".
+
+- reg : base physical address of the frames and length of memory mapped region.
+
+- reg-names : Should contain the resource reg names to show the order of
+ the values in "reg".
+ Must include the following entries : "refresh", "control".
+
+- interrupts : Should at least contain WS0 interrupt,
+ the WS1 Signal is optional.
+
+- interrupt-names : Should contain the resource interrupt names.
+ Must include the following entries : "ws0". "ws1" is optional.
+
+Optional properties
+- timeout-sec : Watchdog pre-timeout and timeout values (in seconds).
+ The first is timeout values, then pre-timeout.
+
+Example for FVP Foundation Model v8:
+
+watchdog@2a440000 {
+ compatible = "arm,sbsa-gwdt";
+ reg = <0x0 0x2a440000 0 0x10000>,
+ <0x0 0x2a450000 0 0x10000>;
+ reg-names = "control", "refresh";
+ interrupts = <0 27 4>;
+ interrupt-names = "ws0";
+ timeout-sec = <10 5>;
+};
--
1.9.1

2015-06-02 04:06:37

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 2/7] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts

From: Fu Wei <[email protected]>

This can be a example of adding SBSA Generic Watchdog device node
into some dts files for the Soc which contains SBSA Generic Watchdog.

Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Fu Wei <[email protected]>
---
arch/arm64/boot/dts/arm/foundation-v8.dts | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/arm/foundation-v8.dts b/arch/arm64/boot/dts/arm/foundation-v8.dts
index 4eac8dc..962a07e 100644
--- a/arch/arm64/boot/dts/arm/foundation-v8.dts
+++ b/arch/arm64/boot/dts/arm/foundation-v8.dts
@@ -237,4 +237,14 @@
};
};
};
+ watchdog@2a440000 {
+ compatible = "arm,sbsa-gwdt";
+ reg = <0x0 0x2a440000 0 0x10000>,
+ <0x0 0x2a450000 0 0x10000>;
+ reg-names = "control",
+ "refresh";
+ interrupts = <0 27 4>;
+ interrupt-names = "ws0";
+ timeout-sec = <10 5>;
+ };
};
--
1.9.1

2015-06-02 04:08:18

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 3/7] ARM64: add SBSA Generic Watchdog device node in amd-seattle-soc.dtsi

From: Fu Wei <[email protected]>

This can be a example of adding SBSA Generic Watchdog device node
into some dts files for the Soc which contains SBSA Generic Watchdog.

Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Suravee Suthikulpanit <[email protected]>
Tested-by: Suravee Suthikulpanit <[email protected]>
Signed-off-by: Fu Wei <[email protected]>
---
arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
index 2874d92..95994eb 100644
--- a/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
+++ b/arch/arm64/boot/dts/amd/amd-seattle-soc.dtsi
@@ -84,6 +84,17 @@
clock-names = "uartclk", "apb_pclk";
};

+ watchdog0: watchdog@e0bb0000 {
+ compatible = "arm,sbsa-gwdt";
+ reg = <0x0 0xe0bb0000 0 0x10000>,
+ <0x0 0xe0bc0000 0 0x10000>;
+ reg-names = "refresh",
+ "control";
+ interrupts = <0 337 4>;
+ interrupt-names = "ws0";
+ timeout-sec = <10 5>;
+ };
+
spi0: ssp@e1020000 {
status = "disabled";
compatible = "arm,pl022", "arm,primecell";
--
1.9.1

2015-06-02 04:06:51

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 4/7] Watchdog: introdouce "pretimeout" into framework

From: Fu Wei <[email protected]>

Also update Documentation/watchdog/watchdog-kernel-api.txt to
introduce:
(1)the new elements in the watchdog_device and watchdog_ops struct;
(2)the new API "watchdog_init_timeouts"

Reasons:
(1)kernel already has two watchdog drivers are using "pretimeout":
drivers/char/ipmi/ipmi_watchdog.c
drivers/watchdog/kempld_wdt.c(but the definition is different)
(2)some other drivers are going to use this: ARM SBSA Generic Watchdog

Signed-off-by: Fu Wei <[email protected]>
---
Documentation/watchdog/watchdog-kernel-api.txt | 47 ++++++++--
drivers/watchdog/watchdog_core.c | 115 +++++++++++++++++++------
drivers/watchdog/watchdog_dev.c | 53 ++++++++++++
include/linux/watchdog.h | 33 ++++++-
4 files changed, 212 insertions(+), 36 deletions(-)

diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
index a0438f3..95b355d 100644
--- a/Documentation/watchdog/watchdog-kernel-api.txt
+++ b/Documentation/watchdog/watchdog-kernel-api.txt
@@ -49,6 +49,9 @@ struct watchdog_device {
unsigned int timeout;
unsigned int min_timeout;
unsigned int max_timeout;
+ unsigned int pretimeout;
+ unsigned int min_pretimeout;
+ unsigned int max_pretimeout;
void *driver_data;
struct mutex lock;
unsigned long status;
@@ -70,6 +73,9 @@ It contains following fields:
* timeout: the watchdog timer's timeout value (in seconds).
* min_timeout: the watchdog timer's minimum timeout value (in seconds).
* max_timeout: the watchdog timer's maximum timeout value (in seconds).
+* pretimeout: the watchdog timer's pretimeout value (in seconds).
+* min_pretimeout: the watchdog timer's minimum pretimeout value (in seconds).
+* max_pretimeout: the watchdog timer's maximum pretimeout value (in seconds).
* bootstatus: status of the device after booting (reported with watchdog
WDIOF_* status bits).
* driver_data: a pointer to the drivers private data of a watchdog device.
@@ -92,6 +98,7 @@ struct watchdog_ops {
int (*ping)(struct watchdog_device *);
unsigned int (*status)(struct watchdog_device *);
int (*set_timeout)(struct watchdog_device *, unsigned int);
+ int (*set_pretimeout)(struct watchdog_device *, unsigned int);
unsigned int (*get_timeleft)(struct watchdog_device *);
void (*ref)(struct watchdog_device *);
void (*unref)(struct watchdog_device *);
@@ -153,9 +160,19 @@ they are supported. These optional routines/operations are:
and -EIO for "could not write value to the watchdog". On success this
routine should set the timeout value of the watchdog_device to the
achieved timeout value (which may be different from the requested one
- because the watchdog does not necessarily has a 1 second resolution).
+ because the watchdog does not necessarily has a 1 second resolution;
+ If the driver supports pretimeout, then the timeout value must be greater
+ than that).
(Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
watchdog's info structure).
+* set_pretimeout: this routine checks and changes the pretimeout of the
+ watchdog timer device. It returns 0 on success, -EINVAL for "parameter out of
+ range" and -EIO for "could not write value to the watchdog". On success this
+ routine should set the pretimeout value of the watchdog_device to the
+ achieved pretimeout value (which may be different from the requested one
+ because the watchdog does not necessarily has a 1 second resolution).
+ (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the
+ watchdog's info structure).
* get_timeleft: this routines returns the time that's left before a reset.
* ref: the operation that calls kref_get on the kref of a dynamically
allocated watchdog_device struct.
@@ -219,8 +236,28 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,
unsigned int timeout_parm, struct device *dev);

The watchdog_init_timeout function allows you to initialize the timeout field
-using the module timeout parameter or by retrieving the timeout-sec property from
-the device tree (if the module timeout parameter is invalid). Best practice is
-to set the default timeout value as timeout value in the watchdog_device and
-then use this function to set the user "preferred" timeout value.
+using the module timeout parameter or by retrieving the first element of
+the timeout-sec property from the device tree (if the module timeout parameter
+is invalid). Best practice is to set the default timeout value as timeout value
+in the watchdog_device and then use this function to set the user "preferred"
+timeout value.
+This routine returns zero on success and a negative errno code for failure.
+
+Some watchdog timers have two stage of timeouts(timeout and pretimeout),
+to initialize the timeout and pretimeout fields at the same time, the following
+function can be used:
+
+extern int watchdog_init_timeouts(struct watchdog_device *wdd,
+ unsigned int pretimeout_parm,
+ unsigned int timeout_parm,
+ struct device *dev);
+
+The watchdog_init_timeouts function allows you to initialize the pretimeout and
+timeout fields using the module pretimeout and timeout parameter or by
+retrieving the elements in the timeout-sec property(the first element is for
+timeout, the second one is for pretimeout) from the device tree(if the module
+pretimeout and timeout parameter are invalid).
+Best practice is to set the default pretimeout and timeout value as pretimeout
+and timeout value in the watchdog_device and then use this function to set the
+user "preferred" pretimeout value.
This routine returns zero on success and a negative errno code for failure.
diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
index cec9b55..ff18db3 100644
--- a/drivers/watchdog/watchdog_core.c
+++ b/drivers/watchdog/watchdog_core.c
@@ -43,60 +43,119 @@
static DEFINE_IDA(watchdog_ida);
static struct class *watchdog_class;

-static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
+static void watchdog_check_min_max_timeouts(struct watchdog_device *wdd)
{
/*
- * Check that we have valid min and max timeout values, if
- * not reset them both to 0 (=not used or unknown)
+ * Check that we have valid min and max pretimeout and timeout values,
+ * if not reset them both to 0 (=not used or unknown)
*/
+ if (wdd->min_pretimeout > wdd->max_pretimeout) {
+ pr_info("Invalid min and max pretimeout, resetting to 0\n");
+ wdd->min_pretimeout = 0;
+ wdd->max_pretimeout = 0;
+ }
if (wdd->min_timeout > wdd->max_timeout) {
pr_info("Invalid min and max timeout values, resetting to 0!\n");
wdd->min_timeout = 0;
wdd->max_timeout = 0;
}
+ /*
+ * Check that we have valid min timeout and max pretimeout values,
+ * if not reset them.
+ */
+ if (wdd->min_pretimeout && wdd->min_timeout < wdd->min_pretimeout) {
+ pr_info("Invalid min_timeout, resetting to min_pretimeout+1\n");
+ wdd->min_timeout = wdd->min_pretimeout + 1;
+ }
+ if (wdd->max_pretimeout && wdd->max_timeout < wdd->max_pretimeout) {
+ pr_info("Invalid max_pretimeout, resetting to max_timeout-1\n");
+ wdd->max_pretimeout = wdd->max_timeout - 1;
+ }
}

/**
- * watchdog_init_timeout() - initialize the timeout field
+ * watchdog_init_timeouts() - initialize the pretimeout and timeout field
+ * @pretimeout_parm: pretimeout module parameter
* @timeout_parm: timeout module parameter
* @dev: Device that stores the timeout-sec property
*
- * Initialize the timeout field of the watchdog_device struct with either the
- * timeout module parameter (if it is valid value) or the timeout-sec property
- * (only if it is a valid value and the timeout_parm is out of bounds).
- * If none of them are valid then we keep the old value (which should normally
- * be the default timeout value.
+ * Initialize the pretimeout and timeout field of the watchdog_device struct
+ * with either the pretimeout and timeout module parameter (if it is valid
+ * value) or the timeout-sec property (only if it is a valid value and the
+ * pretimeout_parm and timeout_parm is out of bounds). If none of them are
+ * valid, then we keep the old value (which should normally be the default
+ * timeout value).
*
* A zero is returned on success and -EINVAL for failure.
*/
-int watchdog_init_timeout(struct watchdog_device *wdd,
- unsigned int timeout_parm, struct device *dev)
+int watchdog_init_timeouts(struct watchdog_device *wdd,
+ unsigned int pretimeout_parm,
+ unsigned int timeout_parm,
+ struct device *dev)
{
- unsigned int t = 0;
- int ret = 0;
+ int ret = 0, length = 0;
+ u32 timeouts[2] = {0};

- watchdog_check_min_max_timeout(wdd);
+ watchdog_check_min_max_timeouts(wdd);

- /* try to get the timeout module parameter first */
- if (!watchdog_timeout_invalid(wdd, timeout_parm) && timeout_parm) {
- wdd->timeout = timeout_parm;
- return ret;
+ /*
+ * Try to get the pretimeout module parameter first,
+ * but it can be "0", that means we don't use pretimeout.
+ */
+ if (pretimeout_parm) {
+ if (!watchdog_pretimeout_invalid(wdd, pretimeout_parm))
+ timeouts[1] = pretimeout_parm;
+ ret = -EINVAL; /* pretimeout_parm is not "0", and invalid */
}
- if (timeout_parm)
+
+ /*
+ * Try to get the timeout module parameter,
+ * if it's valid and pretimeout is ont invalid(ret == 0),
+ * assignment and return zero. Otherwise, try dtb.
+ */
+ if (timeout_parm) {
+ if (!watchdog_timeout_invalid(wdd, timeout_parm) && !ret) {
+ wdd->timeout = timeout_parm;
+ wdd->pretimeout = timeouts[1];
+ return 0;
+ }
ret = -EINVAL;
+ }

- /* try to get the timeout_sec property */
+ /*
+ * We get here, the situation is one of them:
+ * (1)at least one of parameters is invalid(ret = -EINVAL);
+ * (2)both of them are 0.(ret = 0)
+ * So give up the module parameter(at least drop the invalid one),
+ * try to get the timeout_sec property from dtb.
+ */
if (dev == NULL || dev->of_node == NULL)
return ret;
- of_property_read_u32(dev->of_node, "timeout-sec", &t);
- if (!watchdog_timeout_invalid(wdd, t) && t)
- wdd->timeout = t;
- else
- ret = -EINVAL;

- return ret;
+ /*
+ * We get here, that means maybe we can get timeouts from dtb,
+ * if "timeout-sec" is available and the data is valid.
+ */
+ of_find_property(dev->of_node, "timeout-sec", &length);
+ if (length > 0 && length <= sizeof(u32) * 2) {
+ of_property_read_u32_array(dev->of_node,
+ "timeout-sec", timeouts,
+ length / sizeof(u32));
+ if (length == sizeof(u32) * 2 && timeouts[1] &&
+ watchdog_pretimeout_invalid(wdd, timeouts[1]))
+ return -EINVAL; /* pretimeout is invalid */
+
+ if (!watchdog_timeout_invalid(wdd, timeouts[0]) &&
+ timeouts[0]) {
+ wdd->timeout = timeouts[0];
+ wdd->pretimeout = timeouts[1];
+ return 0;
+ }
+ }
+
+ return -EINVAL;
}
-EXPORT_SYMBOL_GPL(watchdog_init_timeout);
+EXPORT_SYMBOL_GPL(watchdog_init_timeouts);

/**
* watchdog_register_device() - register a watchdog device
@@ -119,7 +178,7 @@ int watchdog_register_device(struct watchdog_device *wdd)
if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
return -EINVAL;

- watchdog_check_min_max_timeout(wdd);
+ watchdog_check_min_max_timeouts(wdd);

/*
* Note: now that all watchdog_device data has been verified, we
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 6aaefba..af0777e 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -218,6 +218,38 @@ out_timeout:
}

/*
+ * watchdog_set_pretimeout: set the watchdog timer pretimeout
+ * @wddev: the watchdog device to set the timeout for
+ * @pretimeout: pretimeout to set in seconds
+ */
+
+static int watchdog_set_pretimeout(struct watchdog_device *wddev,
+ unsigned int pretimeout)
+{
+ int err;
+
+ if (!wddev->ops->set_pretimeout ||
+ !(wddev->info->options & WDIOF_PRETIMEOUT))
+ return -EOPNOTSUPP;
+
+ if (watchdog_pretimeout_invalid(wddev, pretimeout))
+ return -EINVAL;
+
+ mutex_lock(&wddev->lock);
+
+ if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+ err = -ENODEV;
+ goto out_pretimeout;
+ }
+
+ err = wddev->ops->set_pretimeout(wddev, pretimeout);
+
+out_pretimeout:
+ mutex_unlock(&wddev->lock);
+ return err;
+}
+
+/*
* watchdog_get_timeleft: wrapper to get the time left before a reboot
* @wddev: the watchdog device to get the remaining time from
* @timeleft: the time that's left
@@ -388,6 +420,27 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd,
if (wdd->timeout == 0)
return -EOPNOTSUPP;
return put_user(wdd->timeout, p);
+ case WDIOC_SETPRETIMEOUT:
+ /* check if we support the pretimeout */
+ if (!(wdd->info->options & WDIOF_PRETIMEOUT))
+ return -EOPNOTSUPP;
+ if (get_user(val, p))
+ return -EFAULT;
+ err = watchdog_set_pretimeout(wdd, val);
+ if (err < 0)
+ return err;
+ /*
+ * If the watchdog is active then we send a keepalive ping
+ * to make sure that the watchdog keeps running (and if
+ * possible that it takes the new pretimeout)
+ */
+ watchdog_ping(wdd);
+ /* Fall */
+ case WDIOC_GETPRETIMEOUT:
+ /* check if we support the pretimeout */
+ if (wdd->info->options & WDIOF_PRETIMEOUT)
+ return put_user(wdd->pretimeout, p);
+ return -EOPNOTSUPP;
case WDIOC_GETTIMELEFT:
err = watchdog_get_timeleft(wdd, &val);
if (err)
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index a746bf5..b1e325d 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -25,6 +25,7 @@ struct watchdog_device;
* @ping: The routine that sends a keepalive ping to the watchdog device.
* @status: The routine that shows the status of the watchdog device.
* @set_timeout:The routine for setting the watchdog devices timeout value.
+ * @set_pretimeout:The routine for setting the watchdog devices pretimeout value
* @get_timeleft:The routine that get's the time that's left before a reset.
* @ref: The ref operation for dyn. allocated watchdog_device structs
* @unref: The unref operation for dyn. allocated watchdog_device structs
@@ -44,6 +45,7 @@ struct watchdog_ops {
int (*ping)(struct watchdog_device *);
unsigned int (*status)(struct watchdog_device *);
int (*set_timeout)(struct watchdog_device *, unsigned int);
+ int (*set_pretimeout)(struct watchdog_device *, unsigned int);
unsigned int (*get_timeleft)(struct watchdog_device *);
void (*ref)(struct watchdog_device *);
void (*unref)(struct watchdog_device *);
@@ -62,6 +64,9 @@ struct watchdog_ops {
* @timeout: The watchdog devices timeout value.
* @min_timeout:The watchdog devices minimum timeout value.
* @max_timeout:The watchdog devices maximum timeout value.
+ * @pretimeout: The watchdog devices pretimeout value.
+ * @min_pretimeout:The watchdog devices minimum pretimeout value.
+ * @max_pretimeout:The watchdog devices maximum pretimeout value.
* @driver-data:Pointer to the drivers private data.
* @lock: Lock for watchdog core internal use only.
* @status: Field that contains the devices internal status bits.
@@ -86,6 +91,9 @@ struct watchdog_device {
unsigned int timeout;
unsigned int min_timeout;
unsigned int max_timeout;
+ unsigned int pretimeout;
+ unsigned int min_pretimeout;
+ unsigned int max_pretimeout;
void *driver_data;
struct mutex lock;
unsigned long status;
@@ -117,7 +125,17 @@ static inline void watchdog_set_nowayout(struct watchdog_device *wdd, bool noway
static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd, unsigned int t)
{
return ((wdd->max_timeout != 0) &&
- (t < wdd->min_timeout || t > wdd->max_timeout));
+ (t < wdd->min_timeout || t > wdd->max_timeout ||
+ t <= wdd->pretimeout));
+}
+
+/* Use the following function to check if a pretimeout value is invalid */
+static inline bool watchdog_pretimeout_invalid(struct watchdog_device *wdd,
+ unsigned int t)
+{
+ return (wdd->max_pretimeout != 0 &&
+ (t < wdd->min_pretimeout || t > wdd->max_pretimeout ||
+ (wdd->timeout != 0 && t >= wdd->timeout)));
}

/* Use the following functions to manipulate watchdog driver specific data */
@@ -132,11 +150,20 @@ static inline void *watchdog_get_drvdata(struct watchdog_device *wdd)
}

/* drivers/watchdog/watchdog_core.c */
-extern int watchdog_init_timeout(struct watchdog_device *wdd,
- unsigned int timeout_parm, struct device *dev);
+extern int watchdog_init_timeouts(struct watchdog_device *wdd,
+ unsigned int pretimeout_parm,
+ unsigned int timeout_parm,
+ struct device *dev);
extern int watchdog_register_device(struct watchdog_device *);
extern void watchdog_unregister_device(struct watchdog_device *);

+static inline int watchdog_init_timeout(struct watchdog_device *wdd,
+ unsigned int timeout_parm,
+ struct device *dev)
+{
+ return watchdog_init_timeouts(wdd, 0, timeout_parm, dev);
+}
+
#ifdef CONFIG_HARDLOCKUP_DETECTOR
void watchdog_nmi_disable_all(void);
void watchdog_nmi_enable_all(void);
--
1.9.1

2015-06-02 04:07:12

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

From: Fu Wei <[email protected]>

This driver bases on linux kernel watchdog framework, and
use "pretimeout" in the framework. It supports getting timeout and
pretimeout from parameter and FDT at the driver init stage.
In first timeout, the interrupt routine run panic to save
system context.

Signed-off-by: Fu Wei <[email protected]>
---
drivers/watchdog/Kconfig | 12 ++
drivers/watchdog/Makefile | 1 +
drivers/watchdog/sbsa_gwdt.c | 426 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 439 insertions(+)
create mode 100644 drivers/watchdog/sbsa_gwdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index e5e7c55..20f9980 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -152,6 +152,18 @@ config ARM_SP805_WATCHDOG
ARM Primecell SP805 Watchdog timer. This will reboot your system when
the timeout is reached.

+config ARM_SBSA_WATCHDOG
+ tristate "ARM SBSA Generic Watchdog"
+ depends on ARM64
+ depends on ARM_ARCH_TIMER
+ select WATCHDOG_CORE
+ select ACPI_GTDT if ACPI
+ help
+ ARM SBSA Generic Watchdog. This watchdog has two Watchdog timeouts.
+ The first timeout will trigger a panic; the second timeout will
+ trigger a system reset.
+ More details: ARM DEN0029B - Server Base System Architecture (SBSA)
+
config AT91RM9200_WATCHDOG
tristate "AT91RM9200 watchdog"
depends on SOC_AT91RM9200 && MFD_SYSCON
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 5c19294..471f1b7c 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -30,6 +30,7 @@ obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb.o

# ARM Architecture
obj-$(CONFIG_ARM_SP805_WATCHDOG) += sp805_wdt.o
+obj-$(CONFIG_ARM_SBSA_WATCHDOG) += sbsa_gwdt.o
obj-$(CONFIG_AT91RM9200_WATCHDOG) += at91rm9200_wdt.o
obj-$(CONFIG_AT91SAM9X_WATCHDOG) += at91sam9_wdt.o
obj-$(CONFIG_CADENCE_WATCHDOG) += cadence_wdt.o
diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c
new file mode 100644
index 0000000..76b6763
--- /dev/null
+++ b/drivers/watchdog/sbsa_gwdt.c
@@ -0,0 +1,426 @@
+/*
+ * SBSA(Server Base System Architecture) Generic Watchdog driver
+ *
+ * Copyright (c) 2015, Linaro Ltd.
+ * Author: Fu Wei <[email protected]>
+ * Suravee Suthikulpanit <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * Note: This SBSA Generic watchdog driver is compatible with
+ * the pretimeout concept of Linux kernel.
+ * The timeout and pretimeout are set by the different REGs.
+ * The first watch period is set by writing WCV directly,
+ * that can support more than 10s timeout at the maximum
+ * system counter frequency.
+ * The second watch period is set by WOR(32bit) which will be loaded
+ * automatically by hardware, when WS0 is triggered.
+ * This gives a maximum watch period of around 10s at the maximum
+ * system counter frequency.
+ * The System Counter shall run at maximum of 400MHz.
+ * More details: ARM DEN0029B - Server Base System Architecture (SBSA)
+ *
+ * Kernel/API: P---------| pretimeout
+ * |-------------------------------T timeout
+ * SBSA GWDT: P--WOR---WS1 pretimeout
+ * |-------WCV----------WS0~~~~~~~~T timeout
+ */
+
+#include <linux/io.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/uaccess.h>
+#include <linux/watchdog.h>
+#include <asm/arch_timer.h>
+
+/* SBSA Generic Watchdog register definitions */
+/* refresh frame */
+#define SBSA_GWDT_WRR 0x000
+
+/* control frame */
+#define SBSA_GWDT_WCS 0x000
+#define SBSA_GWDT_WOR 0x008
+#define SBSA_GWDT_WCV_LO 0x010
+#define SBSA_GWDT_WCV_HI 0x014
+
+/* refresh/control frame */
+#define SBSA_GWDT_W_IIDR 0xfcc
+#define SBSA_GWDT_IDR 0xfd0
+
+/* Watchdog Control and Status Register */
+#define SBSA_GWDT_WCS_EN BIT(0)
+#define SBSA_GWDT_WCS_WS0 BIT(1)
+#define SBSA_GWDT_WCS_WS1 BIT(2)
+
+/**
+ * struct sbsa_gwdt - Internal representation of the SBSA GWDT
+ * @wdd: kernel watchdog_device structure
+ * @clk: store the System Counter clock frequency, in Hz.
+ * @refresh_base: Virtual address of the watchdog refresh frame
+ * @control_base: Virtual address of the watchdog control frame
+ */
+struct sbsa_gwdt {
+ struct watchdog_device wdd;
+ u32 clk;
+ void __iomem *refresh_base;
+ void __iomem *control_base;
+};
+
+#define to_sbsa_gwdt(e) container_of(e, struct sbsa_gwdt, wdd)
+
+#define DEFAULT_TIMEOUT 30 /* seconds, the 1st + 2nd watch periods*/
+#define DEFAULT_PRETIMEOUT 10 /* seconds, the 2nd watch period*/
+
+static unsigned int timeout;
+module_param(timeout, uint, 0);
+MODULE_PARM_DESC(timeout,
+ "Watchdog timeout in seconds. (>=0, default="
+ __MODULE_STRING(DEFAULT_TIMEOUT) ")");
+
+static unsigned int pretimeout;
+module_param(pretimeout, uint, 0);
+MODULE_PARM_DESC(pretimeout,
+ "Watchdog pretimeout in seconds. (>=0, default="
+ __MODULE_STRING(DEFAULT_PRETIMEOUT) ")");
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, S_IRUGO);
+MODULE_PARM_DESC(nowayout,
+ "Watchdog cannot be stopped once started (default="
+ __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
+/*
+ * help functions for accessing 32bit registers of SBSA Generic Watchdog
+ */
+static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
+ struct watchdog_device *wdd)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+
+ writel_relaxed(val, gwdt->control_base + reg);
+}
+
+static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
+ struct watchdog_device *wdd)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+
+ writel_relaxed(val, gwdt->refresh_base + reg);
+}
+
+static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device *wdd)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+
+ return readl_relaxed(gwdt->control_base + reg);
+}
+
+/*
+ * help functions for accessing 64bit WCV register
+ */
+static u64 sbsa_gwdt_get_wcv(struct watchdog_device *wdd)
+{
+ u32 wcv_lo, wcv_hi;
+
+ do {
+ wcv_hi = sbsa_gwdt_cf_read(SBSA_GWDT_WCV_HI, wdd);
+ wcv_lo = sbsa_gwdt_cf_read(SBSA_GWDT_WCV_LO, wdd);
+ } while (wcv_hi != sbsa_gwdt_cf_read(SBSA_GWDT_WCV_HI, wdd));
+
+ return (((u64)wcv_hi << 32) | wcv_lo);
+}
+
+static void sbsa_gwdt_set_wcv(struct watchdog_device *wdd, u64 value)
+{
+ sbsa_gwdt_cf_write(SBSA_GWDT_WCV_HI, upper_32_bits(value), wdd);
+ sbsa_gwdt_cf_write(SBSA_GWDT_WCV_LO, lower_32_bits(value), wdd);
+}
+
+static void reload_timeout_to_wcv(struct watchdog_device *wdd)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+ u64 wcv;
+
+ wcv = arch_counter_get_cntvct() +
+ (u64)(wdd->timeout - wdd->pretimeout) * gwdt->clk;
+
+ sbsa_gwdt_set_wcv(wdd, wcv);
+}
+
+static int sbsa_gwdt_set_timeout(struct watchdog_device *wdd,
+ unsigned int timeout)
+{
+ wdd->timeout = timeout;
+
+ return 0;
+}
+
+static int sbsa_gwdt_set_pretimeout(struct watchdog_device *wdd,
+ unsigned int pretimeout)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+ u32 wor;
+
+ wdd->pretimeout = pretimeout;
+
+ if (!pretimeout)
+ /*
+ * If pretimeout is 0, it gives driver a timeslot(1s)
+ * to update WCV after an explicit refresh(sbsa_gwdt_start)
+ */
+ wor = gwdt->clk;
+ else
+ wor = pretimeout * gwdt->clk;
+
+ /* refresh the WOR, that will cause an explicit watchdog refresh */
+ sbsa_gwdt_cf_write(SBSA_GWDT_WOR, wor, wdd);
+
+ return 0;
+}
+
+static unsigned int sbsa_gwdt_get_timeleft(struct watchdog_device *wdd)
+{
+ struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
+ u64 timeleft = sbsa_gwdt_get_wcv(wdd) - arch_counter_get_cntvct();
+
+ do_div(timeleft, gwdt->clk);
+
+ return timeleft;
+}
+
+static int sbsa_gwdt_keepalive(struct watchdog_device *wdd)
+{
+ reload_timeout_to_wcv(wdd);
+
+ return 0;
+}
+
+static int sbsa_gwdt_start(struct watchdog_device *wdd)
+{
+ /*
+ * Writing WRR for an explicit watchdog refresh.
+ * You can write anyting(like 0xc0ffee).
+ * Force refresh due to hardware bug found in certain Soc.
+ */
+ sbsa_gwdt_rf_write(SBSA_GWDT_WRR, 0xc0ffee, wdd);
+ /* writing WCS will cause an explicit watchdog refresh */
+ sbsa_gwdt_cf_write(SBSA_GWDT_WCS, SBSA_GWDT_WCS_EN, wdd);
+
+ return sbsa_gwdt_keepalive(wdd);
+}
+
+static int sbsa_gwdt_stop(struct watchdog_device *wdd)
+{
+ sbsa_gwdt_cf_write(SBSA_GWDT_WCS, 0, wdd);
+
+ return 0;
+}
+
+static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
+{
+ struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
+ struct watchdog_device *wdd = &gwdt->wdd;
+
+ if (wdd->pretimeout)
+ /* The pretimeout is valid, go panic */
+ panic("SBSA Watchdog pre-timeout");
+ else
+ /* We don't use pretimeout, trigger WS1 now*/
+ sbsa_gwdt_set_wcv(wdd, 0);
+
+ return IRQ_HANDLED;
+}
+
+static struct watchdog_info sbsa_gwdt_info = {
+ .identity = "SBSA Generic Watchdog",
+ .options = WDIOF_SETTIMEOUT |
+ WDIOF_KEEPALIVEPING |
+ WDIOF_MAGICCLOSE |
+ WDIOF_PRETIMEOUT |
+ WDIOF_CARDRESET,
+};
+
+static struct watchdog_ops sbsa_gwdt_ops = {
+ .owner = THIS_MODULE,
+ .start = sbsa_gwdt_start,
+ .stop = sbsa_gwdt_stop,
+ .ping = sbsa_gwdt_keepalive,
+ .set_timeout = sbsa_gwdt_set_timeout,
+ .set_pretimeout = sbsa_gwdt_set_pretimeout,
+ .get_timeleft = sbsa_gwdt_get_timeleft,
+};
+
+static int sbsa_gwdt_probe(struct platform_device *pdev)
+{
+ u64 first_period_max = U64_MAX;
+ struct device *dev = &pdev->dev;
+ struct watchdog_device *wdd;
+ struct sbsa_gwdt *gwdt;
+ struct resource *res;
+ void *rf_base, *cf_base;
+ int ret, irq;
+ u32 status;
+
+ gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
+ if (!gwdt)
+ return -ENOMEM;
+ platform_set_drvdata(pdev, gwdt);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "refresh");
+ rf_base = devm_ioremap_resource(dev, res);
+ if (IS_ERR(rf_base))
+ return PTR_ERR(rf_base);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "control");
+ cf_base = devm_ioremap_resource(dev, res);
+ if (IS_ERR(cf_base))
+ return PTR_ERR(cf_base);
+
+ irq = platform_get_irq_byname(pdev, "ws0");
+ if (irq < 0) {
+ dev_err(dev, "unable to get ws0 interrupt.\n");
+ return irq;
+ }
+
+ /*
+ * Get the frequency of system counter from the cp15 interface of ARM
+ * Generic timer. We don't need to check it, because if it returns "0",
+ * system would panic in very early stage.
+ */
+ gwdt->clk = arch_timer_get_cntfrq();
+ gwdt->refresh_base = rf_base;
+ gwdt->control_base = cf_base;
+
+ wdd = &gwdt->wdd;
+ wdd->parent = dev;
+ wdd->info = &sbsa_gwdt_info;
+ wdd->ops = &sbsa_gwdt_ops;
+ watchdog_set_drvdata(wdd, gwdt);
+ watchdog_set_nowayout(wdd, nowayout);
+
+ wdd->min_pretimeout = 0;
+ wdd->max_pretimeout = U32_MAX / gwdt->clk;
+ wdd->min_timeout = 1;
+ do_div(first_period_max, gwdt->clk);
+ wdd->max_timeout = first_period_max;
+
+ wdd->pretimeout = DEFAULT_PRETIMEOUT;
+ wdd->timeout = DEFAULT_TIMEOUT;
+ watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
+
+ status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
+ if (status & SBSA_GWDT_WCS_WS1) {
+ dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
+ sbsa_gwdt_get_wcv(wdd));
+ wdd->bootstatus |= WDIOF_CARDRESET;
+ }
+ /* Check if watchdog is already enabled */
+ if (status & SBSA_GWDT_WCS_EN) {
+ dev_warn(dev, "already enabled!\n");
+ sbsa_gwdt_keepalive(wdd);
+ }
+
+ /* update pretimeout to WOR */
+ sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
+
+ ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
+ pdev->name, gwdt);
+ if (ret) {
+ dev_err(dev, "unable to request IRQ %d\n", irq);
+ return ret;
+ }
+
+ ret = watchdog_register_device(wdd);
+ if (ret)
+ return ret;
+
+ dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u Hz\n",
+ wdd->timeout, wdd->pretimeout, gwdt->clk);
+
+ return 0;
+}
+
+static void sbsa_gwdt_shutdown(struct platform_device *pdev)
+{
+ struct sbsa_gwdt *gwdt = platform_get_drvdata(pdev);
+
+ sbsa_gwdt_stop(&gwdt->wdd);
+}
+
+static int sbsa_gwdt_remove(struct platform_device *pdev)
+{
+ struct sbsa_gwdt *gwdt = platform_get_drvdata(pdev);
+
+ watchdog_unregister_device(&gwdt->wdd);
+
+ return 0;
+}
+
+/* Disable watchdog if it is active during suspend */
+static int __maybe_unused sbsa_gwdt_suspend(struct device *dev)
+{
+ struct sbsa_gwdt *gwdt = dev_get_drvdata(dev);
+
+ if (watchdog_active(&gwdt->wdd))
+ sbsa_gwdt_stop(&gwdt->wdd);
+
+ return 0;
+}
+
+/* Enable watchdog and configure it if necessary */
+static int __maybe_unused sbsa_gwdt_resume(struct device *dev)
+{
+ struct sbsa_gwdt *gwdt = dev_get_drvdata(dev);
+
+ if (watchdog_active(&gwdt->wdd))
+ sbsa_gwdt_start(&gwdt->wdd);
+
+ return 0;
+}
+
+static const struct dev_pm_ops sbsa_gwdt_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(sbsa_gwdt_suspend, sbsa_gwdt_resume)
+};
+
+static const struct of_device_id sbsa_gwdt_of_match[] = {
+ { .compatible = "arm,sbsa-gwdt", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, sbsa_gwdt_of_match);
+
+static const struct platform_device_id sbsa_gwdt_pdev_match[] = {
+ { .name = "sbsa-gwdt", },
+ {},
+};
+MODULE_DEVICE_TABLE(platform, sbsa_gwdt_pdev_match);
+
+static struct platform_driver sbsa_gwdt_driver = {
+ .driver = {
+ .name = "sbsa-gwdt",
+ .pm = &sbsa_gwdt_pm_ops,
+ .of_match_table = sbsa_gwdt_of_match,
+ },
+ .probe = sbsa_gwdt_probe,
+ .remove = sbsa_gwdt_remove,
+ .shutdown = sbsa_gwdt_shutdown,
+ .id_table = sbsa_gwdt_pdev_match,
+};
+
+module_platform_driver(sbsa_gwdt_driver);
+
+MODULE_DESCRIPTION("SBSA Generic Watchdog Driver");
+MODULE_VERSION("v1.0");
+MODULE_AUTHOR("Fu Wei <[email protected]>");
+MODULE_AUTHOR("Suravee Suthikulpanit <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
1.9.1

2015-06-02 04:07:38

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 6/7] ACPI: add GTDT table parse driver into ACPI driver

From: Fu Wei <[email protected]>

This driver adds support for parsing SBSA Generic Watchdog
Structure in GTDT, and creating a platform device with that
information. This allows the operating system to obtain device
data from the resource of platform device.

The platform device named "sbsa-gwdt" can be used by the
ARM SBSA Generic Watchdog driver.

Signed-off-by: Fu Wei <[email protected]>
Signed-off-by: Hanjun Guo <[email protected]>
---
drivers/acpi/Kconfig | 9 ++++
drivers/acpi/Makefile | 1 +
drivers/acpi/gtdt.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 147 insertions(+)
create mode 100644 drivers/acpi/gtdt.c

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 1bbaa3d..e125698 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -433,4 +433,13 @@ config XPOWER_PMIC_OPREGION

endif

+config ACPI_GTDT
+ bool "ACPI GTDT Support"
+ depends on ARM64
+ help
+ GTDT (Generic Timer Description Table) provides information
+ for per-processor timers and Platform (memory-mapped) timers
+ for ARM platforms. Select this option to provide information
+ needed for the timers init.
+
endif # ACPI
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 431e587..2c5a194 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -96,3 +96,4 @@ obj-$(CONFIG_ACPI_EXTLOG) += acpi_extlog.o
obj-$(CONFIG_PMIC_OPREGION) += pmic/intel_pmic.o
obj-$(CONFIG_CRC_PMIC_OPREGION) += pmic/intel_pmic_crc.o
obj-$(CONFIG_XPOWER_PMIC_OPREGION) += pmic/intel_pmic_xpower.o
+obj-$(CONFIG_ACPI_GTDT) += gtdt.o
diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c
new file mode 100644
index 0000000..a92abf2
--- /dev/null
+++ b/drivers/acpi/gtdt.c
@@ -0,0 +1,137 @@
+/*
+ * ARM Specific GTDT table Support
+ *
+ * Copyright (C) 2015, Linaro Ltd.
+ * Author: Fu Wei <[email protected]>
+ * Hanjun Guo <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+
+static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
+{
+ int trigger, polarity;
+
+ if (!interrupt)
+ return 0;
+
+ trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
+ : ACPI_LEVEL_SENSITIVE;
+
+ polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
+ : ACPI_ACTIVE_HIGH;
+
+ return acpi_register_gsi(NULL, interrupt, trigger, polarity);
+}
+
+/*
+ * Initialize a SBSA generic Watchdog platform device info from GTDT
+ * According to SBSA specification the size of refresh and control
+ * frames of SBSA Generic Watchdog is SZ_4K(Offset 0x000 – 0xFFF).
+ */
+static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
+ int index)
+{
+ struct platform_device *pdev;
+ int irq = map_generic_timer_interrupt(wd->timer_interrupt,
+ wd->timer_flags);
+ struct resource res[] = {
+ DEFINE_RES_IRQ_NAMED(irq, "ws0"),
+ DEFINE_RES_MEM_NAMED(wd->refresh_frame_address, SZ_4K,
+ "refresh"),
+ DEFINE_RES_MEM_NAMED(wd->control_frame_address, SZ_4K,
+ "control"),
+ };
+
+ pr_debug("GTDT: a Watchdog GT(0x%llx/0x%llx gsi:%u flags:0x%x)\n",
+ wd->refresh_frame_address, wd->control_frame_address,
+ wd->timer_interrupt, wd->timer_flags);
+
+ if (!(wd->refresh_frame_address &&
+ wd->control_frame_address &&
+ wd->timer_interrupt)) {
+ pr_err("GTDT: failed geting the device info.\n");
+ return -EINVAL;
+ }
+
+ if (irq < 0) {
+ pr_err("GTDT: failed to register GSI of the Watchdog GT.\n");
+ return -EINVAL;
+ }
+
+ /*
+ * Add a platform device named "sbsa-gwdt" to match the platform driver.
+ * "sbsa-gwdt": SBSA(Server Base System Architecture) Generic Watchdog
+ * The platform driver (like drivers/watchdog/sbsa_gwdt.c)can get device
+ * info below by matching this name.
+ */
+ pdev = platform_device_register_simple("sbsa-gwdt", index, res,
+ ARRAY_SIZE(res));
+ if (IS_ERR(pdev)) {
+ acpi_unregister_gsi(wd->timer_interrupt);
+ return PTR_ERR(pdev);
+ }
+
+ return 0;
+}
+
+static int __init gtdt_platform_timer_parse(struct acpi_table_header *table)
+{
+ struct acpi_gtdt_header *header;
+ struct acpi_table_gtdt *gtdt;
+ void *gtdt_subtable;
+ int i, gwdt_index;
+ int ret = 0;
+
+ if (table->revision < 2) {
+ pr_warn("GTDT: Revision:%d doesn't support Platform Timers.\n",
+ table->revision);
+ return 0;
+ }
+
+ gtdt = container_of(table, struct acpi_table_gtdt, header);
+ if (!gtdt->platform_timer_count) {
+ pr_info("GTDT: No Platform Timer structures.\n");
+ return 0;
+ }
+
+ gtdt_subtable = (void *)gtdt + gtdt->platform_timer_offset;
+
+ for (i = 0, gwdt_index = 0; i < gtdt->platform_timer_count; i++) {
+ if (gtdt_subtable > (void *)table + table->length) {
+ pr_err("GTDT: subtable pointer overflows, bad table\n");
+ return -EINVAL;
+ }
+ header = (struct acpi_gtdt_header *)gtdt_subtable;
+ if (header->type == ACPI_GTDT_TYPE_WATCHDOG) {
+ ret = gtdt_import_sbsa_gwdt(gtdt_subtable, gwdt_index);
+ if (ret)
+ pr_err("GTDT: failed to import subtable %d\n",
+ i);
+ else
+ gwdt_index++;
+ }
+ gtdt_subtable += header->length;
+ }
+
+ return ret;
+}
+
+static int __init gtdt_platform_timer_init(void)
+{
+ if (acpi_disabled)
+ return 0;
+
+ return acpi_table_parse(ACPI_SIG_GTDT, gtdt_platform_timer_parse);
+}
+
+device_initcall(gtdt_platform_timer_init);
--
1.9.1

2015-06-02 04:07:27

by Fu Wei

[permalink] [raw]
Subject: [PATCH v4 7/7] clocksource: simplify ACPI code in arm_arch_timer.c

From: Fu Wei <[email protected]>

The patch update arm_arch_timer driver to use the function
provided by the new GTDT driver of ACPI.
By this way, arm_arch_timer.c can be simplified, and separate
all the ACPI GTDT knowledge from this timer driver.

Signed-off-by: Fu Wei <[email protected]>
Signed-off-by: Hanjun Guo <[email protected]>
---
arch/arm64/kernel/time.c | 4 +--
drivers/acpi/gtdt.c | 43 ++++++++++++++++++++++++++
drivers/clocksource/Kconfig | 1 +
drivers/clocksource/arm_arch_timer.c | 60 +++++++-----------------------------
include/clocksource/arm_arch_timer.h | 8 +++++
include/linux/acpi.h | 5 +++
include/linux/clocksource.h | 4 +--
7 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
index 42f9195..2cabea6 100644
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -75,9 +75,9 @@ void __init time_init(void)

/*
* Since ACPI or FDT will only one be available in the system,
- * we can use acpi_generic_timer_init() here safely
+ * we can use arch_timer_acpi_init() here safely
*/
- acpi_generic_timer_init();
+ arch_timer_acpi_init();

arch_timer_rate = arch_timer_get_rate();
if (!arch_timer_rate)
diff --git a/drivers/acpi/gtdt.c b/drivers/acpi/gtdt.c
index a92abf2..699c9fd 100644
--- a/drivers/acpi/gtdt.c
+++ b/drivers/acpi/gtdt.c
@@ -17,6 +17,8 @@
#include <linux/module.h>
#include <linux/platform_device.h>

+#include <clocksource/arm_arch_timer.h>
+
static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
{
int trigger, polarity;
@@ -33,6 +35,47 @@ static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
return acpi_register_gsi(NULL, interrupt, trigger, polarity);
}

+struct arch_timer_data __initdata *arch_timer_data_p;
+
+static int __init arch_timer_data_init(struct acpi_table_header *table)
+{
+ struct acpi_table_gtdt *gtdt;
+
+ gtdt = container_of(table, struct acpi_table_gtdt, header);
+
+ arch_timer_data_p->phys_secure_ppi =
+ map_generic_timer_interrupt(gtdt->secure_el1_interrupt,
+ gtdt->secure_el1_flags);
+
+ arch_timer_data_p->phys_nonsecure_ppi =
+ map_generic_timer_interrupt(gtdt->non_secure_el1_interrupt,
+ gtdt->non_secure_el1_flags);
+
+ arch_timer_data_p->virt_ppi =
+ map_generic_timer_interrupt(gtdt->virtual_timer_interrupt,
+ gtdt->virtual_timer_flags);
+
+ arch_timer_data_p->hyp_ppi =
+ map_generic_timer_interrupt(gtdt->non_secure_el2_interrupt,
+ gtdt->non_secure_el2_flags);
+
+ arch_timer_data_p->c3stop = !(gtdt->non_secure_el1_flags &
+ ACPI_GTDT_ALWAYS_ON);
+
+ return 0;
+}
+
+/* Initialize the arch_timer_data struct for arm_arch_timer by GTDT info */
+int __init gtdt_arch_timer_data_init(struct arch_timer_data *data)
+{
+ if (acpi_disabled || !data)
+ return -EINVAL;
+
+ arch_timer_data_p = data;
+
+ return acpi_table_parse(ACPI_SIG_GTDT, arch_timer_data_init);
+}
+
/*
* Initialize a SBSA generic Watchdog platform device info from GTDT
* According to SBSA specification the size of refresh and control
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 51d7865f..ea94a6f 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -109,6 +109,7 @@ config CLKSRC_EFM32
config ARM_ARCH_TIMER
bool
select CLKSRC_OF if OF
+ select ACPI_GTDT if ACPI

config ARM_ARCH_TIMER_EVTSTREAM
bool "Support for ARM architected timer event stream generation"
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 0aa135d..99505bb 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -817,68 +817,30 @@ CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, "arm,armv7-timer-mem",
arch_timer_mem_init);

#ifdef CONFIG_ACPI
-static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
-{
- int trigger, polarity;
-
- if (!interrupt)
- return 0;
-
- trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
- : ACPI_LEVEL_SENSITIVE;
-
- polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
- : ACPI_ACTIVE_HIGH;
-
- return acpi_register_gsi(NULL, interrupt, trigger, polarity);
-}
-
/* Initialize per-processor generic timer */
-static int __init arch_timer_acpi_init(struct acpi_table_header *table)
+void __init arch_timer_acpi_init(void)
{
- struct acpi_table_gtdt *gtdt;
+ struct arch_timer_data data;

if (arch_timers_present & ARCH_CP15_TIMER) {
pr_warn("arch_timer: already initialized, skipping\n");
- return -EINVAL;
+ return;
}

- gtdt = container_of(table, struct acpi_table_gtdt, header);
-
arch_timers_present |= ARCH_CP15_TIMER;

- arch_timer_ppi[PHYS_SECURE_PPI] =
- map_generic_timer_interrupt(gtdt->secure_el1_interrupt,
- gtdt->secure_el1_flags);
-
- arch_timer_ppi[PHYS_NONSECURE_PPI] =
- map_generic_timer_interrupt(gtdt->non_secure_el1_interrupt,
- gtdt->non_secure_el1_flags);
-
- arch_timer_ppi[VIRT_PPI] =
- map_generic_timer_interrupt(gtdt->virtual_timer_interrupt,
- gtdt->virtual_timer_flags);
+ if (gtdt_arch_timer_data_init(&data))
+ return;

- arch_timer_ppi[HYP_PPI] =
- map_generic_timer_interrupt(gtdt->non_secure_el2_interrupt,
- gtdt->non_secure_el2_flags);
+ arch_timer_ppi[PHYS_SECURE_PPI] = data.phys_secure_ppi;
+ arch_timer_ppi[PHYS_NONSECURE_PPI] = data.phys_nonsecure_ppi;
+ arch_timer_ppi[VIRT_PPI] = data.virt_ppi;
+ arch_timer_ppi[HYP_PPI] = data.hyp_ppi;
+ /* Always-on capability */
+ arch_timer_c3stop = data.c3stop;

/* Get the frequency from CNTFRQ */
arch_timer_detect_rate(NULL, NULL);
-
- /* Always-on capability */
- arch_timer_c3stop = !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
-
arch_timer_init();
- return 0;
-}
-
-/* Initialize all the generic timers presented in GTDT */
-void __init acpi_generic_timer_init(void)
-{
- if (acpi_disabled)
- return;
-
- acpi_table_parse(ACPI_SIG_GTDT, arch_timer_acpi_init);
}
#endif
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 9916d0e..5deed9d 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -43,6 +43,14 @@ enum arch_timer_reg {

#define ARCH_TIMER_EVT_STREAM_FREQ 10000 /* 100us */

+struct arch_timer_data {
+ int phys_secure_ppi;
+ int phys_nonsecure_ppi;
+ int virt_ppi;
+ int hyp_ppi;
+ bool c3stop;
+};
+
#ifdef CONFIG_ARM_ARCH_TIMER

extern u32 arch_timer_get_rate(void);
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 598f0f1..7213c0d 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -458,6 +458,11 @@ void acpi_walk_dep_device_list(acpi_handle handle);
struct platform_device *acpi_create_platform_device(struct acpi_device *);
#define ACPI_PTR(_ptr) (_ptr)

+#ifdef CONFIG_ACPI_GTDT
+struct arch_timer_data;
+int __init gtdt_arch_timer_data_init(struct arch_timer_data *data);
+#endif
+
#else /* !CONFIG_ACPI */

#define acpi_disabled 1
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index d27d015..264e553 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -254,9 +254,9 @@ static inline void clocksource_of_init(void) {}
#endif

#ifdef CONFIG_ACPI
-void acpi_generic_timer_init(void);
+void arch_timer_acpi_init(void);
#else
-static inline void acpi_generic_timer_init(void) { }
+static inline void arch_timer_acpi_init(void) { }
#endif

#endif /* _LINUX_CLOCKSOURCE_H */
--
1.9.1

2015-06-02 15:32:49

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/01/2015 11:05 PM, [email protected] wrote:

> +/*
> + * help functions for accessing 32bit registers of SBSA Generic Watchdog
> + */
> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
> + struct watchdog_device *wdd)
> +{
> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
> +
> + writel_relaxed(val, gwdt->control_base + reg);
> +}
> +
> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
> + struct watchdog_device *wdd)
> +{
> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
> +
> + writel_relaxed(val, gwdt->refresh_base + reg);
> +}
> +
> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device *wdd)
> +{
> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
> +
> + return readl_relaxed(gwdt->control_base + reg);
> +}

I still think you should get rid of these functions and just call
readl_relaxed() and writel_relaxed() every time, but I won't complain
again if you keep them.

> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
> +{
> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
> + struct watchdog_device *wdd = &gwdt->wdd;
> +
> + if (wdd->pretimeout)
> + /* The pretimeout is valid, go panic */
> + panic("SBSA Watchdog pre-timeout");
> + else
> + /* We don't use pretimeout, trigger WS1 now*/
> + sbsa_gwdt_set_wcv(wdd, 0);

I don't like this. The triggering of the hardware reset should never
depend on an interrupt being handled properly. You should always
program WCV correctly in advance. This is especially true since
pre-timeout will probably rarely be used. So what happens if the CPU is
totally hung and this interrupt handler is never called? When will the
timeout occur?

> +static int sbsa_gwdt_probe(struct platform_device *pdev)
> +{
> + u64 first_period_max = U64_MAX;
> + struct device *dev = &pdev->dev;
> + struct watchdog_device *wdd;
> + struct sbsa_gwdt *gwdt;
> + struct resource *res;
> + void *rf_base, *cf_base;
> + int ret, irq;
> + u32 status;
> +
> + gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
> + if (!gwdt)
> + return -ENOMEM;
> + platform_set_drvdata(pdev, gwdt);

You should probably do this *after* calling platform_get_irq_byname().

> +
> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "refresh");
> + rf_base = devm_ioremap_resource(dev, res);
> + if (IS_ERR(rf_base))
> + return PTR_ERR(rf_base);
> +
> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "control");
> + cf_base = devm_ioremap_resource(dev, res);
> + if (IS_ERR(cf_base))
> + return PTR_ERR(cf_base);
> +
> + irq = platform_get_irq_byname(pdev, "ws0");
> + if (irq < 0) {
> + dev_err(dev, "unable to get ws0 interrupt.\n");
> + return irq;
> + }
> +
> + /*
> + * Get the frequency of system counter from the cp15 interface of ARM
> + * Generic timer. We don't need to check it, because if it returns "0",
> + * system would panic in very early stage.
> + */
> + gwdt->clk = arch_timer_get_cntfrq();
> + gwdt->refresh_base = rf_base;
> + gwdt->control_base = cf_base;
> +
> + wdd = &gwdt->wdd;
> + wdd->parent = dev;
> + wdd->info = &sbsa_gwdt_info;
> + wdd->ops = &sbsa_gwdt_ops;
> + watchdog_set_drvdata(wdd, gwdt);
> + watchdog_set_nowayout(wdd, nowayout);
> +
> + wdd->min_pretimeout = 0;
> + wdd->max_pretimeout = U32_MAX / gwdt->clk;
> + wdd->min_timeout = 1;
> + do_div(first_period_max, gwdt->clk);
> + wdd->max_timeout = first_period_max;
> +
> + wdd->pretimeout = DEFAULT_PRETIMEOUT;
> + wdd->timeout = DEFAULT_TIMEOUT;
> + watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
> +
> + status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
> + if (status & SBSA_GWDT_WCS_WS1) {
> + dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
> + sbsa_gwdt_get_wcv(wdd));

"System was previously reset via watchdog" is much clearer.

> + wdd->bootstatus |= WDIOF_CARDRESET;
> + }
> + /* Check if watchdog is already enabled */
> + if (status & SBSA_GWDT_WCS_EN) {
> + dev_warn(dev, "already enabled!\n");

"watchdog is already enabled". Never use exclamation marks in kernel
messages.

> + sbsa_gwdt_keepalive(wdd);
> + }
> +
> + /* update pretimeout to WOR */
> + sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
> +
> + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
> + pdev->name, gwdt);
> + if (ret) {
> + dev_err(dev, "unable to request IRQ %d\n", irq);
> + return ret;
> + }
> +
> + ret = watchdog_register_device(wdd);
> + if (ret)
> + return ret;
> +
> + dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u Hz\n",
> + wdd->timeout, wdd->pretimeout, gwdt->clk);

if (wdd->pretimeout)
"watchdog initialized to %us timeout and %us pre-timeout at %u Hz\n",
wdd->timeout, wdd->pretimeout, gwdt->clk
else
"watchdog initialized to %us timeout at %u Hz\n", wdd->timeout, gwdt->clk

I think it's unlikely that users will use pre-timeout, so your code
should treat pre-timeout as a special case, not the normal usage.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-02 15:38:06

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/02/2015 08:32 AM, Timur Tabi wrote:
> On 06/01/2015 11:05 PM, [email protected] wrote:
>
>> +/*
>> + * help functions for accessing 32bit registers of SBSA Generic Watchdog
>> + */
>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>> + struct watchdog_device *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + writel_relaxed(val, gwdt->control_base + reg);
>> +}
>> +
>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>> + struct watchdog_device *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + writel_relaxed(val, gwdt->refresh_base + reg);
>> +}
>> +
>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + return readl_relaxed(gwdt->control_base + reg);
>> +}
>
> I still think you should get rid of these functions and just call readl_relaxed() and writel_relaxed() every time, but I won't complain again if you keep them.
>
>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>> +{
>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>> + struct watchdog_device *wdd = &gwdt->wdd;
>> +
>> + if (wdd->pretimeout)
>> + /* The pretimeout is valid, go panic */
>> + panic("SBSA Watchdog pre-timeout");
>> + else
>> + /* We don't use pretimeout, trigger WS1 now*/
>> + sbsa_gwdt_set_wcv(wdd, 0);
>
> I don't like this. The triggering of the hardware reset should never depend on an interrupt being handled properly. You should always program WCV correctly in advance. This is especially true since pre-timeout will probably rarely be used. So what happens if the CPU is totally hung and this interrupt handler is never called? When will the timeout occur?
>
>> +static int sbsa_gwdt_probe(struct platform_device *pdev)
>> +{
>> + u64 first_period_max = U64_MAX;
>> + struct device *dev = &pdev->dev;
>> + struct watchdog_device *wdd;
>> + struct sbsa_gwdt *gwdt;
>> + struct resource *res;
>> + void *rf_base, *cf_base;
>> + int ret, irq;
>> + u32 status;
>> +
>> + gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
>> + if (!gwdt)
>> + return -ENOMEM;
>> + platform_set_drvdata(pdev, gwdt);
>
> You should probably do this *after* calling platform_get_irq_byname().
>
>> +
>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "refresh");
>> + rf_base = devm_ioremap_resource(dev, res);
>> + if (IS_ERR(rf_base))
>> + return PTR_ERR(rf_base);
>> +
>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "control");
>> + cf_base = devm_ioremap_resource(dev, res);
>> + if (IS_ERR(cf_base))
>> + return PTR_ERR(cf_base);
>> +
>> + irq = platform_get_irq_byname(pdev, "ws0");
>> + if (irq < 0) {
>> + dev_err(dev, "unable to get ws0 interrupt.\n");
>> + return irq;
>> + }
>> +
>> + /*
>> + * Get the frequency of system counter from the cp15 interface of ARM
>> + * Generic timer. We don't need to check it, because if it returns "0",
>> + * system would panic in very early stage.
>> + */
>> + gwdt->clk = arch_timer_get_cntfrq();
>> + gwdt->refresh_base = rf_base;
>> + gwdt->control_base = cf_base;
>> +
>> + wdd = &gwdt->wdd;
>> + wdd->parent = dev;
>> + wdd->info = &sbsa_gwdt_info;
>> + wdd->ops = &sbsa_gwdt_ops;
>> + watchdog_set_drvdata(wdd, gwdt);
>> + watchdog_set_nowayout(wdd, nowayout);
>> +
>> + wdd->min_pretimeout = 0;
>> + wdd->max_pretimeout = U32_MAX / gwdt->clk;
>> + wdd->min_timeout = 1;
>> + do_div(first_period_max, gwdt->clk);
>> + wdd->max_timeout = first_period_max;
>> +
>> + wdd->pretimeout = DEFAULT_PRETIMEOUT;
>> + wdd->timeout = DEFAULT_TIMEOUT;
>> + watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
>> +
>> + status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
>> + if (status & SBSA_GWDT_WCS_WS1) {
>> + dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
>> + sbsa_gwdt_get_wcv(wdd));
>
> "System was previously reset via watchdog" is much clearer.
>
>> + wdd->bootstatus |= WDIOF_CARDRESET;
>> + }
>> + /* Check if watchdog is already enabled */
>> + if (status & SBSA_GWDT_WCS_EN) {
>> + dev_warn(dev, "already enabled!\n");
>
> "watchdog is already enabled". Never use exclamation marks in kernel messages.
>
>> + sbsa_gwdt_keepalive(wdd);
>> + }
>> +
>> + /* update pretimeout to WOR */
>> + sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
>> +
>> + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
>> + pdev->name, gwdt);
>> + if (ret) {
>> + dev_err(dev, "unable to request IRQ %d\n", irq);
>> + return ret;
>> + }
>> +
>> + ret = watchdog_register_device(wdd);
>> + if (ret)
>> + return ret;
>> +
>> + dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u Hz\n",
>> + wdd->timeout, wdd->pretimeout, gwdt->clk);
>
> if (wdd->pretimeout)
> "watchdog initialized to %us timeout and %us pre-timeout at %u Hz\n", wdd->timeout, wdd->pretimeout, gwdt->clk
> else
> "watchdog initialized to %us timeout at %u Hz\n", wdd->timeout, gwdt->clk
>
> I think it's unlikely that users will use pre-timeout, so your code should treat pre-timeout as a special case, not the normal usage.
>
+1 to all your comments.

Guenter

2015-06-02 16:12:34

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 4/7] Watchdog: introdouce "pretimeout" into framework

On 06/01/2015 09:05 PM, [email protected] wrote:
> From: Fu Wei <[email protected]>
>
> Also update Documentation/watchdog/watchdog-kernel-api.txt to
> introduce:
> (1)the new elements in the watchdog_device and watchdog_ops struct;
> (2)the new API "watchdog_init_timeouts"
>
> Reasons:
> (1)kernel already has two watchdog drivers are using "pretimeout":
> drivers/char/ipmi/ipmi_watchdog.c
> drivers/watchdog/kempld_wdt.c(but the definition is different)
> (2)some other drivers are going to use this: ARM SBSA Generic Watchdog
>
> Signed-off-by: Fu Wei <[email protected]>
> ---
> Documentation/watchdog/watchdog-kernel-api.txt | 47 ++++++++--
> drivers/watchdog/watchdog_core.c | 115 +++++++++++++++++++------
> drivers/watchdog/watchdog_dev.c | 53 ++++++++++++
> include/linux/watchdog.h | 33 ++++++-
> 4 files changed, 212 insertions(+), 36 deletions(-)
>
> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
> index a0438f3..95b355d 100644
> --- a/Documentation/watchdog/watchdog-kernel-api.txt
> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
> @@ -49,6 +49,9 @@ struct watchdog_device {
> unsigned int timeout;
> unsigned int min_timeout;
> unsigned int max_timeout;
> + unsigned int pretimeout;
> + unsigned int min_pretimeout;
> + unsigned int max_pretimeout;
> void *driver_data;
> struct mutex lock;
> unsigned long status;
> @@ -70,6 +73,9 @@ It contains following fields:
> * timeout: the watchdog timer's timeout value (in seconds).
> * min_timeout: the watchdog timer's minimum timeout value (in seconds).
> * max_timeout: the watchdog timer's maximum timeout value (in seconds).
> +* pretimeout: the watchdog timer's pretimeout value (in seconds).
> +* min_pretimeout: the watchdog timer's minimum pretimeout value (in seconds).
> +* max_pretimeout: the watchdog timer's maximum pretimeout value (in seconds).
> * bootstatus: status of the device after booting (reported with watchdog
> WDIOF_* status bits).
> * driver_data: a pointer to the drivers private data of a watchdog device.
> @@ -92,6 +98,7 @@ struct watchdog_ops {
> int (*ping)(struct watchdog_device *);
> unsigned int (*status)(struct watchdog_device *);
> int (*set_timeout)(struct watchdog_device *, unsigned int);
> + int (*set_pretimeout)(struct watchdog_device *, unsigned int);
> unsigned int (*get_timeleft)(struct watchdog_device *);
> void (*ref)(struct watchdog_device *);
> void (*unref)(struct watchdog_device *);
> @@ -153,9 +160,19 @@ they are supported. These optional routines/operations are:
> and -EIO for "could not write value to the watchdog". On success this
> routine should set the timeout value of the watchdog_device to the
> achieved timeout value (which may be different from the requested one
> - because the watchdog does not necessarily has a 1 second resolution).
> + because the watchdog does not necessarily has a 1 second resolution;
> + If the driver supports pretimeout, then the timeout value must be greater
> + than that).
> (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
> watchdog's info structure).
> +* set_pretimeout: this routine checks and changes the pretimeout of the
> + watchdog timer device. It returns 0 on success, -EINVAL for "parameter out of
> + range" and -EIO for "could not write value to the watchdog". On success this
> + routine should set the pretimeout value of the watchdog_device to the
> + achieved pretimeout value (which may be different from the requested one
> + because the watchdog does not necessarily has a 1 second resolution).
> + (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the
> + watchdog's info structure).
> * get_timeleft: this routines returns the time that's left before a reset.
> * ref: the operation that calls kref_get on the kref of a dynamically
> allocated watchdog_device struct.
> @@ -219,8 +236,28 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,
> unsigned int timeout_parm, struct device *dev);
>
> The watchdog_init_timeout function allows you to initialize the timeout field
> -using the module timeout parameter or by retrieving the timeout-sec property from
> -the device tree (if the module timeout parameter is invalid). Best practice is
> -to set the default timeout value as timeout value in the watchdog_device and
> -then use this function to set the user "preferred" timeout value.
> +using the module timeout parameter or by retrieving the first element of
> +the timeout-sec property from the device tree (if the module timeout parameter
> +is invalid). Best practice is to set the default timeout value as timeout value
> +in the watchdog_device and then use this function to set the user "preferred"
> +timeout value.
> +This routine returns zero on success and a negative errno code for failure.
> +
> +Some watchdog timers have two stage of timeouts(timeout and pretimeout),
> +to initialize the timeout and pretimeout fields at the same time, the following
> +function can be used:
> +
> +extern int watchdog_init_timeouts(struct watchdog_device *wdd,
> + unsigned int pretimeout_parm,
> + unsigned int timeout_parm,
> + struct device *dev);
> +
> +The watchdog_init_timeouts function allows you to initialize the pretimeout and
> +timeout fields using the module pretimeout and timeout parameter or by
> +retrieving the elements in the timeout-sec property(the first element is for
> +timeout, the second one is for pretimeout) from the device tree(if the module
> +pretimeout and timeout parameter are invalid).
> +Best practice is to set the default pretimeout and timeout value as pretimeout
> +and timeout value in the watchdog_device and then use this function to set the
> +user "preferred" pretimeout value.
> This routine returns zero on success and a negative errno code for failure.
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index cec9b55..ff18db3 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -43,60 +43,119 @@
> static DEFINE_IDA(watchdog_ida);
> static struct class *watchdog_class;
>
> -static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
> +static void watchdog_check_min_max_timeouts(struct watchdog_device *wdd)
> {
> /*
> - * Check that we have valid min and max timeout values, if
> - * not reset them both to 0 (=not used or unknown)
> + * Check that we have valid min and max pretimeout and timeout values,
> + * if not reset them both to 0 (=not used or unknown)
> */
> + if (wdd->min_pretimeout > wdd->max_pretimeout) {
> + pr_info("Invalid min and max pretimeout, resetting to 0\n");
> + wdd->min_pretimeout = 0;
> + wdd->max_pretimeout = 0;
> + }
> if (wdd->min_timeout > wdd->max_timeout) {
> pr_info("Invalid min and max timeout values, resetting to 0!\n");
> wdd->min_timeout = 0;
> wdd->max_timeout = 0;
> }
> + /*
> + * Check that we have valid min timeout and max pretimeout values,
> + * if not reset them.
> + */
> + if (wdd->min_pretimeout && wdd->min_timeout < wdd->min_pretimeout) {
> + pr_info("Invalid min_timeout, resetting to min_pretimeout+1\n");
> + wdd->min_timeout = wdd->min_pretimeout + 1;
> + }

min_timeout = 10
max_timeout = 20
min_pretimeout = 30
max_pretimeout = 40

will result in min_timeout set to 31.

> + if (wdd->max_pretimeout && wdd->max_timeout < wdd->max_pretimeout) {
> + pr_info("Invalid max_pretimeout, resetting to max_timeout-1\n");
> + wdd->max_pretimeout = wdd->max_timeout - 1;

and then you set max_pretimeout to 19.

So we'll end up with

min_timeout = 31
max_timeout = 20
min_pretimeout = 30
max_pretimeout = 19

Another example: max_pretimeout = 10, max_timeout = 0 -> max_pretimeout = -1.

> + }

You'll need to determine valid ranges and then enforce those.
Maybe something like
min_pretimeout < min_timeout < max_timeout
and
min_pretimeout < max_pretimeout < max_timeout

Not sure if we should adjust anything to a value different to 0.
Seems to me this is just asking for trouble.

> }
>
> /**
> - * watchdog_init_timeout() - initialize the timeout field
> + * watchdog_init_timeouts() - initialize the pretimeout and timeout field
> + * @pretimeout_parm: pretimeout module parameter
> * @timeout_parm: timeout module parameter
> * @dev: Device that stores the timeout-sec property
> *
> - * Initialize the timeout field of the watchdog_device struct with either the
> - * timeout module parameter (if it is valid value) or the timeout-sec property
> - * (only if it is a valid value and the timeout_parm is out of bounds).
> - * If none of them are valid then we keep the old value (which should normally
> - * be the default timeout value.
> + * Initialize the pretimeout and timeout field of the watchdog_device struct
> + * with either the pretimeout and timeout module parameter (if it is valid
> + * value) or the timeout-sec property (only if it is a valid value and the

just 'if it is valid', or 'if it is a valid value' in both cases.

> + * pretimeout_parm and timeout_parm is out of bounds). If none of them are

s/and/or/

> + * valid, then we keep the old value (which should normally be the default
> + * timeout value).
> *
> * A zero is returned on success and -EINVAL for failure.
> */
> -int watchdog_init_timeout(struct watchdog_device *wdd,
> - unsigned int timeout_parm, struct device *dev)
> +int watchdog_init_timeouts(struct watchdog_device *wdd,
> + unsigned int pretimeout_parm,
> + unsigned int timeout_parm,
> + struct device *dev)
> {
> - unsigned int t = 0;
> - int ret = 0;
> + int ret = 0, length = 0;
> + u32 timeouts[2] = {0};
>
> - watchdog_check_min_max_timeout(wdd);
> + watchdog_check_min_max_timeouts(wdd);
>
> - /* try to get the timeout module parameter first */
> - if (!watchdog_timeout_invalid(wdd, timeout_parm) && timeout_parm) {
> - wdd->timeout = timeout_parm;
> - return ret;
> + /*
> + * Try to get the pretimeout module parameter first,
> + * but it can be "0", that means we don't use pretimeout.
> + */
> + if (pretimeout_parm) {
> + if (!watchdog_pretimeout_invalid(wdd, pretimeout_parm))
> + timeouts[1] = pretimeout_parm;
> + ret = -EINVAL; /* pretimeout_parm is not "0", and invalid */

pretimeout_parm is always invalid ? Why ?

> }
> - if (timeout_parm)
> +
> + /*
> + * Try to get the timeout module parameter,
> + * if it's valid and pretimeout is ont invalid(ret == 0),

s/ont/not/

> + * assignment and return zero. Otherwise, try dtb.
> + */
> + if (timeout_parm) {
> + if (!watchdog_timeout_invalid(wdd, timeout_parm) && !ret) {

Unless I am missing something, you'll never get here if the pretimeout module
parameter is set.

> + wdd->timeout = timeout_parm;
> + wdd->pretimeout = timeouts[1];
> + return 0;
> + }
> ret = -EINVAL;
> + }
>
> - /* try to get the timeout_sec property */
> + /*
> + * We get here, the situation is one of them:
> + * (1)at least one of parameters is invalid(ret = -EINVAL);
> + * (2)both of them are 0.(ret = 0)
> + * So give up the module parameter(at least drop the invalid one),
> + * try to get the timeout_sec property from dtb.

Please simplify the comment.

Either at least one of the module parameters is invalid,
or both of them are 0. Try to get the timeout_sec property.

> + */
> if (dev == NULL || dev->of_node == NULL)
> return ret;
> - of_property_read_u32(dev->of_node, "timeout-sec", &t);
> - if (!watchdog_timeout_invalid(wdd, t) && t)
> - wdd->timeout = t;
> - else
> - ret = -EINVAL;
>
> - return ret;
> + /*
> + * We get here, that means maybe we can get timeouts from dtb,
> + * if "timeout-sec" is available and the data is valid.
> + */

That comment is not needed; it is obvious.

> + of_find_property(dev->of_node, "timeout-sec", &length);
> + if (length > 0 && length <= sizeof(u32) * 2) {

You need to check the return value of of_find_property() instead of assuming
that length will be 0 if it is not found.

> + of_property_read_u32_array(dev->of_node,
> + "timeout-sec", timeouts,
> + length / sizeof(u32));
> + if (length == sizeof(u32) * 2 && timeouts[1] &&
> + watchdog_pretimeout_invalid(wdd, timeouts[1]))
> + return -EINVAL; /* pretimeout is invalid */

Obvious comment.

> +
> + if (!watchdog_timeout_invalid(wdd, timeouts[0]) &&
> + timeouts[0]) {
> + wdd->timeout = timeouts[0];
> + wdd->pretimeout = timeouts[1];
> + return 0;
> + }
> + }
> +
> + return -EINVAL;
> }
> -EXPORT_SYMBOL_GPL(watchdog_init_timeout);
> +EXPORT_SYMBOL_GPL(watchdog_init_timeouts);
>
> /**
> * watchdog_register_device() - register a watchdog device
> @@ -119,7 +178,7 @@ int watchdog_register_device(struct watchdog_device *wdd)
> if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
> return -EINVAL;
>
> - watchdog_check_min_max_timeout(wdd);
> + watchdog_check_min_max_timeouts(wdd);
>
> /*
> * Note: now that all watchdog_device data has been verified, we
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 6aaefba..af0777e 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -218,6 +218,38 @@ out_timeout:
> }
>
> /*
> + * watchdog_set_pretimeout: set the watchdog timer pretimeout
> + * @wddev: the watchdog device to set the timeout for
> + * @pretimeout: pretimeout to set in seconds
> + */
> +
> +static int watchdog_set_pretimeout(struct watchdog_device *wddev,
> + unsigned int pretimeout)
> +{
> + int err;
> +
> + if (!wddev->ops->set_pretimeout ||
> + !(wddev->info->options & WDIOF_PRETIMEOUT))
> + return -EOPNOTSUPP;
> +
> + if (watchdog_pretimeout_invalid(wddev, pretimeout))
> + return -EINVAL;
> +
> + mutex_lock(&wddev->lock);
> +
> + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
> + err = -ENODEV;
> + goto out_pretimeout;
> + }
> +
> + err = wddev->ops->set_pretimeout(wddev, pretimeout);
> +
> +out_pretimeout:
> + mutex_unlock(&wddev->lock);
> + return err;
> +}
> +
> +/*
> * watchdog_get_timeleft: wrapper to get the time left before a reboot
> * @wddev: the watchdog device to get the remaining time from
> * @timeleft: the time that's left
> @@ -388,6 +420,27 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd,
> if (wdd->timeout == 0)
> return -EOPNOTSUPP;
> return put_user(wdd->timeout, p);
> + case WDIOC_SETPRETIMEOUT:
> + /* check if we support the pretimeout */
> + if (!(wdd->info->options & WDIOF_PRETIMEOUT))
> + return -EOPNOTSUPP;
> + if (get_user(val, p))
> + return -EFAULT;
> + err = watchdog_set_pretimeout(wdd, val);
> + if (err < 0)
> + return err;
> + /*
> + * If the watchdog is active then we send a keepalive ping
> + * to make sure that the watchdog keeps running (and if
> + * possible that it takes the new pretimeout)
> + */
> + watchdog_ping(wdd);
> + /* Fall */
> + case WDIOC_GETPRETIMEOUT:
> + /* check if we support the pretimeout */
> + if (wdd->info->options & WDIOF_PRETIMEOUT)
> + return put_user(wdd->pretimeout, p);
> + return -EOPNOTSUPP;
> case WDIOC_GETTIMELEFT:
> err = watchdog_get_timeleft(wdd, &val);
> if (err)
> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> index a746bf5..b1e325d 100644
> --- a/include/linux/watchdog.h
> +++ b/include/linux/watchdog.h
> @@ -25,6 +25,7 @@ struct watchdog_device;
> * @ping: The routine that sends a keepalive ping to the watchdog device.
> * @status: The routine that shows the status of the watchdog device.
> * @set_timeout:The routine for setting the watchdog devices timeout value.
> + * @set_pretimeout:The routine for setting the watchdog devices pretimeout value
> * @get_timeleft:The routine that get's the time that's left before a reset.
> * @ref: The ref operation for dyn. allocated watchdog_device structs
> * @unref: The unref operation for dyn. allocated watchdog_device structs
> @@ -44,6 +45,7 @@ struct watchdog_ops {
> int (*ping)(struct watchdog_device *);
> unsigned int (*status)(struct watchdog_device *);
> int (*set_timeout)(struct watchdog_device *, unsigned int);
> + int (*set_pretimeout)(struct watchdog_device *, unsigned int);
> unsigned int (*get_timeleft)(struct watchdog_device *);
> void (*ref)(struct watchdog_device *);
> void (*unref)(struct watchdog_device *);
> @@ -62,6 +64,9 @@ struct watchdog_ops {
> * @timeout: The watchdog devices timeout value.
> * @min_timeout:The watchdog devices minimum timeout value.
> * @max_timeout:The watchdog devices maximum timeout value.
> + * @pretimeout: The watchdog devices pretimeout value.
> + * @min_pretimeout:The watchdog devices minimum pretimeout value.
> + * @max_pretimeout:The watchdog devices maximum pretimeout value.
> * @driver-data:Pointer to the drivers private data.
> * @lock: Lock for watchdog core internal use only.
> * @status: Field that contains the devices internal status bits.
> @@ -86,6 +91,9 @@ struct watchdog_device {
> unsigned int timeout;
> unsigned int min_timeout;
> unsigned int max_timeout;
> + unsigned int pretimeout;
> + unsigned int min_pretimeout;
> + unsigned int max_pretimeout;
> void *driver_data;
> struct mutex lock;
> unsigned long status;
> @@ -117,7 +125,17 @@ static inline void watchdog_set_nowayout(struct watchdog_device *wdd, bool noway
> static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd, unsigned int t)
> {
> return ((wdd->max_timeout != 0) &&
> - (t < wdd->min_timeout || t > wdd->max_timeout));
> + (t < wdd->min_timeout || t > wdd->max_timeout ||
> + t <= wdd->pretimeout));
> +}
> +
> +/* Use the following function to check if a pretimeout value is invalid */
> +static inline bool watchdog_pretimeout_invalid(struct watchdog_device *wdd,
> + unsigned int t)
> +{
> + return (wdd->max_pretimeout != 0 &&
> + (t < wdd->min_pretimeout || t > wdd->max_pretimeout ||
> + (wdd->timeout != 0 && t >= wdd->timeout)));

This validates a pretimeout as valid if max_pretimeout == 0,
no matter what timeout is set to.

Do we really need to check if timeout == 0 ? Can that ever happen ?

> }
>
> /* Use the following functions to manipulate watchdog driver specific data */
> @@ -132,11 +150,20 @@ static inline void *watchdog_get_drvdata(struct watchdog_device *wdd)
> }
>
> /* drivers/watchdog/watchdog_core.c */
> -extern int watchdog_init_timeout(struct watchdog_device *wdd,
> - unsigned int timeout_parm, struct device *dev);
> +extern int watchdog_init_timeouts(struct watchdog_device *wdd,
> + unsigned int pretimeout_parm,
> + unsigned int timeout_parm,
> + struct device *dev);

No 'extern' here, please. Yes, we'll need to fix that for the other
function declarations, but that is not a reason to introduce new ones.

Thanks,
Guenter

2015-06-02 16:55:18

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Timur,

Thanks , feedback inline

On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
> On 06/01/2015 11:05 PM, [email protected] wrote:
>
>> +/*
>> + * help functions for accessing 32bit registers of SBSA Generic Watchdog
>> + */
>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>> + struct watchdog_device *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + writel_relaxed(val, gwdt->control_base + reg);
>> +}
>> +
>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>> + struct watchdog_device *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + writel_relaxed(val, gwdt->refresh_base + reg);
>> +}
>> +
>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>> *wdd)
>> +{
>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>> +
>> + return readl_relaxed(gwdt->control_base + reg);
>> +}
>
>
> I still think you should get rid of these functions and just call
> readl_relaxed() and writel_relaxed() every time, but I won't complain again
> if you keep them.

yes, that make sense, and will reduce the size of code, and I think
the code's readability will be OK too.
will try in my next patch,

>
>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>> +{
>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>> + struct watchdog_device *wdd = &gwdt->wdd;
>> +
>> + if (wdd->pretimeout)
>> + /* The pretimeout is valid, go panic */
>> + panic("SBSA Watchdog pre-timeout");
>> + else
>> + /* We don't use pretimeout, trigger WS1 now*/
>> + sbsa_gwdt_set_wcv(wdd, 0);
>
>
> I don't like this.

If so, what is your idea ,if pretimeout == 0?

the reason of using WCV as (timout - pretimeout): it can provide the
longer timeout period,
(1)If we use WOR, it can only provide 10s @ 400MHz(max).
as Guenter said earlier, the default timer out for most watchdog will
be 30s, so I think 10s limit will be a little short
(2)we can always program WCV just like ping.
(3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
we still can make this pretimeout longer by programming WCV(I don't
think it's necessary)


> The triggering of the hardware reset should never depend
> on an interrupt being handled properly.

if this fail, system reset in 1S, because WOR == (1s)

> You should always program WCV
> correctly in advance. This is especially true since pre-timeout will
> probably rarely be used.

always programming WCV is doable. But I absolutely can not agree "
pre-timeout will probably rarely be used"
If so, SBSA watchdog is just a normal watchdog, This use case just
makes this HW useless.
If so, go to use SP805.
you still don't see the importance of this warning and pretimeout to a
real server.

If the software of a real server goes wrong, then you just directly restart it ,
never try to sync/protect the current data, never try to figure out
what is wrong with it.
I don't think that is a good server software.

At least, I don't thinks " pre-timeout will probably rarely be used"
is a good idea for a server.
in another word, in a server ,pre-timeout should always be used.

> So what happens if the CPU is totally hung and

Again, system reset in 1S, because WOR == (1s).

> this interrupt handler is never called? When will the timeout occur?

if this interrupt hardler is never called, what I can see is "some
one is feeding the dog".
OK, in case, WS0 is triggered, but this interrupt hardler isn't
called, then software really goes wrong. Then , Again, Again system
reset in 1S, because WOR == (1s).

>
>> +static int sbsa_gwdt_probe(struct platform_device *pdev)
>> +{
>> + u64 first_period_max = U64_MAX;
>> + struct device *dev = &pdev->dev;
>> + struct watchdog_device *wdd;
>> + struct sbsa_gwdt *gwdt;
>> + struct resource *res;
>> + void *rf_base, *cf_base;
>> + int ret, irq;
>> + u32 status;
>> +
>> + gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
>> + if (!gwdt)
>> + return -ENOMEM;
>> + platform_set_drvdata(pdev, gwdt);
>
>
> You should probably do this *after* calling platform_get_irq_byname().

it just dose (pdev->) dev->driver_data = gwdt
If we got gwdt, we can do that.

But maybe I miss something(or a rule of usage), so please let know
why this has to be called *after* calling platform_get_irq_byname().

>
>> +
>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>> "refresh");
>> + rf_base = devm_ioremap_resource(dev, res);
>> + if (IS_ERR(rf_base))
>> + return PTR_ERR(rf_base);
>> +
>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>> "control");
>> + cf_base = devm_ioremap_resource(dev, res);
>> + if (IS_ERR(cf_base))
>> + return PTR_ERR(cf_base);
>> +
>> + irq = platform_get_irq_byname(pdev, "ws0");
>> + if (irq < 0) {
>> + dev_err(dev, "unable to get ws0 interrupt.\n");
>> + return irq;
>> + }
>> +
>> + /*
>> + * Get the frequency of system counter from the cp15 interface of
>> ARM
>> + * Generic timer. We don't need to check it, because if it returns
>> "0",
>> + * system would panic in very early stage.
>> + */
>> + gwdt->clk = arch_timer_get_cntfrq();
>> + gwdt->refresh_base = rf_base;
>> + gwdt->control_base = cf_base;
>> +
>> + wdd = &gwdt->wdd;
>> + wdd->parent = dev;
>> + wdd->info = &sbsa_gwdt_info;
>> + wdd->ops = &sbsa_gwdt_ops;
>> + watchdog_set_drvdata(wdd, gwdt);
>> + watchdog_set_nowayout(wdd, nowayout);
>> +
>> + wdd->min_pretimeout = 0;
>> + wdd->max_pretimeout = U32_MAX / gwdt->clk;
>> + wdd->min_timeout = 1;
>> + do_div(first_period_max, gwdt->clk);
>> + wdd->max_timeout = first_period_max;
>> +
>> + wdd->pretimeout = DEFAULT_PRETIMEOUT;
>> + wdd->timeout = DEFAULT_TIMEOUT;
>> + watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
>> +
>> + status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
>> + if (status & SBSA_GWDT_WCS_WS1) {
>> + dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
>> + sbsa_gwdt_get_wcv(wdd));
>
>
> "System was previously reset via watchdog" is much clearer.

OK

>
>> + wdd->bootstatus |= WDIOF_CARDRESET;
>> + }
>> + /* Check if watchdog is already enabled */
>> + if (status & SBSA_GWDT_WCS_EN) {
>> + dev_warn(dev, "already enabled!\n");
>
>
> "watchdog is already enabled".

I think I don't need to print "watchdog", dev_warn(dev, has help us on this.
If you do so , the message will be "watchdog: watchdog0: watchdog is
already enabled"

> Never use exclamation marks in kernel
> messages.

that make sense , will delete it.

>
>> + sbsa_gwdt_keepalive(wdd);
>> + }
>> +
>> + /* update pretimeout to WOR */
>> + sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
>> +
>> + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
>> + pdev->name, gwdt);
>> + if (ret) {
>> + dev_err(dev, "unable to request IRQ %d\n", irq);
>> + return ret;
>> + }
>> +
>> + ret = watchdog_register_device(wdd);
>> + if (ret)
>> + return ret;
>> +
>> + dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u
>> Hz\n",
>> + wdd->timeout, wdd->pretimeout, gwdt->clk);
>
>
> if (wdd->pretimeout)
> "watchdog initialized to %us timeout and %us pre-timeout at %u
> Hz\n", wdd->timeout, wdd->pretimeout, gwdt->clk
> else
> "watchdog initialized to %us timeout at %u Hz\n", wdd->timeout,
> gwdt->clk
>
> I think it's unlikely that users will use pre-timeout, so your code should
> treat pre-timeout as a special case, not the normal usage.

I don't think so, that why you didn't use pretimeout in your driver.
Because you don't see the meaning of "pretimeout" to a server.

>
> --
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-02 17:07:51

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/02/2015 09:55 AM, Fu Wei wrote:
> Hi Timur,
>
> Thanks , feedback inline
>
> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>
>>> +/*
>>> + * help functions for accessing 32bit registers of SBSA Generic Watchdog
>>> + */
>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>> + struct watchdog_device *wdd)
>>> +{
>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>> +
>>> + writel_relaxed(val, gwdt->control_base + reg);
>>> +}
>>> +
>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>> + struct watchdog_device *wdd)
>>> +{
>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>> +
>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>> +}
>>> +
>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>>> *wdd)
>>> +{
>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>> +
>>> + return readl_relaxed(gwdt->control_base + reg);
>>> +}
>>
>>
>> I still think you should get rid of these functions and just call
>> readl_relaxed() and writel_relaxed() every time, but I won't complain again
>> if you keep them.
>
> yes, that make sense, and will reduce the size of code, and I think
> the code's readability will be OK too.
> will try in my next patch,
>
>>
>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>> +{
>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>> +
>>> + if (wdd->pretimeout)
>>> + /* The pretimeout is valid, go panic */
>>> + panic("SBSA Watchdog pre-timeout");
>>> + else
>>> + /* We don't use pretimeout, trigger WS1 now*/
>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>
>>
>> I don't like this.
>
> If so, what is your idea ,if pretimeout == 0?
>
> the reason of using WCV as (timout - pretimeout): it can provide the
> longer timeout period,
> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
> as Guenter said earlier, the default timer out for most watchdog will
> be 30s, so I think 10s limit will be a little short
> (2)we can always program WCV just like ping.
> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
> we still can make this pretimeout longer by programming WCV(I don't
> think it's necessary)
>
>
>> The triggering of the hardware reset should never depend
>> on an interrupt being handled properly.
>
> if this fail, system reset in 1S, because WOR == (1s)
>
So ?

>> You should always program WCV
>> correctly in advance. This is especially true since pre-timeout will
>> probably rarely be used.
>
> always programming WCV is doable. But I absolutely can not agree "
> pre-timeout will probably rarely be used"
> If so, SBSA watchdog is just a normal watchdog, This use case just
> makes this HW useless.
> If so, go to use SP805.
> you still don't see the importance of this warning and pretimeout to a
> real server.
>

If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?

Guenter

> If the software of a real server goes wrong, then you just directly restart it ,
> never try to sync/protect the current data, never try to figure out
> what is wrong with it.
> I don't think that is a good server software.
>
> At least, I don't thinks " pre-timeout will probably rarely be used"
> is a good idea for a server.
> in another word, in a server ,pre-timeout should always be used.
>
>> So what happens if the CPU is totally hung and
>
> Again, system reset in 1S, because WOR == (1s).
>
>> this interrupt handler is never called? When will the timeout occur?
>
> if this interrupt hardler is never called, what I can see is "some
> one is feeding the dog".
> OK, in case, WS0 is triggered, but this interrupt hardler isn't
> called, then software really goes wrong. Then , Again, Again system
> reset in 1S, because WOR == (1s).
>
>>
>>> +static int sbsa_gwdt_probe(struct platform_device *pdev)
>>> +{
>>> + u64 first_period_max = U64_MAX;
>>> + struct device *dev = &pdev->dev;
>>> + struct watchdog_device *wdd;
>>> + struct sbsa_gwdt *gwdt;
>>> + struct resource *res;
>>> + void *rf_base, *cf_base;
>>> + int ret, irq;
>>> + u32 status;
>>> +
>>> + gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
>>> + if (!gwdt)
>>> + return -ENOMEM;
>>> + platform_set_drvdata(pdev, gwdt);
>>
>>
>> You should probably do this *after* calling platform_get_irq_byname().
>
> it just dose (pdev->) dev->driver_data = gwdt
> If we got gwdt, we can do that.
>
> But maybe I miss something(or a rule of usage), so please let know
> why this has to be called *after* calling platform_get_irq_byname().
>
>>
>>> +
>>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>>> "refresh");
>>> + rf_base = devm_ioremap_resource(dev, res);
>>> + if (IS_ERR(rf_base))
>>> + return PTR_ERR(rf_base);
>>> +
>>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>>> "control");
>>> + cf_base = devm_ioremap_resource(dev, res);
>>> + if (IS_ERR(cf_base))
>>> + return PTR_ERR(cf_base);
>>> +
>>> + irq = platform_get_irq_byname(pdev, "ws0");
>>> + if (irq < 0) {
>>> + dev_err(dev, "unable to get ws0 interrupt.\n");
>>> + return irq;
>>> + }
>>> +
>>> + /*
>>> + * Get the frequency of system counter from the cp15 interface of
>>> ARM
>>> + * Generic timer. We don't need to check it, because if it returns
>>> "0",
>>> + * system would panic in very early stage.
>>> + */
>>> + gwdt->clk = arch_timer_get_cntfrq();
>>> + gwdt->refresh_base = rf_base;
>>> + gwdt->control_base = cf_base;
>>> +
>>> + wdd = &gwdt->wdd;
>>> + wdd->parent = dev;
>>> + wdd->info = &sbsa_gwdt_info;
>>> + wdd->ops = &sbsa_gwdt_ops;
>>> + watchdog_set_drvdata(wdd, gwdt);
>>> + watchdog_set_nowayout(wdd, nowayout);
>>> +
>>> + wdd->min_pretimeout = 0;
>>> + wdd->max_pretimeout = U32_MAX / gwdt->clk;
>>> + wdd->min_timeout = 1;
>>> + do_div(first_period_max, gwdt->clk);
>>> + wdd->max_timeout = first_period_max;
>>> +
>>> + wdd->pretimeout = DEFAULT_PRETIMEOUT;
>>> + wdd->timeout = DEFAULT_TIMEOUT;
>>> + watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
>>> +
>>> + status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
>>> + if (status & SBSA_GWDT_WCS_WS1) {
>>> + dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
>>> + sbsa_gwdt_get_wcv(wdd));
>>
>>
>> "System was previously reset via watchdog" is much clearer.
>
> OK
>
>>
>>> + wdd->bootstatus |= WDIOF_CARDRESET;
>>> + }
>>> + /* Check if watchdog is already enabled */
>>> + if (status & SBSA_GWDT_WCS_EN) {
>>> + dev_warn(dev, "already enabled!\n");
>>
>>
>> "watchdog is already enabled".
>
> I think I don't need to print "watchdog", dev_warn(dev, has help us on this.
> If you do so , the message will be "watchdog: watchdog0: watchdog is
> already enabled"
>
> > Never use exclamation marks in kernel
>> messages.
>
> that make sense , will delete it.
>
>>
>>> + sbsa_gwdt_keepalive(wdd);
>>> + }
>>> +
>>> + /* update pretimeout to WOR */
>>> + sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
>>> +
>>> + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
>>> + pdev->name, gwdt);
>>> + if (ret) {
>>> + dev_err(dev, "unable to request IRQ %d\n", irq);
>>> + return ret;
>>> + }
>>> +
>>> + ret = watchdog_register_device(wdd);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u
>>> Hz\n",
>>> + wdd->timeout, wdd->pretimeout, gwdt->clk);
>>
>>
>> if (wdd->pretimeout)
>> "watchdog initialized to %us timeout and %us pre-timeout at %u
>> Hz\n", wdd->timeout, wdd->pretimeout, gwdt->clk
>> else
>> "watchdog initialized to %us timeout at %u Hz\n", wdd->timeout,
>> gwdt->clk
>>
>> I think it's unlikely that users will use pre-timeout, so your code should
>> treat pre-timeout as a special case, not the normal usage.
>
> I don't think so, that why you didn't use pretimeout in your driver.
> Because you don't see the meaning of "pretimeout" to a server.
>
>>
>> --
>> Qualcomm Innovation Center, Inc.
>> The Qualcomm Innovation Center, Inc. is a member of the
>> Code Aurora Forum, a Linux Foundation Collaborative Project.
>
>
>

2015-06-02 17:22:07

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/02/2015 11:55 AM, Fu Wei wrote:

>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>> +{
>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>> +
>>> + if (wdd->pretimeout)
>>> + /* The pretimeout is valid, go panic */
>>> + panic("SBSA Watchdog pre-timeout");
>>> + else
>>> + /* We don't use pretimeout, trigger WS1 now*/
>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>
>>
>> I don't like this.
>
> If so, what is your idea ,if pretimeout == 0?

If pretimeout == 0, then WCV should be set to timeout/2. The WS0
timeout will occur after timeout/2 seconds, and the driver will ignore
it in the interrupt handler. Then after another timeout/2 seconds, WS1
will timeout and the system will reset.

> the reason of using WCV as (timout - pretimeout): it can provide the
> longer timeout period,
> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
> as Guenter said earlier, the default timer out for most watchdog will
> be 30s, so I think 10s limit will be a little short
> (2)we can always program WCV just like ping.
> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
> we still can make this pretimeout longer by programming WCV(I don't
> think it's necessary)

I understand that, but like I said, I think pretimeout will be rarely
used, and so the driver should work best with pre-timeout enabled.

>> The triggering of the hardware reset should never depend
>> on an interrupt being handled properly.
>
> if this fail, system reset in 1S, because WOR == (1s)
>
>> You should always program WCV
>> correctly in advance. This is especially true since pre-timeout will
>> probably rarely be used.
>
> always programming WCV is doable. But I absolutely can not agree "
> pre-timeout will probably rarely be used"
> If so, SBSA watchdog is just a normal watchdog,

So what? What's wrong with that? Users don't have any control over
which watchdog hardware is on the SOC, and they don't care. They will
use the same watchdog software as they do on x86.

Users *want* a normal watchdog. Pre-timeout is only supported on time
watchdog hardware, so no one will standardize on it. ARM servers
supposed to be interchangeable with x86 servers. Customers are not
going to do anything special on an ARM server that they don't do on an
x86 server. I've been working in the ARM server industry for over two
years, and that's why everyone says.

> This use case just
> makes this HW useless.
> If so, go to use SP805.
> you still don't see the importance of this warning and pretimeout to a
> real server.
>
> If the software of a real server goes wrong, then you just directly restart it ,
> never try to sync/protect the current data, never try to figure out
> what is wrong with it.
> I don't think that is a good server software.

And that is exactly why my driver treats WS0 as the real timeout, and
WS1 as a "backup" timeout. When WS0 expires, the system tries to do a
"polite" reset. If that doesn't work, then WS1 will do a hard reset.

This also allows me to have a timeout that's twice as long as your
driver, so my timeout occurs in WS0, not WS1.

> At least, I don't thinks " pre-timeout will probably rarely be used"
> is a good idea for a server.
> in another word, in a server ,pre-timeout should always be used.

Except that few servers today support pre-timeout, so customers won't
depend on it.

>> You should probably do this *after* calling platform_get_irq_byname().
>
> it just dose (pdev->) dev->driver_data = gwdt
> If we got gwdt, we can do that.
>
> But maybe I miss something(or a rule of usage), so please let know
> why this has to be called *after* calling platform_get_irq_byname().

It's just so that you can avoid calling kzalloc() if the
platform_get_irq_byname() fails. It's not important.

> I think I don't need to print "watchdog", dev_warn(dev, has help us on this.
> If you do so , the message will be "watchdog: watchdog0: watchdog is
> already enabled"

Ok.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-03 18:16:55

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/01/2015 11:05 PM, [email protected] wrote:
> + if (wdd->pretimeout)
> + /* The pretimeout is valid, go panic */
> + panic("SBSA Watchdog pre-timeout");

The problem with this is that WS1 will still occur. So a few seconds
after the panic() call, the hardware will reset. There won't be any
time to debug or log anything.

> + /* We don't use pretimeout, trigger WS1 now*/
> + sbsa_gwdt_set_wcv(wdd, 0);

Are you sure this will work? This will set WCV to 0, which means it
the WS1 reset won't happen until the timestamp counter wraps around to
0. That could be a very long time from now.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-03 18:25:18

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On Wed, Jun 03, 2015 at 01:16:41PM -0500, Timur Tabi wrote:
> On 06/01/2015 11:05 PM, [email protected] wrote:
> >+ if (wdd->pretimeout)
> >+ /* The pretimeout is valid, go panic */
> >+ panic("SBSA Watchdog pre-timeout");
>
> The problem with this is that WS1 will still occur. So a few seconds after
> the panic() call, the hardware will reset. There won't be any time to debug
> or log anything.
>

In general the idea here would be to use a crashdump kernel, which,
when loaded, would reset the watchdog before it fires. This kernel
would then write a core dump to a specified location.

If arm64 doesn't support a crashdump kernel, it might still be possible
to log the backtrace somewhere (eg in nvram using pstore if that is
supported via acpi or efi).

Is there reason to believe that this all won't work on arm64 ?

Thanks,
Guenter

2015-06-03 18:53:41

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/03/2015 01:25 PM, Guenter Roeck wrote:
> In general the idea here would be to use a crashdump kernel, which,
> when loaded, would reset the watchdog before it fires. This kernel
> would then write a core dump to a specified location.

What is the mechanism for resetting the watchdog? The only code that
knows about the hardware registers is this driver. Does the crashdump
kernel call the watchdog stop function?

> If arm64 doesn't support a crashdump kernel, it might still be possible
> to log the backtrace somewhere (eg in nvram using pstore if that is
> supported via acpi or efi).

I think it's expected that the firmware support a crash dump mechanism
of some kind. But if we're talking about firmware support, then why
bother with the panic() in the first place?

> Is there reason to believe that this all won't work on arm64 ?

No, but I'm still trying to figure out why pre-timeout is valuable. If
we don't disable WS1, then we risk having the hardware reset before we
can take advantage of what the panic() offers. In which case, what's
the point of pre-timeout?

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-03 19:29:32

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On Wednesday 03 June 2015 13:53:29 Timur Tabi wrote:
> On 06/03/2015 01:25 PM, Guenter Roeck wrote:
> > In general the idea here would be to use a crashdump kernel, which,
> > when loaded, would reset the watchdog before it fires. This kernel
> > would then write a core dump to a specified location.
>
> What is the mechanism for resetting the watchdog? The only code that
> knows about the hardware registers is this driver. Does the crashdump
> kernel call the watchdog stop function?

It might or might not, depending on what you want to achieve. In most
cases, I'd expect the crashdump kernel to have it enabled if we want
to let the user set a pretimeout, but that is a policy question that
is not for the kernel to decide.

> > If arm64 doesn't support a crashdump kernel, it might still be possible
> > to log the backtrace somewhere (eg in nvram using pstore if that is
> > supported via acpi or efi).
>
> I think it's expected that the firmware support a crash dump mechanism
> of some kind. But if we're talking about firmware support, then why
> bother with the panic() in the first place?

panic() is what triggers all the crash dump or pstore mechanisms, it
has to do that anyway.

> > Is there reason to believe that this all won't work on arm64 ?
>
> No, but I'm still trying to figure out why pre-timeout is valuable. If
> we don't disable WS1, then we risk having the hardware reset before we
> can take advantage of what the panic() offers. In which case, what's
> the point of pre-timeout?

The timeouts are both configurable, so the sysadmin has to make sure that
the time between them is long enough to do whatever is necessary.

Arnd

2015-06-08 16:05:52

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Gurnter

On 3 June 2015 at 01:07, Guenter Roeck <[email protected]> wrote:
> On 06/02/2015 09:55 AM, Fu Wei wrote:
>>
>> Hi Timur,
>>
>> Thanks , feedback inline
>>
>> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>>>
>>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>>
>>>> +/*
>>>> + * help functions for accessing 32bit registers of SBSA Generic
>>>> Watchdog
>>>> + */
>>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>>> + struct watchdog_device *wdd)
>>>> +{
>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>> +
>>>> + writel_relaxed(val, gwdt->control_base + reg);
>>>> +}
>>>> +
>>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>>> + struct watchdog_device *wdd)
>>>> +{
>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>> +
>>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>>> +}
>>>> +
>>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>>>> *wdd)
>>>> +{
>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>> +
>>>> + return readl_relaxed(gwdt->control_base + reg);
>>>> +}
>>>
>>>
>>>
>>> I still think you should get rid of these functions and just call
>>> readl_relaxed() and writel_relaxed() every time, but I won't complain
>>> again
>>> if you keep them.
>>
>>
>> yes, that make sense, and will reduce the size of code, and I think
>> the code's readability will be OK too.
>> will try in my next patch,
>>
>>>
>>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>>> +{
>>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>>> +
>>>> + if (wdd->pretimeout)
>>>> + /* The pretimeout is valid, go panic */
>>>> + panic("SBSA Watchdog pre-timeout");
>>>> + else
>>>> + /* We don't use pretimeout, trigger WS1 now*/
>>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>>
>>>
>>>
>>> I don't like this.
>>
>>
>> If so, what is your idea ,if pretimeout == 0?
>>
>> the reason of using WCV as (timout - pretimeout): it can provide the
>> longer timeout period,
>> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
>> as Guenter said earlier, the default timer out for most watchdog will
>> be 30s, so I think 10s limit will be a little short
>> (2)we can always program WCV just like ping.
>> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
>> we still can make this pretimeout longer by programming WCV(I don't
>> think it's necessary)
>>
>>
>>> The triggering of the hardware reset should never depend
>>> on an interrupt being handled properly.
>>
>>
>> if this fail, system reset in 1S, because WOR == (1s)
>>
> So ?

Even the interrupt routine isn't triggered, (WOR + system counter) --> WCV,
then, sy system reset in 1S.

the hardware reset doesn't depend on an interrupt.


>
>>> You should always program WCV
>>> correctly in advance. This is especially true since pre-timeout will
>>> probably rarely be used.
>>
>>
>> always programming WCV is doable. But I absolutely can not agree "
>> pre-timeout will probably rarely be used"
>> If so, SBSA watchdog is just a normal watchdog, This use case just
>> makes this HW useless.
>> If so, go to use SP805.
>> you still don't see the importance of this warning and pretimeout to a
>> real server.
>>
>
> If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?

Because if WOR = 0 , according to SBSA, once you want to enable watchdog,
(0 + system counter) --> WCV , then , WS0 and WS1 will be triggered immediately.
we have not a chance(a time slot) to update WCV.

>
> Guenter
>
>
>> If the software of a real server goes wrong, then you just directly
>> restart it ,
>> never try to sync/protect the current data, never try to figure out
>> what is wrong with it.
>> I don't think that is a good server software.
>>
>> At least, I don't thinks " pre-timeout will probably rarely be used"
>> is a good idea for a server.
>> in another word, in a server ,pre-timeout should always be used.
>>
>>> So what happens if the CPU is totally hung and
>>
>>
>> Again, system reset in 1S, because WOR == (1s).
>>
>>> this interrupt handler is never called? When will the timeout occur?
>>
>>
>> if this interrupt hardler is never called, what I can see is "some
>> one is feeding the dog".
>> OK, in case, WS0 is triggered, but this interrupt hardler isn't
>> called, then software really goes wrong. Then , Again, Again system
>> reset in 1S, because WOR == (1s).
>>
>>>
>>>> +static int sbsa_gwdt_probe(struct platform_device *pdev)
>>>> +{
>>>> + u64 first_period_max = U64_MAX;
>>>> + struct device *dev = &pdev->dev;
>>>> + struct watchdog_device *wdd;
>>>> + struct sbsa_gwdt *gwdt;
>>>> + struct resource *res;
>>>> + void *rf_base, *cf_base;
>>>> + int ret, irq;
>>>> + u32 status;
>>>> +
>>>> + gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL);
>>>> + if (!gwdt)
>>>> + return -ENOMEM;
>>>> + platform_set_drvdata(pdev, gwdt);
>>>
>>>
>>>
>>> You should probably do this *after* calling platform_get_irq_byname().
>>
>>
>> it just dose (pdev->) dev->driver_data = gwdt
>> If we got gwdt, we can do that.
>>
>> But maybe I miss something(or a rule of usage), so please let know
>> why this has to be called *after* calling platform_get_irq_byname().
>>
>>>
>>>> +
>>>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>>>> "refresh");
>>>> + rf_base = devm_ioremap_resource(dev, res);
>>>> + if (IS_ERR(rf_base))
>>>> + return PTR_ERR(rf_base);
>>>> +
>>>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>>>> "control");
>>>> + cf_base = devm_ioremap_resource(dev, res);
>>>> + if (IS_ERR(cf_base))
>>>> + return PTR_ERR(cf_base);
>>>> +
>>>> + irq = platform_get_irq_byname(pdev, "ws0");
>>>> + if (irq < 0) {
>>>> + dev_err(dev, "unable to get ws0 interrupt.\n");
>>>> + return irq;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Get the frequency of system counter from the cp15 interface
>>>> of
>>>> ARM
>>>> + * Generic timer. We don't need to check it, because if it
>>>> returns
>>>> "0",
>>>> + * system would panic in very early stage.
>>>> + */
>>>> + gwdt->clk = arch_timer_get_cntfrq();
>>>> + gwdt->refresh_base = rf_base;
>>>> + gwdt->control_base = cf_base;
>>>> +
>>>> + wdd = &gwdt->wdd;
>>>> + wdd->parent = dev;
>>>> + wdd->info = &sbsa_gwdt_info;
>>>> + wdd->ops = &sbsa_gwdt_ops;
>>>> + watchdog_set_drvdata(wdd, gwdt);
>>>> + watchdog_set_nowayout(wdd, nowayout);
>>>> +
>>>> + wdd->min_pretimeout = 0;
>>>> + wdd->max_pretimeout = U32_MAX / gwdt->clk;
>>>> + wdd->min_timeout = 1;
>>>> + do_div(first_period_max, gwdt->clk);
>>>> + wdd->max_timeout = first_period_max;
>>>> +
>>>> + wdd->pretimeout = DEFAULT_PRETIMEOUT;
>>>> + wdd->timeout = DEFAULT_TIMEOUT;
>>>> + watchdog_init_timeouts(wdd, pretimeout, timeout, dev);
>>>> +
>>>> + status = sbsa_gwdt_cf_read(SBSA_GWDT_WCS, wdd);
>>>> + if (status & SBSA_GWDT_WCS_WS1) {
>>>> + dev_warn(dev, "System reset by WDT(WCV: %llx)\n",
>>>> + sbsa_gwdt_get_wcv(wdd));
>>>
>>>
>>>
>>> "System was previously reset via watchdog" is much clearer.
>>
>>
>> OK
>>
>>>
>>>> + wdd->bootstatus |= WDIOF_CARDRESET;
>>>> + }
>>>> + /* Check if watchdog is already enabled */
>>>> + if (status & SBSA_GWDT_WCS_EN) {
>>>> + dev_warn(dev, "already enabled!\n");
>>>
>>>
>>>
>>> "watchdog is already enabled".
>>
>>
>> I think I don't need to print "watchdog", dev_warn(dev, has help us on
>> this.
>> If you do so , the message will be "watchdog: watchdog0: watchdog is
>> already enabled"
>>
>> > Never use exclamation marks in kernel
>>>
>>> messages.
>>
>>
>> that make sense , will delete it.
>>
>>>
>>>> + sbsa_gwdt_keepalive(wdd);
>>>> + }
>>>> +
>>>> + /* update pretimeout to WOR */
>>>> + sbsa_gwdt_set_pretimeout(wdd, wdd->pretimeout);
>>>> +
>>>> + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0,
>>>> + pdev->name, gwdt);
>>>> + if (ret) {
>>>> + dev_err(dev, "unable to request IRQ %d\n", irq);
>>>> + return ret;
>>>> + }
>>>> +
>>>> + ret = watchdog_register_device(wdd);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + dev_info(dev, "Initialized with %ds timeout, %ds pretimeout @ %u
>>>> Hz\n",
>>>> + wdd->timeout, wdd->pretimeout, gwdt->clk);
>>>
>>>
>>>
>>> if (wdd->pretimeout)
>>> "watchdog initialized to %us timeout and %us pre-timeout at %u
>>> Hz\n", wdd->timeout, wdd->pretimeout, gwdt->clk
>>> else
>>> "watchdog initialized to %us timeout at %u Hz\n", wdd->timeout,
>>> gwdt->clk
>>>
>>> I think it's unlikely that users will use pre-timeout, so your code
>>> should
>>> treat pre-timeout as a special case, not the normal usage.
>>
>>
>> I don't think so, that why you didn't use pretimeout in your driver.
>> Because you don't see the meaning of "pretimeout" to a server.
>>
>>>
>>> --
>>> Qualcomm Innovation Center, Inc.
>>> The Qualcomm Innovation Center, Inc. is a member of the
>>> Code Aurora Forum, a Linux Foundation Collaborative Project.
>>
>>
>>
>>
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-08 16:10:53

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,


On 4 June 2015 at 02:25, Guenter Roeck <[email protected]> wrote:
> On Wed, Jun 03, 2015 at 01:16:41PM -0500, Timur Tabi wrote:
>> On 06/01/2015 11:05 PM, [email protected] wrote:
>> >+ if (wdd->pretimeout)
>> >+ /* The pretimeout is valid, go panic */
>> >+ panic("SBSA Watchdog pre-timeout");
>>
>> The problem with this is that WS1 will still occur. So a few seconds after
>> the panic() call, the hardware will reset. There won't be any time to debug
>> or log anything.
>>
>
> In general the idea here would be to use a crashdump kernel, which,
> when loaded, would reset the watchdog before it fires. This kernel
> would then write a core dump to a specified location.
>
> If arm64 doesn't support a crashdump kernel, it might still be possible
> to log the backtrace somewhere (eg in nvram using pstore if that is
> supported via acpi or efi).

yes, you are right , thanks for explaining this.

>
> Is there reason to believe that this all won't work on arm64 ?

I don't think there is a reason.

>
> Thanks,
> Guenter



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-08 16:45:04

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 4/7] Watchdog: introdouce "pretimeout" into framework

Hi Guenter,

Great thanks for your review,


On 3 June 2015 at 00:12, Guenter Roeck <[email protected]> wrote:
> On 06/01/2015 09:05 PM, [email protected] wrote:
>>
>> From: Fu Wei <[email protected]>
>>
>> Also update Documentation/watchdog/watchdog-kernel-api.txt to
>> introduce:
>> (1)the new elements in the watchdog_device and watchdog_ops struct;
>> (2)the new API "watchdog_init_timeouts"
>>
>> Reasons:
>> (1)kernel already has two watchdog drivers are using "pretimeout":
>> drivers/char/ipmi/ipmi_watchdog.c
>> drivers/watchdog/kempld_wdt.c(but the definition is different)
>> (2)some other drivers are going to use this: ARM SBSA Generic Watchdog
>>
>> Signed-off-by: Fu Wei <[email protected]>
>> ---
>> Documentation/watchdog/watchdog-kernel-api.txt | 47 ++++++++--
>> drivers/watchdog/watchdog_core.c | 115
>> +++++++++++++++++++------
>> drivers/watchdog/watchdog_dev.c | 53 ++++++++++++
>> include/linux/watchdog.h | 33 ++++++-
>> 4 files changed, 212 insertions(+), 36 deletions(-)
>>
>> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt
>> b/Documentation/watchdog/watchdog-kernel-api.txt
>> index a0438f3..95b355d 100644
>> --- a/Documentation/watchdog/watchdog-kernel-api.txt
>> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
>> @@ -49,6 +49,9 @@ struct watchdog_device {
>> unsigned int timeout;
>> unsigned int min_timeout;
>> unsigned int max_timeout;
>> + unsigned int pretimeout;
>> + unsigned int min_pretimeout;
>> + unsigned int max_pretimeout;
>> void *driver_data;
>> struct mutex lock;
>> unsigned long status;
>> @@ -70,6 +73,9 @@ It contains following fields:
>> * timeout: the watchdog timer's timeout value (in seconds).
>> * min_timeout: the watchdog timer's minimum timeout value (in seconds).
>> * max_timeout: the watchdog timer's maximum timeout value (in seconds).
>> +* pretimeout: the watchdog timer's pretimeout value (in seconds).
>> +* min_pretimeout: the watchdog timer's minimum pretimeout value (in
>> seconds).
>> +* max_pretimeout: the watchdog timer's maximum pretimeout value (in
>> seconds).
>> * bootstatus: status of the device after booting (reported with watchdog
>> WDIOF_* status bits).
>> * driver_data: a pointer to the drivers private data of a watchdog
>> device.
>> @@ -92,6 +98,7 @@ struct watchdog_ops {
>> int (*ping)(struct watchdog_device *);
>> unsigned int (*status)(struct watchdog_device *);
>> int (*set_timeout)(struct watchdog_device *, unsigned int);
>> + int (*set_pretimeout)(struct watchdog_device *, unsigned int);
>> unsigned int (*get_timeleft)(struct watchdog_device *);
>> void (*ref)(struct watchdog_device *);
>> void (*unref)(struct watchdog_device *);
>> @@ -153,9 +160,19 @@ they are supported. These optional
>> routines/operations are:
>> and -EIO for "could not write value to the watchdog". On success this
>> routine should set the timeout value of the watchdog_device to the
>> achieved timeout value (which may be different from the requested one
>> - because the watchdog does not necessarily has a 1 second resolution).
>> + because the watchdog does not necessarily has a 1 second resolution;
>> + If the driver supports pretimeout, then the timeout value must be
>> greater
>> + than that).
>> (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of
>> the
>> watchdog's info structure).
>> +* set_pretimeout: this routine checks and changes the pretimeout of the
>> + watchdog timer device. It returns 0 on success, -EINVAL for "parameter
>> out of
>> + range" and -EIO for "could not write value to the watchdog". On success
>> this
>> + routine should set the pretimeout value of the watchdog_device to the
>> + achieved pretimeout value (which may be different from the requested
>> one
>> + because the watchdog does not necessarily has a 1 second resolution).
>> + (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the
>> + watchdog's info structure).
>> * get_timeleft: this routines returns the time that's left before a
>> reset.
>> * ref: the operation that calls kref_get on the kref of a dynamically
>> allocated watchdog_device struct.
>> @@ -219,8 +236,28 @@ extern int watchdog_init_timeout(struct
>> watchdog_device *wdd,
>> unsigned int timeout_parm, struct
>> device *dev);
>>
>> The watchdog_init_timeout function allows you to initialize the timeout
>> field
>> -using the module timeout parameter or by retrieving the timeout-sec
>> property from
>> -the device tree (if the module timeout parameter is invalid). Best
>> practice is
>> -to set the default timeout value as timeout value in the watchdog_device
>> and
>> -then use this function to set the user "preferred" timeout value.
>> +using the module timeout parameter or by retrieving the first element of
>> +the timeout-sec property from the device tree (if the module timeout
>> parameter
>> +is invalid). Best practice is to set the default timeout value as timeout
>> value
>> +in the watchdog_device and then use this function to set the user
>> "preferred"
>> +timeout value.
>> +This routine returns zero on success and a negative errno code for
>> failure.
>> +
>> +Some watchdog timers have two stage of timeouts(timeout and pretimeout),
>> +to initialize the timeout and pretimeout fields at the same time, the
>> following
>> +function can be used:
>> +
>> +extern int watchdog_init_timeouts(struct watchdog_device *wdd,
>> + unsigned int pretimeout_parm,
>> + unsigned int timeout_parm,
>> + struct device *dev);
>> +
>> +The watchdog_init_timeouts function allows you to initialize the
>> pretimeout and
>> +timeout fields using the module pretimeout and timeout parameter or by
>> +retrieving the elements in the timeout-sec property(the first element is
>> for
>> +timeout, the second one is for pretimeout) from the device tree(if the
>> module
>> +pretimeout and timeout parameter are invalid).
>> +Best practice is to set the default pretimeout and timeout value as
>> pretimeout
>> +and timeout value in the watchdog_device and then use this function to
>> set the
>> +user "preferred" pretimeout value.
>> This routine returns zero on success and a negative errno code for
>> failure.
>> diff --git a/drivers/watchdog/watchdog_core.c
>> b/drivers/watchdog/watchdog_core.c
>> index cec9b55..ff18db3 100644
>> --- a/drivers/watchdog/watchdog_core.c
>> +++ b/drivers/watchdog/watchdog_core.c
>> @@ -43,60 +43,119 @@
>> static DEFINE_IDA(watchdog_ida);
>> static struct class *watchdog_class;
>>
>> -static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
>> +static void watchdog_check_min_max_timeouts(struct watchdog_device *wdd)
>> {
>> /*
>> - * Check that we have valid min and max timeout values, if
>> - * not reset them both to 0 (=not used or unknown)
>> + * Check that we have valid min and max pretimeout and timeout
>> values,
>> + * if not reset them both to 0 (=not used or unknown)
>> */
>> + if (wdd->min_pretimeout > wdd->max_pretimeout) {
>> + pr_info("Invalid min and max pretimeout, resetting to
>> 0\n");
>> + wdd->min_pretimeout = 0;
>> + wdd->max_pretimeout = 0;
>> + }
>> if (wdd->min_timeout > wdd->max_timeout) {
>> pr_info("Invalid min and max timeout values, resetting to
>> 0!\n");
>> wdd->min_timeout = 0;
>> wdd->max_timeout = 0;
>> }
>> + /*
>> + * Check that we have valid min timeout and max pretimeout values,
>> + * if not reset them.
>> + */
>> + if (wdd->min_pretimeout && wdd->min_timeout < wdd->min_pretimeout)
>> {
>> + pr_info("Invalid min_timeout, resetting to
>> min_pretimeout+1\n");
>> + wdd->min_timeout = wdd->min_pretimeout + 1;
>> + }
>
>
> min_timeout = 10
> max_timeout = 20
> min_pretimeout = 30
> max_pretimeout = 40
>
> will result in min_timeout set to 31.
>
>> + if (wdd->max_pretimeout && wdd->max_timeout < wdd->max_pretimeout)
>> {
>> + pr_info("Invalid max_pretimeout, resetting to
>> max_timeout-1\n");
>> + wdd->max_pretimeout = wdd->max_timeout - 1;
>
>
> and then you set max_pretimeout to 19.
>
> So we'll end up with
>
> min_timeout = 31
> max_timeout = 20
> min_pretimeout = 30
> max_pretimeout = 19
>
> Another example: max_pretimeout = 10, max_timeout = 0 -> max_pretimeout =
> -1.
>
>> + }
>
>
> You'll need to determine valid ranges and then enforce those.
> Maybe something like
> min_pretimeout < min_timeout < max_timeout
> and
> min_pretimeout < max_pretimeout < max_timeout
>
> Not sure if we should adjust anything to a value different to 0.
> Seems to me this is just asking for trouble.

yes, this is a new problem, but let me fix this in my new patchset:
(1)these values is set by driver in the initialization procedure,
according to hardware info.
(2) I thinks there is not any driver should call this function after
initialization.
So if we find any incorrect value, I think the hardware info we got
is wrong , maybe we can just give up

any suggestion ?

>
>> }
>>
>> /**
>> - * watchdog_init_timeout() - initialize the timeout field
>> + * watchdog_init_timeouts() - initialize the pretimeout and timeout field
>> + * @pretimeout_parm: pretimeout module parameter
>> * @timeout_parm: timeout module parameter
>> * @dev: Device that stores the timeout-sec property
>> *
>> - * Initialize the timeout field of the watchdog_device struct with either
>> the
>> - * timeout module parameter (if it is valid value) or the timeout-sec
>> property
>> - * (only if it is a valid value and the timeout_parm is out of bounds).
>> - * If none of them are valid then we keep the old value (which should
>> normally
>> - * be the default timeout value.
>> + * Initialize the pretimeout and timeout field of the watchdog_device
>> struct
>> + * with either the pretimeout and timeout module parameter (if it is
>> valid
>> + * value) or the timeout-sec property (only if it is a valid value and
>> the
>
>
> just 'if it is valid', or 'if it is a valid value' in both cases.

OK. np,. Thanks

>
>> + * pretimeout_parm and timeout_parm is out of bounds). If none of them
>> are
>
>
> s/and/or/

Thanks , fixed

>
>
>> + * valid, then we keep the old value (which should normally be the
>> default
>> + * timeout value).
>> *
>> * A zero is returned on success and -EINVAL for failure.
>> */
>> -int watchdog_init_timeout(struct watchdog_device *wdd,
>> - unsigned int timeout_parm, struct device
>> *dev)
>> +int watchdog_init_timeouts(struct watchdog_device *wdd,
>> + unsigned int pretimeout_parm,
>> + unsigned int timeout_parm,
>> + struct device *dev)
>> {
>> - unsigned int t = 0;
>> - int ret = 0;
>> + int ret = 0, length = 0;
>> + u32 timeouts[2] = {0};
>>
>> - watchdog_check_min_max_timeout(wdd);
>> + watchdog_check_min_max_timeouts(wdd);
>>
>> - /* try to get the timeout module parameter first */
>> - if (!watchdog_timeout_invalid(wdd, timeout_parm) && timeout_parm)
>> {
>> - wdd->timeout = timeout_parm;
>> - return ret;
>> + /*
>> + * Try to get the pretimeout module parameter first,
>> + * but it can be "0", that means we don't use pretimeout.
>> + */
>> + if (pretimeout_parm) {
>> + if (!watchdog_pretimeout_invalid(wdd, pretimeout_parm))
>> + timeouts[1] = pretimeout_parm;
>> + ret = -EINVAL; /* pretimeout_parm is not "0", and invalid
>> */
>
>
> pretimeout_parm is always invalid ? Why ?

Sorry, I lost a "else" , this is a bug/typo

>
>> }
>> - if (timeout_parm)
>> +
>> + /*
>> + * Try to get the timeout module parameter,
>> + * if it's valid and pretimeout is ont invalid(ret == 0),
>
>
> s/ont/not/

Thanks , fixed

>
>> + * assignment and return zero. Otherwise, try dtb.
>> + */
>> + if (timeout_parm) {
>> + if (!watchdog_timeout_invalid(wdd, timeout_parm) && !ret)
>> {
>
>
> Unless I am missing something, you'll never get here if the pretimeout
> module
> parameter is set.

yes, because of the bug above.
Will fix it in my next patchset , thanks :-)

>
>> + wdd->timeout = timeout_parm;
>> + wdd->pretimeout = timeouts[1];
>> + return 0;
>> + }
>> ret = -EINVAL;
>> + }
>>
>> - /* try to get the timeout_sec property */
>> + /*
>> + * We get here, the situation is one of them:
>> + * (1)at least one of parameters is invalid(ret = -EINVAL);
>> + * (2)both of them are 0.(ret = 0)
>> + * So give up the module parameter(at least drop the invalid one),
>> + * try to get the timeout_sec property from dtb.
>
>
> Please simplify the comment.
>
> Either at least one of the module parameters is invalid,
> or both of them are 0. Try to get the timeout_sec property.

Thanks , will do.

>
>> + */
>> if (dev == NULL || dev->of_node == NULL)
>> return ret;
>> - of_property_read_u32(dev->of_node, "timeout-sec", &t);
>> - if (!watchdog_timeout_invalid(wdd, t) && t)
>> - wdd->timeout = t;
>> - else
>> - ret = -EINVAL;
>>
>> - return ret;
>> + /*
>> + * We get here, that means maybe we can get timeouts from dtb,
>> + * if "timeout-sec" is available and the data is valid.
>> + */
>
>
> That comment is not needed; it is obvious.

OK, will remove it

>
>> + of_find_property(dev->of_node, "timeout-sec", &length);
>> + if (length > 0 && length <= sizeof(u32) * 2) {
>
>
> You need to check the return value of of_find_property() instead of assuming
> that length will be 0 if it is not found.

you are right, will do, thanks

>
>> + of_property_read_u32_array(dev->of_node,
>> + "timeout-sec", timeouts,
>> + length / sizeof(u32));
>> + if (length == sizeof(u32) * 2 && timeouts[1] &&
>> + watchdog_pretimeout_invalid(wdd, timeouts[1]))
>> + return -EINVAL; /* pretimeout is invalid
>> */
>
>
> Obvious comment.

OK, will remove it

>
>
>> +
>> + if (!watchdog_timeout_invalid(wdd, timeouts[0]) &&
>> + timeouts[0]) {
>> + wdd->timeout = timeouts[0];
>> + wdd->pretimeout = timeouts[1];
>> + return 0;
>> + }
>> + }
>> +
>> + return -EINVAL;
>> }
>> -EXPORT_SYMBOL_GPL(watchdog_init_timeout);
>> +EXPORT_SYMBOL_GPL(watchdog_init_timeouts);
>>
>> /**
>> * watchdog_register_device() - register a watchdog device
>> @@ -119,7 +178,7 @@ int watchdog_register_device(struct watchdog_device
>> *wdd)
>> if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
>> return -EINVAL;
>>
>> - watchdog_check_min_max_timeout(wdd);
>> + watchdog_check_min_max_timeouts(wdd);
>>
>> /*
>> * Note: now that all watchdog_device data has been verified, we
>> diff --git a/drivers/watchdog/watchdog_dev.c
>> b/drivers/watchdog/watchdog_dev.c
>> index 6aaefba..af0777e 100644
>> --- a/drivers/watchdog/watchdog_dev.c
>> +++ b/drivers/watchdog/watchdog_dev.c
>> @@ -218,6 +218,38 @@ out_timeout:
>> }
>>
>> /*
>> + * watchdog_set_pretimeout: set the watchdog timer pretimeout
>> + * @wddev: the watchdog device to set the timeout for
>> + * @pretimeout: pretimeout to set in seconds
>> + */
>> +
>> +static int watchdog_set_pretimeout(struct watchdog_device *wddev,
>> + unsigned int pretimeout)
>> +{
>> + int err;
>> +
>> + if (!wddev->ops->set_pretimeout ||
>> + !(wddev->info->options & WDIOF_PRETIMEOUT))
>> + return -EOPNOTSUPP;
>> +
>> + if (watchdog_pretimeout_invalid(wddev, pretimeout))
>> + return -EINVAL;
>> +
>> + mutex_lock(&wddev->lock);
>> +
>> + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
>> + err = -ENODEV;
>> + goto out_pretimeout;
>> + }
>> +
>> + err = wddev->ops->set_pretimeout(wddev, pretimeout);
>> +
>> +out_pretimeout:
>> + mutex_unlock(&wddev->lock);
>> + return err;
>> +}
>> +
>> +/*
>> * watchdog_get_timeleft: wrapper to get the time left before a
>> reboot
>> * @wddev: the watchdog device to get the remaining time from
>> * @timeleft: the time that's left
>> @@ -388,6 +420,27 @@ static long watchdog_ioctl(struct file *file,
>> unsigned int cmd,
>> if (wdd->timeout == 0)
>> return -EOPNOTSUPP;
>> return put_user(wdd->timeout, p);
>> + case WDIOC_SETPRETIMEOUT:
>> + /* check if we support the pretimeout */
>> + if (!(wdd->info->options & WDIOF_PRETIMEOUT))
>> + return -EOPNOTSUPP;
>> + if (get_user(val, p))
>> + return -EFAULT;
>> + err = watchdog_set_pretimeout(wdd, val);
>> + if (err < 0)
>> + return err;
>> + /*
>> + * If the watchdog is active then we send a keepalive ping
>> + * to make sure that the watchdog keeps running (and if
>> + * possible that it takes the new pretimeout)
>> + */
>> + watchdog_ping(wdd);
>> + /* Fall */
>> + case WDIOC_GETPRETIMEOUT:
>> + /* check if we support the pretimeout */
>> + if (wdd->info->options & WDIOF_PRETIMEOUT)
>> + return put_user(wdd->pretimeout, p);
>> + return -EOPNOTSUPP;
>> case WDIOC_GETTIMELEFT:
>> err = watchdog_get_timeleft(wdd, &val);
>> if (err)
>> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
>> index a746bf5..b1e325d 100644
>> --- a/include/linux/watchdog.h
>> +++ b/include/linux/watchdog.h
>> @@ -25,6 +25,7 @@ struct watchdog_device;
>> * @ping: The routine that sends a keepalive ping to the watchdog
>> device.
>> * @status: The routine that shows the status of the watchdog device.
>> * @set_timeout:The routine for setting the watchdog devices timeout
>> value.
>> + * @set_pretimeout:The routine for setting the watchdog devices
>> pretimeout value
>> * @get_timeleft:The routine that get's the time that's left before a
>> reset.
>> * @ref: The ref operation for dyn. allocated watchdog_device
>> structs
>> * @unref: The unref operation for dyn. allocated watchdog_device
>> structs
>> @@ -44,6 +45,7 @@ struct watchdog_ops {
>> int (*ping)(struct watchdog_device *);
>> unsigned int (*status)(struct watchdog_device *);
>> int (*set_timeout)(struct watchdog_device *, unsigned int);
>> + int (*set_pretimeout)(struct watchdog_device *, unsigned int);
>> unsigned int (*get_timeleft)(struct watchdog_device *);
>> void (*ref)(struct watchdog_device *);
>> void (*unref)(struct watchdog_device *);
>> @@ -62,6 +64,9 @@ struct watchdog_ops {
>> * @timeout: The watchdog devices timeout value.
>> * @min_timeout:The watchdog devices minimum timeout value.
>> * @max_timeout:The watchdog devices maximum timeout value.
>> + * @pretimeout: The watchdog devices pretimeout value.
>> + * @min_pretimeout:The watchdog devices minimum pretimeout value.
>> + * @max_pretimeout:The watchdog devices maximum pretimeout value.
>> * @driver-data:Pointer to the drivers private data.
>> * @lock: Lock for watchdog core internal use only.
>> * @status: Field that contains the devices internal status bits.
>> @@ -86,6 +91,9 @@ struct watchdog_device {
>> unsigned int timeout;
>> unsigned int min_timeout;
>> unsigned int max_timeout;
>> + unsigned int pretimeout;
>> + unsigned int min_pretimeout;
>> + unsigned int max_pretimeout;
>> void *driver_data;
>> struct mutex lock;
>> unsigned long status;
>> @@ -117,7 +125,17 @@ static inline void watchdog_set_nowayout(struct
>> watchdog_device *wdd, bool noway
>> static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd,
>> unsigned int t)
>> {
>> return ((wdd->max_timeout != 0) &&
>> - (t < wdd->min_timeout || t > wdd->max_timeout));
>> + (t < wdd->min_timeout || t > wdd->max_timeout ||
>> + t <= wdd->pretimeout));
>> +}
>> +
>> +/* Use the following function to check if a pretimeout value is invalid
>> */
>> +static inline bool watchdog_pretimeout_invalid(struct watchdog_device
>> *wdd,
>> + unsigned int t)
>> +{
>> + return (wdd->max_pretimeout != 0 &&
>> + (t < wdd->min_pretimeout || t > wdd->max_pretimeout ||
>> + (wdd->timeout != 0 && t >= wdd->timeout)));
>
>
> This validates a pretimeout as valid if max_pretimeout == 0,
> no matter what timeout is set to.
>
> Do we really need to check if timeout == 0 ? Can that ever happen ?

(wdd->timeout != 0 && t >= wdd->timeout)

if we have set timeout(wdd->timeout != 0), and t(pretimeout) >
wdd->timeout, t is a invalid value.
Am I correct?

the reason of adding "wdd->timeout != 0": if driver calls this
function before setting up wdd->timeout.


>
>> }
>>
>> /* Use the following functions to manipulate watchdog driver specific
>> data */
>> @@ -132,11 +150,20 @@ static inline void *watchdog_get_drvdata(struct
>> watchdog_device *wdd)
>> }
>>
>> /* drivers/watchdog/watchdog_core.c */
>> -extern int watchdog_init_timeout(struct watchdog_device *wdd,
>> - unsigned int timeout_parm, struct device
>> *dev);
>> +extern int watchdog_init_timeouts(struct watchdog_device *wdd,
>> + unsigned int pretimeout_parm,
>> + unsigned int timeout_parm,
>> + struct device *dev);
>
>
> No 'extern' here, please. Yes, we'll need to fix that for the other
> function declarations, but that is not a reason to introduce new ones.

OK, I do agree with you on this, have fixed it in my next patchset.

>
> Thanks,
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-08 18:26:38

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/08/2015 09:05 AM, Fu Wei wrote:
> Hi Gurnter
>
> On 3 June 2015 at 01:07, Guenter Roeck <[email protected]> wrote:
>> On 06/02/2015 09:55 AM, Fu Wei wrote:
>>>
>>> Hi Timur,
>>>
>>> Thanks , feedback inline
>>>
>>> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>>>>
>>>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>>>
>>>>> +/*
>>>>> + * help functions for accessing 32bit registers of SBSA Generic
>>>>> Watchdog
>>>>> + */
>>>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>>>> + struct watchdog_device *wdd)
>>>>> +{
>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>> +
>>>>> + writel_relaxed(val, gwdt->control_base + reg);
>>>>> +}
>>>>> +
>>>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>>>> + struct watchdog_device *wdd)
>>>>> +{
>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>> +
>>>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>>>> +}
>>>>> +
>>>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>>>>> *wdd)
>>>>> +{
>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>> +
>>>>> + return readl_relaxed(gwdt->control_base + reg);
>>>>> +}
>>>>
>>>>
>>>>
>>>> I still think you should get rid of these functions and just call
>>>> readl_relaxed() and writel_relaxed() every time, but I won't complain
>>>> again
>>>> if you keep them.
>>>
>>>
>>> yes, that make sense, and will reduce the size of code, and I think
>>> the code's readability will be OK too.
>>> will try in my next patch,
>>>
>>>>
>>>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>>>> +{
>>>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>>>> +
>>>>> + if (wdd->pretimeout)
>>>>> + /* The pretimeout is valid, go panic */
>>>>> + panic("SBSA Watchdog pre-timeout");
>>>>> + else
>>>>> + /* We don't use pretimeout, trigger WS1 now*/
>>>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>>>
>>>>
>>>>
>>>> I don't like this.
>>>
>>>
>>> If so, what is your idea ,if pretimeout == 0?
>>>
>>> the reason of using WCV as (timout - pretimeout): it can provide the
>>> longer timeout period,
>>> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
>>> as Guenter said earlier, the default timer out for most watchdog will
>>> be 30s, so I think 10s limit will be a little short
>>> (2)we can always program WCV just like ping.
>>> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
>>> we still can make this pretimeout longer by programming WCV(I don't
>>> think it's necessary)
>>>
>>>
>>>> The triggering of the hardware reset should never depend
>>>> on an interrupt being handled properly.
>>>
>>>
>>> if this fail, system reset in 1S, because WOR == (1s)
>>>
>> So ?
>
> Even the interrupt routine isn't triggered, (WOR + system counter) --> WCV,
> then, sy system reset in 1S.
>
> the hardware reset doesn't depend on an interrupt.
>
>
>>
>>>> You should always program WCV
>>>> correctly in advance. This is especially true since pre-timeout will
>>>> probably rarely be used.
>>>
>>>
>>> always programming WCV is doable. But I absolutely can not agree "
>>> pre-timeout will probably rarely be used"
>>> If so, SBSA watchdog is just a normal watchdog, This use case just
>>> makes this HW useless.
>>> If so, go to use SP805.
>>> you still don't see the importance of this warning and pretimeout to a
>>> real server.
>>>
>>
>> If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?
>
> Because if WOR = 0 , according to SBSA, once you want to enable watchdog,
> (0 + system counter) --> WCV , then , WS0 and WS1 will be triggered immediately.
> we have not a chance(a time slot) to update WCV.
>

I would have thought that this is exactly what we want if pretimeout is not used.
What am I missing here ?

Guenter

2015-06-09 03:59:40

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,


On 9 June 2015 at 02:26, Guenter Roeck <[email protected]> wrote:
> On 06/08/2015 09:05 AM, Fu Wei wrote:
>>
>> Hi Gurnter
>>
>> On 3 June 2015 at 01:07, Guenter Roeck <[email protected]> wrote:
>>>
>>> On 06/02/2015 09:55 AM, Fu Wei wrote:
>>>>
>>>>
>>>> Hi Timur,
>>>>
>>>> Thanks , feedback inline
>>>>
>>>> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>>>>>
>>>>>
>>>>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>>>>
>>>>>> +/*
>>>>>> + * help functions for accessing 32bit registers of SBSA Generic
>>>>>> Watchdog
>>>>>> + */
>>>>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>>>>> + struct watchdog_device *wdd)
>>>>>> +{
>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>> +
>>>>>> + writel_relaxed(val, gwdt->control_base + reg);
>>>>>> +}
>>>>>> +
>>>>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>>>>> + struct watchdog_device *wdd)
>>>>>> +{
>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>> +
>>>>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>>>>> +}
>>>>>> +
>>>>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>>>>>> *wdd)
>>>>>> +{
>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>> +
>>>>>> + return readl_relaxed(gwdt->control_base + reg);
>>>>>> +}
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I still think you should get rid of these functions and just call
>>>>> readl_relaxed() and writel_relaxed() every time, but I won't complain
>>>>> again
>>>>> if you keep them.
>>>>
>>>>
>>>>
>>>> yes, that make sense, and will reduce the size of code, and I think
>>>> the code's readability will be OK too.
>>>> will try in my next patch,
>>>>
>>>>>
>>>>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>>>>> +{
>>>>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>>>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>>>>> +
>>>>>> + if (wdd->pretimeout)
>>>>>> + /* The pretimeout is valid, go panic */
>>>>>> + panic("SBSA Watchdog pre-timeout");
>>>>>> + else
>>>>>> + /* We don't use pretimeout, trigger WS1 now*/
>>>>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I don't like this.
>>>>
>>>>
>>>>
>>>> If so, what is your idea ,if pretimeout == 0?
>>>>
>>>> the reason of using WCV as (timout - pretimeout): it can provide the
>>>> longer timeout period,
>>>> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
>>>> as Guenter said earlier, the default timer out for most watchdog will
>>>> be 30s, so I think 10s limit will be a little short
>>>> (2)we can always program WCV just like ping.
>>>> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
>>>> we still can make this pretimeout longer by programming WCV(I don't
>>>> think it's necessary)
>>>>
>>>>
>>>>> The triggering of the hardware reset should never depend
>>>>> on an interrupt being handled properly.
>>>>
>>>>
>>>>
>>>> if this fail, system reset in 1S, because WOR == (1s)
>>>>
>>> So ?
>>
>>
>> Even the interrupt routine isn't triggered, (WOR + system counter) -->
>> WCV,
>> then, sy system reset in 1S.
>>
>> the hardware reset doesn't depend on an interrupt.
>>
>>
>>>
>>>>> You should always program WCV
>>>>> correctly in advance. This is especially true since pre-timeout will
>>>>> probably rarely be used.
>>>>
>>>>
>>>>
>>>> always programming WCV is doable. But I absolutely can not agree "
>>>> pre-timeout will probably rarely be used"
>>>> If so, SBSA watchdog is just a normal watchdog, This use case just
>>>> makes this HW useless.
>>>> If so, go to use SP805.
>>>> you still don't see the importance of this warning and pretimeout to a
>>>> real server.
>>>>
>>>
>>> If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?
>>
>>
>> Because if WOR = 0 , according to SBSA, once you want to enable watchdog,
>> (0 + system counter) --> WCV , then , WS0 and WS1 will be triggered
>> immediately.
>> we have not a chance(a time slot) to update WCV.
>>
>
> I would have thought that this is exactly what we want if pretimeout is not
> used.

Although pretimeout == 0 is not good for a server, but If
administrator set up pretimeout == 0. *I thinks we should trigger WS1
ASAP* . Because WS1 maybe a interrupter or a reboot signal, that is
why we can not reboot system directly.

This driver is SBSA watchdog driver, that means we need to follow SBSA spec:
(1) SBSA watchdog has two stage timeouts --> timeout and pretimeout is
the best solution in watchdog framework, at least for now.
(2) The watchdog has the following output signals:
Watchdog Signal 0 (WS0)---> "The initial signal is typically wired
to an interrupt and alerts the system."(original word from spec), I
thinks the key work should be "interrupt" and "alerts". So in WS0
interrupt routine, reset is absolutely a wrong operation. Although I
think we should make this "alerts" more useful. But for the first
version of driver, I thinks panic is useful enough.
Watchdog Signal 1 (WS1). ---> "The signal is fed to a higher agent
as an interrupt or reset for it to take executive action." (original
word from spec) . The key work should be "interrupt", "or" and "reset"
. So WS1 maybe a interrupt.
so even in the WS0 interrupt routine, if pretimeout == 0 , we need to
trigger WS1(that is what my patch is doing now, set WCV to 0, so WCV
is always less than SystemCounter, and in this situation(WS0 = TRUE),
WS1 will be trigger immediately), but definitely not a reset too.

But in worst case, if the WS0 is triggered, but the interrupt routine
doesn't work(can not set up WCV), it doesn't matter, we just need to
wait for a WOR(1s in my driver) timeout, then WS1 will be triggered.
That is hardware mechanism, once we config SBSA watchdog correctly,
that should work. If it doesn't, I think the chip design doesn't
follow the SBSA spec.

Make a summary here, for SBSA watchdog driver, it need to support two
stage timeouts and need to trigger WS0/1 when timeouts occur(can not
simply reset system in interrupt routines).
If a driver doesn't do these above, the driver can not be called SBSA
watchdog driver.

But according to SBSA, even pretimeout == 0, we can not setup WOR = 0.
if we make WOR = 0 , once we set up WCS(enabling watchdog), that cause
a explicit watchdog refresh, then WCV = (0 + system counter), so WS0
and WS1 will be triggered serially and immediately(in theory, the
"delay" also depend on implementation). So in my patchset , if
pretimeout == 0, WOR will be 1s at least to make sure we have time to
setup WCV. I have made comments in the patch for explaining this.

Maybe some people want to ask: if we can not set up WOR = 0, but
pretimeout can be 0 and timeout can not, why I still want to use WOR
for pretimeout and using WCV as (timout - pretimeout) ??
For this:
(1)WCV can provide the longer timeout period, If we use WOR, it can
only provide 10s @ 400MHz(max). The default timer out for most
watchdog will be 30s, so I think 10s limit will be a little short.
(2)we can always program(write) WCV just like ping.
(3)if the first timeout occurs, WOR will be loaded to WCV(WOR + system
counter) automatically , so why not just use WOR as pretimeout?
Although we still can make this pretimeout longer by programming WCV,
I don't think it's necessary for now.


> What am I missing here ?
>
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-09 04:37:24

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/08/2015 08:59 PM, Fu Wei wrote:
> Hi Guenter,
>
>
> On 9 June 2015 at 02:26, Guenter Roeck <[email protected]> wrote:
>> On 06/08/2015 09:05 AM, Fu Wei wrote:
>>>
>>> Hi Gurnter
>>>
>>> On 3 June 2015 at 01:07, Guenter Roeck <[email protected]> wrote:
>>>>
>>>> On 06/02/2015 09:55 AM, Fu Wei wrote:
>>>>>
>>>>>
>>>>> Hi Timur,
>>>>>
>>>>> Thanks , feedback inline
>>>>>
>>>>> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>>>>>>
>>>>>>
>>>>>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>>>>>
>>>>>>> +/*
>>>>>>> + * help functions for accessing 32bit registers of SBSA Generic
>>>>>>> Watchdog
>>>>>>> + */
>>>>>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>>>>>> + struct watchdog_device *wdd)
>>>>>>> +{
>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>> +
>>>>>>> + writel_relaxed(val, gwdt->control_base + reg);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>>>>>> + struct watchdog_device *wdd)
>>>>>>> +{
>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>> +
>>>>>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct watchdog_device
>>>>>>> *wdd)
>>>>>>> +{
>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>> +
>>>>>>> + return readl_relaxed(gwdt->control_base + reg);
>>>>>>> +}
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I still think you should get rid of these functions and just call
>>>>>> readl_relaxed() and writel_relaxed() every time, but I won't complain
>>>>>> again
>>>>>> if you keep them.
>>>>>
>>>>>
>>>>>
>>>>> yes, that make sense, and will reduce the size of code, and I think
>>>>> the code's readability will be OK too.
>>>>> will try in my next patch,
>>>>>
>>>>>>
>>>>>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>>>>>> +{
>>>>>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>>>>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>>>>>> +
>>>>>>> + if (wdd->pretimeout)
>>>>>>> + /* The pretimeout is valid, go panic */
>>>>>>> + panic("SBSA Watchdog pre-timeout");
>>>>>>> + else
>>>>>>> + /* We don't use pretimeout, trigger WS1 now*/
>>>>>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I don't like this.
>>>>>
>>>>>
>>>>>
>>>>> If so, what is your idea ,if pretimeout == 0?
>>>>>
>>>>> the reason of using WCV as (timout - pretimeout): it can provide the
>>>>> longer timeout period,
>>>>> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
>>>>> as Guenter said earlier, the default timer out for most watchdog will
>>>>> be 30s, so I think 10s limit will be a little short
>>>>> (2)we can always program WCV just like ping.
>>>>> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
>>>>> we still can make this pretimeout longer by programming WCV(I don't
>>>>> think it's necessary)
>>>>>
>>>>>
>>>>>> The triggering of the hardware reset should never depend
>>>>>> on an interrupt being handled properly.
>>>>>
>>>>>
>>>>>
>>>>> if this fail, system reset in 1S, because WOR == (1s)
>>>>>
>>>> So ?
>>>
>>>
>>> Even the interrupt routine isn't triggered, (WOR + system counter) -->
>>> WCV,
>>> then, sy system reset in 1S.
>>>
>>> the hardware reset doesn't depend on an interrupt.
>>>
>>>
>>>>
>>>>>> You should always program WCV
>>>>>> correctly in advance. This is especially true since pre-timeout will
>>>>>> probably rarely be used.
>>>>>
>>>>>
>>>>>
>>>>> always programming WCV is doable. But I absolutely can not agree "
>>>>> pre-timeout will probably rarely be used"
>>>>> If so, SBSA watchdog is just a normal watchdog, This use case just
>>>>> makes this HW useless.
>>>>> If so, go to use SP805.
>>>>> you still don't see the importance of this warning and pretimeout to a
>>>>> real server.
>>>>>
>>>>
>>>> If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?
>>>
>>>
>>> Because if WOR = 0 , according to SBSA, once you want to enable watchdog,
>>> (0 + system counter) --> WCV , then , WS0 and WS1 will be triggered
>>> immediately.
>>> we have not a chance(a time slot) to update WCV.
>>>
>>
>> I would have thought that this is exactly what we want if pretimeout is not
>> used.
>
> Although pretimeout == 0 is not good for a server, but If
> administrator set up pretimeout == 0. *I thinks we should trigger WS1
> ASAP* . Because WS1 maybe a interrupter or a reboot signal, that is
> why we can not reboot system directly.
>
> This driver is SBSA watchdog driver, that means we need to follow SBSA spec:
> (1) SBSA watchdog has two stage timeouts --> timeout and pretimeout is
> the best solution in watchdog framework, at least for now.
> (2) The watchdog has the following output signals:
> Watchdog Signal 0 (WS0)---> "The initial signal is typically wired
> to an interrupt and alerts the system."(original word from spec), I
> thinks the key work should be "interrupt" and "alerts". So in WS0
> interrupt routine, reset is absolutely a wrong operation. Although I
> think we should make this "alerts" more useful. But for the first
> version of driver, I thinks panic is useful enough.
> Watchdog Signal 1 (WS1). ---> "The signal is fed to a higher agent
> as an interrupt or reset for it to take executive action." (original
> word from spec) . The key work should be "interrupt", "or" and "reset"
> . So WS1 maybe a interrupt.
> so even in the WS0 interrupt routine, if pretimeout == 0 , we need to
> trigger WS1(that is what my patch is doing now, set WCV to 0, so WCV
> is always less than SystemCounter, and in this situation(WS0 = TRUE),
> WS1 will be trigger immediately), but definitely not a reset too.
>
> But in worst case, if the WS0 is triggered, but the interrupt routine
> doesn't work(can not set up WCV), it doesn't matter, we just need to
> wait for a WOR(1s in my driver) timeout, then WS1 will be triggered.
> That is hardware mechanism, once we config SBSA watchdog correctly,
> that should work. If it doesn't, I think the chip design doesn't
> follow the SBSA spec.
>
> Make a summary here, for SBSA watchdog driver, it need to support two
> stage timeouts and need to trigger WS0/1 when timeouts occur(can not
> simply reset system in interrupt routines).
> If a driver doesn't do these above, the driver can not be called SBSA
> watchdog driver.
>
> But according to SBSA, even pretimeout == 0, we can not setup WOR = 0.
> if we make WOR = 0 , once we set up WCS(enabling watchdog), that cause
> a explicit watchdog refresh, then WCV = (0 + system counter), so WS0
> and WS1 will be triggered serially and immediately(in theory, the

I still don't understand why this would be a problem.

> "delay" also depend on implementation). So in my patchset , if
> pretimeout == 0, WOR will be 1s at least to make sure we have time to
> setup WCV. I have made comments in the patch for explaining this.
>
> Maybe some people want to ask: if we can not set up WOR = 0, but
> pretimeout can be 0 and timeout can not, why I still want to use WOR
> for pretimeout and using WCV as (timout - pretimeout) ??
> For this:
> (1)WCV can provide the longer timeout period, If we use WOR, it can
> only provide 10s @ 400MHz(max). The default timer out for most
> watchdog will be 30s, so I think 10s limit will be a little short.
> (2)we can always program(write) WCV just like ping.
> (3)if the first timeout occurs, WOR will be loaded to WCV(WOR + system
> counter) automatically , so why not just use WOR as pretimeout?
> Although we still can make this pretimeout longer by programming WCV,
> I don't think it's necessary for now.
>

Too bad I don't have an arm64 system to test myself. I am not sure
I understand why WOR must be set to > 0 if pretimeout == 0, and even
if it must be set to a value > 0 I don't understand why
setting it to 1 (instead of 1 second) would not be sufficient.

Guenter

2015-06-09 06:38:06

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,

Thanks for reply so quickly.

On 9 June 2015 at 12:37, Guenter Roeck <[email protected]> wrote:
> On 06/08/2015 08:59 PM, Fu Wei wrote:
>>
>> Hi Guenter,
>>
>>
>> On 9 June 2015 at 02:26, Guenter Roeck <[email protected]> wrote:
>>>
>>> On 06/08/2015 09:05 AM, Fu Wei wrote:
>>>>
>>>>
>>>> Hi Gurnter
>>>>
>>>> On 3 June 2015 at 01:07, Guenter Roeck <[email protected]> wrote:
>>>>>
>>>>>
>>>>> On 06/02/2015 09:55 AM, Fu Wei wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Timur,
>>>>>>
>>>>>> Thanks , feedback inline
>>>>>>
>>>>>> On 2 June 2015 at 23:32, Timur Tabi <[email protected]> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 06/01/2015 11:05 PM, [email protected] wrote:
>>>>>>>
>>>>>>>> +/*
>>>>>>>> + * help functions for accessing 32bit registers of SBSA Generic
>>>>>>>> Watchdog
>>>>>>>> + */
>>>>>>>> +static void sbsa_gwdt_cf_write(unsigned int reg, u32 val,
>>>>>>>> + struct watchdog_device *wdd)
>>>>>>>> +{
>>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>>> +
>>>>>>>> + writel_relaxed(val, gwdt->control_base + reg);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static void sbsa_gwdt_rf_write(unsigned int reg, u32 val,
>>>>>>>> + struct watchdog_device *wdd)
>>>>>>>> +{
>>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>>> +
>>>>>>>> + writel_relaxed(val, gwdt->refresh_base + reg);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static u32 sbsa_gwdt_cf_read(unsigned int reg, struct
>>>>>>>> watchdog_device
>>>>>>>> *wdd)
>>>>>>>> +{
>>>>>>>> + struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);
>>>>>>>> +
>>>>>>>> + return readl_relaxed(gwdt->control_base + reg);
>>>>>>>> +}
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I still think you should get rid of these functions and just call
>>>>>>> readl_relaxed() and writel_relaxed() every time, but I won't complain
>>>>>>> again
>>>>>>> if you keep them.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> yes, that make sense, and will reduce the size of code, and I think
>>>>>> the code's readability will be OK too.
>>>>>> will try in my next patch,
>>>>>>
>>>>>>>
>>>>>>>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>>>>>>>> +{
>>>>>>>> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>>>>>>>> + struct watchdog_device *wdd = &gwdt->wdd;
>>>>>>>> +
>>>>>>>> + if (wdd->pretimeout)
>>>>>>>> + /* The pretimeout is valid, go panic */
>>>>>>>> + panic("SBSA Watchdog pre-timeout");
>>>>>>>> + else
>>>>>>>> + /* We don't use pretimeout, trigger WS1 now*/
>>>>>>>> + sbsa_gwdt_set_wcv(wdd, 0);
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I don't like this.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> If so, what is your idea ,if pretimeout == 0?
>>>>>>
>>>>>> the reason of using WCV as (timout - pretimeout): it can provide the
>>>>>> longer timeout period,
>>>>>> (1)If we use WOR, it can only provide 10s @ 400MHz(max).
>>>>>> as Guenter said earlier, the default timer out for most watchdog will
>>>>>> be 30s, so I think 10s limit will be a little short
>>>>>> (2)we can always program WCV just like ping.
>>>>>> (3)if a timeout arrives, WOR will be use, so use it as pretimeout, but
>>>>>> we still can make this pretimeout longer by programming WCV(I don't
>>>>>> think it's necessary)
>>>>>>
>>>>>>
>>>>>>> The triggering of the hardware reset should never depend
>>>>>>> on an interrupt being handled properly.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> if this fail, system reset in 1S, because WOR == (1s)
>>>>>>
>>>>> So ?
>>>>
>>>>
>>>>
>>>> Even the interrupt routine isn't triggered, (WOR + system counter) -->
>>>> WCV,
>>>> then, sy system reset in 1S.
>>>>
>>>> the hardware reset doesn't depend on an interrupt.
>>>>
>>>>
>>>>>
>>>>>>> You should always program WCV
>>>>>>> correctly in advance. This is especially true since pre-timeout will
>>>>>>> probably rarely be used.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> always programming WCV is doable. But I absolutely can not agree "
>>>>>> pre-timeout will probably rarely be used"
>>>>>> If so, SBSA watchdog is just a normal watchdog, This use case just
>>>>>> makes this HW useless.
>>>>>> If so, go to use SP805.
>>>>>> you still don't see the importance of this warning and pretimeout to a
>>>>>> real server.
>>>>>>
>>>>>
>>>>> If pretimeout isn't used, why not just set WCV = timeout, WOR = 0 ?
>>>>
>>>>
>>>>
>>>> Because if WOR = 0 , according to SBSA, once you want to enable
>>>> watchdog,
>>>> (0 + system counter) --> WCV , then , WS0 and WS1 will be triggered
>>>> immediately.
>>>> we have not a chance(a time slot) to update WCV.
>>>>
>>>
>>> I would have thought that this is exactly what we want if pretimeout is
>>> not
>>> used.
>>
>>
>> Although pretimeout == 0 is not good for a server, but If
>> administrator set up pretimeout == 0. *I thinks we should trigger WS1
>> ASAP* . Because WS1 maybe a interrupter or a reboot signal, that is
>> why we can not reboot system directly.
>>
>> This driver is SBSA watchdog driver, that means we need to follow SBSA
>> spec:
>> (1) SBSA watchdog has two stage timeouts --> timeout and pretimeout is
>> the best solution in watchdog framework, at least for now.
>> (2) The watchdog has the following output signals:
>> Watchdog Signal 0 (WS0)---> "The initial signal is typically wired
>> to an interrupt and alerts the system."(original word from spec), I
>> thinks the key work should be "interrupt" and "alerts". So in WS0
>> interrupt routine, reset is absolutely a wrong operation. Although I
>> think we should make this "alerts" more useful. But for the first
>> version of driver, I thinks panic is useful enough.
>> Watchdog Signal 1 (WS1). ---> "The signal is fed to a higher agent
>> as an interrupt or reset for it to take executive action." (original
>> word from spec) . The key work should be "interrupt", "or" and "reset"
>> . So WS1 maybe a interrupt.
>> so even in the WS0 interrupt routine, if pretimeout == 0 , we need to
>> trigger WS1(that is what my patch is doing now, set WCV to 0, so WCV
>> is always less than SystemCounter, and in this situation(WS0 = TRUE),
>> WS1 will be trigger immediately), but definitely not a reset too.
>>
>> But in worst case, if the WS0 is triggered, but the interrupt routine
>> doesn't work(can not set up WCV), it doesn't matter, we just need to
>> wait for a WOR(1s in my driver) timeout, then WS1 will be triggered.
>> That is hardware mechanism, once we config SBSA watchdog correctly,
>> that should work. If it doesn't, I think the chip design doesn't
>> follow the SBSA spec.
>>
>> Make a summary here, for SBSA watchdog driver, it need to support two
>> stage timeouts and need to trigger WS0/1 when timeouts occur(can not
>> simply reset system in interrupt routines).
>> If a driver doesn't do these above, the driver can not be called SBSA
>> watchdog driver.
>>
>> But according to SBSA, even pretimeout == 0, we can not setup WOR = 0.
>> if we make WOR = 0 , once we set up WCS(enabling watchdog), that cause
>> a explicit watchdog refresh, then WCV = (0 + system counter), so WS0
>> and WS1 will be triggered serially and immediately(in theory, the
>
>
> I still don't understand why this would be a problem.

that 20+ lines pseudocode from the page 23 of SBSA spec 2.3 can
explain this well.
Please let me know which line confuse you, I will explain them one by one below.

>
>> "delay" also depend on implementation). So in my patchset , if
>> pretimeout == 0, WOR will be 1s at least to make sure we have time to
>> setup WCV. I have made comments in the patch for explaining this.
>>
>> Maybe some people want to ask: if we can not set up WOR = 0, but
>> pretimeout can be 0 and timeout can not, why I still want to use WOR
>> for pretimeout and using WCV as (timout - pretimeout) ??
>> For this:
>> (1)WCV can provide the longer timeout period, If we use WOR, it can
>> only provide 10s @ 400MHz(max). The default timer out for most
>> watchdog will be 30s, so I think 10s limit will be a little short.
>> (2)we can always program(write) WCV just like ping.
>> (3)if the first timeout occurs, WOR will be loaded to WCV(WOR + system
>> counter) automatically , so why not just use WOR as pretimeout?
>> Although we still can make this pretimeout longer by programming WCV,
>> I don't think it's necessary for now.
>>
>
> Too bad I don't have an arm64 system to test myself. I am not sure
> I understand why WOR must be set to > 0 if pretimeout == 0, and even

Although SBSA spec has published for years(2.2 was published at 15th
Jan 2014), but this is still a relatively new spec of hardware, for
now . I only has Foundation model and FAST model for testing.
But I would like to suggest: this driver should follow the SBSA spec
first, but not a specific hardware, even some of hardware has this
watchdog, but there maybe some hardware bug in it(at least I know
there is one).

I am using 1 second, just because I think 1 second is safe enough to
reload WCV. (from write "1" in WCS to reload WCV[31:0] and WCV[63:32]
)
Actually, in most of case , we need WOR > 1, only when we enable the
watchdog( write "1" to WCS, this causes a ExplicitRefresh, see Line 7
"ExplicitRefresh == TRUE" and Line 8 below).
in this cause , we just need a time slot to reload WCV, the driver
doesn't really spend 1s on this.
But If WOR ==0, (Line 1) TimeoutRefresh = TRUE in a very short time,
the minimum time depends on the implementation. Then The first timeout
stage occur(see Line 7 (TimeoutRefresh == TRUE and WS0 == FALSE), and
Line 8 below again ) and WS0 was triggered(Line 16~18). Because WOR
==0 too, (Line 1) TimeoutRefresh = TRUE occur in a very short time
again, then see Line 16, 17, 20

In worst case, if pretimeout is 0, First timeout stage occur(see Line
7 (TimeoutRefresh == TRUE and WS0 == FALSE), and Line 8 below, then
Line 16~18) but the interrupt routine doesn't work, we can still got
WS1 in the time we configured in WOR. (Line 1) TimeoutRefresh = TRUE
occur when SystemCounter[63:0] > SystemCounter[63:0] +
ZeroExtend(WOR[31:0]), then Line 16, 17, 20

---------------------------------------------------
CompareValue : WCV
WatchdogOffsetValue : WOR


1 TimeoutRefresh = (SystemCounter[63:0] > CompareValue[63:0])

2 If WatchdogColdReset
3 WatchdogEnable = DISABLED
4 Endif

5 If LoadNewCompareValue
6 CompareValue = new_value
7 ElseIf ExplicitRefresh == TRUE or (TimeoutRefresh == TRUE and WS0 == FALSE)
8 CompareValue = SystemCounter[63:0] +
ZeroExtend(WatchdogOffsetValue[31:0])
9 Endif

10 If WatchdogEnable == DISABLED
11 WS0 = FALSE
12 WS1 = FALSE
13 ElseIf ExplicitRefresh == TRUE
14 WS0 = FALSE
15 WS1 = FALSE
16 ElseIf TimeoutRefresh == TRUE
17 If WS0 == FALSE
18 WS0 = TRUE
19 Else
20 WS1 = TRUE
21 Endif
22 Endif
---------------------------------------------------

I would like to stress that pretimeout == 0 should not happen in a
real server system, that is why we defined a SBSA watchdog, but not
a normal one
But we still need to thinking about the situation that administrator
want to do this on a very special purpose.

maybe we can set min_pretimeout = 1 for this driver. that is just a suggestion.


> if it must be set to a value > 0 I don't understand why
> setting it to 1 (instead of 1 second) would not be sufficient.

it don't have to be 1 second , it can be 0.1, 0.5 or 0.01 second. like
I said before, we just need a time slot to setup WCV in
sbsa_gwdt_start. I have commented this in sbsa_gwdt_set_pretimeout in
this patchset.
But I think the minimum time slot depend on implementation, the spec
doesn't mention about this.
If pretimeout == 0, and we set up WOR a little bigger, it *ONLY*
affect "the worst case" I mention above. in this case, a administrator
set up pretimeout == 0 which should not happen in a real server, then
the system goes very wrong. I don't think it is important that this
server system reset in 1 second or 0.0001 second, at this time, this
server can not provide any useful info anyway(because we don't use
pretimeout).
If system may go wrong, the administrator should set up pretimeout >
0 to figure out what is wrong with it just like he always should do.

>
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-09 08:05:44

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/08/2015 11:37 PM, Fu Wei wrote:

>
> I would like to stress that pretimeout == 0 should not happen in a
> real server system, that is why we defined a SBSA watchdog, but not
> a normal one

Clarification - In _your opinion_, a server should always use pretimeout.

> But we still need to thinking about the situation that administrator
> want to do this on a very special purpose.
>
I could as well argue that setting pretimeout is the special situation.
Some administrators may not want to bother but just want the system to reset
if a watchdog timeout occurs. _Maybe_ if it happens multiple times,
they might want to set up pretimeout to figure out why.

Declaring that one _has_ to configure pretimeout is just a personal
opinion, nothing else. We don't know what server administrators want,
and we should not dictate anything unless technically necessary.

> maybe we can set min_pretimeout = 1 for this driver. that is just a suggestion.
>
No. It is not technically necessary.

>
>> if it must be set to a value > 0 I don't understand why
>> setting it to 1 (instead of 1 second) would not be sufficient.
>
> it don't have to be 1 second , it can be 0.1, 0.5 or 0.01 second. like
> I said before, we just need a time slot to setup WCV in
> sbsa_gwdt_start. I have commented this in sbsa_gwdt_set_pretimeout in
> this patchset.

I think what you really want to say is that you want to have time to
handle the interrupt. But handling the interrupt is not asked for if
pretimeout == 0.

> But I think the minimum time slot depend on implementation, the spec
> doesn't mention about this.

Yes, because a minimum value does not exist.

> If pretimeout == 0, and we set up WOR a little bigger, it *ONLY*
> affect "the worst case" I mention above. in this case, a administrator
> set up pretimeout == 0 which should not happen in a real server, then
> the system goes very wrong. I don't think it is important that this
> server system reset in 1 second or 0.0001 second, at this time, this
> server can not provide any useful info anyway(because we don't use
> pretimeout).
> If system may go wrong, the administrator should set up pretimeout >
> 0 to figure out what is wrong with it just like he always should do.
>

Yes, exactly. But otherwise, if pretimeout is set to 0, we want
to reset immediately as directed.

As I interpret the specification, WOR=0 forces WS1 immediately after WS0.

1) if TimeoutRefresh = True:
CompareValue := SystemCounter + WOR (= SystemCounter
WS0 := True
2) TimeoutRefresh is True again, WS0 == True:
WS1 = True

This is exactly the behavior we want if pretimeout == 0. In this situation,
we don't want to handle the interrupt, we just want to reset the system as
fast as possible.

Having said that, have you tested what happens in your system if you
set WOR=0 ?

Thanks,
Guenter

2015-06-09 10:46:25

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,

Thanks for your feedback

On 9 June 2015 at 16:04, Guenter Roeck <[email protected]> wrote:
> On 06/08/2015 11:37 PM, Fu Wei wrote:
>
>>
>> I would like to stress that pretimeout == 0 should not happen in a
>> real server system, that is why we defined a SBSA watchdog, but not
>> a normal one
>
>
> Clarification - In _your opinion_, a server should always use pretimeout.
>
>> But we still need to thinking about the situation that administrator
>> want to do this on a very special purpose.
>>
> I could as well argue that setting pretimeout is the special situation.
> Some administrators may not want to bother but just want the system to reset
> if a watchdog timeout occurs. _Maybe_ if it happens multiple times,
> they might want to set up pretimeout to figure out why.

you are right,
before this driver and device are really used in some server, we don't
know How do administrators use this pretimeout,
and the percentage of using pretimeout, But here you did mention a
good usage of pretimeout: we can figure out what is wrong with system.

except the usages, this two stage timeouts is a main feature of SBSA
watchdog, if we drop this feature, I don't think that is a SBSA
watchdog driver. it becomes normal watchdog driver using SBSA watchdog
hardware.

>
> Declaring that one _has_ to configure pretimeout is just a personal
> opinion, nothing else. We don't know what server administrators want,
> and we should not dictate anything unless technically necessary.

yes, agree with you on "we should not dictate anything unless
technically necessary."

>
>> maybe we can set min_pretimeout = 1 for this driver. that is just a
>> suggestion.
>>
> No. It is not technically necessary.

it is just a thought. but I never do that , because min_pretimeout = 0
is technically doable in this driver.

>
>>
>>> if it must be set to a value > 0 I don't understand why
>>> setting it to 1 (instead of 1 second) would not be sufficient.
>>
>>
>> it don't have to be 1 second , it can be 0.1, 0.5 or 0.01 second. like
>> I said before, we just need a time slot to setup WCV in
>> sbsa_gwdt_start. I have commented this in sbsa_gwdt_set_pretimeout in
>> this patchset.
>
>
> I think what you really want to say is that you want to have time to
> handle the interrupt. But handling the interrupt is not asked for if
> pretimeout == 0.

No, that is not what I want to say. pretimeout == 0 but WOR can not be
0 is nothing to do with interrupt.
In another word, pretimeout == 0 we should trigger WS1 ASAP, we don't
need to handle the interrupt.
I have explained this in previous email:

"
we need WOR > 1, only when we enable the watchdog( write "1" to WCS,
this causes a ExplicitRefresh, see Line 7 "ExplicitRefresh == TRUE"
and Line 8 below).
in this cause , we just need a time slot to reload WCV, the driver
doesn't really spend 1s on this.
But If WOR ==0, (Line 1) TimeoutRefresh = TRUE in a very short time,
the minimum time depends on the implementation. Then The first timeout
stage occur(see Line 7 (TimeoutRefresh == TRUE and WS0 == FALSE), and
Line 8 below again ) and WS0 was triggered(Line 16~18). Because WOR
==0 too, (Line 1) TimeoutRefresh = TRUE occur in a very short time
again, then see Line 16, 17, 20
"
in one word, according to SBSA, If we make WOR == 0, once we enable
watchdog, WS0 and WS1 will be triggered immediately, we have not
chance to set up WCV for (timeout - pretimeout).
In another word, WOR == 0, we can not enable watchdog, enabling
become a warn reset button.

>
>> But I think the minimum time slot depend on implementation, the spec
>> doesn't mention about this.
>
>
> Yes, because a minimum value does not exist.
>
>> If pretimeout == 0, and we set up WOR a little bigger, it *ONLY*
>> affect "the worst case" I mention above. in this case, a administrator
>> set up pretimeout == 0 which should not happen in a real server, then
>> the system goes very wrong. I don't think it is important that this
>> server system reset in 1 second or 0.0001 second, at this time, this
>> server can not provide any useful info anyway(because we don't use
>> pretimeout).
>> If system may go wrong, the administrator should set up pretimeout >
>> 0 to figure out what is wrong with it just like he always should do.
>>
>
> Yes, exactly. But otherwise, if pretimeout is set to 0, we want
> to reset immediately as directed.

In this driver, if pretimeout is set to 0, we want to *trigger WS1*
immediately as directed.
Yes, If we could.

>
> As I interpret the specification, WOR=0 forces WS1 immediately after WS0.
>
> 1) if TimeoutRefresh = True:
> CompareValue := SystemCounter + WOR (= SystemCounter
> WS0 := True
> 2) TimeoutRefresh is True again, WS0 == True:
> WS1 = True
>
> This is exactly the behavior we want if pretimeout == 0. In this situation,
> we don't want to handle the interrupt, we just want to reset the system as
> fast as possible.

Yes, if WOR only affect in TimeoutRefresh, we cat always make WOR == pretimeout
But the problem is if we enable watchdog (write 0x01 to WCS will
cause an explicit watchdog refresh), then
1) if ExplicitRefresh = True:
CompareValue := SystemCounter + WOR
WS0 := True
2) TimeoutRefresh is True again, WS0 == True:
WS1 = True

so once we enable watchdog, system reset, that is not what we want.
this behavior is following SBSA spec.

FYI:
-----
An explicit watchdog refresh occurs when one of a number of different
events occur:
The Watchdog Refresh Register is written.
The Watchdog Offset Register is written.
The Watchdog Control and Status register is written.
-----

>
> Having said that, have you tested what happens in your system if you
> set WOR=0 ?

I have not HW, but I have tested it on Foundation model, the result is
just what I expect:
-------------------------------------
pretimeout=0 ,

and

static int sbsa_gwdt_set_pretimeout(struct watchdog_device *wdd,
unsigned int pretimeout)
{
struct sbsa_gwdt *gwdt = to_sbsa_gwdt(wdd);

wdd->pretimeout = pretimeout;

/* refresh the WOR, that will cause an explicit watchdog refresh */
sbsa_gwdt_cf_write(SBSA_GWDT_WOR, pretimeout * gwdt->clk, wdd);

return 0;
}
-------------------------------------
Once I enable watchdog, system down(there is a WS1 interrupter in
Foundation model, but not more info for that), there is not panic(if
ws0 is triggered).

>
> Thanks,
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-09 16:24:08

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/09/2015 03:46 AM, Fu Wei wrote:

> Yes, if WOR only affect in TimeoutRefresh, we cat always make WOR == pretimeout
> But the problem is if we enable watchdog (write 0x01 to WCS will
> cause an explicit watchdog refresh), then
> 1) if ExplicitRefresh = True:
> CompareValue := SystemCounter + WOR
> WS0 := True
> 2) TimeoutRefresh is True again, WS0 == True:
> WS1 = True
>
> so once we enable watchdog, system reset, that is not what we want.
> this behavior is following SBSA spec.
>

Ok, I admit I am a bit slow ;-).

WS0 := True would be set the next time around, since

if ExplicitRefresh == True
WS0 = False
WS1 = False

but I see your point. Essentially, the specification is broken
for all practical purposes, since, as you point out, enabling
the watchdog overwrites and explicitly sets WCV. Effectively
this means that just using WCV to program the timeout period
is not really possible.

I am not really sure how to address this. We can either only use WOR,
and forget about pretimeout, or we can enforce a minimum pretimeout.
In the latter case, we'll have to write WCV after writing WOR.

Thanks,
Guenter

2015-06-09 16:30:06

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>
>
> but I see your point. Essentially, the specification is broken
> for all practical purposes, since, as you point out, enabling
> the watchdog overwrites and explicitly sets WCV. Effectively
> this means that just using WCV to program the timeout period
> is not really possible.
>
> I am not really sure how to address this. We can either only use WOR,
> and forget about pretimeout, or we can enforce a minimum pretimeout.
> In the latter case, we'll have to write WCV after writing WOR.

In talking with our hardware engineers, using WCV to program the timeout
period is not a valid operation. This is why I keep arguing against the
pre-timeout feature, and I don't agree that servers should always use
pre-timeout.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-09 16:46:20

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/09/2015 09:29 AM, Timur Tabi wrote:
> On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>>
>>
>> but I see your point. Essentially, the specification is broken
>> for all practical purposes, since, as you point out, enabling
>> the watchdog overwrites and explicitly sets WCV. Effectively
>> this means that just using WCV to program the timeout period
>> is not really possible.
>>
>> I am not really sure how to address this. We can either only use WOR,
>> and forget about pretimeout, or we can enforce a minimum pretimeout.
>> In the latter case, we'll have to write WCV after writing WOR.
>
> In talking with our hardware engineers, using WCV to program the timeout period is not a valid operation. This is why I keep arguing against the pre-timeout feature, and I don't agree that servers should always use pre-timeout.
>

Not sure if "not valid" is correct - after all, it is mentioned in the
specification. However, it is at the very least fragile.

I tend to agree that we should just forget about pretimeout and
use your original approach, where the timeout value is used
to program WOR. Everything else is really just asking for trouble.

Guenter

2015-06-09 16:54:01

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/09/2015 11:45 AM, Guenter Roeck wrote:

> I tend to agree that we should just forget about pretimeout and
> use your original approach, where the timeout value is used
> to program WOR. Everything else is really just asking for trouble.

The driver that I submitted is effectively the same as Fu's, except
without pre-timeout support.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

2015-06-10 03:41:38

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,

On 10 June 2015 at 00:45, Guenter Roeck <[email protected]> wrote:
> On 06/09/2015 09:29 AM, Timur Tabi wrote:
>>
>> On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>>>
>>>
>>>
>>> but I see your point. Essentially, the specification is broken
>>> for all practical purposes, since, as you point out, enabling
>>> the watchdog overwrites and explicitly sets WCV. Effectively
>>> this means that just using WCV to program the timeout period
>>> is not really possible.
>>>
>>> I am not really sure how to address this. We can either only use WOR,
>>> and forget about pretimeout, or we can enforce a minimum pretimeout.
>>> In the latter case, we'll have to write WCV after writing WOR.
>>
>>
>> In talking with our hardware engineers, using WCV to program the timeout
>> period is not a valid operation. This is why I keep arguing against the
>> pre-timeout feature, and I don't agree that servers should always use
>> pre-timeout.
>>
>
> Not sure if "not valid" is correct - after all, it is mentioned in the
> specification. However, it is at the very least fragile.

I think we should focus on SBSA spec, but not a specific chip design,
because this is SBSA watchdog, not a driver for an IP core from a
specific chip vendor.
this operation is mentioned in the spec,
and I have tested my driver on Foundation model(from ARM) and a real hardware.

>
> I tend to agree that we should just forget about pretimeout and
> use your original approach, where the timeout value is used
> to program WOR. Everything else is really just asking for trouble.

I don't mind if we give up pretimeout, The reason I use pretimeout is:
this concept matches the function of two stage timeouts.

but, If we give up pretimeout, could you give me a suggestion:

How to config the two stage timeouts
(1)from enabling watchdog to WS0
(2)the time from WS1 to WS0

If we only have one timeout parameter, How to config the two stage timeouts?
Any suggestion ?

If we make the first stage timeout is timeout/2, this violates the
definition of timeout.
I don't think users expect interrupt, panic or reboot at timeout/2.

And WS1 definitely isn't a backup of WS0.

>
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-10 04:20:29

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,

On 10 June 2015 at 11:41, Fu Wei <[email protected]> wrote:
> Hi Guenter,
>
> On 10 June 2015 at 00:45, Guenter Roeck <[email protected]> wrote:
>> On 06/09/2015 09:29 AM, Timur Tabi wrote:
>>>
>>> On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>>>>
>>>>
>>>>
>>>> but I see your point. Essentially, the specification is broken
>>>> for all practical purposes, since, as you point out, enabling
>>>> the watchdog overwrites and explicitly sets WCV. Effectively
>>>> this means that just using WCV to program the timeout period
>>>> is not really possible.
>>>>
>>>> I am not really sure how to address this. We can either only use WOR,
>>>> and forget about pretimeout, or we can enforce a minimum pretimeout.
>>>> In the latter case, we'll have to write WCV after writing WOR.
>>>
>>>
>>> In talking with our hardware engineers, using WCV to program the timeout
>>> period is not a valid operation. This is why I keep arguing against the
>>> pre-timeout feature, and I don't agree that servers should always use
>>> pre-timeout.
>>>
>>
>> Not sure if "not valid" is correct - after all, it is mentioned in the
>> specification. However, it is at the very least fragile.
>
> I think we should focus on SBSA spec, but not a specific chip design,
> because this is SBSA watchdog, not a driver for an IP core from a
> specific chip vendor.
> this operation is mentioned in the spec,
> and I have tested my driver on Foundation model(from ARM) and a real hardware.
>
>>
>> I tend to agree that we should just forget about pretimeout and
>> use your original approach, where the timeout value is used
>> to program WOR. Everything else is really just asking for trouble.

Another weakness of only using WOR is the timeout limited by this
32bit register.
10s @400MHz generic Timer

I don't think this limit is good for a server, once the server is in a
heavy load

>
> I don't mind if we give up pretimeout, The reason I use pretimeout is:
> this concept matches the function of two stage timeouts.
>
> but, If we give up pretimeout, could you give me a suggestion:
>
> How to config the two stage timeouts
> (1)from enabling watchdog to WS0
> (2)the time from WS1 to WS0
>
> If we only have one timeout parameter, How to config the two stage timeouts?
> Any suggestion ?
>
> If we make the first stage timeout is timeout/2, this violates the
> definition of timeout.
> I don't think users expect interrupt, panic or reboot at timeout/2.
>
> And WS1 definitely isn't a backup of WS0.
>
>>
>> Guenter
>>
>
>
>
> --
> Best regards,
>
> Fu Wei
> Software Engineer
> Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
> Ph: +86 21 61221326(direct)
> Ph: +86 186 2020 4684 (mobile)
> Room 1512, Regus One Corporate Avenue,Level 15,
> One Corporate Avenue,222 Hubin Road,Huangpu District,
> Shanghai,China 200021



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-10 14:22:52

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Fu Wei wrote:
> Another weakness of only using WOR is the timeout limited by this
> 32bit register.
> 10s @400MHz generic Timer
>
> I don't think this limit is good for a server, once the server is in a
> heavy load

Perhaps, but if that's the limitation of the hardware, then so be it.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

2015-06-10 14:37:02

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 10 June 2015 at 22:22, Timur Tabi <[email protected]> wrote:
> Fu Wei wrote:
>>
>> Another weakness of only using WOR is the timeout limited by this
>> 32bit register.
>> 10s @400MHz generic Timer
>>
>> I don't think this limit is good for a server, once the server is in a
>> heavy load
>
>
> Perhaps, but if that's the limitation of the hardware, then so be it.

it is not the real limitation, if we use WCV, SBSA spec has mentioned that.

------ SBSA 2.3-------Page 23 ----------
Note: the watchdog offset register is 32 bits wide. This gives a
maximum watch period of around 10s at a system
counter frequency of 400MHz. If a larger watch period is required then
the compare value can be programmed
directly into the compare value register
------ SBSA 2.3-------Page 23 ----------

if some hardware doesn't support writing WCV,
that means this hardware is non-standard SBSA watchdog.

>
> --
> Sent by an employee of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the
> Code Aurora Forum, hosted by The Linux Foundation.



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-10 14:41:17

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 10 June 2015 at 22:36, Fu Wei <[email protected]> wrote:
> On 10 June 2015 at 22:22, Timur Tabi <[email protected]> wrote:
>> Fu Wei wrote:
>>>
>>> Another weakness of only using WOR is the timeout limited by this
>>> 32bit register.
>>> 10s @400MHz generic Timer
>>>
>>> I don't think this limit is good for a server, once the server is in a
>>> heavy load
>>
>>
>> Perhaps, but if that's the limitation of the hardware, then so be it.
>
> it is not the real limitation, if we use WCV, SBSA spec has mentioned that.
>
> ------ SBSA 2.3-------Page 23 ----------
> Note: the watchdog offset register is 32 bits wide. This gives a
> maximum watch period of around 10s at a system
> counter frequency of 400MHz. If a larger watch period is required then
> the compare value can be programmed
> directly into the compare value register
> ------ SBSA 2.3-------Page 23 ----------

------ SBSA 2.3-------Page 24 ----------
0x010 – 0x013 WCV[31:0]
0x014 – 0x017 WCV[63:32]
Watchdog compare value.
*Read/Write* registers
containing the current value in the watchdog compare
register.
------ SBSA 2.3-------Page 24 ----------

>
> if some hardware doesn't support writing WCV,
> that means this hardware is non-standard SBSA watchdog.
>
>>
>> --
>> Sent by an employee of the Qualcomm Innovation Center, Inc.
>> The Qualcomm Innovation Center, Inc. is a member of the
>> Code Aurora Forum, hosted by The Linux Foundation.
>
>
>
> --
> Best regards,
>
> Fu Wei
> Software Engineer
> Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
> Ph: +86 21 61221326(direct)
> Ph: +86 186 2020 4684 (mobile)
> Room 1512, Regus One Corporate Avenue,Level 15,
> One Corporate Avenue,222 Hubin Road,Huangpu District,
> Shanghai,China 200021



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-10 15:41:33

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,


On 10 June 2015 at 11:41, Fu Wei <[email protected]> wrote:
> Hi Guenter,
>
> On 10 June 2015 at 00:45, Guenter Roeck <[email protected]> wrote:
>> On 06/09/2015 09:29 AM, Timur Tabi wrote:
>>>
>>> On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>>>>
>>>>
>>>>
>>>> but I see your point. Essentially, the specification is broken
>>>> for all practical purposes, since, as you point out, enabling
>>>> the watchdog overwrites and explicitly sets WCV. Effectively
>>>> this means that just using WCV to program the timeout period
>>>> is not really possible.
>>>>
>>>> I am not really sure how to address this. We can either only use WOR,
>>>> and forget about pretimeout, or we can enforce a minimum pretimeout.
>>>> In the latter case, we'll have to write WCV after writing WOR.
>>>
>>>
>>> In talking with our hardware engineers, using WCV to program the timeout
>>> period is not a valid operation. This is why I keep arguing against the
>>> pre-timeout feature, and I don't agree that servers should always use
>>> pre-timeout.
>>>
>>
>> Not sure if "not valid" is correct - after all, it is mentioned in the
>> specification. However, it is at the very least fragile.
>
> I think we should focus on SBSA spec, but not a specific chip design,
> because this is SBSA watchdog, not a driver for an IP core from a
> specific chip vendor.
> this operation is mentioned in the spec,
> and I have tested my driver on Foundation model(from ARM) and a real hardware.
>
>>
>> I tend to agree that we should just forget about pretimeout and
>> use your original approach, where the timeout value is used
>> to program WOR. Everything else is really just asking for trouble.
>
> I don't mind if we give up pretimeout, The reason I use pretimeout is:
> this concept matches the function of two stage timeouts.
>
> but, If we give up pretimeout, could you give me a suggestion:
>
> How to config the two stage timeouts
> (1)from enabling watchdog to WS0
> (2)the time from WS1 to WS0
>
> If we only have one timeout parameter, How to config the two stage timeouts?
> Any suggestion ?

I have another thought for this, please allow me to sent anther
patchset in a day. see if you like it.

>
> If we make the first stage timeout is timeout/2, this violates the
> definition of timeout.
> I don't think users expect interrupt, panic or reboot at timeout/2.
>
> And WS1 definitely isn't a backup of WS0.
>
>>
>> Guenter
>>
>
>
>
> --
> Best regards,
>
> Fu Wei
> Software Engineer
> Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
> Ph: +86 21 61221326(direct)
> Ph: +86 186 2020 4684 (mobile)
> Room 1512, Regus One Corporate Avenue,Level 15,
> One Corporate Avenue,222 Hubin Road,Huangpu District,
> Shanghai,China 200021



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-10 17:54:32

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter,

On 10 June 2015 at 23:38, Fu Wei <[email protected]> wrote:
> Hi Guenter,
>
>
> On 10 June 2015 at 11:41, Fu Wei <[email protected]> wrote:
>> Hi Guenter,
>>
>> On 10 June 2015 at 00:45, Guenter Roeck <[email protected]> wrote:
>>> On 06/09/2015 09:29 AM, Timur Tabi wrote:
>>>>
>>>> On 06/09/2015 11:22 AM, Guenter Roeck wrote:
>>>>>
>>>>>
>>>>>
>>>>> but I see your point. Essentially, the specification is broken
>>>>> for all practical purposes, since, as you point out, enabling
>>>>> the watchdog overwrites and explicitly sets WCV. Effectively
>>>>> this means that just using WCV to program the timeout period
>>>>> is not really possible.
>>>>>
>>>>> I am not really sure how to address this. We can either only use WOR,
>>>>> and forget about pretimeout, or we can enforce a minimum pretimeout.
>>>>> In the latter case, we'll have to write WCV after writing WOR.
>>>>
>>>>
>>>> In talking with our hardware engineers, using WCV to program the timeout
>>>> period is not a valid operation. This is why I keep arguing against the
>>>> pre-timeout feature, and I don't agree that servers should always use
>>>> pre-timeout.
>>>>
>>>
>>> Not sure if "not valid" is correct - after all, it is mentioned in the
>>> specification. However, it is at the very least fragile.
>>
>> I think we should focus on SBSA spec, but not a specific chip design,
>> because this is SBSA watchdog, not a driver for an IP core from a
>> specific chip vendor.
>> this operation is mentioned in the spec,
>> and I have tested my driver on Foundation model(from ARM) and a real hardware.
>>
>>>
>>> I tend to agree that we should just forget about pretimeout and
>>> use your original approach, where the timeout value is used
>>> to program WOR. Everything else is really just asking for trouble.
>>
>> I don't mind if we give up pretimeout, The reason I use pretimeout is:
>> this concept matches the function of two stage timeouts.
>>
>> but, If we give up pretimeout, could you give me a suggestion:
>>
>> How to config the two stage timeouts
>> (1)from enabling watchdog to WS0
>> (2)the time from WS1 to WS0
>>
>> If we only have one timeout parameter, How to config the two stage timeouts?
>> Any suggestion ?
>
> I have another thought for this, please allow me to sent anther
> patchset in a day. see if you like it.

I have sent a non-pretimeout version patchset, please let me know if
you like the non-pretimeout version more.

Great thanks for your time. :-)

>
>>
>> If we make the first stage timeout is timeout/2, this violates the
>> definition of timeout.
>> I don't think users expect interrupt, panic or reboot at timeout/2.
>>
>> And WS1 definitely isn't a backup of WS0.
>>
>>>
>>> Guenter
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Fu Wei
>> Software Engineer
>> Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
>> Ph: +86 21 61221326(direct)
>> Ph: +86 186 2020 4684 (mobile)
>> Room 1512, Regus One Corporate Avenue,Level 15,
>> One Corporate Avenue,222 Hubin Road,Huangpu District,
>> Shanghai,China 200021
>
>
>
> --
> Best regards,
>
> Fu Wei
> Software Engineer
> Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
> Ph: +86 21 61221326(direct)
> Ph: +86 186 2020 4684 (mobile)
> Room 1512, Regus One Corporate Avenue,Level 15,
> One Corporate Avenue,222 Hubin Road,Huangpu District,
> Shanghai,China 200021



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-11 00:22:50

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Fu Wei wrote:
> If we make the first stage timeout is timeout/2, this violates the
> definition of timeout.

The documentation says that the hardware needs to reset after the
timeout expires. If you program the hardware to timeout/2, the driver
can ignore WS0 and allow WS1 to reset the hardware. That conforms to
the documentation.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

2015-06-11 03:01:03

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 11 June 2015 at 08:22, Timur Tabi <[email protected]> wrote:
> Fu Wei wrote:
>>
>> If we make the first stage timeout is timeout/2, this violates the
>> definition of timeout.
>
>
> The documentation says that the hardware needs to reset after the timeout
> expires.

yes , you are absolutely on this. Great thanks for point out this.
my patch is doing this way: trigger WS1 at the timeout expires
Before that, using pretimeout for WS0 as an warning.


> If you program the hardware to timeout/2, the driver can ignore
> WS0 and allow WS1 to reset the hardware. That conforms to the
> documentation.

yes, technically we can do nothing in WS0, and just wait for WS1.
So SBSA watchdog become a one stage timeout watchdog, right?
So if user want the WS0 warning, How to make driver know that, and do
something in WS0 routine?
I think by this way, two stage timeouts, WS0 "alert" will be meaningless.

Could you suggest a good way to use WS0, so we can follow SBSA spec?

>
> --
> Sent by an employee of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the
> Code Aurora Forum, hosted by The Linux Foundation.



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-11 03:45:37

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Fu Wei wrote:
> Could you suggest a good way to use WS0, so we can follow SBSA spec?

To avoid the timeout/2 problem, WS0 calls panic, which is the "real"
timeout/reset. WS1 is then a "backup" that is ignored by the driver.
That is, the driver doesn't do anything with WS1 and it doesn't tell the
kernel about WS1.

I know you don't like the idea of WS1 as a backup timeout, but this is
one way to solve the problem.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

2015-06-11 05:14:03

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 06/10/2015 08:45 PM, Timur Tabi wrote:
> Fu Wei wrote:
>> Could you suggest a good way to use WS0, so we can follow SBSA spec?
>
> To avoid the timeout/2 problem, WS0 calls panic, which is the "real" timeout/reset. WS1 is then a "backup" that is ignored by the driver. That is, the driver doesn't do anything with WS1 and it doesn't tell the kernel about WS1.
>
> I know you don't like the idea of WS1 as a backup timeout, but this is one way to solve the problem.
>
This is what I would do.

Guenter

2015-06-11 05:32:28

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

On 11 June 2015 at 11:45, Timur Tabi <[email protected]> wrote:
> Fu Wei wrote:
>>
>> Could you suggest a good way to use WS0, so we can follow SBSA spec?
>
>
> To avoid the timeout/2 problem, WS0 calls panic, which is the "real"
> timeout/reset. WS1 is then a "backup" that is ignored by the driver. That
> is, the driver doesn't do anything with WS1 and it doesn't tell the kernel
> about WS1.

could you suggestion :
(1) how to config the time from WS0 to WS1 without touching timeout.
(2) if we don't config the time from WS0 to WS1, what is the better
time for ALL server and ALL situation

>
> I know you don't like the idea of WS1 as a backup timeout, but this is one
> way to solve the problem.

It is not "I don't like"
WS1 is not a backup in SBSA spec. Page 21 :
"The initial signal is typically wired to an interrupt and alerts the system. "

If this is the backup, it should say "reset the system"

>
>
> --
> Sent by an employee of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the
> Code Aurora Forum, hosted by The Linux Foundation.



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

2015-06-11 05:33:28

by Fu Wei

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] Watchdog: introduce ARM SBSA watchdog driver

Hi Guenter

On 11 June 2015 at 13:13, Guenter Roeck <[email protected]> wrote:
> On 06/10/2015 08:45 PM, Timur Tabi wrote:
>>
>> Fu Wei wrote:
>>>
>>> Could you suggest a good way to use WS0, so we can follow SBSA spec?
>>
>>
>> To avoid the timeout/2 problem, WS0 calls panic, which is the "real"
>> timeout/reset. WS1 is then a "backup" that is ignored by the driver. That
>> is, the driver doesn't do anything with WS1 and it doesn't tell the kernel
>> about WS1.
>>
>> I know you don't like the idea of WS1 as a backup timeout, but this is one
>> way to solve the problem.
>>
> This is what I would do.

If so, Please check the non-pretimeout version. Thanks

>
> Guenter
>



--
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021