2018-06-20 10:38:23

by Dmitry Vyukov

[permalink] [raw]
Subject: [PATCH] include/asm-generic/bug.h: clarify valid uses of WARN()

From: Dmitry Vyukov <[email protected]>

Explicitly state that WARN*() should be used only for recoverable
kernel issues/bugs and that it should not be used for any kind of
invalid external inputs or transient conditions.

Motivation: it's a very useful capability to be able to understand
if a particular kernel splat means a kernel bug or simply an invalid
user-space program. For the former one wants to notify kernel developers,
while notifying kernel developers for the latter is annoying.
Even a kernel developer may not know what to do with a WARNING
in an unfamiliar subsystem. This is especially critical for any automated
testing systems that may use panic_on_warn and mail kernel developers.

The clear separation also serves as an additional documentation:
is it a condition that must never occur because of additional
checks/logic elsewhere? or is it simply a check for invalid inputs
or unfortunate conditions?

Use of pr_err() for user messages also leads to better error messages.
"Something is wrong in file foo on line X" is not particularly useful
message for end user. pr_err() forces developers to write more meaningful
error messages for user.

As of now we are almost there. We are doing systematic kernel testing
with panic_on_warn and are not seeing massive amounts of false positives.
But every now and then another WARN on ENOMEM or invalid inputs pops up
and leads to a lengthy argument each time. The goal of this change
is to officially document the rules.

Signed-off-by: Dmitry Vyukov <[email protected]>
---
include/asm-generic/bug.h | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index a7613e1b0c87..20561a60db9c 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -75,9 +75,19 @@ struct bug_entry {

/*
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
- * significant issues that need prompt attention if they should ever
- * appear at runtime. Use the versions with printk format strings
- * to provide better diagnostics.
+ * significant kernel issues that need prompt attention if they should ever
+ * appear at runtime.
+ *
+ * Do not use these macros when checking for invalid external inputs
+ * (e.g. invalid system call arguments, or invalid data coming from
+ * network/devices), and on transient conditions like ENOMEM or EAGAIN.
+ * These macros should be used for recoverable kernel issues only.
+ * For invalid external inputs, transient conditions, etc use
+ * pr_err[_once/_ratelimited]() followed by dump_stack(), if necessary.
+ * Do not include "BUG"/"WARNING" in format strings manually to make these
+ * conditions distinguishable from kernel issues.
+ *
+ * Use the versions with printk format strings to provide better diagnostics.
*/
#ifndef __WARN_TAINT
extern __printf(3, 4)
--
2.18.0.rc1.244.gcf134e6275-goog



2018-06-21 00:03:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] include/asm-generic/bug.h: clarify valid uses of WARN()

On Wed, Jun 20, 2018 at 12:37:16PM +0200, Dmitry Vyukov wrote:
> From: Dmitry Vyukov <[email protected]>
>
> Explicitly state that WARN*() should be used only for recoverable
> kernel issues/bugs and that it should not be used for any kind of
> invalid external inputs or transient conditions.
>
> Motivation: it's a very useful capability to be able to understand
> if a particular kernel splat means a kernel bug or simply an invalid
> user-space program. For the former one wants to notify kernel developers,
> while notifying kernel developers for the latter is annoying.
> Even a kernel developer may not know what to do with a WARNING
> in an unfamiliar subsystem. This is especially critical for any automated
> testing systems that may use panic_on_warn and mail kernel developers.
>
> The clear separation also serves as an additional documentation:
> is it a condition that must never occur because of additional
> checks/logic elsewhere? or is it simply a check for invalid inputs
> or unfortunate conditions?
>
> Use of pr_err() for user messages also leads to better error messages.
> "Something is wrong in file foo on line X" is not particularly useful
> message for end user. pr_err() forces developers to write more meaningful
> error messages for user.
>
> As of now we are almost there. We are doing systematic kernel testing
> with panic_on_warn and are not seeing massive amounts of false positives.
> But every now and then another WARN on ENOMEM or invalid inputs pops up
> and leads to a lengthy argument each time. The goal of this change
> is to officially document the rules.
>
> Signed-off-by: Dmitry Vyukov <[email protected]>
> ---
> include/asm-generic/bug.h | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)

Nice!

Acked-by: Greg Kroah-Hartman <[email protected]>