While Rusty Russell wants the return value of sysfs_create_file
ignored, it's annotated '__must_check'. Tejun Heo made the annotaion
and suggests just using BUG_ON(). Meanwhile the compiler warns that
the 'err' variable is set but unused. This patch uses Tejun's
suggestion. This eliminates the warning, satisfies the required check,
and fails-fast with notice if sysfs_create_file actually ever fails
(something that Rusty says should never happen when this code runs).
Signed-off-by: Louis Langholtz <[email protected]>
---
diff --git a/kernel/params.c b/kernel/params.c
index a22d6a7..b04a752 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -853,6 +853,7 @@ static void __init version_sysfs_builtin(void)
mk = locate_module_kobject(vattr->module_name);
if (mk) {
err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
+ BUG_ON(err);
kobject_uevent(&mk->kobj, KOBJ_ADD);
kobject_put(&mk->kobj);
}
On Sun, Jun 07, 2015 at 05:54:30PM -0600, Louis Langholtz wrote:
> While Rusty Russell wants the return value of sysfs_create_file
> ignored, it's annotated '__must_check'. Tejun Heo made the annotaion
> and suggests just using BUG_ON(). Meanwhile the compiler warns that
> the 'err' variable is set but unused. This patch uses Tejun's
> suggestion. This eliminates the warning, satisfies the required check,
> and fails-fast with notice if sysfs_create_file actually ever fails
> (something that Rusty says should never happen when this code runs).
>
> Signed-off-by: Louis Langholtz <[email protected]>
> ---
>
> diff --git a/kernel/params.c b/kernel/params.c
> index a22d6a7..b04a752 100644
> --- a/kernel/params.c
> +++ b/kernel/params.c
> @@ -853,6 +853,7 @@ static void __init version_sysfs_builtin(void)
> mk = locate_module_kobject(vattr->module_name);
> if (mk) {
> err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
> + BUG_ON(err);
Maybe BUG_ON(sysfs_create_file(...)); is simpler? Other than that,
Acked-by: Tejun Heo <[email protected]>
Thanks.
--
tejun
On Sun, Jun 7, 2015 at 5:00 PM, Tejun Heo <[email protected]> wrote:
> On Sun, Jun 07, 2015 at 05:54:30PM -0600, Louis Langholtz wrote:
>> @@ -853,6 +853,7 @@ static void __init version_sysfs_builtin(void)
>> mk = locate_module_kobject(vattr->module_name);
>> if (mk) {
>> err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
>> + BUG_ON(err);
>
> Maybe BUG_ON(sysfs_create_file(...)); is simpler? Other than that,
Hell no.
Stop with the random BUG_ON() additions.
I have said this before, and apparently I need to sat this again, and
probably I will have to say it in the future.
We don't add BUG_ON's for random reasons.
The *ONLY* acceptable reason for a BUG_ON() is if the machine is dead
anyway because of some major internal corruption.
We have too many BUG_ON's. We've had people add BUG_ON's because "this
cannot happen", and then it turns out they were wrong, and they just
killed the machine.
Dammit, there's no reason to add a BUG_ON() here in the first place,
and the reason of "but but it's an unused error return": is f*cking
retarded.
Stop this idiocy. We don't write crap code just to satisfy some random
coding standard or shut up a compiler error.
At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
ignore the error. But BUG_ON()? Hell no.
NO NO NO.
Quite frankly, if you want to add error handling, then dammit, add it
right. And no, BUG_ON() is _never_ proper error handling.
BUG_ON() is for things like "uhhuh, somebody is trying to free a page
that is already free". That is some serious internal corruption.
BUG_ON() is _not_ for "I'm not doing any error handling, so I'll
sprinkle random lines of BUG_ON() like fairy dust to make the compiler
happen".
Really. I'm getting very tired indeed of people adding BUG_ON's like
that. Stop it.
Linus
On Sun, Jun 07, 2015 at 05:17:20PM -0700, Linus Torvalds wrote:
> At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
> ignore the error. But BUG_ON()? Hell no.
Yeah, WARN_ON_ONCE() is the right one. The short history here is that
sysfs_create_file() has __must_check on it which triggered this whole
discussion. The rationale for having __must_check on the function was
that the function's failure leads to userland visible behavior
difference and it's very inviting to skip error handling on the
function as file creation is often the last operation to be performed
with no further dependency on it.
Thanks.
--
tejun
While Rusty Russell wants the return value of sysfs_create_file
ignored, it's annotated '__must_check'. Tejun Heo made the annotaion
and suggests just using WARN_ON_ONCE(). Meanwhile the compiler warns that
the 'err' variable is set but unused. This patch uses Tejun's
suggestion. This eliminates the warning, satisfies the required check,
and warns if sysfs_create_file actually ever fails (something that Rusty
says should never happen when this code runs).
Signed-off-by: Louis Langholtz <[email protected]>
---
diff --git a/kernel/params.c b/kernel/params.c
index a22d6a7..49406f9 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -853,6 +853,7 @@ static void __init version_sysfs_builtin(void)
mk = locate_module_kobject(vattr->module_name);
if (mk) {
err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
+ WARN_ON_ONCE(err);
kobject_uevent(&mk->kobj, KOBJ_ADD);
kobject_put(&mk->kobj);
}
Spake Linus:
> Hell no.
>
> Stop with the random BUG_ON() additions.
>
> I have said this before, and apparently I need to sat this again, and
> probably I will have to say it in the future.
>
> We don't add BUG_ON's for random reasons.
>
> The *ONLY* acceptable reason for a BUG_ON() is if the machine is dead
> anyway because of some major internal corruption.
>
> We have too many BUG_ON's. We've had people add BUG_ON's because "this
> cannot happen", and then it turns out they were wrong, and they just
> killed the machine.
>
> Dammit, there's no reason to add a BUG_ON() here in the first place,
> and the reason of "but but it's an unused error return": is f*cking
> retarded.
>
> Stop this idiocy. We don't write crap code just to satisfy some random
> coding standard or shut up a compiler error.
>
> At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
> ignore the error. But BUG_ON()? Hell no.
>
> NO NO NO.
>
> Quite frankly, if you want to add error handling, then dammit, add it
> right. And no, BUG_ON() is _never_ proper error handling.
>
> BUG_ON() is for things like "uhhuh, somebody is trying to free a page
> that is already free". That is some serious internal corruption.
>
> BUG_ON() is _not_ for "I'm not doing any error handling, so I'll
> sprinkle random lines of BUG_ON() like fairy dust to make the compiler
> happen".
>
> Really. I'm getting very tired indeed of people adding BUG_ON's like
> that. Stop it.
Signed-off-by: Joe Perches <[email protected]>
---
scripts/checkpatch.pl | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 69c4716..f3daa4e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3423,12 +3423,11 @@ sub process {
}
}
-# # no BUG() or BUG_ON()
-# if ($line =~ /\b(BUG|BUG_ON)\b/) {
-# print "Try to use WARN_ON & Recovery code rather than BUG() or BUG_ON()\n";
-# print "$herecurr";
-# $clean = 0;
-# }
+# avoid BUG() or BUG_ON()
+ if ($line =~ /\b(BUG|BUG_ON)\b/) {
+ WARN("BUG",
+ "Avoid using $1 unless there is a serious corruption - try to use WARN_ON & recovery code instead\n" . $herecurr);
+ }
if ($line =~ /\bLINUX_VERSION_CODE\b/) {
WARN("LINUX_VERSION_CODE",
On Jun 7, 2015, at 6:17 PM, Linus Torvalds <[email protected]> wrote:
> On Sun, Jun 7, 2015 at 5:00 PM, Tejun Heo <[email protected]> wrote:
>> On Sun, Jun 07, 2015 at 05:54:30PM -0600, Louis Langholtz wrote:
>>> @@ -853,6 +853,7 @@ static void __init version_sysfs_builtin(void)
>>> mk = locate_module_kobject(vattr->module_name);
>>> if (mk) {
>>> err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
>>> + BUG_ON(err);
>>
>> Maybe BUG_ON(sysfs_create_file(...)); is simpler? Other than that,
>
> Hell no.
>
> Stop with the random BUG_ON() additions.
> ...
> The *ONLY* acceptable reason for a BUG_ON() is if the machine is dead
> anyway because of some major internal corruption.
> ...
> At most, it could be a "WARN_ON_ONCE()"...
>
> Linus
Agreed. The comments in the bug.h file say this clearly too - to not use
BUG_ON "unless there's really no way out".
I originally just wanted a light-weight message to be issued on failure so
at least there's some notice that something unexpected happened (and
to have the must-check value used). I've submitted a second version
now as you probably saw (that instead uses WARN_ON_ONCE).-
* Linus Torvalds <[email protected]> wrote:
> Stop with the random BUG_ON() additions.
Yeah, so I propose the attached patch which attempts to resist new BUG_ON()
additions.
Thanks,
Ingo
================================>
>From 724052923fbae2e3a14e0b9383c89b18217d817f Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Mon, 8 Jun 2015 09:01:43 +0200
Subject: [PATCH] debug: Deprecate BUG_ON() use in new code, introduce CRASH_ON()
So people still keep adding random BUG_ON() lines, as a mistaken practice
to put asserts that will never trigger, into supposedly perfect kernel code.
So such BUG_ON()s should either not be added, because the code is truly
perfect, or if there's a chance that it's imperfect, use WARN_ON() instead
and limp along, in the hope of getting some debug information back from
the user.
Using BUG_ON() will just hang or reboot most systems, with no useful
feedback provided. It's as user hostile as it gets.
Add a checkpatch rule that warns against new BUG_ON() uses:
WARNING: Using BUG_ON() is generally wrong, use WARN_ON() instead - or CRASH_ON() if the kernel absolutely must crash.
CRASH_ON() can be used in code that must absolutely crash right then,
in the very rare case where there's no way the system can be allowed
to continue execution.
It should be used sparingly, and its name will hopefully achieve this.
Signed-off-by: Ingo Molnar <[email protected]>
---
include/asm-generic/bug.h | 7 +++++++
scripts/checkpatch.pl | 6 ++++++
2 files changed, 13 insertions(+)
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 630dd2372238..c6424723277b 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -165,6 +165,13 @@ extern void warn_slowpath_null(const char *file, const int line);
#define WARN_TAINT(condition, taint, format...) WARN(condition, format)
#define WARN_TAINT_ONCE(condition, taint, format...) WARN(condition, format)
+/*
+ * BUG_ON() is deprecated, use either one of the WARN_ON() variants,
+ * or if it's absolutely unavoidable to crash the system due to
+ * some grave condition, use CRASH_ON():
+ */
+#define CRASH_ON(condition) BUG_ON(condition)
+
#endif
/*
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 89b1df4e72ab..6e0887057398 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5414,6 +5414,12 @@ sub process {
"Using $1 should generally have parentheses around the comparison\n" . $herecurr);
}
+# check for use of BUG_ON()
+ if ($line =~ /\bBUG_ON\s*\(/) {
+ WARN("BUG_ON",
+ "Using BUG_ON() is generally wrong, use WARN_ON() instead - or CRASH_ON() if the kernel absolutely must crash.\n" . $herecurr);
+ }
+
# whine mightly about in_atomic
if ($line =~ /\bin_atomic\s*\(/) {
if ($realfile =~ m@^drivers/@) {
Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
>
> * Linus Torvalds <[email protected]> wrote:
>
>> Stop with the random BUG_ON() additions.
>
> Yeah, so I propose the attached patch which attempts to resist new BUG_ON()
> additions.
As this reminded me at flame I received once from a maintainer because I
wanted to avoid a desastrous memory corruption by using a BUG_ON().
maybe someone should mention that a BUG_ON or now CRASH_ON should be
still prefered instead of some random memory corruption which might lead
to worse things. Or how is the viewpoint of the kernel masters in regard
to memory corruptions and use of BUG_ON, WARN_ON or CRASH_ON?
Regards,
Alexander Holler
>
> Thanks,
>
> Ingo
>
> ================================>
> From 724052923fbae2e3a14e0b9383c89b18217d817f Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <[email protected]>
> Date: Mon, 8 Jun 2015 09:01:43 +0200
> Subject: [PATCH] debug: Deprecate BUG_ON() use in new code, introduce CRASH_ON()
>
> So people still keep adding random BUG_ON() lines, as a mistaken practice
> to put asserts that will never trigger, into supposedly perfect kernel code.
>
> So such BUG_ON()s should either not be added, because the code is truly
> perfect, or if there's a chance that it's imperfect, use WARN_ON() instead
> and limp along, in the hope of getting some debug information back from
> the user.
>
> Using BUG_ON() will just hang or reboot most systems, with no useful
> feedback provided. It's as user hostile as it gets.
>
> Add a checkpatch rule that warns against new BUG_ON() uses:
>
> WARNING: Using BUG_ON() is generally wrong, use WARN_ON() instead - or CRASH_ON() if the kernel absolutely must crash.
>
> CRASH_ON() can be used in code that must absolutely crash right then,
> in the very rare case where there's no way the system can be allowed
> to continue execution.
>
> It should be used sparingly, and its name will hopefully achieve this.
>
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> include/asm-generic/bug.h | 7 +++++++
> scripts/checkpatch.pl | 6 ++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index 630dd2372238..c6424723277b 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -165,6 +165,13 @@ extern void warn_slowpath_null(const char *file, const int line);
> #define WARN_TAINT(condition, taint, format...) WARN(condition, format)
> #define WARN_TAINT_ONCE(condition, taint, format...) WARN(condition, format)
>
> +/*
> + * BUG_ON() is deprecated, use either one of the WARN_ON() variants,
> + * or if it's absolutely unavoidable to crash the system due to
> + * some grave condition, use CRASH_ON():
> + */
> +#define CRASH_ON(condition) BUG_ON(condition)
> +
> #endif
>
> /*
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 89b1df4e72ab..6e0887057398 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -5414,6 +5414,12 @@ sub process {
> "Using $1 should generally have parentheses around the comparison\n" . $herecurr);
> }
>
> +# check for use of BUG_ON()
> + if ($line =~ /\bBUG_ON\s*\(/) {
> + WARN("BUG_ON",
> + "Using BUG_ON() is generally wrong, use WARN_ON() instead - or CRASH_ON() if the kernel absolutely must crash.\n" . $herecurr);
> + }
> +
> # whine mightly about in_atomic
> if ($line =~ /\bin_atomic\s*\(/) {
> if ($realfile =~ m@^drivers/@) {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Mon, Jun 8, 2015 at 9:40 AM, Alexander Holler <[email protected]> wrote:
> Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
>>
>>
>> * Linus Torvalds <[email protected]> wrote:
>>
>>> Stop with the random BUG_ON() additions.
>>
>>
>> Yeah, so I propose the attached patch which attempts to resist new
>> BUG_ON()
>> additions.
>
>
> As this reminded me at flame I received once from a maintainer because I
> wanted to avoid a desastrous memory corruption by using a BUG_ON().
Reference?
--
Thanks,
//richard
* Alexander Holler <[email protected]> wrote:
> Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
> >
> >* Linus Torvalds <[email protected]> wrote:
> >
> >>Stop with the random BUG_ON() additions.
> >
> > Yeah, so I propose the attached patch which attempts to resist new BUG_ON()
> > additions.
>
> As this reminded me at flame I received once from a maintainer because I wanted
> to avoid a desastrous memory corruption by using a BUG_ON(). maybe someone
> should mention that a BUG_ON or now CRASH_ON should be still prefered instead of
> some random memory corruption which might lead to worse things. Or how is the
> viewpoint of the kernel masters in regard to memory corruptions and use of
> BUG_ON, WARN_ON or CRASH_ON?
So it depends on the actual change, but there's very few cases where a BUG_ON() is
justified, even if the code detects memory corruption.
Most instances of memory corruption either come from the hardware or come from
some other piece of code, so _your_ code crashing the system will be unexpected,
and in most cases unproductive to finding the cause of the corruption.
The best action is to stop doing whatever your code was doing, trying to bail out
with as little extra changes done to the system as possible.
An example for that are lockdep's asserts. An actual lockdep warning in a
released, production kernel is frequently connected to a real risk of data
corruption - yet what we do is that we report the bug non-intrusively and turn off
lockdep completely, so that it does not make the situation worse and that we have
a chance the messages can be saved and can be reported back to kernel developers.
The origins of widespread BUG_ON() use are twofold:
- 20 years ago we didn't have much of any locking in the kernel, so a BUG_ON()
resulted in essence in a graceful segfault of the application that happened to
trigger it, in most cases. Kernel logs were still possible to retrieve if the
bug did not trigger too often - and if not (because for example the crash
happened in the idle thread) then the backtrace was still visible on the VGA
text console.
- in the early days we didn't have WARN_ON(), we only had BUG_ON(), so people
used that. BUG_ON() used to be the 'graceful' assert, panic() was the
equivalent of CRASH_ON().
These days a BUG_ON() is almost always fatal due to unreleased locks, plus we
still don't print kernel crashes to the graphical console, so they are silent hard
lockups in 99% of the cases.
Thanks,
Ingo
Am 08.06.2015 um 10:08 schrieb Richard Weinberger:
> On Mon, Jun 8, 2015 at 9:40 AM, Alexander Holler <[email protected]> wrote:
>> Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
>>>
>>>
>>> * Linus Torvalds <[email protected]> wrote:
>>>
>>>> Stop with the random BUG_ON() additions.
>>>
>>>
>>> Yeah, so I propose the attached patch which attempts to resist new
>>> BUG_ON()
>>> additions.
>>
>>
>> As this reminded me at flame I received once from a maintainer because I
>> wanted to avoid a desastrous memory corruption by using a BUG_ON().
>
> Reference?
https://lkml.org/lkml/2013/5/17/254
To explain: The bug already existed for several releases and the memory
corruption was that desatrous that it even leaded here to hard resets of
systems without any oops. And fixing it needed several more releases
(another year).
And in the above mentioned case and the kernel config settings I use(d),
only the wronggoing thread was killed by the BUG_ON (I proposed) before
it had the chance to corrupt the memory.
Maybe someone could clarify what Greg meant with "something _really_
bad", because in my humble opionion there aren't much more worse things
than memory corruptions (e.g. by wrong pointers, use after free or
similiar stuff) if that happens inside the kernel. The consequences of
such are almost always unpredictable and therefor I would and likely
will ever prefer a controlled shutdown, reset or similiar instead of
leaving a system running with corrupted memory. Regardless what any
maintainer will say.
Regards,
Alexander Holler
* Alexander Holler <[email protected]> wrote:
> Am 08.06.2015 um 10:08 schrieb Richard Weinberger:
> >On Mon, Jun 8, 2015 at 9:40 AM, Alexander Holler <[email protected]> wrote:
> >>Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
> >>>
> >>>
> >>>* Linus Torvalds <[email protected]> wrote:
> >>>
> >>>>Stop with the random BUG_ON() additions.
> >>>
> >>>
> >>>Yeah, so I propose the attached patch which attempts to resist new
> >>>BUG_ON()
> >>>additions.
> >>
> >>
> >>As this reminded me at flame I received once from a maintainer because I
> >>wanted to avoid a desastrous memory corruption by using a BUG_ON().
> >
> >Reference?
>
> https://lkml.org/lkml/2013/5/17/254
>
> To explain: The bug already existed for several releases and the memory
> corruption was that desatrous that it even leaded here to hard resets of systems
> without any oops. And fixing it needed several more releases (another year).
>
> And in the above mentioned case and the kernel config settings I use(d), only
> the wronggoing thread was killed by the BUG_ON (I proposed) before it had the
> chance to corrupt the memory.
Firstly, the changelog of the patch that Greg rejected told nothing about all that
thinking, so at minimum it's a deficient changelog.
Secondly and more importantly, instead of doing a BUG_ON() you could have done:
if (WARN_ON_ONCE(port->itty))
return;
This would probably have prevented the tty related memory corruption just as much,
at the cost of a (small and infrequent) memory leak.
I.e. instead of crashing the machine, you need to try to find the least
destructive approach if a bug is detected.
I am pretty certain that Greg would have applied such a patch in an eye blink.
> Maybe someone could clarify what Greg meant with "something _really_ bad",
> because in my humble opionion there aren't much more worse things than memory
> corruptions (e.g. by wrong pointers, use after free or similiar stuff) if that
> happens inside the kernel. The consequences of such are almost always
> unpredictable and therefor I would and likely will ever prefer a controlled
> shutdown, reset or similiar instead of leaving a system running with corrupted
> memory. Regardless what any maintainer will say.
So a justified BUG_ON() would be something during early boot for example, where a
grave inconsistency is detected that we know will make the kernel unable to work
much further.
We have only a few such cases: not finding a root filesystem, or detecting an x86
kernel image with instructions in it that are incompatible with the CPU it is
running on. We can do nothing to improve the situation, so we try to print
something useful and stop-crash the box.
Thanks,
Ingo
* Ingo Molnar <[email protected]> wrote:
> Firstly, the changelog of the patch that Greg rejected told nothing about all
> that thinking, so at minimum it's a deficient changelog.
>
> Secondly and more importantly, instead of doing a BUG_ON() you could have done:
>
> if (WARN_ON_ONCE(port->itty))
> return;
>
> This would probably have prevented the tty related memory corruption just as
> much, at the cost of a (small and infrequent) memory leak.
>
> I.e. instead of crashing the machine, you need to try to find the least
> destructive approach if a bug is detected.
Also note that BUG_ON() will make data corruption _worse_ statistically. Why?
Because most data corruptions are unlikely to be perfectly detected by a BUG_ON(),
and the BUG_ON() delays the finding of the underlying bug, so the bug will hit
more people before it's fixed for good.
So even in the cases where you could argue that the system needs to stop, because
we have evidence of data corruption, it's statistically the better approach to
continue and get kernel log info back to developers.
Thanks,
Ingo
Am 08.06.2015 um 11:05 schrieb Ingo Molnar:
>
> * Alexander Holler <[email protected]> wrote:
>
>> Am 08.06.2015 um 10:08 schrieb Richard Weinberger:
>>> On Mon, Jun 8, 2015 at 9:40 AM, Alexander Holler <[email protected]> wrote:
>>>> Am 08.06.2015 um 09:12 schrieb Ingo Molnar:
>>>>>
>>>>>
>>>>> * Linus Torvalds <[email protected]> wrote:
>>>>>
>>>>>> Stop with the random BUG_ON() additions.
>>>>>
>>>>>
>>>>> Yeah, so I propose the attached patch which attempts to resist new
>>>>> BUG_ON()
>>>>> additions.
>>>>
>>>>
>>>> As this reminded me at flame I received once from a maintainer because I
>>>> wanted to avoid a desastrous memory corruption by using a BUG_ON().
>>>
>>> Reference?
>>
>> https://lkml.org/lkml/2013/5/17/254
>>
>> To explain: The bug already existed for several releases and the memory
>> corruption was that desatrous that it even leaded here to hard resets of systems
>> without any oops. And fixing it needed several more releases (another year).
>>
>> And in the above mentioned case and the kernel config settings I use(d), only
>> the wronggoing thread was killed by the BUG_ON (I proposed) before it had the
>> chance to corrupt the memory.
>
> Firstly, the changelog of the patch that Greg rejected told nothing about all that
> thinking, so at minimum it's a deficient changelog.
>
> Secondly and more importantly, instead of doing a BUG_ON() you could have done:
>
> if (WARN_ON_ONCE(port->itty))
> return;
>
> This would probably have prevented the tty related memory corruption just as much,
> at the cost of a (small and infrequent) memory leak.
>
> I.e. instead of crashing the machine, you need to try to find the least
> destructive approach if a bug is detected.
>
> I am pretty certain that Greg would have applied such a patch in an eye blink.
As you've said it, *probably*. But such a simple exit path as you're
proposing doesn't always exist. And I assume if it would have existed,
it would not have needed another year to get rid of at least the memory
corruption. It took me quiet some time to find the problem and I'm sure,
if I had seen during my adventures through the tty-subsystem that such a
simple return would have been enough, I would have used WARN_ON or
WARN_ON_ONCE. But I can't remember.
Alexander Holler
Am 08.06.2015 um 11:11 schrieb Ingo Molnar:
>
> * Ingo Molnar <[email protected]> wrote:
>
>> Firstly, the changelog of the patch that Greg rejected told nothing about all
>> that thinking, so at minimum it's a deficient changelog.
>>
>> Secondly and more importantly, instead of doing a BUG_ON() you could have done:
>>
>> if (WARN_ON_ONCE(port->itty))
>> return;
>>
>> This would probably have prevented the tty related memory corruption just as
>> much, at the cost of a (small and infrequent) memory leak.
>>
>> I.e. instead of crashing the machine, you need to try to find the least
>> destructive approach if a bug is detected.
>
> Also note that BUG_ON() will make data corruption _worse_ statistically. Why?
> Because most data corruptions are unlikely to be perfectly detected by a BUG_ON(),
> and the BUG_ON() delays the finding of the underlying bug, so the bug will hit
> more people before it's fixed for good.
>
> So even in the cases where you could argue that the system needs to stop, because
> we have evidence of data corruption, it's statistically the better approach to
> continue and get kernel log info back to developers.
Risking more, maybe even worse problems like corrupting file systems or
similiar in order to have a slightly chance of save log info?
Sorry, that isn't something I would propose.
Anyway, CRASH_ON didn't exist, so I only had the choice between BUG_ON
and WARN_ON, and for the latter you need a proper exit path which isn't
always easy to find. So I appreciate CRASH_ON, thanks.
Alexander Holler
* Alexander Holler <[email protected]> wrote:
> > I am pretty certain that Greg would have applied such a patch in an eye blink.
>
> As you've said it, *probably*. But such a simple exit path as you're proposing
> doesn't always exist. [...]
As I said it's case by case. I discussed your example (which was a deficient patch
for multiple reasons) but we'd be wasting everyone's time by discussion
hypothethical situations.
Thanks,
Ingo
* Alexander Holler <[email protected]> wrote:
> > Also note that BUG_ON() will make data corruption _worse_ statistically. Why?
> > Because most data corruptions are unlikely to be perfectly detected by a
> > BUG_ON(), and the BUG_ON() delays the finding of the underlying bug, so the
> > bug will hit more people before it's fixed for good.
> >
> > So even in the cases where you could argue that the system needs to stop,
> > because we have evidence of data corruption, it's statistically the better
> > approach to continue and get kernel log info back to developers.
>
> Risking more, maybe even worse problems like corrupting file systems or similiar
> in order to have a slightly chance of save log info?
That's not what I said - please read my argument and argue with that if you want,
not with some other straw-man argument...
Thanks,
Ingo
Ingo Molnar wrote:
> These days a BUG_ON() is almost always fatal due to unreleased locks,
> plus we still don't print kernel crashes to the graphical console,
> so they are silent hard lockups in 99% of the cases.
Is it a regression?
I've just tested with "echo c >/proc/sysrq-trigger" and
kernel.panic_on_oops=1 and the box just hanged. But I clearly remember
it started to work at some point and it was a miracle. Recently a colleague
of mine show such graphical panic (most of it as always scrolled away BTW).
It's usual setup with i915 driver, X and fbconsole which is never used.
As for BUG_ON() I hope rule will be relaxed for "not under locks" situations
and something will be done with inevitable miryads of trivial patches.
Count them:
$ chgrep -e BUG_ON -w -n | wc -l
9660
Am 08.06.2015 um 13:27 schrieb Ingo Molnar:
>
> * Alexander Holler <[email protected]> wrote:
>
>>> I am pretty certain that Greg would have applied such a patch in an eye blink.
>>
>> As you've said it, *probably*. But such a simple exit path as you're proposing
>> doesn't always exist. [...]
>
> As I said it's case by case. I discussed your example (which was a deficient patch
> for multiple reasons) but we'd be wasting everyone's time by discussion
> hypothethical situations.
Sure it was a deficient patch, but still better than what existed for a
year long in the kernel and which leaded to a reset of a system sometime
after a wireless remote connected device was turned off. And it showed
exactly where the problem existed. Maybe I should mention that almost a
year after I've posted that patch, people still stumbled over the
problem (and likely had the same hard time to find the reason for a
sudden reboot without any warning).
I just want to make clear that a brutforce slogan like BUG_ON is bad is
bad, especially as there are many people which just might repeat that
slogan without thinking about what they are repeating.
Alexander Holler
* Alexander Holler <[email protected]> wrote:
> Am 08.06.2015 um 13:27 schrieb Ingo Molnar:
> >
> >* Alexander Holler <[email protected]> wrote:
> >
> >>>I am pretty certain that Greg would have applied such a patch in an eye blink.
> >>
> >>As you've said it, *probably*. But such a simple exit path as you're proposing
> >>doesn't always exist. [...]
> >
> >As I said it's case by case. I discussed your example (which was a deficient patch
> >for multiple reasons) but we'd be wasting everyone's time by discussion
> >hypothethical situations.
>
> Sure it was a deficient patch, but still better than what existed for a year
> long in the kernel [...]
... but it was still worse than the simple solution I suggested:
if (WARN_ON_ONCE(port->itty))
return;
again, the BUG_ON() you wanted to introduce was wrong on multiple grounds, then
and now. Why are you still arguing about this?
> I just want to make clear that a brutforce slogan like BUG_ON is bad is bad,
I pointed out specific cases where a BUG_ON() is the right solution. They are
rare. Stop misrepresenting my words.
Thanks,
Ingo
Am 08.06.2015 um 21:35 schrieb Ingo Molnar:
> I pointed out specific cases where a BUG_ON() is the right solution. They are
> rare. Stop misrepresenting my words.
I'm just mimicking usual kernel maintainer behaviour, like most monkeys.
Sorry.
On Jun 7, 2015, at 6:58 PM, Tejun Heo <[email protected]> wrote:
> On Sun, Jun 07, 2015 at 05:17:20PM -0700, Linus Torvalds wrote:
>> At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
>> ignore the error. But BUG_ON()? Hell no.
>
> Yeah, WARN_ON_ONCE() is the right one...
>
> --
> tejun
On further consideration...
While removing the __must_check attribute from the sysfs_create_file
function seems to have larger ramifications (as it's called more often than
the code I'd offered a patch for), is it really that helpful to keep this
attribution?
I don't believe so now and think it should be removed. Here's why (stuff
that's partly been said already):
The underlying code for sysfs_create_file does call WARN to warn about
any errors. So it's not like the code is totally silent anyway. Then the unused
but set variable in params.c (that's set to its return value) can be removed
(and the compiler warning resolved). Incidentally, I do think it'd be helpful
to comment document this behavior of using WARN so that callers can
more readily recognize (and be assured that) they won't need to call WARN
(unless callers want to add __FILE__ and __LINE__ info to the printk output).
Having functions marked __must_check seems to make more sense when
their return values are *always* necessary for calling code to have any
business calling them. Like when there's one or more future calls to functions
that have to be made (or not made) for the given resource (the kobject) that
depend on that function's return value (such that the return value state is
always a later conditional). That doesn't appear to be the case however for
sysfs_create_file().
Seems what Rusty was indicating too.
We never did hear back from Andrew though who had added many (most?)
of the __must_check attributes to begin with to the sysfs interface functions
(in include/linux/sysfs.h). The full commit message back in 2006 stated:
add __must_check to device management code
We're getting a lot of crashes in the sysfs/kobject/device/bus/class code and
they're very hard to diagnose.
I'm suspecting that in some cases this is because drivers aren't checking
return values and aren't handling errors correctly. So the code blithely
blunders on and crashes later in very obscure ways.
There's just no reason to ignore errors which can and do occur. So the patch
sprinkles __must_check all over these APIs.
Causes 1,513 new warnings. Heh.
Adding __must_check probably made it easier for developers to identify calling
code that was depending on success from these functions. At the same time,
it's not true (at least currently) that these return values always need to
be checked to avoid incorrect behavior (though perhaps they should almost
always be checked).
Take the sysfs_chmod_file() function as another example of a function whose
return value is unnecessary to always check. It doesn't have the same
behavior though as sysfs_create_file() in internally calling WARN for any error
condition it might return. So not checking it may result in silent failures (hate
those), but still many of its callers don't do anything with its return value other
than warn about errors anyway (and don't seem like they need to do more).
--
Lou-
Hey, Louis.
On Wed, Jun 10, 2015 at 11:05:21AM -0600, Louis Langholtz wrote:
> The underlying code for sysfs_create_file does call WARN to warn about
> any errors. So it's not like the code is totally silent anyway. Then the unused
Not any errors. It triggers warning on missing ops and dup file
names. The former is a pretty fundamental usage error and we've had
too many of the latter unhandled despite of __must_check.
> but set variable in params.c (that's set to its return value) can be removed
> (and the compiler warning resolved). Incidentally, I do think it'd be helpful
> to comment document this behavior of using WARN so that callers can
> more readily recognize (and be assured that) they won't need to call WARN
> (unless callers want to add __FILE__ and __LINE__ info to the printk output).
>
> Having functions marked __must_check seems to make more sense when
> their return values are *always* necessary for calling code to have any
> business calling them. Like when there's one or more future calls to functions
> that have to be made (or not made) for the given resource (the kobject) that
> depend on that function's return value (such that the return value state is
> always a later conditional). That doesn't appear to be the case however for
> sysfs_create_file().
I don't know. Sounds like a weird rule. We had __must_check, still
had certain error types unhandled so people added explicit WARN to
force people to look into those issues and that means __must_check
should go away? Sure, if it warns on all errors and can return void
then there's nothing to discuss but that's not the case. Here, an
error return indicates userland visible behavior difference. I'd
venture to say that the return value is pretty darn important.
...
> Adding __must_check probably made it easier for developers to identify calling
> code that was depending on success from these functions. At the same time,
> it's not true (at least currently) that these return values always need to
I don't know. The "always" rule you're speaking of seems too
arbitrary and rigid to me. Always is a tricky word and tying oneself
to things like that usually doesn't lead to healthy trade-offs.
I don't think removing __must_check would be a disaster but at the
same time these terminal sysfs functions look like the perfect
candidates for such annotations where error returns indicate subtle
but visible behavior differences visible to userland while not
affecting in-kernel operations at all and thus can be easily ignored.
Thanks.
--
tejun
Linus Torvalds <[email protected]> writes:
> On Sun, Jun 7, 2015 at 5:00 PM, Tejun Heo <[email protected]> wrote:
> At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
> ignore the error. But BUG_ON()? Hell no.
Yeah, in practice it's already (1) paniced if we ran out of memory, or
(2) warned if we somehow tried to create two entries with the same name.
So the WARN_ON_ONCE() is a bit... meh. How's this, too snarky?
Thanks,
Rusty.
Subject: params: suppress unused variable error, warn once just in case code changes.
Reported-by: Louis Langholtz <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
diff --git a/kernel/params.c b/kernel/params.c
index 7edf31f2ce96..0b9bbdf830cb 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -884,6 +884,12 @@ static void __init version_sysfs_builtin(void)
mk = locate_module_kobject(vattr->module_name);
if (mk) {
err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
+ /*
+ * That should not fail at boot due to OOM, and it'll
+ * already warn if we somehow get two identical names,
+ * but this one line should quiet both gcc and lkml.
+ */
+ WARN_ON_ONCE(err);
kobject_uevent(&mk->kobj, KOBJ_ADD);
kobject_put(&mk->kobj);
}
On Fri, Jun 12, 2015 at 10:57:24AM +0930, Rusty Russell wrote:
> Linus Torvalds <[email protected]> writes:
> > On Sun, Jun 7, 2015 at 5:00 PM, Tejun Heo <[email protected]> wrote:
> > At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
> > ignore the error. But BUG_ON()? Hell no.
>
> Yeah, in practice it's already (1) paniced if we ran out of memory, or
> (2) warned if we somehow tried to create two entries with the same name.
>
> So the WARN_ON_ONCE() is a bit... meh. How's this, too snarky?
Sounds pretty passive agressive to me. At least reply to the actual
argument?
Thanks.
--
tejun
Hi Tejun,
On Jun 10, 2015, at 7:54 PM, Tejun Heo <[email protected]> wrote:
> Hey, Louis.
>
> On Wed, Jun 10, 2015 at 11:05:21AM -0600, Louis Langholtz wrote:
>> The underlying code for sysfs_create_file does call WARN to warn about
>> any errors. So it's not like the code is totally silent anyway. Then the unused
>
> Not any errors. It triggers warning on missing ops and dup file
> names.
Ah... I see now that the underlying code doesn't warn about all errors.
Specifically that code doesn't warn about __kernfs_create_file() returning
any error code other than EEXIST. So it's quiet on a few other error codes
it can return (like ENOMEM or ENOENT). Warnings are generated when
sysfs_create_file() returns EINVAL or EEXIST.
> The former is a pretty fundamental usage error and we've had
> too many of the latter unhandled despite of __must_check.
Admittedly the preconditions for sysfs_create_file() seem obscure to me.
Perhaps that's contributed to the problem you've described (some
comment documentation for the sysfs_create_file function I think
would be useful). And what should be done after this function returns?
Having this __must_check says its return value must be used. But how
should it be used? More on this question after next quote block...
>
>> ...
>> Having functions marked __must_check seems to make more sense when
>> their return values are *always* necessary for calling code to have any
>> business calling them. Like when there's one or more future calls to functions
>> that have to be made (or not made) for the given resource (the kobject) that
>> depend on that function's return value (such that the return value state is
>> always a later conditional). That doesn't appear to be the case however for
>> sysfs_create_file().
>
> I don't know. Sounds like a weird rule. We had __must_check, still
> had certain error types unhandled so people added explicit WARN to
> force people to look into those issues and that means __must_check
> should go away?
Taken in light of the sysfs_create_file function the rule seems (at least for
the moment) helpful in addressing what should be done about the return
value of sysfs_create_file().
Should the caller WARN about all errors returned from sysfs_create_file()?
Even those already reported (like when sysfs_create_file() returns EINVAL
or EEXIST)? If all errors should be WARNed (by code somewhere) then
would it be better to have the code underlying sysfs_create_file() do that?
If the code underlying sysfs_create_file() warns about all errors, then no
caller needs to duplicate that code. Alternatively the caller could just warn
when the return code isn't one that's already be warned about. But as the
code is now, that relies on how sysfs_create_file() underlying code is coded
to know what's already been warned and breaks as soon as someone
changes what sysfs_create_file() warns about without updating what the
callers warn about. That solution seems unnecessarily fragile.
What else should always be done besides provide WARN for errors? If
nothing else should always be done and errors are logged by the
underlying code, then what point is left for sysfs_create_file() being
marked __must_check? It shouldn't be - at least by this offered rule -
since its return value isn't always necessary for the caller to still have
business in calling it given that it has a worthwhile side-effect (when it
succeeds).
> Sure, if it warns on all errors and can return void
> then there's nothing to discuss but that's not the case. Here, an
> error return indicates userland visible behavior difference. I'd
> venture to say that the return value is pretty darn important.
I don't mean to diminish the significance of userland visible behavior
difference. These are significant. And I don't want sysfs_create_file()
to return void. But if there's no clear things that callers of sysfs_create_file()
should do based on its return value then marking it __must_check seems
wrong. I'm guessing (based on the commit log message) that Andrew
originally established the precedent for this marking though to help get
rid of cases where callers needed to do something different than they
were when errors occurred because their code was blindly assuming
success.
The return value of kmalloc I'd argue meanwhile should always be used.
Oddly it's not marked __must_check. But based on this rule, it makes more
sense to. What business would a caller have in calling kmalloc if it's not
going to use the return value? There is a side-effect of using up memory
but I don't know of a reason that's ever desirable for a properly running
kernel if the memory is not used (or at least passed to the free function).
Fortunately I don't see any callers that aren't using the return value and
marking it __must_check only helps to reaffirm what callers already know.
> ...
>> Adding __must_check probably made it easier for developers to identify calling
>> code that was depending on success from these functions. At the same time,
>> it's not true (at least currently) that these return values always need to
>
> I don't know. The "always" rule you're speaking of seems too
> arbitrary and rigid to me. Always is a tricky word and tying oneself
> to things like that usually doesn't lead to healthy trade-offs.
An understandable concern. I'd not want a rule to be established if
it can have holes poked into it. But __must_check is itself an always
kind of rule at least in so far as the compiler will always warn on the
result being unused (assuming it recognizes the attribute and it hasn't
been disabled). So having an always kind of rule for when
__must_check should or should not be used seems consistent with
what it's being applied to and seems like it'd be great to have if it's solid.
> I don't think removing __must_check would be a disaster
Sounds like what Rusty suggested too. Though I'm unclear if he preferred
removing this __must_check attribute or hacking around it with his suggested
doesnt_matter() macro. Hopefully it's the former.
> but at the
> same time these terminal sysfs functions look like the perfect
> candidates for such annotations where error returns indicate subtle
> but visible behavior differences visible to userland while not
> affecting in-kernel operations at all and thus can be easily ignored.
> ...
Disagreeing with this seems to put me at odds with Andrew Morton too.
Some of the sysfs functions may be proper candidates for this attribute
but not sysfs_create_file() as I've argued. Hopefully my not following in
these shoes will keep me out of trouble better than my following the
BUG_ON shoes (by offering the patch that used BUG_ON) got me into
trouble! ;)
> Thanks.
>
> --
> tejun
And my thanks to you too.
--
Lou-
Tejun Heo <[email protected]> writes:
> On Fri, Jun 12, 2015 at 10:57:24AM +0930, Rusty Russell wrote:
>> Linus Torvalds <[email protected]> writes:
>> > On Sun, Jun 7, 2015 at 5:00 PM, Tejun Heo <[email protected]> wrote:
>> > At most, it could be a "WARN_ON_ONCE()". Maybe even just silently
>> > ignore the error. But BUG_ON()? Hell no.
>>
>> Yeah, in practice it's already (1) paniced if we ran out of memory, or
>> (2) warned if we somehow tried to create two entries with the same name.
>>
>> So the WARN_ON_ONCE() is a bit... meh. How's this, too snarky?
>
> Sounds pretty passive agressive to me. At least reply to the actual
> argument?
Oh. Perhaps my sense of humour is miscalibrated.
err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
+ /*
+ * That should not fail at boot due to OOM, and it'll
+ * already warn if we somehow get two identical names,
+ * but this one line should quiet both gcc and lkml.
+ */
+ WARN_ON_ONCE(err);
Oh well, I'll skip the inline commentry entirely then:
===
Subject: params: suppress unused variable error, warn once just in case code changes.
It shouldn't fail due to OOM (it's boot time), and already warns if we
get two identical names. But you never know what the future holds, and
WARN_ON_ONCE() keeps gcc happy with minimal code.
Reported-by: Louis Langholtz <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
diff --git a/kernel/params.c b/kernel/params.c
index 7edf31f2ce96..0b9bbdf830cb 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -884,6 +884,7 @@ static void __init version_sysfs_builtin(void)
mk = locate_module_kobject(vattr->module_name);
if (mk) {
err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
+ WARN_ON_ONCE(err);
kobject_uevent(&mk->kobj, KOBJ_ADD);
kobject_put(&mk->kobj);
}
Hello, Rusty.
On Mon, Jun 15, 2015 at 05:19:26AM +0930, Rusty Russell wrote:
> Oh. Perhaps my sense of humour is miscalibrated.
Heh, prolly mine was. Sorry if I came off as aggressive.
> ===
> Subject: params: suppress unused variable error, warn once just in case code changes.
>
> It shouldn't fail due to OOM (it's boot time), and already warns if we
> get two identical names. But you never know what the future holds, and
> WARN_ON_ONCE() keeps gcc happy with minimal code.
>
> Reported-by: Louis Langholtz <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>
>
> diff --git a/kernel/params.c b/kernel/params.c
> index 7edf31f2ce96..0b9bbdf830cb 100644
> --- a/kernel/params.c
> +++ b/kernel/params.c
> @@ -884,6 +884,7 @@ static void __init version_sysfs_builtin(void)
> mk = locate_module_kobject(vattr->module_name);
> if (mk) {
> err = sysfs_create_file(&mk->kobj, &vattr->mattr.attr);
> + WARN_ON_ONCE(err);
> kobject_uevent(&mk->kobj, KOBJ_ADD);
Looks good to me. Please feel free to add
Acked-by: Tejun Heo <[email protected]>
Thanks.
--
tejun