2010-07-28 16:40:10

by Kulikov Vasiliy

[permalink] [raw]
Subject: [PATCH 04/10] x86: mce: fix error handling

mcheck_init_device() poorly handles errors. If any request fails
unregister and free everything.

Signed-off-by: Kulikov Vasiliy <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce.c | 24 ++++++++++++++++++++----
1 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index ed41562..a1119f1 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2124,22 +2124,38 @@ static __init int mcheck_init_device(void)
if (!mce_available(&boot_cpu_data))
return -EIO;

- zalloc_cpumask_var(&mce_dev_initialized, GFP_KERNEL);
+ if (!zalloc_cpumask_var(&mce_dev_initialized, GFP_KERNEL))
+ return -ENOMEM;

mce_init_banks();

err = sysdev_class_register(&mce_sysclass);
if (err)
- return err;
+ goto err_free_cpumask_var;

for_each_online_cpu(i) {
err = mce_create_device(i);
if (err)
- return err;
+ goto mce_remove_devices;
}

register_hotcpu_notifier(&mce_cpu_notifier);
- misc_register(&mce_log_device);
+ err = misc_register(&mce_log_device);
+ if (err)
+ goto err_unreg_notifier;
+ return 0;
+
+err_unreg_notifier:
+ unregister_hotcpu_notifier(&mce_cpu_notifier);
+
+mce_remove_devices:
+ for_each_online_cpu(i)
+ mce_remove_device(i);
+
+ sysdev_class_unregister(&mce_sysclass);
+
+err_free_cpumask_var:
+ free_cpumask_var(mce_dev_initialized);

return err;
}
--
1.7.0.4


Subject: Re: [PATCH 04/10] x86: mce: fix error handling

From: Kulikov Vasiliy <[email protected]>
Date: Wed, Jul 28, 2010 at 12:39:44PM -0400

> mcheck_init_device() poorly handles errors. If any request fails
> unregister and free everything.
>
> Signed-off-by: Kulikov Vasiliy <[email protected]>

Acked-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

2010-07-28 17:07:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

On 7/28/2010 6:39 PM, Kulikov Vasiliy wrote:
> mcheck_init_device() poorly handles errors. If any request fails
> unregister and free everything.

Actually these are at early boot time and only contain memory errors,
and if you run out of memory at this stage the system is usually
dead in the water anyways. The best you can do at this stage
is panicing, but silently returning from the the init function doesn't
help anyone. But someone else will likely panic anyways.

e.g. boot time allocations of cpu masks generally do not check for memory
failures and I think that's ok, not a bug.

Your patch would be good if the driver was modular, but it isn't.

-Andi

2010-07-28 17:14:14

by Kulikov Vasiliy

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

Hi,

On Wed, Jul 28, 2010 at 19:07 +0200, Andi Kleen wrote:
> On 7/28/2010 6:39 PM, Kulikov Vasiliy wrote:
> >mcheck_init_device() poorly handles errors. If any request fails
> >unregister and free everything.
>
> Actually these are at early boot time and only contain memory errors,
> and if you run out of memory at this stage the system is usually
> dead in the water anyways. The best you can do at this stage
> is panicing, but silently returning from the the init function doesn't
> help anyone. But someone else will likely panic anyways.
>
> e.g. boot time allocations of cpu masks generally do not check for memory
> failures and I think that's ok, not a bug.
>
> Your patch would be good if the driver was modular, but it isn't.

I'm agree with you that if allocation fails at boot time, we are dead :)
But this coding style breaking rules that result from some functions
_must_ be checked for errors. Maybe we should add BUG_ON() here or
indicate someway that we have no ideas how to handle error?

2010-07-28 17:20:16

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling


> I'm agree with you that if allocation fails at boot time, we are dead :)
> But this coding style breaking rules that result from some functions
> _must_ be checked for errors. Maybe we should add BUG_ON() here or
> indicate someway that we have no ideas how to handle error?

What rules exactly? I don't think any of those functions are declared
with __must_check

Coding style should never get in the way of what is right.

The classic way to explicitely discard a return value is a cast to void,
but that is generally considered
ugly in the Linux kernel.

One could possibly add a comment about this at least.

-Andi

2010-07-29 09:35:53

by Kulikov Vasiliy

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

On Wed, Jul 28, 2010 at 19:20 +0200, Andi Kleen wrote:
>
> >I'm agree with you that if allocation fails at boot time, we are dead :)
> >But this coding style breaking rules that result from some functions
> >_must_ be checked for errors. Maybe we should add BUG_ON() here or
> >indicate someway that we have no ideas how to handle error?
>
> What rules exactly? I don't think any of those functions are
> declared with __must_check

IMO memmory allocation fails are dangerous in kernel mode. As it is
probably not exploitable because of boot time, it can destroy some
sensitive data like dirty disk caches those are going to be written on
disk.

>
> Coding style should never get in the way of what is right.
>
> The classic way to explicitely discard a return value is a cast to
> void, but that is generally considered
> ugly in the Linux kernel.
>
> One could possibly add a comment about this at least.
>
> -Andi
>

2010-07-29 09:51:23

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling


> IMO memmory allocation fails are dangerous in kernel mode. As it is
> probably not exploitable because of boot time, it can destroy some
> sensitive data like dirty disk caches those are going to be written on
> disk.

It's true for runtime, but not for normal boot time.

Anyways if it happens on boot time the only thing you can do is panic,
but someone else
will likely panic anyways for you. Just ignoring it like your patch
effectively does
(because nothing will ever look at the ENOMEMs for an initcall) is wrong
though
In this case it's actually better to oops like the original code does.

BTW even with your patch likely later code will crash anyways because it
doesn't
expect init code to fail.

-Andi

2010-07-29 10:10:31

by walter harms

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling



Andi Kleen schrieb:
>
>> IMO memmory allocation fails are dangerous in kernel mode. As it is
>> probably not exploitable because of boot time, it can destroy some
>> sensitive data like dirty disk caches those are going to be written on
>> disk.
>
> It's true for runtime, but not for normal boot time.
>
> Anyways if it happens on boot time the only thing you can do is panic,
> but someone else
> will likely panic anyways for you. Just ignoring it like your patch
> effectively does
> (because nothing will ever look at the ENOMEMs for an initcall) is wrong
> though
> In this case it's actually better to oops like the original code does.
>
> BTW even with your patch likely later code will crash anyways because it
> doesn't
> expect init code to fail.
>

NTL it is nice to have a error message. for users it is worse if you crash suddenly
with out warning than having a crash with "OOM" before because it gives you a clue
what is going on.
short:
please think of users that are not kernel developers give them a hint what went wrong.

to make thinks more easy on boot we could replace kalloc() with kmalloc_or_die().
When anyone runs out of mem on boottime you can panic() instantly.

just my to cents,
wh

2010-07-29 10:17:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

From: Andi Kleen <[email protected]>
Date: Thu, Jul 29, 2010 at 05:51:00AM -0400

>
> > IMO memmory allocation fails are dangerous in kernel mode. As it is
> > probably not exploitable because of boot time, it can destroy some
> > sensitive data like dirty disk caches those are going to be written on
> > disk.
>
> It's true for runtime, but not for normal boot time.
>
> Anyways if it happens on boot time the only thing you can do is panic,
> but someone else
> will likely panic anyways for you. Just ignoring it like your patch
> effectively does
> (because nothing will ever look at the ENOMEMs for an initcall)

Not true, initcall_debug will at least dump the -ENOMEM or the other
-E's if enabled on the cmdline. So even only for that case does the
patch make sense.

It's a whole different question whether it actually is prudent to turn
on error reporting of failed initcalls unconditionally.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

2010-07-31 18:19:50

by Kulikov Vasiliy

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

On Thu, Jul 29, 2010 at 12:10 +0200, walter harms wrote:
>
>
> Andi Kleen schrieb:
> >
> >> IMO memmory allocation fails are dangerous in kernel mode. As it is
> >> probably not exploitable because of boot time, it can destroy some
> >> sensitive data like dirty disk caches those are going to be written on
> >> disk.
> >
> > It's true for runtime, but not for normal boot time.
> >
> > Anyways if it happens on boot time the only thing you can do is panic,
> > but someone else
> > will likely panic anyways for you. Just ignoring it like your patch
> > effectively does
> > (because nothing will ever look at the ENOMEMs for an initcall) is wrong
> > though
> > In this case it's actually better to oops like the original code does.
> >
> > BTW even with your patch likely later code will crash anyways because it
> > doesn't
> > expect init code to fail.
> >
>
> NTL it is nice to have a error message. for users it is worse if you crash suddenly
> with out warning than having a crash with "OOM" before because it gives you a clue
> what is going on.
> short:
> please think of users that are not kernel developers give them a hint what went wrong.
>
> to make thinks more easy on boot we could replace kalloc() with kmalloc_or_die().
The thing is that this driver does not call kmalloc() explicitly, it
uses function those call functions those call kmalloc() :)

If we call BUG_ON() in init code, it would not make big overhead and
would make fault exactly when bug was detected, independent from caller
checks. Andi, are you fine with it?

> When anyone runs out of mem on boottime you can panic() instantly.
>
> just my to cents,
> wh
>

2010-07-31 19:08:09

by Kulikov Vasiliy

[permalink] [raw]
Subject: Re: [PATCH 04/10] x86: mce: fix error handling

On Thu, Jul 29, 2010 at 12:10 +0200, walter harms wrote:
>
>
> Andi Kleen schrieb:
> >
> >> IMO memmory allocation fails are dangerous in kernel mode. As it is
> >> probably not exploitable because of boot time, it can destroy some
> >> sensitive data like dirty disk caches those are going to be written on
> >> disk.
> >
> > It's true for runtime, but not for normal boot time.
> >
> > Anyways if it happens on boot time the only thing you can do is panic,
> > but someone else
> > will likely panic anyways for you. Just ignoring it like your patch
> > effectively does
> > (because nothing will ever look at the ENOMEMs for an initcall) is wrong
> > though
> > In this case it's actually better to oops like the original code does.
> >
> > BTW even with your patch likely later code will crash anyways because it
> > doesn't
> > expect init code to fail.
> >
>
> NTL it is nice to have a error message. for users it is worse if you crash suddenly
> with out warning than having a crash with "OOM" before because it gives you a clue
> what is going on.
> short:
> please think of users that are not kernel developers give them a hint what went wrong.
>
> to make thinks more easy on boot we could replace kalloc() with kmalloc_or_die().
The thing is that this driver does not call kmalloc() explicitly, it
uses function those call functions those call kmalloc() :)

If we call BUG_ON() in init code, it would not make big overhead and
would make fault exactly when bug was detected, independent from caller
checks. Andi, are you fine with it?

> When anyone runs out of mem on boottime you can panic() instantly.
>
> just my to cents,
> wh
>