Hi Jiri,
On Fri, 30 Nov 2007 15:12:46 +0100, Jiri Slaby wrote:
> Ok, I don't see it merged in the latest -mm (mmotm). Could you, Mark, Rafael,
> sign off this version of the patch (Mark's + Rafael's fix)?
>
> --
>
> From: Mark M. Hoffman <[email protected]>
>
> coretemp, suspend fix
>
> It's not permitted to unregister device/cpu if frozen and going to sleep.
> It causes deadlock on systems, where coretemp hwmon is loaded. Do it only
> in non-freezed states instead.
>
> Cc: Rafael J. Wysocki <[email protected]> (frozen fix)
> Cc: Mark M. Hoffman <[email protected]>
> Signed-off-by: Jiri Slaby <[email protected]>
>
> ---
> commit 4f0e19b172ed18fb29e8006c4470fd37aa245a7a
> tree bec1cc4f7a499efe94c5f9d2d208db325914f28e
> parent 877dcc2ef6c7c17a64155cf201886c49622250e9
> author Jiri Slaby <[email protected]> Tue, 27 Nov 2007 20:19:47 +0100
> committer Jiri Slaby <[email protected]> Thu, 29 Nov 2007 23:41:11 +0100
>
> drivers/hwmon/coretemp.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 5c82ec7..ce7457d 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -338,11 +338,13 @@ static int coretemp_cpu_callback(struct notifier_block *nfb,
> switch (action) {
> case CPU_ONLINE:
> case CPU_ONLINE_FROZEN:
> + case CPU_DOWN_FAILED:
> coretemp_device_add(cpu);
> + case CPU_DOWN_FAILED_FROZEN:
> break;
> - case CPU_DEAD:
> - case CPU_DEAD_FROZEN:
> + case CPU_DOWN_PREPARE:
> coretemp_device_remove(cpu);
> + case CPU_DOWN_PREPARE_FROZEN:
> break;
> }
> return NOTIFY_OK;
Should this change go to the stable tree(s) as well?
--
Jean Delvare
On 11/30/2007 11:15 PM, Jean Delvare wrote:
> Hi Jiri,
Hi.
> On Fri, 30 Nov 2007 15:12:46 +0100, Jiri Slaby wrote:
>> Ok, I don't see it merged in the latest -mm (mmotm). Could you, Mark, Rafael,
>> sign off this version of the patch (Mark's + Rafael's fix)?
>>
>> --
>>
>> From: Mark M. Hoffman <[email protected]>
>>
>> coretemp, suspend fix
>>
>> It's not permitted to unregister device/cpu if frozen and going to sleep.
>> It causes deadlock on systems, where coretemp hwmon is loaded. Do it only
>> in non-freezed states instead.
>>
>> Cc: Rafael J. Wysocki <[email protected]> (frozen fix)
>> Cc: Mark M. Hoffman <[email protected]>
>> Signed-off-by: Jiri Slaby <[email protected]>
>>
>> ---
>> commit 4f0e19b172ed18fb29e8006c4470fd37aa245a7a
>> tree bec1cc4f7a499efe94c5f9d2d208db325914f28e
>> parent 877dcc2ef6c7c17a64155cf201886c49622250e9
>> author Jiri Slaby <[email protected]> Tue, 27 Nov 2007 20:19:47 +0100
>> committer Jiri Slaby <[email protected]> Thu, 29 Nov 2007 23:41:11 +0100
>>
>> drivers/hwmon/coretemp.c | 6 ++++--
>> 1 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
>> index 5c82ec7..ce7457d 100644
>> --- a/drivers/hwmon/coretemp.c
>> +++ b/drivers/hwmon/coretemp.c
>> @@ -338,11 +338,13 @@ static int coretemp_cpu_callback(struct notifier_block *nfb,
>> switch (action) {
>> case CPU_ONLINE:
>> case CPU_ONLINE_FROZEN:
>> + case CPU_DOWN_FAILED:
>> coretemp_device_add(cpu);
>> + case CPU_DOWN_FAILED_FROZEN:
>> break;
>> - case CPU_DEAD:
>> - case CPU_DEAD_FROZEN:
>> + case CPU_DOWN_PREPARE:
>> coretemp_device_remove(cpu);
>> + case CPU_DOWN_PREPARE_FROZEN:
>> break;
>> }
>> return NOTIFY_OK;
>
> Should this change go to the stable tree(s) as well?
Sorry, I have no idea. Rafael?
On Friday, 30 of November 2007, Jiri Slaby wrote:
> On 11/30/2007 11:15 PM, Jean Delvare wrote:
> > Hi Jiri,
>
> Hi.
>
> > On Fri, 30 Nov 2007 15:12:46 +0100, Jiri Slaby wrote:
> >> Ok, I don't see it merged in the latest -mm (mmotm). Could you, Mark, Rafael,
> >> sign off this version of the patch (Mark's + Rafael's fix)?
> >>
> >> --
> >>
> >> From: Mark M. Hoffman <[email protected]>
> >>
> >> coretemp, suspend fix
> >>
> >> It's not permitted to unregister device/cpu if frozen and going to sleep.
> >> It causes deadlock on systems, where coretemp hwmon is loaded. Do it only
> >> in non-freezed states instead.
> >>
> >> Cc: Rafael J. Wysocki <[email protected]> (frozen fix)
> >> Cc: Mark M. Hoffman <[email protected]>
> >> Signed-off-by: Jiri Slaby <[email protected]>
> >>
> >> ---
> >> commit 4f0e19b172ed18fb29e8006c4470fd37aa245a7a
> >> tree bec1cc4f7a499efe94c5f9d2d208db325914f28e
> >> parent 877dcc2ef6c7c17a64155cf201886c49622250e9
> >> author Jiri Slaby <[email protected]> Tue, 27 Nov 2007 20:19:47 +0100
> >> committer Jiri Slaby <[email protected]> Thu, 29 Nov 2007 23:41:11 +0100
> >>
> >> drivers/hwmon/coretemp.c | 6 ++++--
> >> 1 files changed, 4 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> >> index 5c82ec7..ce7457d 100644
> >> --- a/drivers/hwmon/coretemp.c
> >> +++ b/drivers/hwmon/coretemp.c
> >> @@ -338,11 +338,13 @@ static int coretemp_cpu_callback(struct notifier_block *nfb,
> >> switch (action) {
> >> case CPU_ONLINE:
> >> case CPU_ONLINE_FROZEN:
> >> + case CPU_DOWN_FAILED:
> >> coretemp_device_add(cpu);
> >> + case CPU_DOWN_FAILED_FROZEN:
> >> break;
> >> - case CPU_DEAD:
> >> - case CPU_DEAD_FROZEN:
> >> + case CPU_DOWN_PREPARE:
> >> coretemp_device_remove(cpu);
> >> + case CPU_DOWN_PREPARE_FROZEN:
> >> break;
> >> }
> >> return NOTIFY_OK;
> >
> > Should this change go to the stable tree(s) as well?
>
> Sorry, I have no idea. Rafael?
Well, actually, having looked once again at the patch, I think that it's
slightly wrong. Namely, it looks like we just should drop all of the _FROZEN
actions from there.
Fixed patch follows and I think it's also a candidate for -stable.
---
Subject: HWMON: coretemp, suspend fix
It's not permitted to unregister a device after devices have been suspended.
It causes deadlocks to appear on systems with coretemp hwmon loaded. To avoid
this, we can make coretemp_cpu_callback() do nothing if the _FROZEN bit is set
in action.
Also, in other cases it's generally to late to unregister the coretemp device
if the CPU is already dead, so it should be unregistered on CPU_DOWN_PREPARE.
Cc: Rafael J. Wysocki <[email protected]> (frozen fix)
Cc: Mark M. Hoffman <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---
drivers/hwmon/coretemp.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Index: linux-2.6/drivers/hwmon/coretemp.c
===================================================================
--- linux-2.6.orig/drivers/hwmon/coretemp.c
+++ linux-2.6/drivers/hwmon/coretemp.c
@@ -337,11 +337,10 @@ static int coretemp_cpu_callback(struct
switch (action) {
case CPU_ONLINE:
- case CPU_ONLINE_FROZEN:
+ case CPU_DOWN_FAILED:
coretemp_device_add(cpu);
break;
- case CPU_DEAD:
- case CPU_DEAD_FROZEN:
+ case CPU_DOWN_PREPARE:
coretemp_device_remove(cpu);
break;
}
On Saturday, 1 of December 2007, Rafael J. Wysocki wrote:
> On Friday, 30 of November 2007, Jiri Slaby wrote:
> > On 11/30/2007 11:15 PM, Jean Delvare wrote:
> > > Hi Jiri,
> >
[--snip--]
> > >
> > > Should this change go to the stable tree(s) as well?
> >
> > Sorry, I have no idea. Rafael?
>
> Well, actually, having looked once again at the patch, I think that it's
> slightly wrong. Namely, it looks like we just should drop all of the _FROZEN
> actions from there.
>
> Fixed patch follows and I think it's also a candidate for -stable.
Crap, I forgot to add the sign-off, so here it goes again:
---
Subject: HWMON: coretemp, suspend fix
It's not permitted to unregister a device after devices have been suspended.
It causes deadlocks to appear on systems with coretemp hwmon loaded. To avoid
this, we can make coretemp_cpu_callback() do nothing if the _FROZEN bit is set
in action.
Also, in other cases it's generally to late to unregister the coretemp device
if the CPU is already dead, so it should be unregistered on CPU_DOWN_PREPARE.
Signed-off-by: Rafael J. Wysocki <[email protected]> (frozen fix)
Cc: Mark M. Hoffman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Andrew Morton <[email protected]>
---
drivers/hwmon/coretemp.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Index: linux-2.6/drivers/hwmon/coretemp.c
===================================================================
--- linux-2.6.orig/drivers/hwmon/coretemp.c
+++ linux-2.6/drivers/hwmon/coretemp.c
@@ -337,11 +337,10 @@ static int coretemp_cpu_callback(struct
switch (action) {
case CPU_ONLINE:
- case CPU_ONLINE_FROZEN:
+ case CPU_DOWN_FAILED:
coretemp_device_add(cpu);
break;
- case CPU_DEAD:
- case CPU_DEAD_FROZEN:
+ case CPU_DOWN_PREPARE:
coretemp_device_remove(cpu);
break;
}
Hi:
* Rafael J. Wysocki <[email protected]> [2007-12-01 00:51:40 +0100]:
> On Saturday, 1 of December 2007, Rafael J. Wysocki wrote:
> > On Friday, 30 of November 2007, Jiri Slaby wrote:
> > > On 11/30/2007 11:15 PM, Jean Delvare wrote:
> > > > Hi Jiri,
> > >
> [--snip--]
> > > >
> > > > Should this change go to the stable tree(s) as well?
> > >
> > > Sorry, I have no idea. Rafael?
> >
> > Well, actually, having looked once again at the patch, I think that it's
> > slightly wrong. Namely, it looks like we just should drop all of the _FROZEN
> > actions from there.
> >
> > Fixed patch follows and I think it's also a candidate for -stable.
>
> Crap, I forgot to add the sign-off, so here it goes again:
>
> ---
> Subject: HWMON: coretemp, suspend fix
>
> It's not permitted to unregister a device after devices have been suspended.
> It causes deadlocks to appear on systems with coretemp hwmon loaded. To avoid
> this, we can make coretemp_cpu_callback() do nothing if the _FROZEN bit is set
> in action.
>
> Also, in other cases it's generally to late to unregister the coretemp device
> if the CPU is already dead, so it should be unregistered on CPU_DOWN_PREPARE.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]> (frozen fix)
> Cc: Mark M. Hoffman <[email protected]>
> Cc: Jiri Slaby <[email protected]>
> Cc: Andrew Morton <[email protected]>
> ---
>
> drivers/hwmon/coretemp.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/drivers/hwmon/coretemp.c
> ===================================================================
> --- linux-2.6.orig/drivers/hwmon/coretemp.c
> +++ linux-2.6/drivers/hwmon/coretemp.c
> @@ -337,11 +337,10 @@ static int coretemp_cpu_callback(struct
>
> switch (action) {
> case CPU_ONLINE:
> - case CPU_ONLINE_FROZEN:
> + case CPU_DOWN_FAILED:
> coretemp_device_add(cpu);
> break;
> - case CPU_DEAD:
> - case CPU_DEAD_FROZEN:
> + case CPU_DOWN_PREPARE:
> coretemp_device_remove(cpu);
> break;
> }
Sorry for the delay, it took me some time to RTF code and convince
myself that nothing can tickle this driver's sysfs files after a
CPU_DOWN_PREPARE_FROZEN. As long as I didn't misread that...
Acked-by: Mark M. Hoffman <[email protected]>
PS: while reading kernel/power/disk.c, I saw this...
335 static void power_down(void)
336 {
337 switch (hibernation_mode) {
338 case HIBERNATION_TEST:
339 case HIBERNATION_TESTPROC:
340 break;
341 case HIBERNATION_REBOOT:
342 kernel_restart(NULL);
343 break;
344 case HIBERNATION_PLATFORM:
345 hibernation_platform_enter();
346 case HIBERNATION_SHUTDOWN:
347 kernel_power_off();
348 break;
349 }
350 kernel_halt();
351 /*
352 * Valid image is on the disk, if we continue we risk serious data
353 * corruption after resume.
354 */
355 printk(KERN_CRIT "Please power me down manually\n");
356 while(1);
357 }
Shouldn't that be while(1) cpu_relax(); ?
Regards,
--
Mark M. Hoffman
[email protected]
On Sunday, 2 of December 2007, Mark M. Hoffman wrote:
> Hi:
>
> * Rafael J. Wysocki <[email protected]> [2007-12-01 00:51:40 +0100]:
> > On Saturday, 1 of December 2007, Rafael J. Wysocki wrote:
> > > On Friday, 30 of November 2007, Jiri Slaby wrote:
> > > > On 11/30/2007 11:15 PM, Jean Delvare wrote:
[--snip--]
>
> PS: while reading kernel/power/disk.c, I saw this...
>
> 335 static void power_down(void)
> 336 {
> 337 switch (hibernation_mode) {
> 338 case HIBERNATION_TEST:
> 339 case HIBERNATION_TESTPROC:
> 340 break;
> 341 case HIBERNATION_REBOOT:
> 342 kernel_restart(NULL);
> 343 break;
> 344 case HIBERNATION_PLATFORM:
> 345 hibernation_platform_enter();
> 346 case HIBERNATION_SHUTDOWN:
> 347 kernel_power_off();
> 348 break;
> 349 }
> 350 kernel_halt();
> 351 /*
> 352 * Valid image is on the disk, if we continue we risk serious data
> 353 * corruption after resume.
> 354 */
> 355 printk(KERN_CRIT "Please power me down manually\n");
> 356 while(1);
> 357 }
>
> Shouldn't that be while(1) cpu_relax(); ?
Yes, it should.
Thanks for pointing that out, I'll fix it.
Greetings,
Rafael