2007-12-19 15:07:12

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

I discovered that I can use IMAP with GMail now, so I can send messages
using Thunderbird and avoid the line wrapping problem.

I tried doing a series: suspend-to-disk, suspend-to-ram and suspend-to-disk
Here is the result:

[ 11.827653] PM: Creating hibernation image:
[ 11.827658] WARNING: at arch/x86/kernel/smp_32.c:561
native_smp_call_function_mask()
[ 11.827661] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[ 11.827665] [<c0107d55>] show_trace_log_lvl+0x12/0x25
[ 11.827673] [<c010848a>] show_trace+0xd/0x10
[ 11.827677] [<c0108763>] dump_stack+0x57/0x5f
[ 11.827681] [<c0117db4>] native_smp_call_function_mask+0x41/0x126
[ 11.827686] [<c01192d9>] smp_call_function+0x18/0x1f
[ 11.827690] [<c012c624>] on_each_cpu+0x12/0x40
[ 11.827695] [<c0166ece>] drain_all_pages+0x13/0x16
[ 11.827700] [<c014f7b3>] swsusp_save+0x18/0x46b
[ 11.827705] [<c03103fa>] swsusp_arch_suspend+0x2a/0x2c
[ 11.827710] [<c014e7d8>] hibernate+0xba/0x16e
[ 11.827714] [<c014d56b>] state_store+0x45/0xac
[ 11.827717] [<c01ffe95>] kobj_attr_store+0x1a/0x22
[ 11.827722] [<c01b92c7>] sysfs_write_file+0xb8/0xe3
[ 11.827726] [<c01837eb>] vfs_write+0xa4/0x120
[ 11.827731] [<c0183d5e>] sys_write+0x3b/0x60
[ 11.827734] [<c0106bae>] sysenter_past_esp+0x6b/0xc1
[ 11.827738] =======================
[ 11.920363] PM: Need to copy 124108 pages
[ 11.920368] PM: Normal pages needed: 46468 + 1024 + 40, available
pages: 182806
[ 15.623893] PM: Hibernation image created (124108 pages copied)
[ 15.624618] Intel machine check architecture supported.
[ 15.624625] Intel machine check reporting enabled on CPU#0.
[ 15.624992]
[ 15.624993] =================================
[ 15.624995] [ INFO: inconsistent lock state ]
[ 15.624998] 2.6.24-rc5-mm1 #8
[ 15.624999] ---------------------------------
[ 15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[ 15.625005] pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 15.625007] (&cpu_base->lock_key){++..}, at: [<c013c453>]
retrigger_next_event+0x63/0x9f
[ 15.625017] {in-hardirq-W} state was registered at:
[ 15.625019] [<c0145432>] __lock_acquire+0x408/0xbf4
[ 15.625025] [<c0145c94>] lock_acquire+0x76/0x9d
[ 15.625029] [<c039aa08>] _spin_lock+0x19/0x28
[ 15.625035] [<c013cd92>] hrtimer_interrupt+0x72/0x1b0
[ 15.625039] [<c011a2b7>] smp_apic_timer_interrupt+0x69/0x7c
[ 15.625045] [<c0107777>] apic_timer_interrupt+0x33/0x38
[ 15.625050] [<c01054b5>] mwait_idle+0x1b/0x1d
[ 15.625054] [<c01055e9>] cpu_idle+0xb3/0xd4
[ 15.625058] [<c03986c5>] rest_init+0x49/0x4b
[ 15.625062] [<c04f696d>] start_kernel+0x357/0x35f
[ 15.625069] [<00000000>] 0x0
[ 15.625082] [<ffffffff>] 0xffffffff
[ 15.625087] irq event stamp: 1182359
[ 15.625089] hardirqs last enabled at (1182359): [<c0106cb3>]
restore_nocheck+0x12/0x15
[ 15.625094] hardirqs last disabled at (1182358): [<c010776d>]
apic_timer_interrupt+0x29/0x38
[ 15.625098] softirqs last enabled at (933018): [<c0137d89>]
__rcu_offline_cpu+0x32/0x62
[ 15.625104] softirqs last disabled at (933016): [<c039aa22>]
_spin_lock_bh+0xb/0x2d
[ 15.625109]
[ 15.625110] other info that might help us debug this:
[ 15.625112] 2 locks held by pm-hibernate/9940:
[ 15.625114] #0: (&buffer->mutex){--..}, at: [<c01b9234>]
sysfs_write_file+0x25/0xe3
[ 15.625121] #1: (pm_mutex){--..}, at: [<c014e72e>] hibernate+0x10/0x16e
[ 15.625127]
[ 15.625128] stack backtrace:
[ 15.625131] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[ 15.625133] [<c0107d55>] show_trace_log_lvl+0x12/0x25
[ 15.625138] [<c010848a>] show_trace+0xd/0x10
[ 15.625141] [<c0108763>] dump_stack+0x57/0x5f
[ 15.625144] [<c0143e45>] print_usage_bug+0x10a/0x117
[ 15.625148] [<c01447de>] mark_lock+0x1e7/0x3fe
[ 15.625152] [<c014549f>] __lock_acquire+0x475/0xbf4
[ 15.625156] [<c0145c94>] lock_acquire+0x76/0x9d
[ 15.625159] [<c039aa08>] _spin_lock+0x19/0x28
[ 15.625163] [<c013c453>] retrigger_next_event+0x63/0x9f
[ 15.625167] [<c013caf7>] hres_timers_resume+0x4d/0x4f
[ 15.625170] [<c013eed1>] timekeeping_resume+0x117/0x11e
[ 15.625175] [<c027b2ba>] __sysdev_resume+0x14/0x34
[ 15.625179] [<c027b752>] sysdev_resume+0x21/0x57
[ 15.625183] [<c027f426>] device_power_up+0x8/0xf
[ 15.625188] [<c014e6e7>] hibernation_snapshot+0x13c/0x173
[ 15.625192] [<c014e7d8>] hibernate+0xba/0x16e
[ 15.625195] [<c014d56b>] state_store+0x45/0xac
[ 15.625199] [<c01ffe95>] kobj_attr_store+0x1a/0x22
[ 15.625203] [<c01b92c7>] sysfs_write_file+0xb8/0xe3
[ 15.625207] [<c01837eb>] vfs_write+0xa4/0x120
[ 15.625211] [<c0183d5e>] sys_write+0x3b/0x60
[ 15.625214] [<c0106bae>] sysenter_past_esp+0x6b/0xc1
[ 15.625217] =======================
[ 15.625242] agpgart-intel 0000:00:00.0: EARLY resume
...
[ 15.624618] Intel machine check architecture supported.
[ 15.624625] Intel machine check reporting enabled on CPU#0.
[ 15.624992]
[ 15.624993] =================================
[ 15.624995] [ INFO: inconsistent lock state ]
[ 15.624998] 2.6.24-rc5-mm1 #8
[ 15.624999] ---------------------------------
[ 15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[ 15.625005] pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 15.625007] (&cpu_base->lock_key){++..}, at: [<c013c453>]
retrigger_next_event+0x63/0x9f
[ 15.625017] {in-hardirq-W} state was registered at:
[ 15.625019] [<c0145432>] __lock_acquire+0x408/0xbf4
[ 15.625025] [<c0145c94>] lock_acquire+0x76/0x9d
[ 15.625029] [<c039aa08>] _spin_lock+0x19/0x28
[ 15.625035] [<c013cd92>] hrtimer_interrupt+0x72/0x1b0
[ 15.625039] [<c011a2b7>] smp_apic_timer_interrupt+0x69/0x7c
[ 15.625045] [<c0107777>] apic_timer_interrupt+0x33/0x38
[ 15.625050] [<c01054b5>] mwait_idle+0x1b/0x1d
[ 15.625054] [<c01055e9>] cpu_idle+0xb3/0xd4
[ 15.625058] [<c03986c5>] rest_init+0x49/0x4b
[ 15.625062] [<c04f696d>] start_kernel+0x357/0x35f
[ 15.625069] [<00000000>] 0x0
[ 15.625082] [<ffffffff>] 0xffffffff
[ 15.625087] irq event stamp: 1182359
[ 15.625089] hardirqs last enabled at (1182359): [<c0106cb3>]
restore_nocheck+0x12/0x15
[ 15.625094] hardirqs last disabled at (1182358): [<c010776d>]
apic_timer_interrupt+0x29/0x38
[ 15.625098] softirqs last enabled at (933018): [<c0137d89>]
__rcu_offline_cpu+0x32/0x62
[ 15.625104] softirqs last disabled at (933016): [<c039aa22>]
_spin_lock_bh+0xb/0x2d
[ 15.625109]
[ 15.625110] other info that might help us debug this:
[ 15.625112] 2 locks held by pm-hibernate/9940:
[ 15.625114] #0: (&buffer->mutex){--..}, at: [<c01b9234>]
sysfs_write_file+0x25/0xe3
[ 15.625121] #1: (pm_mutex){--..}, at: [<c014e72e>] hibernate+0x10/0x16e
[ 15.625127]
[ 15.625128] stack backtrace:
[ 15.625131] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[ 15.625133] [<c0107d55>] show_trace_log_lvl+0x12/0x25
[ 15.625138] [<c010848a>] show_trace+0xd/0x10
[ 15.625141] [<c0108763>] dump_stack+0x57/0x5f
[ 15.625144] [<c0143e45>] print_usage_bug+0x10a/0x117
[ 15.625148] [<c01447de>] mark_lock+0x1e7/0x3fe
[ 15.625152] [<c014549f>] __lock_acquire+0x475/0xbf4
[ 15.625156] [<c0145c94>] lock_acquire+0x76/0x9d
[ 15.625159] [<c039aa08>] _spin_lock+0x19/0x28
[ 15.625163] [<c013c453>] retrigger_next_event+0x63/0x9f
[ 15.625167] [<c013caf7>] hres_timers_resume+0x4d/0x4f
[ 15.625170] [<c013eed1>] timekeeping_resume+0x117/0x11e
[ 15.625175] [<c027b2ba>] __sysdev_resume+0x14/0x34
[ 15.625179] [<c027b752>] sysdev_resume+0x21/0x57
[ 15.625183] [<c027f426>] device_power_up+0x8/0xf
[ 15.625188] [<c014e6e7>] hibernation_snapshot+0x13c/0x173
[ 15.625192] [<c014e7d8>] hibernate+0xba/0x16e
[ 15.625195] [<c014d56b>] state_store+0x45/0xac
[ 15.625199] [<c01ffe95>] kobj_attr_store+0x1a/0x22
[ 15.625203] [<c01b92c7>] sysfs_write_file+0xb8/0xe3
[ 15.625207] [<c01837eb>] vfs_write+0xa4/0x120
[ 15.625211] [<c0183d5e>] sys_write+0x3b/0x60
[ 15.625214] [<c0106bae>] sysenter_past_esp+0x6b/0xc1
[ 15.625217] =======================
[ 15.625242] agpgart-intel 0000:00:00.0: EARLY resume


2007-12-19 18:46:01

by Daniel Walker

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
> [ 11.827653] PM: Creating hibernation image:
> [ 11.827658] WARNING: at arch/x86/kernel/smp_32.c:561
> native_smp_call_function_mask()
> [ 11.827661] Pid: 9940, comm: pm-hibernate Not tainted
> 2.6.24-rc5-mm1 #8
> [ 11.827665] [<c0107d55>] show_trace_log_lvl+0x12/0x25
> [ 11.827673] [<c010848a>] show_trace+0xd/0x10
> [ 11.827677] [<c0108763>] dump_stack+0x57/0x5f
> [ 11.827681] [<c0117db4>] native_smp_call_function_mask+0x41/0x126
> [ 11.827686] [<c01192d9>] smp_call_function+0x18/0x1f
> [ 11.827690] [<c012c624>] on_each_cpu+0x12/0x40
> [ 11.827695] [<c0166ece>] drain_all_pages+0x13/0x16
> [ 11.827700] [<c014f7b3>] swsusp_save+0x18/0x46b
> [ 11.827705] [<c03103fa>] swsusp_arch_suspend+0x2a/0x2c
> [ 11.827710] [<c014e7d8>] hibernate+0xba/0x16e
> [ 11.827714] [<c014d56b>] state_store+0x45/0xac
> [ 11.827717] [<c01ffe95>] kobj_attr_store+0x1a/0x22
> [ 11.827722] [<c01b92c7>] sysfs_write_file+0xb8/0xe3
> [ 11.827726] [<c01837eb>] vfs_write+0xa4/0x120
> [ 11.827731] [<c0183d5e>] sys_write+0x3b/0x60
> [ 11.827734] [<c0106bae>] sysenter_past_esp+0x6b/0xc1
> [ 11.827738] =======================
...
> [ 15.624993] =================================
> [ 15.624995] [ INFO: inconsistent lock state ]
> [ 15.624998] 2.6.24-rc5-mm1 #8
> [ 15.624999] ---------------------------------
> [ 15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.

It looks like the swsusp_save() calls drain_all_pages() , which calls
on_each_cpu() .. On return on_each_cpu() unconditionally enables
interrupts so the rest of the resume process has interrupt enable
(which , it looks like, shouldn't happen) and then you get the lockdep()
warning due to the above..

Not sure if this has been found already, or not?

Should drain_all_pages() really be drain_local_pages() ?

Daniel

2007-12-19 19:27:20

by Daniel Walker

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Wed, 2007-12-19 at 10:42 -0800, Daniel Walker wrote:
> On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
> > [ 11.827653] PM: Creating hibernation image:
> > [ 11.827658] WARNING: at arch/x86/kernel/smp_32.c:561
> > native_smp_call_function_mask()
> > [ 11.827661] Pid: 9940, comm: pm-hibernate Not tainted
> > 2.6.24-rc5-mm1 #8
> > [ 11.827665] [<c0107d55>] show_trace_log_lvl+0x12/0x25
> > [ 11.827673] [<c010848a>] show_trace+0xd/0x10
> > [ 11.827677] [<c0108763>] dump_stack+0x57/0x5f
> > [ 11.827681] [<c0117db4>] native_smp_call_function_mask+0x41/0x126
> > [ 11.827686] [<c01192d9>] smp_call_function+0x18/0x1f
> > [ 11.827690] [<c012c624>] on_each_cpu+0x12/0x40
> > [ 11.827695] [<c0166ece>] drain_all_pages+0x13/0x16
> > [ 11.827700] [<c014f7b3>] swsusp_save+0x18/0x46b
> > [ 11.827705] [<c03103fa>] swsusp_arch_suspend+0x2a/0x2c
> > [ 11.827710] [<c014e7d8>] hibernate+0xba/0x16e
> > [ 11.827714] [<c014d56b>] state_store+0x45/0xac
> > [ 11.827717] [<c01ffe95>] kobj_attr_store+0x1a/0x22
> > [ 11.827722] [<c01b92c7>] sysfs_write_file+0xb8/0xe3
> > [ 11.827726] [<c01837eb>] vfs_write+0xa4/0x120
> > [ 11.827731] [<c0183d5e>] sys_write+0x3b/0x60
> > [ 11.827734] [<c0106bae>] sysenter_past_esp+0x6b/0xc1
> > [ 11.827738] =======================
> ...
> > [ 15.624993] =================================
> > [ 15.624995] [ INFO: inconsistent lock state ]
> > [ 15.624998] 2.6.24-rc5-mm1 #8
> > [ 15.624999] ---------------------------------
> > [ 15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
>
> It looks like the swsusp_save() calls drain_all_pages() , which calls
> on_each_cpu() .. On return on_each_cpu() unconditionally enables
> interrupts so the rest of the resume process has interrupt enable
> (which , it looks like, shouldn't happen) and then you get the lockdep()
> warning due to the above..
>
> Not sure if this has been found already, or not?
>
> Should drain_all_pages() really be drain_local_pages() ?

It looks like it was drain_local_pages, but the following patch

page-allocator-clean-up-pcp-draining-functions.patch

Changes that in -mm .. I added Christoph Lameter to the CC since it's
his patch ..

Daniel

2007-12-19 20:03:18

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Wed, 19 Dec 2007, Daniel Walker wrote:

> > It looks like the swsusp_save() calls drain_all_pages() , which calls
> > on_each_cpu() .. On return on_each_cpu() unconditionally enables
> > interrupts so the rest of the resume process has interrupt enable
> > (which , it looks like, shouldn't happen) and then you get the lockdep()
> > warning due to the above..
> >
> > Not sure if this has been found already, or not?

Hmmm... It will unconditionally enable interrupts regardless how we call
this. We could explicity save and restore interrrupts in
swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

> > Should drain_all_pages() really be drain_local_pages() ?
>
> It looks like it was drain_local_pages, but the following patch
>
> page-allocator-clean-up-pcp-draining-functions.patch
>
> Changes that in -mm .. I added Christoph Lameter to the CC since it's
> his patch ..

We could reexport drain_local_pages() again but then I do not understand
why we would only drain the pages of this processor and not of all other
processors as well. It seems that software suspend intend was to flush
them all right?

2007-12-19 23:10:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Wednesday, 19 of December 2007, Christoph Lameter wrote:
> On Wed, 19 Dec 2007, Daniel Walker wrote:
>
> > > It looks like the swsusp_save() calls drain_all_pages() , which calls
> > > on_each_cpu() .. On return on_each_cpu() unconditionally enables
> > > interrupts so the rest of the resume process has interrupt enable
> > > (which , it looks like, shouldn't happen) and then you get the lockdep()
> > > warning due to the above..
> > >
> > > Not sure if this has been found already, or not?
>
> Hmmm... It will unconditionally enable interrupts regardless how we call
> this. We could explicity save and restore interrrupts in
> swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

Actually, it's called with interrupts disabled, because it's job is to create
the hibernation image. At this point everything is off except for the CPU
running swsusp_save().

> > > Should drain_all_pages() really be drain_local_pages() ?
> >
> > It looks like it was drain_local_pages, but the following patch
> >
> > page-allocator-clean-up-pcp-draining-functions.patch
> >
> > Changes that in -mm .. I added Christoph Lameter to the CC since it's
> > his patch ..
>
> We could reexport drain_local_pages() again but then I do not understand
> why we would only drain the pages of this processor and not of all other
> processors as well. It seems that software suspend intend was to flush
> them all right?

Well, not exactly. We are on one CPU at this point, the others have been
disabled.

2007-12-19 23:22:34

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:

> > We could reexport drain_local_pages() again but then I do not understand
> > why we would only drain the pages of this processor and not of all other
> > processors as well. It seems that software suspend intend was to flush
> > them all right?
>
> Well, not exactly. We are on one CPU at this point, the others have been
> disabled.

Ok so the others are flush. Here is a patch to re-export
drain_local_pages() again and use it for software suspend:

Signed-off-by: Christoph Lameter <[email protected]>

---
include/linux/gfp.h | 1 +
kernel/power/snapshot.c | 2 +-
mm/page_alloc.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
===================================================================
--- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:25.233961700 -0800
+++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c 2007-12-19 15:16:34.179661929 -0800
@@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)

printk(KERN_INFO "PM: Creating hibernation image: \n");

- drain_all_pages();
+ drain_local_pages(NULL);
nr_pages = count_data_pages();
nr_highmem = count_highmem_pages();
printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + nr_highmem);
Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 -0800
+++ linux-2.6.24-rc5-mm1/mm/page_alloc.c 2007-12-19 15:12:19.850545818 -0800
@@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
/*
* Spill all of this CPU's per-cpu pages back into the buddy allocator.
*/
-static void drain_local_pages(void *arg)
+void drain_local_pages(void *arg)
{
drain_pages(smp_processor_id());
}
Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
===================================================================
--- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:51.926950065 -0800
+++ linux-2.6.24-rc5-mm1/include/linux/gfp.h 2007-12-19 15:16:11.951564369 -0800
@@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
void page_alloc_init(void);
void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
void drain_all_pages(void);
+void drain_local_pages(void *dummy);

#endif /* __LINUX_GFP_H */

2007-12-19 23:49:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Thursday, 20 of December 2007, Christoph Lameter wrote:
> On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
>
> > > We could reexport drain_local_pages() again but then I do not understand
> > > why we would only drain the pages of this processor and not of all other
> > > processors as well. It seems that software suspend intend was to flush
> > > them all right?
> >
> > Well, not exactly. We are on one CPU at this point, the others have been
> > disabled.
>
> Ok so the others are flush. Here is a patch to re-export
> drain_local_pages() again and use it for software suspend:
>
> Signed-off-by: Christoph Lameter <[email protected]>
>
> ---
> include/linux/gfp.h | 1 +
> kernel/power/snapshot.c | 2 +-
> mm/page_alloc.c | 2 +-
> 3 files changed, 3 insertions(+), 2 deletions(-)
>
> Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:25.233961700 -0800
> +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c 2007-12-19 15:16:34.179661929 -0800
> @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
>
> printk(KERN_INFO "PM: Creating hibernation image: \n");
>
> - drain_all_pages();
> + drain_local_pages(NULL);
> nr_pages = count_data_pages();
> nr_highmem = count_highmem_pages();
> printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + nr_highmem);

You've omitted the second instance, right before the copy_data_pages() call.

> Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 -0800
> +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c 2007-12-19 15:12:19.850545818 -0800
> @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> /*
> * Spill all of this CPU's per-cpu pages back into the buddy allocator.
> */
> -static void drain_local_pages(void *arg)
> +void drain_local_pages(void *arg)
> {
> drain_pages(smp_processor_id());
> }
> Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:51.926950065 -0800
> +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h 2007-12-19 15:16:11.951564369 -0800
> @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> void page_alloc_init(void);
> void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> void drain_all_pages(void);
> +void drain_local_pages(void *dummy);
>
> #endif /* __LINUX_GFP_H */

2007-12-20 01:11:50

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

On Dec 19, 2007 7:09 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> On Thursday, 20 of December 2007, Christoph Lameter wrote:
> > On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> >
> > > > We could reexport drain_local_pages() again but then I do not understand
> > > > why we would only drain the pages of this processor and not of all other
> > > > processors as well. It seems that software suspend intend was to flush
> > > > them all right?
> > >
> > > Well, not exactly. We are on one CPU at this point, the others have been
> > > disabled.
> >
> > Ok so the others are flush. Here is a patch to re-export
> > drain_local_pages() again and use it for software suspend:
> >
> > Signed-off-by: Christoph Lameter <[email protected]>
> >
> > ---
> > include/linux/gfp.h | 1 +
> > kernel/power/snapshot.c | 2 +-
> > mm/page_alloc.c | 2 +-
> > 3 files changed, 3 insertions(+), 2 deletions(-)
> >
> > Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> > ===================================================================
> > --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:25.233961700 -0800
> > +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c 2007-12-19 15:16:34.179661929 -0800
> > @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
> >
> > printk(KERN_INFO "PM: Creating hibernation image: \n");
> >
> > - drain_all_pages();
> > + drain_local_pages(NULL);
> > nr_pages = count_data_pages();
> > nr_highmem = count_highmem_pages();
> > printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + nr_highmem);
>
> You've omitted the second instance, right before the copy_data_pages() call.

I will wait for a revised patch and then test.
(Sorry for the duplicate message. I am resending because I
accidentally sent an HTML
message the first time. Whoops.)

> > Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 -0800
> > +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c 2007-12-19 15:12:19.850545818 -0800
> > @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> > /*
> > * Spill all of this CPU's per-cpu pages back into the buddy allocator.
> > */
> > -static void drain_local_pages(void *arg)
> > +void drain_local_pages(void *arg)
> > {
> > drain_pages(smp_processor_id());
> > }
> > Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> > ===================================================================
> > --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:51.926950065 -0800
> > +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h 2007-12-19 15:16:11.951564369 -0800
> > @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> > void page_alloc_init(void);
> > void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> > void drain_all_pages(void);
> > +void drain_local_pages(void *dummy);
> >
> > #endif /* __LINUX_GFP_H */
>