2006-01-12 09:00:44

by Chase Venters

[permalink] [raw]
Subject: Bad page state at free_hot_cold_page

Greetings,
(I'm posting this to LKML and CK because I'm not sure if any of 2.6.15-ck1's
changes might cause this scenario)
Recently I've noticed that after my desktop has been up for a while, my music
playback / mouse cursor movement will on occasion pause briefly. I got
frustrated with it a minute ago and decided to kill artsd, wondering if there
could be issues with both arts and amarok's backend holding the audio device
open at once.
When running killall artsd, I locked up for a second and found this in dmesg:

Bad page state at free_hot_cold_page (in process 'artsd', page b1761620)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761640)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761660)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761680)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17616a0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17616c0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17616e0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761700)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761720)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761740)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761760)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b1761780)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17617a0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17617c0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page b17617e0)
flags:0x80000404 mapping:00000000 mapcount:0 count:0
Backtrace:
[<b0148e9a>] bad_page+0x84/0xbc
[<b0149699>] free_hot_cold_page+0x65/0x13a
[<b0153bf1>] zap_pte_range+0x1d1/0x28f
[<b0153d70>] unmap_page_range+0xc1/0x122
[<b0153ebe>] unmap_vmas+0xed/0x242
[<b0158099>] unmap_region+0xb4/0x156
[<b01583e2>] do_munmap+0x108/0x144
[<b015846f>] sys_munmap+0x51/0x76
[<b0102eff>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed

Any thoughts? Further diagnostics?

Thanks,
Chase


2006-01-12 09:24:19

by Andreas Mohr

[permalink] [raw]
Subject: Re: [ck] Bad page state at free_hot_cold_page

Hi,

On Thu, Jan 12, 2006 at 03:00:59AM -0600, Chase Venters wrote:
> Greetings,
> (I'm posting this to LKML and CK because I'm not sure if any of 2.6.15-ck1's
> changes might cause this scenario)
> Recently I've noticed that after my desktop has been up for a while, my music
> playback / mouse cursor movement will on occasion pause briefly. I got
> frustrated with it a minute ago and decided to kill artsd, wondering if there
> could be issues with both arts and amarok's backend holding the audio device
> open at once.
> When running killall artsd, I locked up for a second and found this in dmesg:
>
> Bad page state at free_hot_cold_page (in process 'artsd', page b1761620)
> flags:0x80000404 mapping:00000000 mapcount:0 count:0
> Backtrace:
> [<b0148e9a>] bad_page+0x84/0xbc
> [<b0149699>] free_hot_cold_page+0x65/0x13a
> [<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
> [<b0153bf1>] zap_pte_range+0x1d1/0x28f
> [<b0153d70>] unmap_page_range+0xc1/0x122
> [<b0153ebe>] unmap_vmas+0xed/0x242
> [<b0158099>] unmap_region+0xb4/0x156
> [<b01583e2>] do_munmap+0x108/0x144
> [<b015846f>] sys_munmap+0x51/0x76
> [<b0102eff>] sysenter_past_esp+0x54/0x75
> Trying to fix it up, but a reboot is needed

AFAIK random page state toggling often happens due to bad RAM.

Care to run memtest86 or similar to confirm this?
Or also try running an older kernel to verify whether it doesn't happen there.
But I'm betting on bad RAM :-\

Andreas Mohr

2006-01-12 09:35:45

by Chase Venters

[permalink] [raw]
Subject: Re: [ck] Bad page state at free_hot_cold_page

On Thursday 12 January 2006 03:24, Andreas Mohr wrote:
> AFAIK random page state toggling often happens due to bad RAM.
>
> Care to run memtest86 or similar to confirm this?
> Or also try running an older kernel to verify whether it doesn't happen
> there. But I'm betting on bad RAM :-\

Andreas,
I've been looking into this problem a little bit more (did some digging to
try and teach myself a little bit about the page flags). I noticed after
posting that I had bad page states reported in dmesg for amarokapp (new ones)
as well.
So I got a bit curious and looked to see what bits were stuck on... and I
started to wonder if it could have something to do with ALSA. (Unfortunately
I'm quickly reaching the limit of my current understanding of the kernel's
innards)
Anyway, I tried a test - made sure both amarokapp and artsd were dead, then
used rmmod to pluck out every last "snd" module. I put them back in, fired up
amarok, and went to play a song. It played fine with no noticeable latency,
so I tried switching to another track. Doing so caused the system to freeze
for a few seconds and another set of bad page states to go rushing through
the log.
I can accept bad memory (though I think it's unlikely on this system because
I've tested that fairly recently) but the page states, while bad, are
consistently (not randomly) so.
I'd reboot right now and test it, but at the moment I'm capable of
reproducing these page state errors 100% of the time, so if there are any
sorts of things I can do to debug the thing while it's up I'd like to move
forward with that before I reboot and lose whatever 'ideal state' got me here
in the first place.

Thanks,
Chase

> Andreas Mohr
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2006-01-12 11:55:20

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [ck] Bad page state at free_hot_cold_page : Three more reports, suspend2 but non-ck as far as I know.

Hi.

On Thursday 12 January 2006 19:24, Andreas Mohr wrote:
> Hi,
>
> On Thu, Jan 12, 2006 at 03:00:59AM -0600, Chase Venters wrote:
> > Greetings,
> > (I'm posting this to LKML and CK because I'm not sure if any of
> > 2.6.15-ck1's changes might cause this scenario)
> > Recently I've noticed that after my desktop has been up for a while, my
> > music playback / mouse cursor movement will on occasion pause briefly. I
> > got frustrated with it a minute ago and decided to kill artsd, wondering
> > if there could be issues with both arts and amarok's backend holding the
> > audio device open at once.
> > When running killall artsd, I locked up for a second and found this in
> > dmesg:
> >
> > Bad page state at free_hot_cold_page (in process 'artsd', page b1761620)
> > flags:0x80000404 mapping:00000000 mapcount:0 count:0
> > Backtrace:
> > [<b0148e9a>] bad_page+0x84/0xbc
> > [<b0149699>] free_hot_cold_page+0x65/0x13a
> > [<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
> > [<b0153bf1>] zap_pte_range+0x1d1/0x28f
> > [<b0153d70>] unmap_page_range+0xc1/0x122
> > [<b0153ebe>] unmap_vmas+0xed/0x242
> > [<b0158099>] unmap_region+0xb4/0x156
> > [<b01583e2>] do_munmap+0x108/0x144
> > [<b015846f>] sys_munmap+0x51/0x76
> > [<b0102eff>] sysenter_past_esp+0x54/0x75
> > Trying to fix it up, but a reboot is needed
>
> AFAIK random page state toggling often happens due to bad RAM.
>
> Care to run memtest86 or similar to confirm this?
> Or also try running an older kernel to verify whether it doesn't happen
> there. But I'm betting on bad RAM :-\

I've had the same reports, so far as I know without the ck patches, and also
in artsd....

One message:

Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page c16962e8)
flags:0x80000414 mapping:00000000 mapcount:0 count:0
Backtrace:
?[<c0155660>] bad_page+0x84/0xbc
?[<c0155e5f>] free_hot_cold_page+0x55/0x169
?[<c0156759>] __pagevec_free+0x16/0x1e
?[<c015c6f2>] release_pages+0x16e/0x190
?[<c01614dd>] unmap_page_range+0xb4/0x13a
?[<c0169bd9>] free_pages_and_swap_cache+0x5d/0x83
?[<c016177b>] unmap_vmas+0x218/0x22e
?[<c0165cc8>] unmap_region+0xb7/0x159
?[<c0166037>] do_munmap+0x10f/0x179
?[<c01660f2>] sys_munmap+0x51/0x76
?[<c01032a7>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
Bad page state at free_hot_cold_page (in process 'artsd', page c16962c4)
flags:0x80000414 mapping:00000000 mapcount:0 count:0
Backtrace:
?[<c0155660>] bad_page+0x84/0xbc
?[<c0155e5f>] free_hot_cold_page+0x55/0x169
?[<c0156759>] __pagevec_free+0x16/0x1e
?[<c015c6f2>] release_pages+0x16e/0x190
?[<c01614dd>] unmap_page_range+0xb4/0x13a
?[<c0169bd9>] free_pages_and_swap_cache+0x5d/0x83
?[<c016177b>] unmap_vmas+0x218/0x22e
?[<c0165cc8>] unmap_region+0xb7/0x159
?[<c0166037>] do_munmap+0x10f/0x179
?[<c01660f2>] sys_munmap+0x51/0x76
?[<c01032a7>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed

To which another user replied:

I am seeing this with the fglrx (commercial ATI) driver version 8.20.8.

A third user:

Bad page state at free_hot_cold_page (in process 'artsd', page c15ba7e0)
flags:0x80000414 mapping:00000000 mapcount:0 count:0
Backtrace:
?[<c01420fa>] bad_page+0x5c/0x92
?[<c01427db>] free_hot_cold_page+0x5a/0xe9
?[<c014aa0c>] zap_pte_range+0x164/0x1ea
?[<c014ab3b>] unmap_page_range+0xa9/0xf3
?[<c014ac4c>] unmap_vmas+0xc7/0x18c
?[<c014e22b>] unmap_region+0x7d/0xed
?[<c014e4a3>] do_munmap+0xdd/0xf3
?[<c014e4ea>] sys_munmap+0x31/0x4b
?[<c0102bc9>] syscall_call+0x7/0xb

One person said these were happening before they'd ever suspended, so I don't
believe it's suspend2 related.

Regards,

Nigel

2006-01-12 22:00:50

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [ck] Bad page state at free_hot_cold_page

>> Greetings,
>> (I'm posting this to LKML and CK because I'm not sure if any of 2.6.15-ck1's
>> changes might cause this scenario)
>> Recently I've noticed that after my desktop has been up for a while, my music
>> playback / mouse cursor movement will on occasion pause briefly. I got
>> frustrated with it a minute ago and decided to kill artsd, wondering if there
>> could be issues with both arts and amarok's backend holding the audio device
>> open at once.
>> When running killall artsd, I locked up for a second and found this in dmesg:
>>
>> Bad page state at free_hot_cold_page (in process 'artsd', page b1761620)
>> flags:0x80000404 mapping:00000000 mapcount:0 count:0
>> Backtrace:
>> [<b0148e9a>] bad_page+0x84/0xbc
>> [<b0149699>] free_hot_cold_page+0x65/0x13a
>> [<b05b6901>] _spin_unlock_irqrestore+0xf/0x23
>> [<b0153bf1>] zap_pte_range+0x1d1/0x28f
>> [<b0153d70>] unmap_page_range+0xc1/0x122
>> [<b0153ebe>] unmap_vmas+0xed/0x242
>> [<b0158099>] unmap_region+0xb4/0x156
>> [<b01583e2>] do_munmap+0x108/0x144
>> [<b015846f>] sys_munmap+0x51/0x76
>> [<b0102eff>] sysenter_past_esp+0x54/0x75
>> Trying to fix it up, but a reboot is needed

Ftr, I get the same stackdump when shutting down X. But look no
further: I use an ancient nvidia and 2.6.15 (vanilla). No bad ram here,
even tested last week before install. :)


Jan Engelhardt
--