2008-06-27 16:43:34

by Soeren Sonnenburg

[permalink] [raw]
Subject: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

Dear all,

Anyone else seeing this? Happened after resuming from s2both (from
disk)...

VFS: Mounted root (ext2 filesystem).
------------[ cut here ]------------
kernel BUG at kernel/power/snapshot.c:493!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in:

Pid: 568, comm: linuxrc Not tainted (2.6.26-rc7-sonne #35)
EIP: 0060:[<c0157f99>] EFLAGS: 00010282 CPU: 0
EIP is at memory_bm_set_bit+0x29/0x30
EAX: fffffff2 EBX: f7165f1c ECX: 00000000 EDX: 03000200
ESI: f7115000 EDI: b7f2a008 EBP: f7165f28 ESP: f7165f18
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process linuxrc (pid: 568, ti=f7164000 task=f7b65f80 task.ti=f7164000)
Stack: f7165f1c 00000013 f7117000 00000002 f7165f5c c0158e9b bf96e5ec c02dd570
00001000 c05fa980 00000008 00000008 00000000 00000000 f7000200 c05fa980
b7f2a008 f7165f70 c015b3af f7000200 b7f2a008 00001000 f7165f90 c0185e28
Call Trace:
[<c0158e9b>] ? snapshot_write_next+0xcb/0x7a0
[<c02dd570>] ? write_chan+0x0/0x380
[<c015b3af>] ? snapshot_write+0x1f/0x70
[<c0185e28>] ? vfs_write+0x98/0x120
[<c015b390>] ? snapshot_write+0x0/0x70
[<c0185f6d>] ? sys_write+0x3d/0x70
[<c010333a>] ? syscall_call+0x7/0xb
=======================
Code: 00 00 55 89 e5 53 83 ec 0c 8d 4d f8 8d 5d f4 89 1c 24 e8 4b ff ff ff 85 c0
EIP: [<c0157f99>] memory_bm_set_bit+0x29/0x30 SS:ESP 0068:f7165f18
---[ end trace 70a926e889075b3c ]---


Soeren


2008-06-27 17:06:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Friday, 27 of June 2008, Soeren Sonnenburg wrote:
> Dear all,
>
> Anyone else seeing this? Happened after resuming from s2both (from
> disk)...

At what point exactly did it happen?

> VFS: Mounted root (ext2 filesystem).
> ------------[ cut here ]------------
> kernel BUG at kernel/power/snapshot.c:493!
> invalid opcode: 0000 [#1] PREEMPT SMP
> Modules linked in:
>
> Pid: 568, comm: linuxrc Not tainted (2.6.26-rc7-sonne #35)
> EIP: 0060:[<c0157f99>] EFLAGS: 00010282 CPU: 0
> EIP is at memory_bm_set_bit+0x29/0x30
> EAX: fffffff2 EBX: f7165f1c ECX: 00000000 EDX: 03000200
> ESI: f7115000 EDI: b7f2a008 EBP: f7165f28 ESP: f7165f18
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process linuxrc (pid: 568, ti=f7164000 task=f7b65f80 task.ti=f7164000)
> Stack: f7165f1c 00000013 f7117000 00000002 f7165f5c c0158e9b bf96e5ec c02dd570
> 00001000 c05fa980 00000008 00000008 00000000 00000000 f7000200 c05fa980
> b7f2a008 f7165f70 c015b3af f7000200 b7f2a008 00001000 f7165f90 c0185e28
> Call Trace:
> [<c0158e9b>] ? snapshot_write_next+0xcb/0x7a0
> [<c02dd570>] ? write_chan+0x0/0x380
> [<c015b3af>] ? snapshot_write+0x1f/0x70
> [<c0185e28>] ? vfs_write+0x98/0x120
> [<c015b390>] ? snapshot_write+0x0/0x70
> [<c0185f6d>] ? sys_write+0x3d/0x70
> [<c010333a>] ? syscall_call+0x7/0xb
> =======================
> Code: 00 00 55 89 e5 53 83 ec 0c 8d 4d f8 8d 5d f4 89 1c 24 e8 4b ff ff ff 85 c0
> EIP: [<c0157f99>] memory_bm_set_bit+0x29/0x30 SS:ESP 0068:f7165f18
> ---[ end trace 70a926e889075b3c ]---

2008-06-27 18:14:45

by Soeren Sonnenburg

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Fri, 2008-06-27 at 19:07 +0200, Rafael J. Wysocki wrote:
> On Friday, 27 of June 2008, Soeren Sonnenburg wrote:
> > Dear all,
> >
> > Anyone else seeing this? Happened after resuming from s2both (from
> > disk)...
>
> At what point exactly did it happen?

I am attaching the full dmesg.

HTH,
Soeren


Attachments:
dmesg.gz (8.35 kB)

2008-06-27 19:34:25

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Friday, 27 of June 2008, Soeren Sonnenburg wrote:
> On Fri, 2008-06-27 at 19:07 +0200, Rafael J. Wysocki wrote:
> > On Friday, 27 of June 2008, Soeren Sonnenburg wrote:
> > > Dear all,
> > >
> > > Anyone else seeing this? Happened after resuming from s2both (from
> > > disk)...
> >
> > At what point exactly did it happen?
>
> I am attaching the full dmesg.

This looks like userland resume trying to feed invalid image metadata to the
kernel through /dev/snapshot.

Hard to say why this happened, though. It looks like the swap header of the
resume device wasn't restored correctly after reading the image and that would
be a user space bug.

Thanks,
Rafael

2008-07-09 13:08:49

by Martin Michlmayr

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

* Soeren Sonnenburg <[email protected]> [2008-06-27 18:43]:
> Anyone else seeing this? Happened after resuming from s2both (from
> disk)...

FWIW, I got something similar a few days ago with s2disk and 2.6.25.
I had never seen this before but then I got the same bug two or three
times within a day. It happens when resuming the laptop from disk.
The kernel continues to start fine but doesn't restore the previous
state.

[ 10.024084] ------------[ cut here ]------------
[ 10.025255] kernel BUG at kernel/power/snapshot.c:493!
[ 10.025255] invalid opcode: 0000 [1] SMP
[ 10.025255] CPU 0
[ 10.025255] Modules linked in: sha256_generic aes_x86_64 aes_generic cbc dm_crypt crypto_blkcipher dm_mirror dm_snapshot dm_mod ide_cd_mod cdrom ide_disk ata_generic libata scsi_mod dock piix ide_pci_generic ide_core ehci_hcd uhci_hcd thermal processor fan
[ 10.025255] Pid: 1520, comm: resume Not tainted 2.6.25-2-amd64 #1
[ 10.025255] RIP: 0010:[<ffffffff80257d19>] [<ffffffff80257d19>] memory_bm_set_bit+0x15/0x28
[ 10.025255] RSP: 0018:ffff81007c9fbe90 EFLAGS: 00010282
[ 10.025255] RAX: 00000000fffffff2 RBX: 0000000000000001 RCX: ffff81007c9fbe9c
[ 10.025255] RDX: ffff81007c9fbe90 RSI: 0100000000000000 RDI: ffffffff80618ac0
[ 10.025255] RBP: ffff81007c016000 R08: ffff81007c9fbe90 R09: 0000000000000002
[ 10.025255] R10: 0000000000000002 R11: ffffffff802efae3 R12: 00007f853856a010
[ 10.025255] R13: ffffffff80618b20 R14: 0000000000000004 R15: 0000000000001000
[ 10.025255] FS: 0000000002090850(0063) GS:ffffffff80520000(0000) knlGS:0000000000000000
[ 10.025255] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 10.025255] CR2: 0000000000445230 CR3: 000000007c914000 CR4: 00000000000006e0
[ 10.025255] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 10.025255] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 10.025255] Process resume (pid: 1520, threadinfo ffff81007c9fa000, task ffff81007cd4a400)
[ 10.025255] Stack: ffff81007e877180 0000000000000092 ffffffff80258e7a 0000000000000000
[ 10.025255] ffff81007d8d4380 ffffffff80618b20 00007f853856a010 ffff81007c9fbf50
[ 10.025255] 0000000000000004 00007f8538568010 ffffffff80259fd8 000000000001d688
[ 10.025255] Call Trace:
[ 10.025255] [<ffffffff80258e7a>] ? snapshot_write_next+0xec/0x646
[ 10.025255] [<ffffffff80259fd8>] ? snapshot_write+0x22/0x5c
[ 10.025255] [<ffffffff8029ab65>] ? vfs_write+0xad/0x156
[ 10.025255] [<ffffffff8029b107>] ? sys_write+0x45/0x6e
[ 10.025255] [<ffffffff8020bd9a>] ? system_call_after_swapgs+0x8a/0x8f
[ 10.025255]
[ 10.025255]
[ 10.025255] Code: 6d 48 8d 34 37 48 89 d7 48 c1 fe 03 48 0f af f0 e9 72 ff ff ff c3 48 83 ec 10 48 8d 4c 24 0c 48 89 e2 e8 e4 fe ff ff 85 c0 74 04 <0f> 0b eb fe 8b 54 24 0c 48 8b 04 24 f0 0f ab 10 59 5e c3 48 8b
[ 10.025255] RIP [<ffffffff80257d19>] memory_bm_set_bit+0x15/0x28
[ 10.025255] RSP <ffff81007c9fbe90>
[ 10.025266] ---[ end trace a54fd545107adef9 ]---

Full dmesg is attached.
--
Martin Michlmayr
http://www.cyrius.com/


Attachments:
(No filename) (3.00 kB)
dmesg-snapshot-bug (40.93 kB)
Download all attachments

2008-07-09 20:59:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Wednesday, 9 of July 2008, Martin Michlmayr wrote:
> * Soeren Sonnenburg <[email protected]> [2008-06-27 18:43]:
> > Anyone else seeing this? Happened after resuming from s2both (from
> > disk)...
>
> FWIW, I got something similar a few days ago with s2disk and 2.6.25.
> I had never seen this before but then I got the same bug two or three
> times within a day. It happens when resuming the laptop from disk.
> The kernel continues to start fine but doesn't restore the previous
> state.

This is a symptom of the userspace resume trying to feed invalid image
metadata to the kernel (yes, I'll have to change that code to return error
code in that case).

Please try to run mkswap on your resume partition and see if that helps.

Thanks,
Rafael

2008-07-11 15:38:48

by Martin Michlmayr

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

* Rafael J. Wysocki <[email protected]> [2008-07-09 23:00]:
> This is a symptom of the userspace resume trying to feed invalid image
> metadata to the kernel (yes, I'll have to change that code to return error
> code in that case).
>
> Please try to run mkswap on your resume partition and see if that helps.

Nope, I'm still getting the same.
--
Martin Michlmayr
http://www.cyrius.com/

2008-07-11 19:04:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Friday, 11 of July 2008, Martin Michlmayr wrote:
> * Rafael J. Wysocki <[email protected]> [2008-07-09 23:00]:
> > This is a symptom of the userspace resume trying to feed invalid image
> > metadata to the kernel (yes, I'll have to change that code to return error
> > code in that case).
> >
> > Please try to run mkswap on your resume partition and see if that helps.
>
> Nope, I'm still getting the same.

If you have the user space resume utility in your initrd, please try to disable
or remove it. I'd like to verify if it is responsible for this message.

Thanks,
Rafael

2008-07-11 19:54:24

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

Hi!

> > > Anyone else seeing this? Happened after resuming from s2both (from
> > > disk)...
> >
> > At what point exactly did it happen?
>
> I am attaching the full dmesg.

Something is seriously wrong here.

> usbcore: registered new interface driver usbhid
> usbhid: v2.6:USB HID core driver
> TCP cubic registered
> NET: Registered protocol family 10
> lo: Disabled Privacy Extensions
> Mobile IPv6
> NET: Registered protocol family 17
> NET: Registered protocol family 15
> Using IPI No-Shortcut mode
> RAMDISK: ext2 filesystem found at block 0
> RAMDISK: Loading 2000KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|done.
> EXT2-fs warning: checktime reached, running e2fsck is recommended
> VFS: Mounted root (ext2 filesystem).
> ------------[ cut here ]------------
> kernel BUG at kernel/power/snapshot.c:493!
> invalid opcode: 0000 [#1] PREEMPT SMP
> Modules linked in:

This is _after_ we mounted filesystems... we should not be resuming at
this point.

> Pid: 568, comm: linuxrc Not tainted (2.6.26-rc7-sonne #35)
> EIP: 0060:[<c0157f99>] EFLAGS: 00010282 CPU: 0
> EIP is at memory_bm_set_bit+0x29/0x30
> EAX: fffffff2 EBX: f7165f1c ECX: 00000000 EDX: 03000200
> ESI: f7115000 EDI: b7f2a008 EBP: f7165f28 ESP: f7165f18
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process linuxrc (pid: 568, ti=f7164000 task=f7b65f80 task.ti=f7164000)
> Stack: f7165f1c 00000013 f7117000 00000002 f7165f5c c0158e9b bf96e5ec c02dd570
> 00001000 c05fa980 00000008 00000008 00000000 00000000 f7000200 c05fa980
> b7f2a008 f7165f70 c015b3af f7000200 b7f2a008 00001000 f7165f90 c0185e28
> Call Trace:
> [<c0158e9b>] ? snapshot_write_next+0xcb/0x7a0
> [<c02dd570>] ? write_chan+0x0/0x380
> [<c015b3af>] ? snapshot_write+0x1f/0x70
> [<c0185e28>] ? vfs_write+0x98/0x120
> [<c015b390>] ? snapshot_write+0x0/0x70
> [<c0185f6d>] ? sys_write+0x3d/0x70
> [<c010333a>] ? syscall_call+0x7/0xb
> =======================
> Code: 00 00 55 89 e5 53 83 ec 0c 8d 4d f8 8d 5d f4 89 1c 24 e8 4b ff ff ff 85 c0 75 10 8b 55 f8 8b 45 f4 f0 0f ab 02 83 c4 0c 5b 5d c3 <0f> 0b eb fe 8d 76 00 8b 0d 0c a9 5f c0 55 89 e5 85 c9 74 14 8b
> EIP: [<c0157f99>] memory_bm_set_bit+0x29/0x30 SS:ESP 0068:f7165f18
> ---[ end trace 70a926e889075b3c ]---

> Clocksource tsc unstable (delta = -208377644 ns)
> kjournald starting. Commit interval 5 seconds
> EXT3 FS on sda3, internal journal
> ext3_orphan_cleanup: deleting unreferenced inode 1805649
> ext3_orphan_cleanup: deleting unreferenced inode 1228813
> EXT3-fs: sda3: 2 orphan inodes deleted
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with ordered data mode.
> VFS: Mounted root (ext3 filesystem).
> Trying to move old root to /initrd ... /initrd does not exist. Ignored.
> Unmounting old root
> Trying to free ramdisk memory ... okay
> Freeing unused kernel memory: 268k freed
> kjournald starting. Commit interval 5 seconds
> EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
> EXT3 FS on dm-0, internal journal
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with ordered data mode.

You should really fsck at this point.


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-07-11 20:33:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Friday, 11 of July 2008, Pavel Machek wrote:
> Hi!
>
> > > > Anyone else seeing this? Happened after resuming from s2both (from
> > > > disk)...
> > >
> > > At what point exactly did it happen?
> >
> > I am attaching the full dmesg.
>
> Something is seriously wrong here.
>
> > usbcore: registered new interface driver usbhid
> > usbhid: v2.6:USB HID core driver
> > TCP cubic registered
> > NET: Registered protocol family 10
> > lo: Disabled Privacy Extensions
> > Mobile IPv6
> > NET: Registered protocol family 17
> > NET: Registered protocol family 15
> > Using IPI No-Shortcut mode
> > RAMDISK: ext2 filesystem found at block 0
> > RAMDISK: Loading 2000KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|done.
> > EXT2-fs warning: checktime reached, running e2fsck is recommended
> > VFS: Mounted root (ext2 filesystem).
> > ------------[ cut here ]------------
> > kernel BUG at kernel/power/snapshot.c:493!
> > invalid opcode: 0000 [#1] PREEMPT SMP
> > Modules linked in:
>
> This is _after_ we mounted filesystems... we should not be resuming at
> this point.

This is an initrd, isn't it?

In which case we can be resuming (using the user land "resume") exactly at this
point, no?

> > Pid: 568, comm: linuxrc Not tainted (2.6.26-rc7-sonne #35)
> > EIP: 0060:[<c0157f99>] EFLAGS: 00010282 CPU: 0
> > EIP is at memory_bm_set_bit+0x29/0x30
> > EAX: fffffff2 EBX: f7165f1c ECX: 00000000 EDX: 03000200
> > ESI: f7115000 EDI: b7f2a008 EBP: f7165f28 ESP: f7165f18
> > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > Process linuxrc (pid: 568, ti=f7164000 task=f7b65f80 task.ti=f7164000)
> > Stack: f7165f1c 00000013 f7117000 00000002 f7165f5c c0158e9b bf96e5ec c02dd570
> > 00001000 c05fa980 00000008 00000008 00000000 00000000 f7000200 c05fa980
> > b7f2a008 f7165f70 c015b3af f7000200 b7f2a008 00001000 f7165f90 c0185e28
> > Call Trace:
> > [<c0158e9b>] ? snapshot_write_next+0xcb/0x7a0
> > [<c02dd570>] ? write_chan+0x0/0x380
> > [<c015b3af>] ? snapshot_write+0x1f/0x70
> > [<c0185e28>] ? vfs_write+0x98/0x120
> > [<c015b390>] ? snapshot_write+0x0/0x70
> > [<c0185f6d>] ? sys_write+0x3d/0x70
> > [<c010333a>] ? syscall_call+0x7/0xb
> > =======================
> > Code: 00 00 55 89 e5 53 83 ec 0c 8d 4d f8 8d 5d f4 89 1c 24 e8 4b ff ff ff 85 c0 75 10 8b 55 f8 8b 45 f4 f0 0f ab 02 83 c4 0c 5b 5d c3 <0f> 0b eb fe 8d 76 00 8b 0d 0c a9 5f c0 55 89 e5 85 c9 74 14 8b
> > EIP: [<c0157f99>] memory_bm_set_bit+0x29/0x30 SS:ESP 0068:f7165f18
> > ---[ end trace 70a926e889075b3c ]---
>
> > Clocksource tsc unstable (delta = -208377644 ns)
> > kjournald starting. Commit interval 5 seconds
> > EXT3 FS on sda3, internal journal
> > ext3_orphan_cleanup: deleting unreferenced inode 1805649
> > ext3_orphan_cleanup: deleting unreferenced inode 1228813
> > EXT3-fs: sda3: 2 orphan inodes deleted
> > EXT3-fs: recovery complete.
> > EXT3-fs: mounted filesystem with ordered data mode.
> > VFS: Mounted root (ext3 filesystem).
> > Trying to move old root to /initrd ... /initrd does not exist. Ignored.
> > Unmounting old root
> > Trying to free ramdisk memory ... okay
> > Freeing unused kernel memory: 268k freed
> > kjournald starting. Commit interval 5 seconds
> > EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
> > EXT3 FS on dm-0, internal journal
> > EXT3-fs: recovery complete.
> > EXT3-fs: mounted filesystem with ordered data mode.
>
> You should really fsck at this point.

I don't really think so.

Rafael

2008-07-11 20:39:58

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Fri 2008-07-11 22:33:52, Rafael J. Wysocki wrote:
> On Friday, 11 of July 2008, Pavel Machek wrote:
> > Hi!
> >
> > > > > Anyone else seeing this? Happened after resuming from s2both (from
> > > > > disk)...
> > > >
> > > > At what point exactly did it happen?
> > >
> > > I am attaching the full dmesg.
> >
> > Something is seriously wrong here.
> >
> > > usbcore: registered new interface driver usbhid
> > > usbhid: v2.6:USB HID core driver
> > > TCP cubic registered
> > > NET: Registered protocol family 10
> > > lo: Disabled Privacy Extensions
> > > Mobile IPv6
> > > NET: Registered protocol family 17
> > > NET: Registered protocol family 15
> > > Using IPI No-Shortcut mode
> > > RAMDISK: ext2 filesystem found at block 0
> > > RAMDISK: Loading 2000KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|done.

(1)
> > > EXT2-fs warning: checktime reached, running e2fsck is recommended
(2)
> > > VFS: Mounted root (ext2 filesystem).
> > > ------------[ cut here ]------------
> > > kernel BUG at kernel/power/snapshot.c:493!
> > > invalid opcode: 0000 [#1] PREEMPT SMP
> > > Modules linked in:
> >
> > This is _after_ we mounted filesystems... we should not be resuming at
> > this point.
>
> This is an initrd, isn't it?

Does not look like so: it mounted some filesystem at (1), and
checktime is reached. I thought initrd is read-only RAM disk -- it
should not reach check time, no?

> In which case we can be resuming (using the user land "resume") exactly at this
> point, no?
>
> > > Pid: 568, comm: linuxrc Not tainted (2.6.26-rc7-sonne #35)
> > > EIP: 0060:[<c0157f99>] EFLAGS: 00010282 CPU: 0
> > > EIP is at memory_bm_set_bit+0x29/0x30
> > > EAX: fffffff2 EBX: f7165f1c ECX: 00000000 EDX: 03000200
> > > ESI: f7115000 EDI: b7f2a008 EBP: f7165f28 ESP: f7165f18
> > > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > > Process linuxrc (pid: 568, ti=f7164000 task=f7b65f80 task.ti=f7164000)
> > > Stack: f7165f1c 00000013 f7117000 00000002 f7165f5c c0158e9b bf96e5ec c02dd570
> > > 00001000 c05fa980 00000008 00000008 00000000 00000000 f7000200 c05fa980
> > > b7f2a008 f7165f70 c015b3af f7000200 b7f2a008 00001000 f7165f90 c0185e28
> > > Call Trace:
> > > [<c0158e9b>] ? snapshot_write_next+0xcb/0x7a0
> > > [<c02dd570>] ? write_chan+0x0/0x380
> > > [<c015b3af>] ? snapshot_write+0x1f/0x70
> > > [<c0185e28>] ? vfs_write+0x98/0x120
> > > [<c015b390>] ? snapshot_write+0x0/0x70
> > > [<c0185f6d>] ? sys_write+0x3d/0x70
> > > [<c010333a>] ? syscall_call+0x7/0xb
> > > =======================
> > > Code: 00 00 55 89 e5 53 83 ec 0c 8d 4d f8 8d 5d f4 89 1c 24 e8 4b ff ff ff 85 c0 75 10 8b 55 f8 8b 45 f4 f0 0f ab 02 83 c4 0c 5b 5d c3 <0f> 0b eb fe 8d 76 00 8b 0d 0c a9 5f c0 55 89 e5 85 c9 74 14 8b
> > > EIP: [<c0157f99>] memory_bm_set_bit+0x29/0x30 SS:ESP 0068:f7165f18
> > > ---[ end trace 70a926e889075b3c ]---
> >
> > > Clocksource tsc unstable (delta = -208377644 ns)
> > > kjournald starting. Commit interval 5 seconds
> > > EXT3 FS on sda3, internal journal
> > > ext3_orphan_cleanup: deleting unreferenced inode 1805649
> > > ext3_orphan_cleanup: deleting unreferenced inode 1228813
> > > EXT3-fs: sda3: 2 orphan inodes deleted
> > > EXT3-fs: recovery complete.
> > > EXT3-fs: mounted filesystem with ordered data mode.
> > > VFS: Mounted root (ext3 filesystem).
> > > Trying to move old root to /initrd ... /initrd does not exist. Ignored.
> > > Unmounting old root
> > > Trying to free ramdisk memory ... okay
> > > Freeing unused kernel memory: 268k freed
> > > kjournald starting. Commit interval 5 seconds
> > > EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
> > > EXT3 FS on dm-0, internal journal
> > > EXT3-fs: recovery complete.
> > > EXT3-fs: mounted filesystem with ordered data mode.
> >
> > You should really fsck at this point.
>
> I don't really think so.

Well, "mounted fs with errors" indicates a problem, right?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-07-11 20:45:37

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

On Friday, 11 of July 2008, Pavel Machek wrote:
> On Fri 2008-07-11 22:33:52, Rafael J. Wysocki wrote:
> > On Friday, 11 of July 2008, Pavel Machek wrote:
> > > Hi!
> > >
> > > > > > Anyone else seeing this? Happened after resuming from s2both (from
> > > > > > disk)...
> > > > >
> > > > > At what point exactly did it happen?
> > > >
> > > > I am attaching the full dmesg.
> > >
> > > Something is seriously wrong here.
> > >
> > > > usbcore: registered new interface driver usbhid
> > > > usbhid: v2.6:USB HID core driver
> > > > TCP cubic registered
> > > > NET: Registered protocol family 10
> > > > lo: Disabled Privacy Extensions
> > > > Mobile IPv6
> > > > NET: Registered protocol family 17
> > > > NET: Registered protocol family 15
> > > > Using IPI No-Shortcut mode
> > > > RAMDISK: ext2 filesystem found at block 0
> > > > RAMDISK: Loading 2000KiB [1 disk] into ram disk... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|done.
>
> (1)
> > > > EXT2-fs warning: checktime reached, running e2fsck is recommended
> (2)
> > > > VFS: Mounted root (ext2 filesystem).
> > > > ------------[ cut here ]------------
> > > > kernel BUG at kernel/power/snapshot.c:493!
> > > > invalid opcode: 0000 [#1] PREEMPT SMP
> > > > Modules linked in:
> > >
> > > This is _after_ we mounted filesystems... we should not be resuming at
> > > this point.
> >
> > This is an initrd, isn't it?
>
> Does not look like so: it mounted some filesystem at (1), and
> checktime is reached. I thought initrd is read-only RAM disk -- it
> should not reach check time, no?

Yes, it will reach the check time if it's ext2 (which is the case).

Thanks,
Rafael

2008-07-15 12:44:32

by Martin Michlmayr

[permalink] [raw]
Subject: Re: 2.6.26-rc7: BUG at kernel/power/snapshot.c:493

* Rafael J. Wysocki <[email protected]> [2008-07-11 21:05]:
> If you have the user space resume utility in your initrd, please try
> to disable or remove it. I'd like to verify if it is responsible
> for this message.

I haven't had a chance to try this yet. However, I made a new
observation in the meantime: when I suspend my system with 2.6.25 and
resume it with 2.6.25, I get the problem. But if I suspend my system
with 2.6.25 and then resume it with 2.6.26-rc9 it resumes just fine.
Suspending with 2.6.26-rc9 and resuming with 2.6.26-rc9 also works
fine.

--
Martin Michlmayr
http://www.cyrius.com/