2008-02-26 13:20:20

by Otavio Salvador

[permalink] [raw]
Subject: oops when using git gc --auto

Hello,

Today I got this oops, someone has an idea of what's going wrong?

Unable to handle kernel paging request at 0000020000000000 RIP:
[<ffffffff802735c3>] find_get_pages+0x3c/0x69
PGD 0
Oops: 0000 [1] SMP
CPU 3
Modules linked in: sha256_generic aes_generic aes_x86_64 cbc blkcipher nvidia(P) rfcomm l2cap bluetooth ac battery ipv6 nfs lockd nfs_acl sunrpc bridge ext2 mbcache dm_crypt tun kvm_intel kvm loop snd_usb_audio snd_usb_lib snd_rawmidi snd_hda_intel e1000e i2c_i801 serio_raw snd_seq_device snd_pcm intel_agp button snd_timer pcspkr psmouse snd_hwdep snd snd_page_alloc soundcore evdev i2c_core xfs dm_mirror dm_snapshot dm_mod raid0 md_mod sg sr_mod cdrom sd_mod usbhid hid usb_storage pata_marvell floppy ahci ata_generic libata scsi_mod ehci_hcd uhci_hcd thermal processor fan
Pid: 15684, comm: git Tainted: P 2.6.24-1-amd64 #1
RIP: 0010:[<ffffffff802735c3>] [<ffffffff802735c3>] find_get_pages+0x3c/0x69
RSP: 0018:ffff8100394dfd98 EFLAGS: 00010097
RAX: 0000000000000009 RBX: 000000000000000e RCX: 0000000000000009
RDX: 0000020000000000 RSI: 000000000000000a RDI: 0000000000000040
RBP: ffff810042964350 R08: 0000000000000040 R09: 000000000000000a
R10: ffff8100425a06c8 R11: 000000000000000a R12: 000000000000000e
R13: ffff8100394dfdf8 R14: ffff810042964350 R15: 0000000000000000
FS: 00002ae326df2190(0000) GS:ffff81007d7aeb40(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000020000000000 CR3: 00000000358f9000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process git (pid: 15684, threadinfo ffff8100394de000, task ffff8100359cd800)
Stack: 000000000000000d ffff8100394dfde8 000000000000000d 000000000000000e
000000000000000e ffffffff802794d6 ffff8100014a7768 ffffffff80279b04
0000000000000000 ffffffffffffffff 0000000000000000 0000000000000000
Call Trace:
[<ffffffff802794d6>] pagevec_lookup+0x17/0x1e
[<ffffffff80279b04>] truncate_inode_pages_range+0x108/0x2bd
[<ffffffff802a9e3a>] generic_delete_inode+0xbf/0x127
[<ffffffff802a1a4d>] do_unlinkat+0xd5/0x144
[<ffffffff802989e2>] sys_write+0x45/0x6e
[<ffffffff8020be2e>] system_call+0x7e/0x83


Code: 48 8b 02 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 52 10
RIP [<ffffffff802735c3>] find_get_pages+0x3c/0x69
RSP <ffff8100394dfd98>
CR2: 0000020000000000
---[ end trace cb43a9f4488b815a ]---

--
O T A V I O S A L V A D O R
---------------------------------------------
E-mail: [email protected] UIN: 5906116
GNU/Linux User: 239058 GPG ID: 49A5F855
Home Page: http://otavio.ossystems.com.br
---------------------------------------------
"Microsoft sells you Windows ... Linux gives
you the whole house."


2008-02-26 13:40:35

by Nick Piggin

[permalink] [raw]
Subject: Re: oops when using git gc --auto

On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
> Hello,
>
> Today I got this oops, someone has an idea of what's going wrong?
>
> Unable to handle kernel paging request at 0000020000000000 RIP:
> [<ffffffff802735c3>] find_get_pages+0x3c/0x69

At this point, the most likely candidate is a memory corruption
error, probably hardware. Can you run memtest86 for a few hours
to get a bit more confidence in the hw (preferably overnight)?

I did recently see another quite similar corruption in the
pagecache radix-tree, though. Coincidence maybe?

> PGD 0
> Oops: 0000 [1] SMP
> CPU 3
> Modules linked in: sha256_generic aes_generic aes_x86_64 cbc blkcipher
> nvidia(P) rfcomm l2cap bluetooth ac battery ipv6 nfs lockd nfs_acl sunrpc
> bridge ext2 mbcache dm_crypt tun kvm_intel kvm loop snd_usb_audio
> snd_usb_lib snd_rawmidi snd_hda_intel e1000e i2c_i801 serio_raw
> snd_seq_device snd_pcm intel_agp button snd_timer pcspkr psmouse snd_hwdep
> snd snd_page_alloc soundcore evdev i2c_core xfs dm_mirror dm_snapshot
> dm_mod raid0 md_mod sg sr_mod cdrom sd_mod usbhid hid usb_storage
> pata_marvell floppy ahci ata_generic libata scsi_mod ehci_hcd uhci_hcd
> thermal processor fan Pid: 15684, comm: git Tainted: P
> 2.6.24-1-amd64 #1
> RIP: 0010:[<ffffffff802735c3>] [<ffffffff802735c3>]
> find_get_pages+0x3c/0x69 RSP: 0018:ffff8100394dfd98 EFLAGS: 00010097
> RAX: 0000000000000009 RBX: 000000000000000e RCX: 0000000000000009
> RDX: 0000020000000000 RSI: 000000000000000a RDI: 0000000000000040
> RBP: ffff810042964350 R08: 0000000000000040 R09: 000000000000000a
> R10: ffff8100425a06c8 R11: 000000000000000a R12: 000000000000000e
> R13: ffff8100394dfdf8 R14: ffff810042964350 R15: 0000000000000000
> FS: 00002ae326df2190(0000) GS:ffff81007d7aeb40(0000)
> knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000020000000000 CR3: 00000000358f9000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process git (pid: 15684, threadinfo ffff8100394de000, task
> ffff8100359cd800) Stack: 000000000000000d ffff8100394dfde8
> 000000000000000d 000000000000000e 000000000000000e ffffffff802794d6
> ffff8100014a7768 ffffffff80279b04 0000000000000000 ffffffffffffffff
> 0000000000000000 0000000000000000 Call Trace:
> [<ffffffff802794d6>] pagevec_lookup+0x17/0x1e
> [<ffffffff80279b04>] truncate_inode_pages_range+0x108/0x2bd
> [<ffffffff802a9e3a>] generic_delete_inode+0xbf/0x127
> [<ffffffff802a1a4d>] do_unlinkat+0xd5/0x144
> [<ffffffff802989e2>] sys_write+0x45/0x6e
> [<ffffffff8020be2e>] system_call+0x7e/0x83
>
>
> Code: 48 8b 02 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 52 10
> RIP [<ffffffff802735c3>] find_get_pages+0x3c/0x69
> RSP <ffff8100394dfd98>
> CR2: 0000020000000000
> ---[ end trace cb43a9f4488b815a ]---

2008-02-26 14:19:20

by Otavio Salvador

[permalink] [raw]
Subject: Re: oops when using git gc --auto

Nick Piggin <[email protected]> writes:

> On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
>> Hello,
>>
>> Today I got this oops, someone has an idea of what's going wrong?
>>
>> Unable to handle kernel paging request at 0000020000000000 RIP:
>> [<ffffffff802735c3>] find_get_pages+0x3c/0x69
>
> At this point, the most likely candidate is a memory corruption
> error, probably hardware. Can you run memtest86 for a few hours
> to get a bit more confidence in the hw (preferably overnight)?

Those memories are new, but I can try. No problem. Will get back to
you by tomorrow.

> I did recently see another quite similar corruption in the
> pagecache radix-tree, though. Coincidence maybe?

I hope not.

--
O T A V I O S A L V A D O R
---------------------------------------------
E-mail: [email protected] UIN: 5906116
GNU/Linux User: 239058 GPG ID: 49A5F855
Home Page: http://otavio.ossystems.com.br
---------------------------------------------
"Microsoft sells you Windows ... Linux gives
you the whole house."

2008-02-26 19:34:11

by Otavio Salvador

[permalink] [raw]
Subject: Re: oops when using git gc --auto

Nick Piggin <[email protected]> writes:

> On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
>> Hello,
>>
>> Today I got this oops, someone has an idea of what's going wrong?
>>
>> Unable to handle kernel paging request at 0000020000000000 RIP:
>> [<ffffffff802735c3>] find_get_pages+0x3c/0x69
>
> At this point, the most likely candidate is a memory corruption
> error, probably hardware. Can you run memtest86 for a few hours
> to get a bit more confidence in the hw (preferably overnight)?
>
> I did recently see another quite similar corruption in the
> pagecache radix-tree, though. Coincidence maybe?

I let it running at lunch time and all went OK. I've also let burnP6
running later and nothing happened. Looks like hw is OK.

I've just got another oops, with same kernel.

Unable to handle kernel paging request at ffff83006d922370 RIP:
[<ffffffff8027a79b>] shrink_page_list+0x16f/0x570
PGD 0
Oops: 0000 [1] SMP
CPU 2
Modules linked in: sha256_generic aes_generic aes_x86_64 cbc blkcipher nvidia(P) rfcomm l2cap bluetooth ac battery ipv6 nfs lockd nfs_acl sunrpc bridge ext2 mbcache dm_crypt tun kvm_intel kvm loop snd_hda_intel snd_usb_audio snd_pcm snd_timer snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd i2c_i801 soundcore snd_page_alloc intel_agp serio_raw button pcspkr e1000e i2c_core psmouse evdev xfs dm_mirror dm_snapshot dm_mod raid0 md_mod sg sr_mod cdrom sd_mod usbhid hid pata_marvell usb_storage floppy ahci ata_generic libata scsi_mod uhci_hcd ehci_hcd thermal processor fan
Pid: 213, comm: kswapd0 Tainted: P 2.6.24-1-amd64 #1
RIP: 0010:[<ffffffff8027a79b>] [<ffffffff8027a79b>] shrink_page_list+0x16f/0x570
RSP: 0018:ffff81007ac8bbe0 EFLAGS: 00010286
RAX: 0000000000010009 RBX: ffff810001e888a8 RCX: ffff810001e888d0
RDX: ffff83006d922350 RSI: 0000000000000001 RDI: ffff810001e888a8
RBP: ffff81007d1b9258 R08: ffff81007d776407 R09: 0000000000000000
R10: 0000000000000009 R11: 0000000000000002 R12: 0000000000000001
R13: ffff81007ac8be70 R14: ffff81007ac8bda0 R15: ffff81007ac8be01
FS: 0000000000000000(0000) GS:ffff81007d776340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff83006d922370 CR3: 000000006d4ac000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
Process kswapd0 (pid: 213, threadinfo ffff81007ac8a000, task ffff81007cc03800)
Stack: 0000000600000000 0000000000000002 0000000000000002 0000000000000001
ffff810001dc79e0 ffff810001dc7a18 ffff810001dc7fc8 ffff810001dc8000
0000000000000000 0000000000000001 0000000000000000 0000000000000001
Call Trace:
[<ffffffff80279da1>] isolate_lru_pages+0x5d/0x1d9
[<ffffffff80279da1>] isolate_lru_pages+0x5d/0x1d9
[<ffffffff8027acb9>] shrink_inactive_list+0x11d/0x381
[<ffffffff8027b002>] shrink_zone+0xe5/0x108
[<ffffffff8027b500>] kswapd+0x2fc/0x49b
[<ffffffff80413b5b>] thread_return+0x3d/0xab
[<ffffffff80247ff2>] autoremove_wake_function+0x0/0x2e
[<ffffffff8027b204>] kswapd+0x0/0x49b
[<ffffffff80247ed3>] kthread+0x47/0x74
[<ffffffff8020cc48>] child_rip+0xa/0x12
[<ffffffff80247e8c>] kthread+0x0/0x74
[<ffffffff8020cc3e>] child_rip+0x0/0x12


Code: 48 83 7a 20 00 0f 85 47 03 00 00 48 8d 42 30 48 39 42 30 0f
RIP [<ffffffff8027a79b>] shrink_page_list+0x16f/0x570
RSP <ffff81007ac8bbe0>
CR2: ffff83006d922370
---[ end trace b01014a6540e7663 ]---

--
O T A V I O S A L V A D O R
---------------------------------------------
E-mail: [email protected] UIN: 5906116
GNU/Linux User: 239058 GPG ID: 49A5F855
Home Page: http://otavio.ossystems.com.br
---------------------------------------------
"Microsoft sells you Windows ... Linux gives
you the whole house."

2008-02-27 22:53:07

by Otavio Salvador

[permalink] [raw]
Subject: Re: oops when using git gc --auto

Otavio Salvador <[email protected]> writes:

> Nick Piggin <[email protected]> writes:
>
>> On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
>>> Hello,
>>>
>>> Today I got this oops, someone has an idea of what's going wrong?
>>>
>>> Unable to handle kernel paging request at 0000020000000000 RIP:
>>> [<ffffffff802735c3>] find_get_pages+0x3c/0x69
>>
>> At this point, the most likely candidate is a memory corruption
>> error, probably hardware. Can you run memtest86 for a few hours
>> to get a bit more confidence in the hw (preferably overnight)?
>>
>> I did recently see another quite similar corruption in the
>> pagecache radix-tree, though. Coincidence maybe?
>
> I let it running at lunch time and all went OK. I've also let burnP6
> running later and nothing happened. Looks like hw is OK.
>
> I've just got another oops, with same kernel.
>
> Unable to handle kernel paging request at ffff83006d922370 RIP:
> [<ffffffff8027a79b>] shrink_page_list+0x16f/0x570

In the end, it was a motherboard issue.

Thanks for the help!

--
O T A V I O S A L V A D O R
---------------------------------------------
E-mail: [email protected] UIN: 5906116
GNU/Linux User: 239058 GPG ID: 49A5F855
Home Page: http://otavio.ossystems.com.br
---------------------------------------------
"Microsoft sells you Windows ... Linux gives
you the whole house."