2013-08-26 08:36:38

by Fabio Coatti

[permalink] [raw]
Subject: 3.10.9 Crash

Hi all,
I'm remotely managing a 3.10.9 server and this morning I found it locked.
The only data I could gather is a screenshot (thru AMT interface).
I've ran it thru gocr and this is the result (some char got garbled but the
bulk is ok).
The machine is pretty new and stable, nothin like this happened with previous
3.10.X

Attached you can find .config file.
If you need more information just let me know (just CC: me as I'm not
subscribed ATM)


uname -a:
Linux calvin 3.10.9 #1 SMP PREEMPT Sat Aug 24 16:21:53 CEST 2013 x86_64
Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz GenuineIntel GNU/Linux

Screen dump: (at least what I've been able to recover and copy)
s lockd sunrpc fscache jfs hid_logitech usbhid xhci_hcd ohci_hcd uhci_hcd
usb_storage ehci_ci ehci_hcd usbcore usb_common
CPU: 3 PID: 2 Comm: kthreadd Tainted: G D 3.10.9 #1
Hardware name: System manufacturer System Product Name/P8Q77-M2, BIOS 0701
02/04/2013
Task: ffff88040c9006f0 ti: ffff88040c914000 task.ti: ffff88040c914000
RIP: 0010:[<ffffffff810bd7dc>] [<ffffffff810bd7dc>] kmem_cache_alloc+0x10b/0x133
RSP: 0018: ffff88040c915d98 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88041ead4600 RCX: 00000000116c64fb
RDX: 00000000116c64f3 RSI: 0000000000000003 RDI: ffff88040c914000
RBP: ffff88040e403800 R08: 0000000000014600 R09: 656d22202c226b63
R10: 0000000000000040 R11: ffff88041ede8c00 R12: 00000000000000d0
R13: ffffffff81044f01 R14: 00000000116c64f3 R15: ffff88040c914000
FS: 0000000000000000(0000) GS:ffff88041eac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000626bc8 CR3: 000000000180c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
0000000000000000 ffff88040e405480 ffff88040c9006f0 0000000000800711
ffff88040c9006f0 ffff8801c5595340 0000000000000000 ffffffff81044f01
ffff88040e405480 ffff88040e405480 ffff8801c5595340 ffffffff81045350
Call Trace:
[<ffffffff81044f01>] ? prepare_creds+0x1d/0xa6
[<ffffffff81045350>] ? copy_creds+0x58/0xf0
[<ffffffff810260f9>] ? copy_rocess.part.31+0x21c/0x114b
[<ffffffff81040a36>] ? _kthread_arkme+0x5b/0x5b
[<ffffffff8104c07a>] ? arch__time_task_switch+0x56/0x69
[<ffffffff8102711c>] ? do_fork+0xc9/0x21a
[<ffffffff8149debf>] ? _raw_spin_unlock_irqrestore+0x1e/0x2a
[<ffffffff8102728c>] ? kernel_thread+0x1f/0x22
[<ffffffff81040ffc>] ? kthreadd+0xf6/0x13a
[<ffffffff81040f06>] ? kthread_create_on_cpu+0x55/0x55
[<ffffffff8149e7ec>] ? ret_from_fork+0x7c/0xb0
[<ffffffff81040f06>] ? kthread_create_on_cpu+0x55/0x55
Code: 03 41 81 e4 00 80 00 00 74 39 4d 85 c9 74 34 48 63 4d 1c 31 c0 4c 89 cf
f3 aa eb 27 48 63 45 20 49 8d 4e 08 4c 89 f2 4c 8b 45 00 <49> 8b 1c 01 4c 89
c8 65 49 0f c7 08 0f 94 c0 84 c0 75 bb e9 5_
RIP [<ffffffff810bd7dc>] kmem_cache_alloc+0x10b/0x133
RSP <ffff88040c915d98>
general protection fault: 0000 [#6] PREEMPT SMP
Modules linked in: nct6775 hwmon_vid bridge stp ipv6 llc fbcon bitblit
softcursor font snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device
usbkbd uvc_ideo videobuf2_vmalloc videobuf2_memops videobuf2_core usbmouse
videodev i915 cfbfillrect cfbimgblt cfbcopyarea intel_agp intel_gtt mperf
corete kvm_intel snd_hda_codec_realtek kvm snd_hda_intel snd_hda_codec
crc32_clmul snd_hwdep drm_kms_h_
per crc32c_intel i2c_i801 snd_cm ghash_clmulni_intel iTCO_wdt snd_age_alloc
snd_timer snd iTCO_vendor_support soundcore xts aesni_intel lrw gf128mul
glue_helper ablk_helper cryptd aes_x86_64 fuse 1
_ lockd sunrpc fscache jfs hid_logitech usbhid xhci_hcd ohci_hcd uhci_hcd
usb_storage ehci_ci ehci_hcd usbcore usb_common
CPU: 3 PID: 2645 Comm: udevd Tainted: G D 3.10.9 #1
Hardware name: System manufacturer System Product Name/P8Q77-M2, BIOS 0701
02/04/2013
task: ffff88040866e120 ti: ffff88040822e000 task.ti: ffff88040822e000
RIP: 0010:[<ffffffff810bd7dc>] [<ffffffff810bd7dc>] kmem_cache_alloc+0x10b/0x133
RSP: 0018: ffff88040822fde0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88041ead4600 RCX: 00000000116c64fb
RDX: 00000000116c64f3 RSI: 0000000000000003 RDI: ffff88040822e000
RBP: ffff88040e403800 R08: 0000000000014600 R09: 656d22202c226b63
R10: 0000000000000040 R11: ffff88041ede8c00 R12: 00000000000000d0
R13: ffffffff81044f01 R14: 00000000116c64f3 R15: ffff88040822e000
FS: 00007f3531ecf780(0000) GS:ffff88041eac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3531ee0000 CR3: 00000004088e4000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
00007f3531ecfa50 ffff8804099cac00 ffff88040866e120 0000000001200011
ffff88040866e120 ffff8801c5593090 00007f3531ecfa50 ffffffff81044f01
ffff8804099cac00 ffff8804099cac00 ffff8801c5593090 ffffffff81045350
Call Trace:
[<ffffffff81044f01>] ? prepare_creds+0x1d/0xa6
[<ffffffff81045350>] ? copy_creds+0x58/0xf0
[<ffffffff810260f9>] ? copy_process.part.31+0x21c/0x114b
[<ffffffff814083d7>] ? sock_setsockopt+0x5a9/0x5de
[<ffffffff8102711c>] ? do_fork+0xc9/0x21a
[<ffffffff81053428>] ? cpuacct_account_field+0x11/0x51
[<ffffffff8104bf10>] ? vtime_account_user+0x42/0x4f
[<ffffffff8149dc68>] ? _raw_spin_unlock+0x1c/0x28
[<ffffffff8149eb39>] ? stub_clone+0x69/0x90
[<ffffffff8149ea11>] ? tracesys+0xd4/0xd9
Code: 03 41 81 e4 00 80 00 00 74 39 4d 85 c9 74 34 48 63 4d 1c 31 c0 4c 89 cf
f3 aa eb 27 48 63 45 20 49 8d 4e 08 4c 89 f2 4c 8b 45 00 <49> 8b 1c 01 4c 89
c8 65 49 0f c7 08 0f 94 c0 84 c0 75 bb e9 5l
RIP [<ffffffff810bd7dc>] kmem_cache_alloc+0x10b/0x133
RSP <ffff88040822fde0>


--
Fabio


Attachments:
config (86.80 kB)

2013-08-26 10:57:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: 3.10.9 Crash

On Mon, Aug 26, 2013 at 10:36:30AM +0200, Fabio Coatti wrote:
> Hi all,
> I'm remotely managing a 3.10.9 server and this morning I found it locked.
> The only data I could gather is a screenshot (thru AMT interface).
> I've ran it thru gocr and this is the result (some char got garbled but the
> bulk is ok).
> The machine is pretty new and stable, nothin like this happened with previous
> 3.10.X

Can you reproduce this easily? If so, can you use 'git bisect' to track
down the offending patch?

thanks,

greg k-h

2013-08-26 11:15:47

by Fabio Coatti

[permalink] [raw]
Subject: Re: 3.10.9 Crash

In data luned? 26 agosto 2013 03:59:30, Greg Kroah-Hartman ha scritto:
> On Mon, Aug 26, 2013 at 10:36:30AM +0200, Fabio Coatti wrote:
> > Hi all,
> > I'm remotely managing a 3.10.9 server and this morning I found it locked.
> > The only data I could gather is a screenshot (thru AMT interface).
> > I've ran it thru gocr and this is the result (some char got garbled but
> > the
> > bulk is ok).
> > The machine is pretty new and stable, nothin like this happened with
> > previous 3.10.X
>
> Can you reproduce this easily? If so, can you use 'git bisect' to track
> down the offending patch?


I wish I could :). I thinked of it, but looking at logs I spotted no clues
about reproducing the issue. The server is pretty unused, the only thing that
I can imagine happening at more or less the time of crash is a logrotation. I
launched it again with no problems, but of course things can be different.
Right now I'm using that server with no problems, (emerging packages and so
on) but of course is not a proof of anything.
The only thing a bit weird on that system (if it can be considered as such) is
a raid+lvm+btrfs stack (not a important production server )
--
Fabio