2011-04-29 22:10:21

by werner

[permalink] [raw]
Subject: 2.6.39-rc5-git2 boot crashs



Pid: 5635, comm: mount Tainted: G C
2.6.39-rc5-git2 #1 System manufacturer System Product
Name/M2N8-VMX
EIP: 0060:[<c12d01fb>] EFLAGS: 00010246 CPU: 0
EIP is at logfs_drop_inode+0x3c/0x68
EAX: 00000000 EBX: f4db8000 ECX: f4db81f4 EDX: f4db81f4
ESI: f521c000 EDI: f5232c00 EBP: f5199e70 ESP: f5199e68
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process mount (pid: 5635, ti=f5198000 task=f523ae50
task.ti=f5198000)
Stack:
c1f2344c f4db8000 f5199e84 c10ea544 ffffffea f5232c00
f68ac1c0 f5199ec0
c12d77cd 00000000 00000000 c10ced5c 00000000 f521c000
00000400 f521c000
f68a4b40 00000040 000000d0 00000000 f5106cb0 f5106cb0
f5199ef8 c10d9b11
Call Trace:
[<c10ea544>] iput+0x5c/0x119
[<c12d77cd>] logfs_mount+0x44f/0x5cc
[<c10ced5c>] ? __kmalloc_track_caller+0x9b/0x157
[<c10d9b11>] mount_fs+0x68/0x13e
[<c10b1ce3>] ? kstrdup+0x30/0x41
[<c10ee6c3>] vfs_kern_mount+0x53/0x7f
[<c10ee747>] do_kern_mount+0x3c/0xbb
[<c10eede8>] do_mount+0x622/0x66f
[<c10ed9ca>] ? copy_mount_options+0xe/0xe7
[<c10b1c15>] ? memdup_user+0x34/0x4b
[<c10b1c5d>] ? strndup_user+0x31/0x42
[<c10eeea2>] sys_mount+0x6d/0x9b
[<c1eba70c>] syscall_call+0x7/0xb
Code: 8c 01 00 00 b8 30 4e 79 c2 e8 41 a1 be 00 8d 8b f4
01 00 00 8b 93 f4 01 00 00 8b 83 f8 01 00 00 89 42 04 89
10 8b 86 54 02 00 00
48 04 89 83 f4 01 00 00 8d 86 54 02 00 00 89 83 f8 01 00
00
EIP: [<c12d01fb>] logfs_drop_inode+0x3c/0x68 SS:ESP
0068:f5199e68
CR2: 0000000000000004
---[ end trace cd59ca17c20fba5d ]---
---
Professional hosting for everyone - http://www.host.ru


2011-04-30 02:32:43

by Linus Torvalds

[permalink] [raw]
Subject: Fwd: 2.6.39-rc5-git2 boot crashs

I dunno if you guys saw this. Any ideas?

Dave Chinner and Al Viro on the recipients because they were working
on iput_final etc locking changes. And logfs people for obvious
reasons.

The Code: line is buggered and seems to be missing one instruction
byte, and I think it's because the user used a web interface, and the
"<>" around the byte messed things up. But the code arount it decodes
to:

0: b8 30 4e 79 c2 mov $0xc2794e30,%eax (probably
logfs_inode_lock address)
5: e8 41 a1 be 00 call xxx (probably _raw_spin_lock)
a: 8d 8b f4 01 00 00 lea 0x1f4(%ebx),%ecx
(li->li_freeing_list address)
10: 8b 93 f4 01 00 00 mov 0x1f4(%ebx),%edx (li->li_freeing_list.next)
16: 8b 83 f8 01 00 00 mov 0x1f8(%ebx),%eax (li->li_freeing_list.prev)
1c: 89 42 04 mov %eax,0x4(%edx) (next->prev = prev)
1f: 89 10 mov %edx,(%eax) (prev->next = next)
... something messed up ..
29: 89 83 f4 01 00 00 mov %eax,0x1f4(%ebx)
2f: 8d 86 54 02 00 00 lea 0x254(%esi),%eax
35: 89 83 f8 01 00 00 mov %eax,0x1f8(%ebx)

and that's basically the code that does:

list_move(&li->li_freeing_list, &super->s_freeing_list);

and the removal from the old list has succeeded, but adding to the
super->s_freeing_list is failing.

It looks like a NULL pointer dereference with offset 4, so at a guess,
super->s_freeing_list.next is NULL, and it's the "next->prev = entry"
instruction that faults when inserting into that list.

How/why would s_freeing_list be NULL? I have no idea. But it looks
like a failed mount, so presumably it was never initialized.

Linus

---------- Forwarded message ----------
From: werner <[email protected]>
Date: Fri, Apr 29, 2011 at 3:10 PM
Subject: 2.6.39-rc5-git2 boot crashs
To: [email protected]




Pid: 5635, comm: mount Tainted: G ? ? ? ? C 2.6.39-rc5-git2 #1 System
manufacturer System Product Name/M2N8-VMX
EIP: 0060:[<c12d01fb>] EFLAGS: 00010246 CPU: 0
EIP is at logfs_drop_inode+0x3c/0x68
EAX: 00000000 EBX: f4db8000 ECX: f4db81f4 EDX: f4db81f4
ESI: f521c000 EDI: f5232c00 EBP: f5199e70 ESP: f5199e68
?DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process mount (pid: 5635, ti=f5198000 task=f523ae50 task.ti=f5198000)
Stack:
?c1f2344c f4db8000 f5199e84 c10ea544 ffffffea f5232c00 f68ac1c0 f5199ec0
?c12d77cd 00000000 00000000 c10ced5c 00000000 f521c000 00000400 f521c000
?f68a4b40 00000040 000000d0 00000000 f5106cb0 f5106cb0 f5199ef8 c10d9b11
Call Trace:
?[<c10ea544>] iput+0x5c/0x119
?[<c12d77cd>] logfs_mount+0x44f/0x5cc
?[<c10ced5c>] ? __kmalloc_track_caller+0x9b/0x157
?[<c10d9b11>] mount_fs+0x68/0x13e
?[<c10b1ce3>] ? kstrdup+0x30/0x41
?[<c10ee6c3>] vfs_kern_mount+0x53/0x7f
?[<c10ee747>] do_kern_mount+0x3c/0xbb
?[<c10eede8>] do_mount+0x622/0x66f
?[<c10ed9ca>] ? copy_mount_options+0xe/0xe7
?[<c10b1c15>] ? memdup_user+0x34/0x4b
?[<c10b1c5d>] ? strndup_user+0x31/0x42
?[<c10eeea2>] sys_mount+0x6d/0x9b
?[<c1eba70c>] syscall_call+0x7/0xb
Code: 8c 01 00 00 b8 30 4e 79 c2 e8 41 a1 be 00 8d 8b f4 01 00 00 8b
93 f4 01 00 00 8b 83 f8 01 00 00 89 42 04 89 10 8b 86 54 02 00 00
?48 04 89 83 f4 01 00 00 8d 86 54 02 00 00 89 83 f8 01 00 00
EIP: [<c12d01fb>] logfs_drop_inode+0x3c/0x68 SS:ESP 0068:f5199e68
CR2: 0000000000000004
---[ end trace cd59ca17c20fba5d ]---
---
Professional hosting for everyone - http://www.host.ru

2011-04-30 02:47:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 7:31 PM, Linus Torvalds
<[email protected]> wrote:
>
> It looks like a NULL pointer dereference with offset 4, so at a guess,
> super->s_freeing_list.next is NULL, and it's the "next->prev = entry"
> instruction that faults when inserting into that list.
>
> How/why would s_freeing_list be NULL? I have no idea. But it looks
> like a failed mount, so presumably it was never initialized.

Hmm. super->s_freeing_list is initialized pretty late in
logfs_read_sb(), and any error path _before_ that point will result in
a "goto err1" in logfs_get_sb_device() which will do various iputs
etc. All without that list initialized. That would seem to be the
cause of this, possibly triggered by Al's changes to ->mount from
read_super.

Somebody who knows the code better than me (ie any reasonably
well-educated squirrel) should take another look, though.

Werner, if this is easily repeatable for you, could you test just
moving up the lines that initialize the superblock mutexes and the
s_freeing_list to the top of logfs_read_sb() rather than the end (ie
move the three lines that do

mutex_init(&super->s_dirop_mutex);
mutex_init(&super->s_object_alias_mutex);
INIT_LIST_HEAD(&super->s_freeing_list);

to be before the call to mempool_create(). That way those things will
be initialized much earlier, which is definitely what we want.

Whether that's the only problem and actually fixes it, I won't even
begin to guess, though.

Linus

2011-04-30 02:55:56

by Al Viro

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 07:47:14PM -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2011 at 7:31 PM, Linus Torvalds
> <[email protected]> wrote:
> >
> > It looks like a NULL pointer dereference with offset 4, so at a guess,
> > super->s_freeing_list.next is NULL, and it's the "next->prev = entry"
> > instruction that faults when inserting into that list.
> >
> > How/why would s_freeing_list be NULL? I have no idea. But it looks
> > like a failed mount, so presumably it was never initialized.
>
> Hmm. super->s_freeing_list is initialized pretty late in
> logfs_read_sb(), and any error path _before_ that point will result in
> a "goto err1" in logfs_get_sb_device() which will do various iputs
> etc. All without that list initialized. That would seem to be the
> cause of this, possibly triggered by Al's changes to ->mount from
> read_super.

Then it ought to be reproducible with much ealier kernels. Say, 2.6.37 or
so... That part of ->mount() series went in during last Autumn...

2011-04-30 03:02:50

by Al Viro

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Sat, Apr 30, 2011 at 03:55:45AM +0100, Al Viro wrote:

> > Hmm. super->s_freeing_list is initialized pretty late in
> > logfs_read_sb(), and any error path _before_ that point will result in
> > a "goto err1" in logfs_get_sb_device() which will do various iputs
> > etc. All without that list initialized. That would seem to be the
> > cause of this, possibly triggered by Al's changes to ->mount from
> > read_super.

Wait a bit; _can_ we get there with non-NULL ->s_master_inode et.al.?
iput(NULL) is a noop... I don't think so, since logfs_init_journal()
is not called until after we initialize that list.

Not that I'd object against taking that initialization earlier, of course,
but there seems to be something else going on... Which iput() it is?

2011-04-30 03:09:40

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 8:02 PM, Al Viro <[email protected]> wrote:
>
> Wait a bit; _can_ we get there with non-NULL ->s_master_inode et.al.?
> iput(NULL) is a noop... ?I don't think so, since logfs_init_journal()
> is not called until after we initialize that list.
>
> Not that I'd object against taking that initialization earlier, of course,
> but there seems to be something else going on... ?Which iput() it is?

Not something I can guess from the oops, sadly. Gcc has inlined
everything into logfs_mount, and the "0x44f/0x5cc" offset isn't very
helpful (with the same compiler version and config options it would be
possible to figure it out).

But looking at it, logfs_init_mapping() is currently called before
"s_freeing_list" is initialized, and it sets up at least
s_mapping_inode. So if anything fails between that point and the point
where we initialize s_freeing_list, I think we're toast.

I didn't check the other inodes, but at least that one does seem to be
potentially non-NULL. No?

Linus

2011-04-30 03:28:07

by Al Viro

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 08:09:16PM -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2011 at 8:02 PM, Al Viro <[email protected]> wrote:
> >
> > Wait a bit; _can_ we get there with non-NULL ->s_master_inode et.al.?
> > iput(NULL) is a noop... ?I don't think so, since logfs_init_journal()
> > is not called until after we initialize that list.
> >
> > Not that I'd object against taking that initialization earlier, of course,
> > but there seems to be something else going on... ?Which iput() it is?
>
> Not something I can guess from the oops, sadly. Gcc has inlined
> everything into logfs_mount, and the "0x44f/0x5cc" offset isn't very
> helpful (with the same compiler version and config options it would be
> possible to figure it out).
>
> But looking at it, logfs_init_mapping() is currently called before
> "s_freeing_list" is initialized, and it sets up at least
> s_mapping_inode. So if anything fails between that point and the point
> where we initialize s_freeing_list, I think we're toast.
>
> I didn't check the other inodes, but at least that one does seem to be
> potentially non-NULL. No?

Ho-hum... Point. Let's take that initialization up to the beginning of
logfs_read_sb(), see if oops goes away and then try to figure out WTF
we hadn't been seeing it all along. I don't see anything recent affecting
that area, but then logfs goes through many odd places during mount (including,
IIRC, mtd). So there might be many sources of the failure where we used to
have none and failure in that spot would, indeed, fuck the things up that
way...

In any case, taking that initialization to the beginning of logfs_read_sb()
(if not up to its only caller where we set ->s_op et.al. anyway) seems to be
the obviously right thing to do. Unless logfs folks have some subtle
objections?

2011-04-30 03:39:09

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

At my reclamation thread about 2.6.39-rc3,4 crashs, I
informed that there was a reset-resistent change of the
system after crashs, so that on subsequent boots (after a
'primary' crash rather at the end of booting) it happened
an early 'secondary' crash at the time of initializing
ata0, with funny effects like that the grafic card (or
anything else) was identified as an ata device, with
subsequent 'read erros' on it and crash. This 'secondary'
effect repeated and repeated and gone away only at booting
with a normal kernel (2.6.38.4 or 2.6.26.2). But if
afterwards booting again with 2.6.39-rc3 or -rc4 , then at
the end of the boot it crashed, and at subsequent boots
again continued this reset-resistent effect that it crasha
again and again with ata0 problems, until I reboot with
2.6.38.4 or 2.6.26.2 , or waiting 5 minutes (perhaps until
the memory discharged).

All these problems dont happen with 2.6.38.4 or 2.6.26.2

Werner Landgraf


================================================
On Fri, 29 Apr 2011 20:09:16 -0700
Linus Torvalds <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 8:02 PM, Al Viro
><[email protected]> wrote:
>>
>> Wait a bit; _can_ we get there with non-NULL
>>->s_master_inode et.al.?
>> iput(NULL) is a noop... ?I don't think so, since
>>logfs_init_journal()
>> is not called until after we initialize that list.
>>
>> Not that I'd object against taking that initialization
>>earlier, of course,
>> but there seems to be something else going on... ?Which
>>iput() it is?
>
> Not something I can guess from the oops, sadly. Gcc has
>inlined
> everything into logfs_mount, and the "0x44f/0x5cc"
>offset isn't very
> helpful (with the same compiler version and config
>options it would be
> possible to figure it out).
>
> But looking at it, logfs_init_mapping() is currently
>called before
> "s_freeing_list" is initialized, and it sets up at least
> s_mapping_inode. So if anything fails between that point
>and the point
> where we initialize s_freeing_list, I think we're toast.
>
> I didn't check the other inodes, but at least that one
>does seem to be
> potentially non-NULL. No?
>
> Linus
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru

2011-04-30 04:00:34

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

Enclosed the config and compilation log file (with some
warnings, things which should be corrected also). Perhaps
it's useful for find the error.

Right now my computer crashed again when I unzipped a big
file. As said in the 2.6.39-rc1 until -rc4 reclamations,
the computer always crashs if zip, unzip or move big
files. This looks like an error of the memory / paging
driver.

w.r.t. my last message, it should be ata1 rather than ata0
. ( See, enclosed a foto of such reboot-resistent
secondary crashs (happened after a big primary crash) on
which 'anything' was interpreted as ata )

wl

=====================================================
On Fri, 29 Apr 2011 20:09:16 -0700
Linus Torvalds <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 8:02 PM, Al Viro
><[email protected]> wrote:
>>
>> Wait a bit; _can_ we get there with non-NULL
>>->s_master_inode et.al.?
>> iput(NULL) is a noop... ?I don't think so, since
>>logfs_init_journal()
>> is not called until after we initialize that list.
>>
>> Not that I'd object against taking that initialization
>>earlier, of course,
>> but there seems to be something else going on... ?Which
>>iput() it is?
>
> Not something I can guess from the oops, sadly. Gcc has
>inlined
> everything into logfs_mount, and the "0x44f/0x5cc"
>offset isn't very
> helpful (with the same compiler version and config
>options it would be
> possible to figure it out).
>
> But looking at it, logfs_init_mapping() is currently
>called before
> "s_freeing_list" is initialized, and it sets up at least
> s_mapping_inode. So if anything fails between that point
>and the point
> where we initialize s_freeing_list, I think we're toast.
>
> I didn't check the other inodes, but at least that one
>does seem to be
> potentially non-NULL. No?
>
> Linus
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru


Attachments:
config-2.6.39-rc5-git2-i486-1sys.bz2 (31.77 kB)
linux-2.6.39-rc5-git2-i486-1sys.log.bz2 (73.18 kB)
Foto0049.jpg (180.42 kB)
Download all attachments

2011-04-30 04:01:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 8:39 PM, werner <[email protected]> wrote:
>
> At my reclamation thread about 2.6.39-rc3,4 crashs, I informed that there
> was a reset-resistent change of the system after crashs, so that on
> subsequent boots (after a 'primary' crash rather at the end of booting) it
> happened an early 'secondary' ?crash at the time of initializing ata0, with
> funny effects like that the grafic card (or anything else) was identified as
> an ata device, with subsequent 'read erros' on it and crash. This
> 'secondary' effect repeated and repeated and gone away only at booting with
> a normal kernel (2.6.38.4 or 2.6.26.2). But if afterwards booting again with
> 2.6.39-rc3 or -rc4 , then at the end of the boot it crashed, and at
> subsequent boots again continued this reset-resistent effect that it crasha
> again and again with ata0 problems, until I reboot with 2.6.38.4 or 2.6.26.2
> , or waiting 5 minutes (perhaps until the memory discharged).
>
> All these problems dont happen with 2.6.38.4 or 2.6.26.2

Do you think you could bisect when that odd after-reset behavior started?

It does sound like you have some PCI-level problem (some device that
has "sticky" state and doesn't get reset properly). Most likely a
hardware "feature" (there is various PCI hardware that allows things
like device identifiers to be written to), coupled with a firmware bug
that doesn't reset things.

But it would be intriguing to hear when it started happening, so that
we can figure out exactly _what_ isn't getting properly reset..

The logfs oops may just be a result of "autodetect any random
filesystem" in that confused state. So when the state isn't confused,
you'd not see the oops, because nothing ever tries to mount the
invalid logfs image.

Linus

2011-04-30 04:13:33

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

The problem that the computer crashs if zip, unzip or move
a big file, started with -rc1

The problem with the secondary reset-resistent crashs
after a primary and-of-boot or after-boot crash started
with -rc3 or -rc4, at least I perceived it then, but it's
possible that it occured also before, at least then I
didn't note it

Now I see, that nor the syslog don't contain everything
(this was also better with 2.6.38.4). For example,
currently aren't logged these crashs which currently
occure during rtc0 initializing (in addition to the other
crashs protocolled in the log files). Also, as you can see
from my before-last message, with the screen foto of
/dev/pts/0 w.r.t. boot_vga, this also don't appear in the
log file which was added at the end of the same message.

For being -rc5 , that kernel is rather bad. I hope you
get through to correct it until -rc10

W.Landgraf





=========================================================
On Fri, 29 Apr 2011 21:00:24 -0700
Linus Torvalds <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 8:39 PM, werner
><[email protected]> wrote:
>>
>> At my reclamation thread about 2.6.39-rc3,4 crashs, I
>>informed that there
>> was a reset-resistent change of the system after crashs,
>>so that on
>> subsequent boots (after a 'primary' crash rather at the
>>end of booting) it
>> happened an early 'secondary' ?crash at the time of
>>initializing ata0, with
>> funny effects like that the grafic card (or anything
>>else) was identified as
>> an ata device, with subsequent 'read erros' on it and
>>crash. This
>> 'secondary' effect repeated and repeated and gone away
>>only at booting with
>> a normal kernel (2.6.38.4 or 2.6.26.2). But if
>>afterwards booting again with
>> 2.6.39-rc3 or -rc4 , then at the end of the boot it
>>crashed, and at
>> subsequent boots again continued this reset-resistent
>>effect that it crasha
>> again and again with ata0 problems, until I reboot with
>>2.6.38.4 or 2.6.26.2
>> , or waiting 5 minutes (perhaps until the memory
>>discharged).
>>
>> All these problems dont happen with 2.6.38.4 or 2.6.26.2
>
> Do you think you could bisect when that odd after-reset
>behavior started?
>
> It does sound like you have some PCI-level problem (some
>device that
> has "sticky" state and doesn't get reset properly). Most
>likely a
> hardware "feature" (there is various PCI hardware that
>allows things
> like device identifiers to be written to), coupled with
>a firmware bug
> that doesn't reset things.
>
> But it would be intriguing to hear when it started
>happening, so that
> we can figure out exactly _what_ isn't getting properly
>reset..
>
> The logfs oops may just be a result of "autodetect any
>random
> filesystem" in that confused state. So when the state
>isn't confused,
> you'd not see the oops, because nothing ever tries to
>mount the
> invalid logfs image.
>
> Linus
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru

2011-04-30 04:20:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

2011/4/29 werner <[email protected]>:
> The problem that the computer crashs if zip, unzip or move a big file,
> started with -rc1

Please do try to bisect..

> For being -rc5 , that kernel is rather bad. ?I hope you get through to
> correct it until -rc10

Quite frankly, right now I think you're the only one that has reported
these kinds of problems, so we will need to rely on you to figure out
what is so odd about your setup.

If you can pinpoint when the crashes happened, that would help a lot.

Also, the crash that happens not-at-boot is in many ways way more
important. Clearly you have something odd going on after a reboot, and
I'd love to figure that out too, but in many ways the unzip one is way
more important.

Sadly, your previous mail had a nice picture, but the important
information had scrolled off the screen because of the ata1 exception
and command failed printouts. Any chance you could get a picture of
just the oops? If worst comes to worst, you might even have to disable
those ata debug printouts (normally they are really important, but if
they make the earlier oops scroll off the screen, they hurt more than
they help).

Oh, and please do an lspci -vvxx and a working dmesg too for that machine.

Linus

2011-04-30 04:29:25

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

For your information:
/ All these problems dont have with 2.6.38 , .1, .2, .3 ,
.4
/ At 2.6.38-rc1 or was it 37-rc1 there had a problem with
khugepaged what I also reclaimed. That was similar like
now these crashs which are NOT visible in syslog,
happening a few minutes until hours after booting (these
crashs should be visible in syslog 10..30 seconds before I
rebooting). At that time, the problem was corrected with
khugepaged. But it's possible, that the correction at
that time wasn't good enough, so that now, after some
other changes, the same problem 'returned'. Currently,
since 2.6.39-rc1 it crashs ALWAYS if I zip or unzip files
of some 100 M or move big files. It's possible that
it is a problem with the memory administration.
/ I use gcc 4.3.3 and glibc 2.9
I hope this informations are useful for correct these
problems.
wl

===================================================================
On Sat, 30 Apr 2011 03:55:45 +0100
Al Viro <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 07:47:14PM -0700, Linus Torvalds
>wrote:
>> On Fri, Apr 29, 2011 at 7:31 PM, Linus Torvalds
>> <[email protected]> wrote:
>> >
>> > It looks like a NULL pointer dereference with offset
>>4, so at a guess,
>> > super->s_freeing_list.next is NULL, and it's the
>>"next->prev = entry"
>> > instruction that faults when inserting into that list.
>> >
>> > How/why would s_freeing_list be NULL? I have no idea.
>>But it looks
>> > like a failed mount, so presumably it was never
>>initialized.
>>
>> Hmm. super->s_freeing_list is initialized pretty late in
>> logfs_read_sb(), and any error path _before_ that point
>>will result in
>> a "goto err1" in logfs_get_sb_device() which will do
>>various iputs
>> etc. All without that list initialized. That would seem
>>to be the
>> cause of this, possibly triggered by Al's changes to
>>->mount from
>> read_super.
>
> Then it ought to be reproducible with much ealier
>kernels. Say, 2.6.37 or
> so... That part of ->mount() series went in during last
>Autumn...
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru

2011-04-30 05:02:50

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

Pls see enclosed:

/ lspci -vvxx
/ dmesg

with 2.6.39-rc5-git2.


One minute later, it crashed again. The screen foto,
I'll send in a next e-mail in a few seconds, because here
I can add only 3 files. After, I rebooted with 2.6.38.4
, what's stable and never crashed. For comparison,

/ dmesg

with 2.6.38.4 (from just now) is added also.

I dont know if in the syslog is recorded what happened
before the ata1 crash, I'll search it.


wl

=========================================================================
On Fri, 29 Apr 2011 21:46:18 -0700
Linus Torvalds <[email protected]> wrote:
> 2011/4/29 werner <[email protected]>:
>> Not only now, long time (since approx. -rc1) I'm
>>reclaiming that the
>> computer crashs if zipping / unzipping big files
>
> I understood.
>
>> The foto on my last mail is complete; after the ata1
>>error messages isn't
>> nothing else but the computer crashed.
>
> It's the part that happenen *before* that is
>interesting. The stuff
> that scrolled off _because_ of the ata1 error messages.
>
> Linus
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru


Attachments:
lspci.bz2 (3.74 kB)
dmesg.bz2 (19.22 kB)
dmesg.2.6.38.4.bz2 (17.88 kB)
Download all attachments

2011-04-30 05:26:48

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

Enclosed a part of the syslog of the time when I made the
screen
foto with the ata1 problem. Unfortunately, neither the
watch of my computer, nor
of my handy is exact and also the system time of the foto
may be when I copied it
to the computer, so that I don't know exactly to which of
the boots the foto was
refered -- but it's also quite possible that this ata1
crashs are neither logged
in syslog, like actually some crashs at rtc0 aren't
logged. MORE UNFORTUNATELY,
THE LOG FILE NOR SHOWS THE KERNEL VERSION AT BOOT TIME
(YOU SHOULD IMPROVE THIS !!!),
SO THAT THE BOOTS ARE REFERED SOMETIMES TO 2.4.38.3 or
.4, SOMETIMES TO 2.6.39-RC3
or RC4 . However, I hope the log file is useful for
something.
wl



=======================================================
On Fri, 29 Apr 2011 21:46:18 -0700
Linus Torvalds <[email protected]> wrote:
> 2011/4/29 werner <[email protected]>:
>> Not only now, long time (since approx. -rc1) I'm
>>reclaiming that the
>> computer crashs if zipping / unzipping big files
>
> I understood.
>
>> The foto on my last mail is complete; after the ata1
>>error messages isn't
>> nothing else but the computer crashed.
>
> It's the part that happenen *before* that is
>interesting. The stuff
> that scrolled off _because_ of the ata1 error messages.
>
> Linus
>
>

"werner" <[email protected]>
---
Professional hosting for everyone - http://www.host.ru


Attachments:
syslog.txt.bz2 (11.71 kB)

2011-04-30 17:10:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

2011/4/29 werner <[email protected]>:
> Pls see enclosed:
>
> / lspci ? -vvxx
> / dmesg

Ok.

So what strikes me is that it looks like you're basically booting a
"allyesconfig" kernel, or at least something that has a _ton_ of crazy
drivers and filesystems that are entirely irrelevant for your setup.

And I'm wondering whether your problems are due to some buggy driver
that stomps on something that it shouldn't. It's clearly a regression
(your 2.6.38.4 dmesg shows the same "lots of irrelevant drivers and
filesystems" issue, but works for you), but it may explain why others
aren't seeing the problem. Your 2.6.39-rc5 dmesg does have a few new
drivers in it, and that seems to be because they simply didn't exist
back in 2.6.38 (but I didn't check).

So your lspci shows a AMD system with a nvidia chipset:

00:00.0 RAM memory: nVidia Corporation MCP61 Memory Controller (rev a1)
00:01.0 ISA bridge: nVidia Corporation MCP61 LPC Bridge (rev a2)
00:01.1 SMBus: nVidia Corporation MCP61 SMBus (rev a2)
00:01.2 RAM memory: nVidia Corporation MCP61 Memory Controller (rev a2)
00:02.0 USB Controller: nVidia Corporation MCP61 USB Controller (rev
a2) (prog-if 10 [OHCI])
00:02.1 USB Controller: nVidia Corporation MCP61 USB Controller (rev
a2) (prog-if 20 [EHCI])
00:04.0 PCI bridge: nVidia Corporation MCP61 PCI bridge (rev a1)
(prog-if 01 [Subtractive decode])
00:05.0 Audio device: nVidia Corporation MCP61 High Definition Audio (rev a2)
00:06.0 IDE interface: nVidia Corporation MCP61 IDE (rev a2)
(prog-if 8a [Master SecP PriP])
00:08.0 IDE interface: nVidia Corporation MCP61 SATA Controller (rev
a2) (prog-if 85 [Master SecO PriO])
00:09.0 PCI bridge: nVidia Corporation MCP61 PCI Express bridge (rev
a2) (prog-if 00 [Normal decode])
00:0b.0 PCI bridge: nVidia Corporation MCP61 PCI Express bridge (rev
a2) (prog-if 00 [Normal decode])
00:0c.0 PCI bridge: nVidia Corporation MCP61 PCI Express bridge (rev
a2) (prog-if 00 [Normal decode])
00:0d.0 VGA compatible controller: nVidia Corporation GeForce 6100
nForce 405 (rev a2) (prog-if 00 [VGA])
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Miscellaneous Control
01:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
04:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit
Ethernet Adapter (rev b0)

and in particular, your IDE/SATA controllers are clearly nVidia. But
the generic IDE driver seems to be a bit confused:

Uniform Multi-Platform E-IDE driver
amd74xx 0000:00:06.0: UDMA133 controller
amd74xx 0000:00:06.0: IDE controller (0x10de:0x03ec rev 0xa2)
amd74xx 0000:00:06.0: IDE port disabled
amd74xx 0000:00:06.0: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7
Probing IDE interface ide0...
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide_generic: please use "probe_mask=0x3f" module parameter for
probing all legacy ISA IDE ports

but that shouldn't matter, since it doesn't actually find anything there.

But could you try using a more reasonable config, and see if the
problem goes away? Don't configure logfs (you don't use it), don't
configure all the crazy random SCSI drivers, don't configure all the
laptop drivers etc (that cause various management drivers to be loaded
even though you don't even have the hardware afaik):

...
XGIfb: Options (null)
asus_wmi: Asus Management GUID not found
asus_wmi: Management GUID not found
asus_wmi: Management GUID not found
msi_laptop: driver 0.5 successfully loaded.
compal-laptop: Motherboard not recognized (You could try the
module's force-parameter)
dell-wmi: No known WMI GUID found
dell_wmi_aio: No known WMI GUID found
acer_wmi: Acer Laptop ACPI-WMI Extras
acer_wmi: No or unsupported WMI interface, unable to load
acerhdf: Acer Aspire One Fan driver, v.0.5.24
acerhdf: unknown (unsupported) BIOS version System
manufacturer/System Product Name/0413 , please report, aborting!
hp_accel: driver loaded
hdaps: supported laptop not found!
hdaps: driver init failed (ret=-19)!
fujitsu-laptop: driver 0.6.0 successfully loaded.
This machine doesn't have MSI-hotkeys through WMI
Topstar Laptop ACPI extras driver loaded
...

because if any of them corrupt memory or something like that, we
obviously want to find that bug, but we don't want to think it's some
bug in the drivers you actually _use_.

So if you could try a minimal config that supports only the hardware
(and filesystems) you actually _have_ and use (ie just disable IDE
entirely - you don't want it, you want the SATA_nv driver), that would
be great. Does that work better for you?

And if it does work better, then it would be really interesting to
start enabling things again, and see what causes the problem.

Ok?

Linus

2011-04-30 18:23:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

2011/4/30 werner <[email protected]>:
>
> The reason that I enable everything is, that the kernel packages
> are build for a distro. ? There have to be everything enabled,
> because you never know what computer the users have.

I DO NOT CARE WHY YOU ENABLE EVERYTHING.

I want to know what makes it start to fail. We simply don't know why
you see this problem (and apparently your friend too), but one issue
may be a totally unrelated buggy driver.

In order to figure it out, you need to help us. And one thing that is
very odd and wrong about your setup is how you have compiled in
absolutely everything.

And yes, it's wrong, because the way to do distro kernels is to
compile in the sane and common stuff, and load the rest as modules as
required.

And that's exactly because

(a) some drivers cannot sanely auto-detect if they are needed
(sometimes they are just buggy, but often it's because the hardware
they drive is not sane and doesn't necessarily have any nice
enumeration model)

(b) bugs happen. They happen especially commonly with rare hardware,
since that by definition gets less testing (that rare hardware can
often be "high-end" hardware - you'd think they are higher quality,
but the reverse is usually true).

Probing absolutely everything at boot-time tends to just be more
dangerous. There's a reason why most distros ask something like "are
you using just standard devices, or specialized storage subsystems",
so that they don't need to worry about the rare and possibly buggy
cases quite as much.

Some drivers you enable tend to be more about embedded systems (ie the
whole MTD layer etc), and there's likely little reason to do that in a
standard distribution kernel at all.

> I'm doing essentially the same since almost 4 years. Sometimes
> it gives problems during -rc1 to -rc4 or so, but at the end always
> everything works.

It's clearly a regression. Nobody disputes that. But you need to help
us find it. So just do a minimal kernel. Please. So that we can say
either "yes, it still shows up even when you only have the normal
drivers", or we can say "ok, it's one of the uncommon drivers that
screws things up".

And it's not just drivers. Please disable things like virtualization
etc kernel features if you don't need it.

We cannot fix it if we cannot pinpoint what the problem is. We won't
ignore it if the problem goes away when you have disabled drivers - at
that point I'm going to ask you to try to enable the drivers and
features until it starts happening again, so that we know _what_
random thing is causing it.

And please stop arguing against people who are trying to help figure
out what is wrong, ok?

> AND: the crashs with the same kernel don't happen only at my computer,
> but also on a laptop of at least one friend, on the same laptop also
> 2.6.38.4 runs normally.

With your "everything enabled" kernel, right?

If you don't want to minimize the configuration, you'll need to at
least bisect _exactly_ where the problem starts. There's about ten
thousand commits, spanning the whole kernel, in between 38 and 39-rc1.
We don't know what's wrong right now. And you are currently the only
one seeing, along with your friend who presumably uses your kernel. So
there's something wrong that is triggered by something specific to
YOUR setup.

See what I'm trying to say here?

Linus

2011-04-30 18:31:13

by werner

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

Since years it's all the time that I compile the kernel
with everything enabled, because these kernel builds are
for a distro, and should work on any computer. Normally,
things going stable at -rc3 or -rc4 , often already at
-rc2.

Beside of with my computer, essentially the same problems
also happens with a laptop of a friend.

Also, a big problem are the (after-boot-) crashs if
zipping / unzipping big files. Packages I can build only
with 2.6.38.4, not with 2.6.39-rcX

Below is the diff between the config of 2.6.38.4 and 2.6
39-rc5-git3 . You could inspect them what potentially
could explain the reported problems. But I think at least
the problems which happens on my computer, and on that of
the friend, are regressions, because none of the new
staging drivers we use.

wl

===============
3,4c3,4
< # Linux/i386 2.6.38.4 Kernel Configuration
< # Thu Apr 21 22:06:47 2011
---
> # Linux/i386 2.6.39-rc5-git2 Kernel Configuration
> # Fri Apr 29 15:48:26 2011
22c22
< # CONFIG_NEED_DMA_MAP_STATE is not set
---
> CONFIG_NEED_DMA_MAP_STATE=y
48a49
> CONFIG_HAVE_INTEL_TXT=y
51d51
< CONFIG_X86_TRAMPOLINE=y
64d63
< CONFIG_LOCK_KERNEL=y
85a85
> # CONFIG_FHANDLE is not set
97d96
< # CONFIG_GENERIC_HARDIRQS_NO_DEPRECATED is not set
99a99
> CONFIG_GENERIC_IRQ_SHOW=y
101,103c101
< # CONFIG_AUTO_IRQ_AFFINITY is not set
< # CONFIG_IRQ_PER_CPU is not set
< # CONFIG_HARDIRQS_SW_RESEND is not set
---
> CONFIG_IRQ_FORCED_THREADING=y
119c117,129
< # CONFIG_CGROUPS is not set
---
> CONFIG_CGROUPS=y
> # CONFIG_CGROUP_DEBUG is not set
> # CONFIG_CGROUP_NS is not set
> # CONFIG_CGROUP_FREEZER is not set
> # CONFIG_CGROUP_DEVICE is not set
> # CONFIG_CPUSETS is not set
> # CONFIG_CGROUP_CPUACCT is not set
> # CONFIG_RESOURCE_COUNTERS is not set
> # CONFIG_CGROUP_PERF is not set
> CONFIG_CGROUP_SCHED=y
> CONFIG_FAIR_GROUP_SCHED=y
> # CONFIG_RT_GROUP_SCHED is not set
> # CONFIG_BLK_CGROUP is not set
126c136
< # CONFIG_SCHED_AUTOGROUP is not set
---
> CONFIG_SCHED_AUTOGROUP=y
140d149
< CONFIG_EMBEDDED=y
157a167
> CONFIG_EMBEDDED=y
275c285
< CONFIG_X86_32_IRIS=m
---
> CONFIG_X86_32_IRIS=y
287d296
< CONFIG_X86_CPU=y
311c320
< # CONFIG_IOMMU_API is not set
---
> CONFIG_IOMMU_API=y
411,414d419
< CONFIG_PM=y
< # CONFIG_PM_DEBUG is not set
< CONFIG_PM_SLEEP_SMP=y
< CONFIG_PM_SLEEP=y
416a422
> CONFIG_HIBERNATE_CALLBACKS=y
418a425,426
> CONFIG_PM_SLEEP=y
> CONFIG_PM_SLEEP_SMP=y
420c428,429
< CONFIG_PM_OPS=y
---
> CONFIG_PM=y
> # CONFIG_PM_DEBUG is not set
448a458
> # CONFIG_ACPI_APEI_PCIEAER is not set
508c518
< # CONFIG_INTEL_IDLE is not set
---
> CONFIG_INTEL_IDLE=y
524a535,537
> CONFIG_DMAR=y
> CONFIG_DMAR_DEFAULT_ON=y
> CONFIG_DMAR_FLOPPY_WA=y
534c547
< # CONFIG_PCI_MSI is not set
---
> CONFIG_PCI_MSI=y
538a552
> CONFIG_PCI_LABEL=y
553,554d566
< CONFIG_OLPC_OPENFIRMWARE=y
< CONFIG_OLPC_OPENFIRMWARE_DT=y
586a599,607
> CONFIG_RAPIDIO=y
> CONFIG_RAPIDIO_DISC_TIMEOUT=30
> CONFIG_RAPIDIO_ENABLE_RX_TX_PORTS=y
> CONFIG_RAPIDIO_TSI57X=y
> CONFIG_RAPIDIO_CPS_XX=y
> CONFIG_RAPIDIO_TSI568=y
> CONFIG_RAPIDIO_CPS_GEN2=y
> CONFIG_RAPIDIO_TSI500=y
> CONFIG_RAPIDIO_DEBUG=y
616,618c637
< CONFIG_ASK_IP_FIB_HASH=y
< # CONFIG_IP_FIB_TRIE is not set
< CONFIG_IP_FIB_HASH=y
---
> CONFIG_IP_FIB_TRIE_STATS=y
621a641
> CONFIG_IP_ROUTE_CLASSID=y
626c646
< CONFIG_NET_IPIP=m
---
> CONFIG_NET_IPIP=y
640c660
< CONFIG_INET_TUNNEL=m
---
> CONFIG_INET_TUNNEL=y
645,646c665,666
< CONFIG_INET_DIAG=m
< CONFIG_INET_TCP_DIAG=m
---
> CONFIG_INET_DIAG=y
> CONFIG_INET_TCP_DIAG=y
664c684
< CONFIG_IPV6=m
---
> CONFIG_IPV6=y
705a726
> # CONFIG_NF_CONNTRACK_TIMESTAMP is not set
713a735
> CONFIG_NF_CONNTRACK_BROADCAST=m
714a737
> CONFIG_NF_CONNTRACK_SNMP=m
727a751
> # CONFIG_NETFILTER_XT_SET is not set
731a756
> CONFIG_NETFILTER_XT_TARGET_AUDIT=m
753a779
> CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
761a788
> CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
790a818,829
> CONFIG_IP_SET=m
> CONFIG_IP_SET_MAX=256
> CONFIG_IP_SET_BITMAP_IP=m
> CONFIG_IP_SET_BITMAP_IPMAC=m
> CONFIG_IP_SET_BITMAP_PORT=m
> CONFIG_IP_SET_HASH_IP=m
> CONFIG_IP_SET_HASH_IPPORT=m
> CONFIG_IP_SET_HASH_IPPORTIP=m
> CONFIG_IP_SET_HASH_IPPORTNET=m
> CONFIG_IP_SET_HASH_NET=m
> CONFIG_IP_SET_HASH_NETPORT=m
> CONFIG_IP_SET_LIST_SET=m
835d873
< # CONFIG_IP_NF_MATCH_ADDRTYPE is not set
929c967
< CONFIG_IP_SCTP=m
---
> CONFIG_IP_SCTP=y
941d978
< CONFIG_TIPC_NODES=255
945,946c982,983
< CONFIG_ATM=m
< CONFIG_ATM_CLIP=m
---
> CONFIG_ATM=y
> CONFIG_ATM_CLIP=y
948,949c985,986
< CONFIG_ATM_LANE=m
< CONFIG_ATM_MPOA=m
---
> CONFIG_ATM_LANE=y
> CONFIG_ATM_MPOA=y
992c1029
< CONFIG_WAN_ROUTER=m
---
> CONFIG_WAN_ROUTER=y
994d1030
< # CONFIG_PHONET_PIPECTRLR is not set
1007a1044
> # CONFIG_NET_SCH_SFB is not set
1014a1052,1053
> # CONFIG_NET_SCH_MQPRIO is not set
> # CONFIG_NET_SCH_CHOKE is not set
1024d1062
< CONFIG_NET_CLS_ROUTE=y
1031a1070
> # CONFIG_NET_CLS_CGROUP is not set
1050a1090
> CONFIG_RFS_ACCEL=y
1102a1143,1144
> CONFIG_CAN_C_CAN=m
> CONFIG_CAN_C_CAN_PLATFORM=m
1170,1171c1212,1213
< CONFIG_BT_L2CAP=m
< CONFIG_BT_SCO=m
---
> CONFIG_BT_L2CAP=y
> CONFIG_BT_SCO=y
1200a1243
> CONFIG_BT_WILINK=m
1239c1282
< CONFIG_RFKILL=m
---
> CONFIG_RFKILL=y
1267a1311
> CONFIG_ARCH_NO_SYSDEV_OPS=y
1273d1316
< # CONFIG_MTD_CONCAT is not set
1296c1339
< CONFIG_SM_FTL=m
---
> CONFIG_SM_FTL=y
1297a1341
> CONFIG_MTD_SWAP=y
1338a1383
> CONFIG_MTD_SC520CDP=y
1355a1401
> CONFIG_MTD_LATCH_ADDR=y
1392a1439,1440
> CONFIG_MTD_NAND_BCH=y
> CONFIG_MTD_NAND_ECC_BCH=y
1425,1428d1472
<
< #
< # UBI debugging options
< #
1444a1489
> CONFIG_OF_PCI=y
1521c1566
< # CONFIG_BLK_DEV_HD is not set
---
> CONFIG_BLK_DEV_HD=y
1522a1568
> CONFIG_SENSORS_LIS3LV02D=y
1540c1586
< CONFIG_ISL29020=m
---
> CONFIG_ISL29020=y
1572a1619
> CONFIG_SENSORS_LIS3_I2C=m
1698d1744
< CONFIG_SCSI_SAS_LIBSAS_DEBUG=y
1705c1751
< CONFIG_SCSI_CXGB4_ISCSI=m
---
> CONFIG_SCSI_CXGB4_ISCSI=y
1706a1753
> CONFIG_SCSI_BNX2X_FCOE=y
1709c1756
< CONFIG_SCSI_HPSA=m
---
> CONFIG_SCSI_HPSA=y
1711c1758
< CONFIG_SCSI_3W_SAS=m
---
> CONFIG_SCSI_3W_SAS=y
1751c1798
< CONFIG_VMWARE_PVSCSI=m
---
> CONFIG_VMWARE_PVSCSI=y
1813c1860
< CONFIG_SCSI_PM8001=m
---
> CONFIG_SCSI_PM8001=y
1873a1921
> CONFIG_PATA_ARASAN_CF=y
1956c2004,2010
< # CONFIG_TARGET_CORE is not set
---
> CONFIG_DM_FLAKEY=y
> CONFIG_TARGET_CORE=m
> CONFIG_TCM_IBLOCK=m
> CONFIG_TCM_FILEIO=m
> CONFIG_TCM_PSCSI=m
> CONFIG_LOOPBACK_TARGET=m
> # CONFIG_LOOPBACK_TARGET_CDB_DEBUG is not set
2155d2208
< # CONFIG_R8169_VLAN is not set
2178d2230
< CONFIG_CHELSIO_T3_DEPENDS=y
2180,2182c2232
< CONFIG_CHELSIO_T4_DEPENDS=y
< CONFIG_CHELSIO_T4=m
< CONFIG_CHELSIO_T4VF_DEPENDS=y
---
> CONFIG_CHELSIO_T4=y
2187a2238
> CONFIG_IXGBEVF=m
2243a2295
> # CONFIG_ATH5K_TRACER is not set
2294c2346,2362
< # CONFIG_IWLWIFI is not set
---
> CONFIG_IWLAGN=m
>
> #
> # Debugging Options
> #
> # CONFIG_IWLWIFI_DEBUG is not set
> CONFIG_IWLWIFI_DEVICE_TRACING=y
> CONFIG_IWL_P2P=y
> CONFIG_IWLWIFI_LEGACY=m
>
> #
> # Debugging Options
> #
> # CONFIG_IWLWIFI_LEGACY_DEBUG is not set
> # CONFIG_IWLWIFI_LEGACY_DEVICE_TRACING is not set
> CONFIG_IWL4965=m
> CONFIG_IWL3945=m
2327a2396
> CONFIG_RT2800PCI_RT53XX=y
2343a2413
> CONFIG_RTL8192CU=m
2344a2415
> CONFIG_RTL8192C_COMMON=m
2400a2472
> CONFIG_USB_VL600=m
2444c2516
< CONFIG_ATM_DUMMY=m
---
> # CONFIG_ATM_DUMMY is not set
2481a2554,2556
> CONFIG_RIONET=m
> CONFIG_RIONET_TX_SIZE=128
> CONFIG_RIONET_RX_SIZE=128
2489c2564
< CONFIG_PLIP=m
---
> CONFIG_PLIP=y
2500c2575
< CONFIG_PPPOATM=m
---
> CONFIG_PPPOATM=y
2511c2586
< # CONFIG_NETPOLL_TRAP is not set
---
> CONFIG_NETPOLL_TRAP=y
2693a2769
> CONFIG_KEYBOARD_QT1070=y
2776a2853
> CONFIG_TOUCHSCREEN_ATMEL_MXT=y
2793d2869
< # CONFIG_TOUCHSCREEN_QT602240 is not set
2796a2873
> CONFIG_TOUCHSCREEN_WM831X=y
2819a2897
> CONFIG_TOUCHSCREEN_TSC2005=y
2883c2961,2964
< CONFIG_DEVKMEM=y
---
> CONFIG_UNIX98_PTYS=y
> CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
> CONFIG_LEGACY_PTYS=y
> CONFIG_LEGACY_PTY_COUNT=256
2885d2965
< CONFIG_COMPUTONE=m
2889d2968
< CONFIG_DIGIEPCA=m
2892d2970
< CONFIG_ISI=m
2895a2974,2975
> CONFIG_NOZOMI=m
> CONFIG_ISI=m
2898,2899c2978
< CONFIG_RISCOM8=m
< CONFIG_SPECIALIX=m
---
> CONFIG_DEVKMEM=y
2901,2903d2979
< CONFIG_STALLION=m
< CONFIG_ISTALLION=m
< CONFIG_NOZOMI=m
2939c3015
< # CONFIG_SERIAL_OF_PLATFORM is not set
---
> CONFIG_SERIAL_OF_PLATFORM=m
2941d3016
< # CONFIG_SERIAL_GRLIB_GAISLER_APBUART is not set
2948,2951d3022
< CONFIG_UNIX98_PTYS=y
< CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
< CONFIG_LEGACY_PTYS=y
< CONFIG_LEGACY_PTY_COUNT=256
2953c3024
< CONFIG_PRINTER=m
---
> CONFIG_PRINTER=y
3053a3125,3126
> CONFIG_I2C_PXA=m
> CONFIG_I2C_PXA_PCI=y
3060a3134
> CONFIG_I2C_DIOLAN_U2C=m
3083a3158
> CONFIG_SPI_ALTERA=m
3087a3163
> CONFIG_SPI_OC_TINY=m
3210a3287,3288
> CONFIG_BATTERY_BQ27X00_I2C=y
> CONFIG_BATTERY_BQ27X00_PLATFORM=y
3261a3340
> CONFIG_SENSORS_LINEAGE=m
3274a3354
> CONFIG_SENSORS_LTC4151=m
3280a3361
> CONFIG_SENSORS_MAX6639=m
3284a3366,3370
> CONFIG_PMBUS=m
> CONFIG_SENSORS_PMBUS=m
> CONFIG_SENSORS_MAX16064=m
> CONFIG_SENSORS_MAX34440=m
> CONFIG_SENSORS_MAX8688=m
3294a3381,3382
> CONFIG_SENSORS_SCH5627=m
> CONFIG_SENSORS_ADS1015=m
3301a3390
> # CONFIG_SENSORS_TWL4030_MADC is not set
3318d3406
< CONFIG_SENSORS_LIS3_I2C=m
3326d3413
< CONFIG_SENSORS_LIS3LV02D=m
3421a3509
> CONFIG_TPS6105X=m
3424a3513
> CONFIG_TWL4030_MADC=m
3432a3522
> CONFIG_MFD_MAX8997=y
3451a3542
> CONFIG_AB8500_GPADC=y
3463c3554
< # CONFIG_REGULATOR_DUMMY is not set
---
> CONFIG_REGULATOR_DUMMY=y
3472a3564
> CONFIG_REGULATOR_MAX8997=m
3487a3580
> CONFIG_REGULATOR_TPS6105X=m
3500a3594
> CONFIG_MEDIA_CONTROLLER=y
3502a3597
> CONFIG_VIDEO_V4L2_SUBDEV_API=y
3523a3619
> CONFIG_IR_ITE_CIR=m
3562a3659,3661
> CONFIG_VIDEOBUF2_CORE=m
> CONFIG_VIDEOBUF2_MEMOPS=m
> CONFIG_VIDEOBUF2_VMALLOC=m
3660a3760
> CONFIG_MEDIA_ALTERA_CI=m
3669a3770
> CONFIG_VIDEO_NOON010PC30=m
3683a3785
> CONFIG_SOC_CAMERA_OV9740=m
3699a3802
> CONFIG_USB_GSPCA_NW80X=m
3725a3829
> CONFIG_USB_GSPCA_VICAM=m
3778a3883,3887
>
> #
> # Texas Instruments WL128x FM driver (ST based)
> #
> CONFIG_RADIO_WL128X=m
3829a3939
> CONFIG_DVB_USB_TECHNISAT_USB2=m
3861a3972,3975
>
> #
> # Supported FireWire (IEEE 1394) Adapters
> #
3863,3864d3976
< CONFIG_DVB_FIREDTV_FIREWIRE=y
< # CONFIG_DVB_FIREDTV_IEEE1394 is not set
3945a4058
> CONFIG_DVB_DIB9000=m
3948a4062
> CONFIG_DVB_STV0367=m
4028d4141
< # CONFIG_DRM_I830 is not set
4069a4183
> CONFIG_FB_CYBER2000_DDC=y
4157a4272
> CONFIG_LCD_LD9040=m
4164c4279
< # CONFIG_BACKLIGHT_MBP_NVIDIA is not set
---
> CONFIG_BACKLIGHT_APPLE=m
4391a4507,4510
> CONFIG_SND_USB_6FIRE=m
> CONFIG_SND_FIREWIRE=y
> CONFIG_SND_FIREWIRE_LIB=m
> CONFIG_SND_FIREWIRE_SPEAKERS=m
4411a4531
> CONFIG_SND_SOC_CS4271=m
4414a4535
> CONFIG_SND_SOC_DFBMCS320=m
4415a4537
> CONFIG_SND_SOC_MAX9850=m
4416a4539
> CONFIG_SND_SOC_SGTL5000=m
4420a4544
> CONFIG_SND_SOC_TVL320AIC32X4=m
4457a4582
> CONFIG_SND_SOC_WM8991=m
4461a4587
> CONFIG_SND_SOC_LM4857=m
4503c4629
< CONFIG_HID_3M_PCT=m
---
> CONFIG_HID_3M_PCT=y
4509c4635
< CONFIG_HID_CANDO=m
---
> CONFIG_HID_CANDO=y
4517d4642
< # CONFIG_HID_EGALAX is not set
4519a4645
> CONFIG_HID_KEYTOUCH=y
4521,4522c4647,4648
< CONFIG_HID_UCLOGIC=m
< CONFIG_HID_WALTOP=m
---
> CONFIG_HID_UCLOGIC=y
> CONFIG_HID_WALTOP=y
4525a4652
> CONFIG_HID_LCPOWER=y
4533c4660
< CONFIG_HID_MOSART=m
---
> CONFIG_HID_MOSART=y
4535c4662
< CONFIG_HID_MULTITOUCH=m
---
> CONFIG_HID_MULTITOUCH=y
4537c4664
< CONFIG_HID_ORTEK=m
---
> CONFIG_HID_ORTEK=y
4546,4550c4673,4680
< CONFIG_HID_QUANTA=m
< CONFIG_HID_ROCCAT=m
< CONFIG_HID_ROCCAT_KONE=m
< CONFIG_HID_ROCCAT_KONEPLUS=m
< CONFIG_HID_ROCCAT_PYRA=m
---
> CONFIG_HID_QUANTA=y
> CONFIG_HID_ROCCAT=y
> CONFIG_HID_ROCCAT_COMMON=y
> CONFIG_HID_ROCCAT_ARVO=y
> CONFIG_HID_ROCCAT_KONE=y
> CONFIG_HID_ROCCAT_KONEPLUS=y
> CONFIG_HID_ROCCAT_KOVAPLUS=y
> CONFIG_HID_ROCCAT_PYRA=y
4553c4683
< CONFIG_HID_STANTUM=m
---
> CONFIG_HID_STANTUM=y
4620c4750
< CONFIG_USB_TMC=m
---
> CONFIG_USB_TMC=y
4630a4761
> CONFIG_USB_STORAGE_REALTEK=y
4641a4773
> CONFIG_USB_STORAGE_ENE_UB6250=y
4744a4877
> # CONFIG_USB_GADGET_FUSB300 is not set
4749c4882,4883
< # CONFIG_USB_GADGET_CI13XXX_PCI is not set
---
> CONFIG_USB_GADGET_CI13XXX_PCI=y
> CONFIG_USB_CI13XXX_PCI=m
4752,4753c4886
< CONFIG_USB_GADGET_LANGWELL=y
< CONFIG_USB_LANGWELL=m
---
> # CONFIG_USB_GADGET_LANGWELL is not set
4818c4951
< CONFIG_MMC_SDHCI_OF=m
---
> CONFIG_MMC_SDHCI_OF=y
4822c4955
< CONFIG_MMC_SDRICOH_CS=m
---
> CONFIG_MMC_SDRICOH_CS=y
4825c4958
< CONFIG_MMC_USHC=m
---
> CONFIG_MMC_USHC=y
4839a4973
> CONFIG_MEMSTICK_R592=y
4846a4981
> CONFIG_LEDS_LM3530=m
5017,5020c5152,5155
< CONFIG_INTEL_MID_DMAC=m
< CONFIG_INTEL_IOATDMA=m
< CONFIG_TIMB_DMA=m
< CONFIG_PCH_DMA=m
---
> CONFIG_INTEL_MID_DMAC=y
> CONFIG_INTEL_IOATDMA=y
> CONFIG_TIMB_DMA=y
> CONFIG_PCH_DMA=y
5028,5029c5163,5164
< CONFIG_DMATEST=m
< CONFIG_DCA=m
---
> # CONFIG_DMATEST is not set
> CONFIG_DCA=y
5045a5181,5186
> CONFIG_STALLION=m
> CONFIG_ISTALLION=m
> CONFIG_DIGIEPCA=m
> CONFIG_RISCOM8=m
> CONFIG_SPECIALIX=m
> CONFIG_COMPUTONE=m
5064,5066c5205
< # CONFIG_USB_DABUSB is not set
< # CONFIG_USB_SE401 is not set
< # CONFIG_USB_VICAM is not set
---
> CONFIG_DVB_CXD2099=m
5075c5214
< # CONFIG_BRCM80211_PCI is not set
---
> CONFIG_BRCMSMAC=y
5076a5216
> CONFIG_BRCMDBG=y
5205a5346,5348
> CONFIG_FB_OLPC_DCON=m
> CONFIG_FB_OLPC_DCON_1=y
> CONFIG_FB_OLPC_DCON_1_5=y
5215a5359,5360
> CONFIG_RTS_PSTOR=m
> # CONFIG_RTS_PSTOR_DEBUG is not set
5220d5364
< # CONFIG_AUTOFS_FS is not set
5264a5409
> CONFIG_IIO_KFIFO_BUF=m
5277a5423,5424
> CONFIG_LIS3L02DQ_BUF_KFIFO=y
> # CONFIG_LIS3L02DQ_BUF_RING_SW is not set
5289a5437,5439
> CONFIG_AD7606=m
> CONFIG_AD7606_IFACE_PARALLEL=m
> CONFIG_AD7606_IFACE_SPI=m
5311a5462
> CONFIG_MAX517=m
5331d5481
< # CONFIG_ADIS16251 is not set
5377a5528
> CONFIG_IIO_SYSFS_TRIGGER=m
5378a5530
> CONFIG_XVMALLOC=y
5379a5532
> # CONFIG_ZRAM_DEBUG is not set
5382d5534
< CONFIG_SAMSUNG_LAPTOP=m
5389,5393d5540
<
< #
< # Texas Instruments shared transport line discipline
< #
< # CONFIG_ST_BT is not set
5399,5400d5545
< # CONFIG_LIRC_IT87 is not set
< # CONFIG_LIRC_ITE8709 is not set
5408d5552
< # CONFIG_SMB_FS is not set
5409a5554,5556
> CONFIG_EASYCAP_SND=y
> # CONFIG_EASYCAP_OSS is not set
> # CONFIG_EASYCAP_DEBUG is not set
5436a5584
> CONFIG_FT1000_PCMCIA=m
5458a5607,5612
> CONFIG_DRM_PSB=m
>
> #
> # Altera FPGA firmware download module
> #
> # CONFIG_ALTERA_STAPL is not set
5460,5461c5614,5615
< CONFIG_ACER_WMI=m
< CONFIG_ACERHDF=m
---
> CONFIG_ACER_WMI=y
> CONFIG_ACERHDF=y
5464c5618,5619
< CONFIG_FUJITSU_LAPTOP=m
---
> CONFIG_DELL_WMI_AIO=y
> CONFIG_FUJITSU_LAPTOP=y
5467,5468c5622,5624
< CONFIG_HP_WMI=m
< CONFIG_MSI_LAPTOP=m
---
> CONFIG_HP_ACCEL=y
> CONFIG_HP_WMI=y
> CONFIG_MSI_LAPTOP=y
5470,5471c5626,5627
< CONFIG_COMPAL_LAPTOP=m
< CONFIG_SONY_LAPTOP=m
---
> CONFIG_COMPAL_LAPTOP=y
> CONFIG_SONY_LAPTOP=y
5473,5475c5629,5630
< CONFIG_IDEAPAD_LAPTOP=m
< CONFIG_THINKPAD_ACPI=m
< CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y
---
> CONFIG_IDEAPAD_LAPTOP=y
> CONFIG_THINKPAD_ACPI=y
5481c5636
< CONFIG_SENSORS_HDAPS=m
---
> CONFIG_SENSORS_HDAPS=y
5483,5484c5638,5641
< CONFIG_EEEPC_LAPTOP=m
< CONFIG_EEEPC_WMI=m
---
> CONFIG_EEEPC_LAPTOP=y
> CONFIG_ASUS_WMI=y
> CONFIG_ASUS_NB_WMI=y
> CONFIG_EEEPC_WMI=y
5489c5646
< CONFIG_ACPI_TOSHIBA=m
---
> CONFIG_ACPI_TOSHIBA=y
5491,5494c5648,5653
< CONFIG_ACPI_CMPC=m
< CONFIG_INTEL_IPS=m
< CONFIG_IBM_RTL=m
< CONFIG_XO1_RFKILL=m
---
> CONFIG_ACPI_CMPC=y
> CONFIG_INTEL_IPS=y
> CONFIG_IBM_RTL=y
> CONFIG_XO1_RFKILL=y
> CONFIG_XO15_EBOOK=y
> CONFIG_SAMSUNG_LAPTOP=y
5505a5665
> CONFIG_DMI_SYSFS=y
5507a5668
> # CONFIG_SIGMA is not set
5661c5822
< CONFIG_LOGFS=m
---
> CONFIG_LOGFS=y
5679a5841
> CONFIG_PSTORE=y
5710c5872
< CONFIG_RPCSEC_GSS_KRB5=y
---
> CONFIG_RPCSEC_GSS_KRB5=m
5816a5979
> CONFIG_DEFAULT_MESSAGE_LOGLEVEL=4
5824a5988
> # CONFIG_DEBUG_SECTION_MISMATCH is not set
5829d5992
< CONFIG_BKL=y
5912a6076
> # CONFIG_INTEL_TXT is not set
6101a6266
> CONFIG_BCH=y
6110a6276
> CONFIG_CPU_RMAP=y
---
Professional hosting for everyone - http://www.host.ru

2011-04-30 18:44:27

by Justin P. Mattock

[permalink] [raw]
Subject: Re: 2.6.39-rc5-git2 boot crashs

On 04/30/2011 11:31 AM, werner wrote:
> Since years it's all the time that I compile the kernel with everything
> enabled, because these kernel builds are for a distro, and should work
> on any computer. Normally, things going stable at -rc3 or -rc4 , often
> already at -rc2.
>
> Beside of with my computer, essentially the same problems also happens
> with a laptop of a friend.
>
> Also, a big problem are the (after-boot-) crashs if zipping / unzipping
> big files. Packages I can build only with 2.6.38.4, not with 2.6.39-rcX
>
> Below is the diff between the config of 2.6.38.4 and 2.6 39-rc5-git3 .
> You could inspect them what potentially could explain the reported
> problems. But I think at least the problems which happens on my
> computer, and on that of the friend, are regressions, because none of
> the new staging drivers we use.
>
> wl
>

make localmodconfig for a small .config and also use some kind of early
debugger(under kernel hacking) to grab the crash info so linus can see
the problem.

Justin P. Mattock