2004-06-19 00:50:27

by Chris Caputo

[permalink] [raw]
Subject: inode_unused list corruption in 2.4.26 - spin_lock problem?

In 2.4.26 on two different dual-proc x86 machines (one dual-P4 Xeon based,
the other dual-PIII) I am seeing crashes which are the result of the
inode_unused doubly linked list in fs/inode.c becoming corrupted.

A particular instance of the corruption I have isolated is in a call from
iput() to __refile_inode(). To try to diagnose this further I placed list
verification code before and after the list_del() and list_add() calls in
__refile_inode() and observed a healthy list become corrupted after the
del/add was completed.

It would seem to me that list corruption on otherwise healthy machines
would only be the result of the inode_lock spinlock not being properly
locked prior to the call to __refile_inode(), but as far as I can tell,
the call to atomic_dec_and_lock() in iput() is doing that properly.

So I am at a loss. Has anyone else seen this or does anyone have any idea
what routes I should be exploring to fix this problem?

Thank you,
Chris


2004-06-20 00:22:36

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?


Hi Chris,

I've seen your previous post -- should have answered you earlier.

On Fri, Jun 18, 2004 at 05:47:05PM -0700, Chris Caputo wrote:
> In 2.4.26 on two different dual-proc x86 machines (one dual-P4 Xeon based,
> the other dual-PIII) I am seeing crashes which are the result of the
> inode_unused doubly linked list in fs/inode.c becoming corrupted.

What steps are required to reproduce the problem?

> A particular instance of the corruption I have isolated is in a call from
> iput() to __refile_inode(). To try to diagnose this further I placed list
> verification code before and after the list_del() and list_add() calls in
> __refile_inode() and observed a healthy list become corrupted after the
> del/add was completed.

Can you show us this data in more detail?

> It would seem to me that list corruption on otherwise healthy machines
> would only be the result of the inode_lock spinlock not being properly
> locked prior to the call to __refile_inode(), but as far as I can tell,
> the call to atomic_dec_and_lock() in iput() is doing that properly.
>
> So I am at a loss. Has anyone else seen this or does anyone have any idea
> what routes I should be exploring to fix this problem?

The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and
the unused_pagecache list) must have something to do with this.

David, Rik, can you give some help here?

2004-06-20 03:33:58

by Trond Myklebust

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

P? lau , 19/06/2004 klokka 20:15, skreiv Marcelo Tosatti:

> The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and
> the unused_pagecache list) must have something to do with this.

Here's one question:

Given the fact that in iput(), the inode remains hashed and the
inode->i_state does not change until after we've dropped the inode_lock,
called write_inode_now(), and then retaken the inode_lock, exactly what
is preventing a third party task from grabbing that inode?

(Better still: write_inode_now() itself actually calls __iget(), which
could cause that inode to be plonked right back onto the "inode_in_use"
list if ever refile_inode() gets called.)

So does the following patch help?

Cheers,
Trond

--- linux-2.4.27-pre3/fs/inode.c.orig 2004-05-20 20:41:41.000000000 -0400
+++ linux-2.4.27-pre3/fs/inode.c 2004-06-19 23:22:29.000000000 -0400
@@ -1200,6 +1200,7 @@ void iput(struct inode *inode)
struct super_block *sb = inode->i_sb;
struct super_operations *op = NULL;

+again:
if (inode->i_state == I_CLEAR)
BUG();

@@ -1241,11 +1242,16 @@ void iput(struct inode *inode)
if (!(inode->i_state & (I_DIRTY|I_LOCK)))
__refile_inode(inode);
inodes_stat.nr_unused++;
- spin_unlock(&inode_lock);
- if (!sb || (sb->s_flags & MS_ACTIVE))
+ if (!sb || (sb->s_flags & MS_ACTIVE)) {
+ spin_unlock(&inode_lock);
return;
- write_inode_now(inode, 1);
- spin_lock(&inode_lock);
+ }
+ if (inode->i_state & I_DIRTY) {
+ __iget(inode);
+ spin_unlock(&inode_lock);
+ write_inode_now(inode, 1);
+ goto again;
+ }
inodes_stat.nr_unused--;
list_del_init(&inode->i_hash);
}

2004-06-21 00:52:43

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Sat, Jun 19, 2004 at 11:33:55PM -0400, Trond Myklebust wrote:
> P? lau , 19/06/2004 klokka 20:15, skreiv Marcelo Tosatti:
>
> > The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and
> > the unused_pagecache list) must have something to do with this.
>
> Here's one question:
>
> Given the fact that in iput(), the inode remains hashed and the
> inode->i_state does not change until after we've dropped the inode_lock,
> called write_inode_now(), and then retaken the inode_lock, exactly what
> is preventing a third party task from grabbing that inode?
>
> (Better still: write_inode_now() itself actually calls __iget(), which
> could cause that inode to be plonked right back onto the "inode_in_use"
> list if ever refile_inode() gets called.)

Lets see if I get this right, while we drop the lock in iput to call
write_inode_now() an iget happens, possibly from write_inode_now itself
(sync_one->__iget) causing the inode->i_list to be added to to inode_in_use.

But then the call returns, locks inode_lock, decreases inodes_stat.nr_unused--
and deletes the inode from the inode_in_use and adds to inode_unused.

AFAICS its an inode with i_count==1 in the unused list, which does not
mean "list corruption", right? Am I missing something here?

If you are indeed right all 2.4.x versions contain this bug.

Thanks for helping!

>
> So does the following patch help?
>
> +++ linux-2.4.27-pre3/fs/inode.c 2004-06-19 23:22:29.000000000 -0400
> @@ -1200,6 +1200,7 @@ void iput(struct inode *inode)
> struct super_block *sb = inode->i_sb;
> struct super_operations *op = NULL;
>
> +again:
> if (inode->i_state == I_CLEAR)
> BUG();
>
> @@ -1241,11 +1242,16 @@ void iput(struct inode *inode)
> if (!(inode->i_state & (I_DIRTY|I_LOCK)))
> __refile_inode(inode);
> inodes_stat.nr_unused++;
> - spin_unlock(&inode_lock);
> - if (!sb || (sb->s_flags & MS_ACTIVE))
> + if (!sb || (sb->s_flags & MS_ACTIVE)) {
> + spin_unlock(&inode_lock);
> return;
> - write_inode_now(inode, 1);
> - spin_lock(&inode_lock);
> + }
> + if (inode->i_state & I_DIRTY) {
> + __iget(inode);
> + spin_unlock(&inode_lock);
> + write_inode_now(inode, 1);
> + goto again;
> + }
> inodes_stat.nr_unused--;
> list_del_init(&inode->i_hash);
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-06-21 17:10:46

by Trond Myklebust

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

P? su , 20/06/2004 klokka 20:45, skreiv Marcelo Tosatti:
> Lets see if I get this right, while we drop the lock in iput to call
> write_inode_now() an iget happens, possibly from write_inode_now itself
> (sync_one->__iget) causing the inode->i_list to be added to to inode_in_use.
> But then the call returns, locks inode_lock, decreases inodes_stat.nr_unused--
> and deletes the inode from the inode_in_use and adds to inode_unused.
>
> AFAICS its an inode with i_count==1 in the unused list, which does not
> mean "list corruption", right? Am I missing something here?

Yes. Please don't forget that the inode is still hashed and is not yet
marked as FREEING: find_inode() can grab it on behalf of some other
process as soon as we drop that spinlock inside iput(). Then we have the
calls to clear_inode() + destroy_inode() just a few lines further down.
;-)

If the above scenario ever does occur, it will cause random Oopses for
third party processes. Since we do not see this too often, my guess is
that the write_inode_now() path must be very rarely (or never?) called.

> If you are indeed right all 2.4.x versions contain this bug.

...and all 2.6.x versions...

I'm not saying this is the same problem that Chris is seeing, but I am
failing to see how iput() is safe as it stands right now. Please
enlighten me if I'm missing something.

Cheers,
Trond

2004-06-21 18:30:36

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Mon, Jun 21, 2004 at 01:10:21PM -0400, Trond Myklebust wrote:
> P? su , 20/06/2004 klokka 20:45, skreiv Marcelo Tosatti:
> > Lets see if I get this right, while we drop the lock in iput to call
> > write_inode_now() an iget happens, possibly from write_inode_now itself
> > (sync_one->__iget) causing the inode->i_list to be added to to inode_in_use.
> > But then the call returns, locks inode_lock, decreases inodes_stat.nr_unused--
> > and deletes the inode from the inode_in_use and adds to inode_unused.
> >
> > AFAICS its an inode with i_count==1 in the unused list, which does not
> > mean "list corruption", right? Am I missing something here?
>
> Yes. Please don't forget that the inode is still hashed and is not yet
> marked as FREEING: find_inode() can grab it on behalf of some other
> process as soon as we drop that spinlock inside iput(). Then we have the
> calls to clear_inode() + destroy_inode() just a few lines further down.
> ;-)
>
> If the above scenario ever does occur, it will cause random Oopses for
> third party processes. Since we do not see this too often, my guess is
> that the write_inode_now() path must be very rarely (or never?) called.

Thats what I though: That if the scenario you described really happens, we
would see random oopses (processes using a deleted inode) instead of
Chris's list corruption.

Chris, _please_ post your full oopses.

> > If you are indeed right all 2.4.x versions contain this bug.
>
> ...and all 2.6.x versions...
>
> I'm not saying this is the same problem that Chris is seeing, but I am
> failing to see how iput() is safe as it stands right now. Please
> enlighten me if I'm missing something.

For me your analysis looks right and we have a problem here.

I think Al Viro knows iput() very well. Maybe he should take a look
at your patched. CC'ed.

2004-06-24 01:51:23

by Chris Caputo

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Mon, 21 Jun 2004, Trond Myklebust wrote:
> P? su , 20/06/2004 klokka 20:45, skreiv Marcelo Tosatti:
> > Lets see if I get this right, while we drop the lock in iput to call
> > write_inode_now() an iget happens, possibly from write_inode_now itself
> > (sync_one->__iget) causing the inode->i_list to be added to to inode_in_use.
> > But then the call returns, locks inode_lock, decreases inodes_stat.nr_unused--
> > and deletes the inode from the inode_in_use and adds to inode_unused.
> >
> > AFAICS its an inode with i_count==1 in the unused list, which does not
> > mean "list corruption", right? Am I missing something here?
>
> Yes. Please don't forget that the inode is still hashed and is not yet
> marked as FREEING: find_inode() can grab it on behalf of some other
> process as soon as we drop that spinlock inside iput(). Then we have the
> calls to clear_inode() + destroy_inode() just a few lines further down.
> ;-)
>
> If the above scenario ever does occur, it will cause random Oopses for
> third party processes. Since we do not see this too often, my guess is
> that the write_inode_now() path must be very rarely (or never?) called.
>
> > If you are indeed right all 2.4.x versions contain this bug.
>
> ...and all 2.6.x versions...
>
> I'm not saying this is the same problem that Chris is seeing, but I am
> failing to see how iput() is safe as it stands right now. Please
> enlighten me if I'm missing something.

I think this is a different (albeit apparently valid) problem. In my case
MS_ACTIVE (in iput() below) will be set since I am not unmounting a volume
and so I believe iput() will return immediately after adding the inode to
the unused list.

That said, I have added your patch to my test setup in case it helps.

Thanks,
Chris

----

if (!list_empty(&inode->i_hash)) {
if (!(inode->i_state & (I_DIRTY|I_LOCK)))
__refile_inode(inode);
inodes_stat.nr_unused++;
spin_unlock(&inode_lock);
if (!sb || (sb->s_flags & MS_ACTIVE))
return;
write_inode_now(inode, 1);
spin_lock(&inode_lock);
inodes_stat.nr_unused--;
list_del_init(&inode->i_hash);
}

2004-06-24 01:51:08

by Chris Caputo

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Sat, 19 Jun 2004, Marcelo Tosatti wrote:
> On Fri, Jun 18, 2004 at 05:47:05PM -0700, Chris Caputo wrote:
> > In 2.4.26 on two different dual-proc x86 machines (one dual-P4 Xeon based,
> > the other dual-PIII) I am seeing crashes which are the result of the
> > inode_unused doubly linked list in fs/inode.c becoming corrupted.
>
> What steps are required to reproduce the problem?

Unfortunately I don't yet know how to (or if I can) repro this with a
stock kernel. I am using 2.4.26 + Ingo's tux3-2.4.23-A3 patch in
conjunction with a filesystem I wrote. (the tux module itself is not
being used, just the patches to the existing kernel)

> > A particular instance of the corruption I have isolated is in a call from
> > iput() to __refile_inode(). To try to diagnose this further I placed list
> > verification code before and after the list_del() and list_add() calls in
> > __refile_inode() and observed a healthy list become corrupted after the
> > del/add was completed.
>
> Can you show us this data in more detail?

In __refile_inode() before and after the list_add()/del() calls I call a
function which checks up to the first 10 items on the inode_unused list to
see if next and prev pointers are valid.
(inode->next->prev == inode && inode->prev->next == inode)

So what I observed was a case here where iput() inline __refile_inode():

1) checked inode_unused and saw that it was good
2) put an item on the inode_unused list
3) checked inode_unused and saw that it was now bad and that the item
added was the culprit.

This all happened within __refile_inode() with the inode_lock spinlock
grabbed by iput() and so I tend to think some other code is accessing the
inode_unused list _without_ grabbing the spinlock. I've checked the
inode.c code over and over, plus my filesystem code, and haven't yet found
a culprit. I also checked the tux diffs to see if it was messing with
inode objects in an inappropriate way.

Is it safe to assume that the x86 version of atomic_dec_and_lock(), which
iput() uses, is well trusted? I figure it's got to be, but doesn't hurt
to ask.

> > It would seem to me that list corruption on otherwise healthy machines
> > would only be the result of the inode_lock spinlock not being properly
> > locked prior to the call to __refile_inode(), but as far as I can tell,
> > the call to atomic_dec_and_lock() in iput() is doing that properly.
> >
> > So I am at a loss. Has anyone else seen this or does anyone have any idea
> > what routes I should be exploring to fix this problem?
>
> The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and
> the unused_pagecache list) must have something to do with this.

__refile_inode() was introduced in 2.4.25. I'll try 2.4.24 to see if I
can reproduce there.

Marcelo, you asked for some oops' relating to the problem. Here are some.

Thanks,
Chris

---

Oops: 0000
CPU: 0
EIP: 0010:[<c015b465>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010a93
eax: 8d12e0e2 ebx: 7b402366 ecx: c40fdf18 edx: d9c92888
esi: 7b40235e edi: 7b402366 ebp: 000007cf esp: c40fdf10
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 7, stackpage=c40fd000)
Stack: e3b85380 000004e5 e3b85388 c9812688 00000002 c25caf40 c0317fd8 00023dd2
c015b664 00000cb4 c0138f35 00000006 000001d0 ffffffff 000001d0 00000002
00000006 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
Call Trace: [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
[<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
[<c01396b0>]
Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08


>>EIP; c015b465 <prune_icache+45/220> <=====

>>ecx; c40fdf18 <_end+3d3ffec/38676134>
>>edx; d9c92888 <_end+198d495c/38676134>
>>esp; c40fdf10 <_end+3d3ffe4/38676134>

Trace; c015b664 <shrink_icache_memory+24/40>
Trace; c0138f35 <shrink_cache+185/410>
Trace; c013936a <shrink_caches+4a/60>
Trace; c01393e2 <try_to_free_pages_zone+62/f0>
Trace; c0139596 <kswapd_balance_pgdat+66/b0>
Trace; c0139608 <kswapd_balance+28/40>
Trace; c0139748 <kswapd+98/c0>
Trace; c01396b0 <kswapd+0/c0>
Trace; c0105000 <_stext+0/0>
Trace; c010587e <arch_kernel_thread+2e/40>
Trace; c01396b0 <kswapd+0/c0>

Code; c015b465 <prune_icache+45/220>
00000000 <_EIP>:
Code; c015b465 <prune_icache+45/220> <=====
0: 8b 5b 04 mov 0x4(%ebx),%ebx <=====
Code; c015b468 <prune_icache+48/220>
3: 8b 86 1c 01 00 00 mov 0x11c(%esi),%eax
Code; c015b46e <prune_icache+4e/220>
9: a8 38 test $0x38,%al
Code; c015b470 <prune_icache+50/220>
b: 0f 84 5d 01 00 00 je 16e <_EIP+0x16e>
Code; c015b476 <prune_icache+56/220>
11: 81 fb 08 00 00 00 cmp $0x8,%ebx

---

Oops: 0000
CPU: 0
EIP: 0010:[<c015b465>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010a97
eax: 830418da ebx: 5954e741 ecx: c40fdf18 edx: c0318f08
esi: 5954e739 edi: 5954e741 ebp: 000001b4 esp: c40fdf10
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 7, stackpage=c40fd000)
Stack: f718b780 000013a1 f718b788 d04d3988 00000001 c3cfe850 c0317fd8 00023d9e
c015b664 00001555 c0138f35 00000006 000001d0 ffffffff 000001d0 00000001
00000004 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
Call Trace: [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
[<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
[<c01396b0>]
Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08


>>EIP; c015b465 <prune_icache+45/220> <=====

>>edx; c0318f08 <inode_unused+0/8>

Trace; c015b664 <shrink_icache_memory+24/40>
Trace; c0138f35 <shrink_cache+185/410>
Trace; c013936a <shrink_caches+4a/60>
Trace; c01393e2 <try_to_free_pages_zone+62/f0>
Trace; c0139596 <kswapd_balance_pgdat+66/b0>
Trace; c0139608 <kswapd_balance+28/40>
Trace; c0139748 <kswapd+98/c0>
Trace; c01396b0 <kswapd+0/c0>
Trace; c0105000 <_stext+0/0>
Trace; c010587e <arch_kernel_thread+2e/40>
Trace; c01396b0 <kswapd+0/c0>

Code; c015b465 <prune_icache+45/220>
00000000 <_EIP>:
Code; c015b465 <prune_icache+45/220> <=====
0: 8b 5b 04 mov 0x4(%ebx),%ebx <=====
Code; c015b468 <prune_icache+48/220>
3: 8b 86 1c 01 00 00 mov 0x11c(%esi),%eax
Code; c015b46e <prune_icache+4e/220>
9: a8 38 test $0x38,%al
Code; c015b470 <prune_icache+50/220>
b: 0f 84 5d 01 00 00 je 16e <_EIP+0x16e>
Code; c015b476 <prune_icache+56/220>
11: 81 fb 08 00 00 00 cmp $0x8,%ebx

---

I think this one was an infinite loop which I used alt-sysrq-p to get
deduce:

Pid: 7, comm: kswapd
EIP: 0010:[<c014dd1c>] CPU: 2 EFLAGS: 00000246 Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: c4e4f824 EBX: c4e4f804 ECX: 00000006 EDX: 00000000
ESI: c4e4f80c EDI: c4e4f804 EBP: 000009ee DS: 0018 ES: 0018
Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
line ignored
CR0: 8005003b CR2: bfffdfb4 CR3: 00101000 CR4: 000006d0
Call Trace: [<c0167f8e>] [<c01681d4>] [<c013d9d0>] [<c0168254>] [<c013fe69>]
[<c01404ae>] [<c0140533>] [<c01405b2>] [<c011a6cb>] [<c0140766>] [<c01407d8>]
[<c0140918>] [<c0140880>] [<c010595e>] [<c0140880>]
Warning (Oops_read): Code line not seen, dumping what data is available


>>EIP; c014dd1c <inode_has_buffers+6c/90> <=====

>>EAX; c4e4f824 <_end+4a6b2f8/3864fb34>
>>EBX; c4e4f804 <_end+4a6b2d8/3864fb34>
>>ESI; c4e4f80c <_end+4a6b2e0/3864fb34>
>>EDI; c4e4f804 <_end+4a6b2d8/3864fb34>

Trace; c0167f8e <prune_icache+7e/320>
Trace; c01681d4 <prune_icache+2c4/320>
Trace; c013d9d0 <kmem_cache_shrink+70/c0>
Trace; c0168254 <shrink_icache_memory+24/40>
Trace; c013fe69 <shrink_cache+1d9/6d0>
Trace; c01404ae <refill_inactive+14e/160>
Trace; c0140533 <shrink_caches+73/90>
Trace; c01405b2 <try_to_free_pages_zone+62/f0>
Trace; c011a6cb <schedule+34b/5f0>
Trace; c0140766 <kswapd_balance_pgdat+66/b0>
Trace; c01407d8 <kswapd_balance+28/40>
Trace; c0140918 <kswapd+98/c0>
Trace; c0140880 <kswapd+0/c0>
Trace; c010595e <arch_kernel_thread+2e/40>
Trace; c0140880 <kswapd+0/c0>

---

Another infinite loop (alt-sysrq-p):

Pid: 7, comm: kswapd
EIP: 0010:[<c0167bc8>] CPU: 3 EFLAGS: 00000206 Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: 00000001 EBX: ebe7f40c ECX: 00000002 EDX: 00000000
ESI: ebe7f60c EDI: ebe7f604 EBP: c2851ee4 DS: 0018 ES: 0018
Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
line ignored
CR0: 8005003b CR2: 40014000 CR3: 00101000 CR4: 000006d0
Call Trace: [<c013d7e2>] [<c0167eb7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
[<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>] [<c010596e>]
[<c0140760>]

>>EIP; c0167bc8 <prune_icache+78/340> <=====

Trace; c013d7e2 <kmem_cache_shrink+72/c0>
Trace; c0167eb7 <shrink_icache_memory+27/40>
Trace; c013fd05 <shrink_cache+1d5/6d0>
Trace; c0140360 <refill_inactive+160/170>
Trace; c01403e2 <shrink_caches+72/90>
Trace; c0140462 <try_to_free_pages_zone+62/f0>
Trace; c0140638 <kswapd_balance_pgdat+78/c0>
Trace; c01406a8 <kswapd_balance+28/40>
Trace; c01407fa <kswapd+9a/c0>
Trace; c0140760 <kswapd+0/c0>
Trace; c010596e <arch_kernel_thread+2e/40>
Trace; c0140760 <kswapd+0/c0>

---

Infinite loop (alt-sysrq-p):

Pid: 9, comm: kupdated
EIP: 0010:[<c0168df3>] CPU: 2 EFLAGS: 00000282 Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: f7bea000 EBX: f7bea000 ECX: f7bebfd0 EDX: 00000000
ESI: f7bebfc8 EDI: f7bebfd8 EBP: f7bebf10 DS: 0018 ES: 0018
Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
line ignored
CR0: 8005003b CR2: 40013090 CR3: 00101000 CR4: 000006d0
Call Trace: [<c011a1b2>] [<c0151333>] [<c015185e>] [<c0107a42>]
[<c010596e>] [<c0151690>]

>>EIP; c0168df3 <.text.lock.inode+69/256> <=====

>>EAX; f7bea000 <_end+377ffb74/38649bd4>
>>EBX; f7bea000 <_end+377ffb74/38649bd4>
>>ECX; f7bebfd0 <_end+37801b44/38649bd4>
>>ESI; f7bebfc8 <_end+37801b3c/38649bd4>
>>EDI; f7bebfd8 <_end+37801b4c/38649bd4>
>>EBP; f7bebf10 <_end+37801a84/38649bd4>

Trace; c011a1b2 <schedule_timeout+62/b0>
Trace; c0151333 <sync_old_buffers+53/160>
Trace; c015185e <kupdate+1ce/230>
Trace; c0107a42 <ret_from_fork+6/20>
Trace; c010596e <arch_kernel_thread+2e/40>
Trace; c0151690 <kupdate+0/230>

---

Infinite loop (alt-sysrq-p):

Pid: 7, comm: kswapd
EIP: 0010:[<c0167be0>] CPU: 3 EFLAGS: 00000202 Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: 00000020 EBX: dcbc120c ECX: 00000086 EDX: daae3044
ESI: dcbc120c EDI: dcbc1204 EBP: c2851ee4 DS: 0018 ES: 0018
Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
line ign
ored
CR0: 8005003b CR2: 0807f260 CR3: 00101000 CR4: 000006d0
Call Trace: [<c013d7e2>] [<c0167ed7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
[<c011a55b>] [<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>]
[<c010596e>] [<c0140760>]

>>EIP; c0167be0 <prune_icache+90/360> <=====

>>EBX; dcbc120c <_end+1c7d6d80/38649bd4>
>>EDX; daae3044 <_end+1a6f8bb8/38649bd4>
>>ESI; dcbc120c <_end+1c7d6d80/38649bd4>
>>EDI; dcbc1204 <_end+1c7d6d78/38649bd4>
>>EBP; c2851ee4 <_end+2467a58/38649bd4>

Trace; c013d7e2 <kmem_cache_shrink+72/c0>
Trace; c0167ed7 <shrink_icache_memory+27/40>
Trace; c013fd05 <shrink_cache+1d5/6d0>
Trace; c0140360 <refill_inactive+160/170>
Trace; c01403e2 <shrink_caches+72/90>
Trace; c011a55b <schedule+34b/5f0>
Trace; c0140462 <try_to_free_pages_zone+62/f0>
Trace; c0140638 <kswapd_balance_pgdat+78/c0>
Trace; c01406a8 <kswapd_balance+28/40>
Trace; c01407fa <kswapd+9a/c0>
Trace; c0140760 <kswapd+0/c0>
Trace; c010596e <arch_kernel_thread+2e/40>
Trace; c0140760 <kswapd+0/c0>

2004-06-25 07:47:46

by Chris Caputo

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Wed, 23 Jun 2004, Chris Caputo wrote:
> On Mon, 21 Jun 2004, Trond Myklebust wrote:
> > P? su , 20/06/2004 klokka 20:45, skreiv Marcelo Tosatti:
> > > Lets see if I get this right, while we drop the lock in iput to call
> > > write_inode_now() an iget happens, possibly from write_inode_now itself
> > > (sync_one->__iget) causing the inode->i_list to be added to to inode_in_use.
> > > But then the call returns, locks inode_lock, decreases inodes_stat.nr_unused--
> > > and deletes the inode from the inode_in_use and adds to inode_unused.
> > >
> > > AFAICS its an inode with i_count==1 in the unused list, which does not
> > > mean "list corruption", right? Am I missing something here?
> >
> > Yes. Please don't forget that the inode is still hashed and is not yet
> > marked as FREEING: find_inode() can grab it on behalf of some other
> > process as soon as we drop that spinlock inside iput(). Then we have the
> > calls to clear_inode() + destroy_inode() just a few lines further down.
> > ;-)
> >
> > If the above scenario ever does occur, it will cause random Oopses for
> > third party processes. Since we do not see this too often, my guess is
> > that the write_inode_now() path must be very rarely (or never?) called.
> >
> > > If you are indeed right all 2.4.x versions contain this bug.
> >
> > ...and all 2.6.x versions...
> >
> > I'm not saying this is the same problem that Chris is seeing, but I am
> > failing to see how iput() is safe as it stands right now. Please
> > enlighten me if I'm missing something.
>
> I think this is a different (albeit apparently valid) problem. In my case
> MS_ACTIVE (in iput() below) will be set since I am not unmounting a volume
> and so I believe iput() will return immediately after adding the inode to
> the unused list.
>
> That said, I have added your patch to my test setup in case it helps.

I was able to duplicate the problem I am seeing even with Trond's patch
applied. So the patch potentially solves a different problem but not the
one I am seeing.

Chris

2004-06-25 08:04:42

by Chris Caputo

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Wed, 23 Jun 2004, Chris Caputo wrote:
> On Sat, 19 Jun 2004, Marcelo Tosatti wrote:
> > Can you show us this data in more detail?
>
> In __refile_inode() before and after the list_add()/del() calls I call a
> function which checks up to the first 10 items on the inode_unused list to
> see if next and prev pointers are valid.
> (inode->next->prev == inode && inode->prev->next == inode)
>
> So what I observed was a case here where iput() inline __refile_inode():
>
> 1) checked inode_unused and saw that it was good
> 2) put an item on the inode_unused list
> 3) checked inode_unused and saw that it was now bad and that the item
> added was the culprit.
>
> This all happened within __refile_inode() with the inode_lock spinlock
> grabbed by iput() and so I tend to think some other code is accessing the
> inode_unused list _without_ grabbing the spinlock. I've checked the
> inode.c code over and over, plus my filesystem code, and haven't yet found
> a culprit. I also checked the tux diffs to see if it was messing with
> inode objects in an inappropriate way.
>
> Is it safe to assume that the x86 version of atomic_dec_and_lock(), which
> iput() uses, is well trusted? I figure it's got to be, but doesn't hurt
> to ask.

An update on this.

Line #3 above is not entirely correct in that I have not seen the item
being added becoming immediately corrupt, but rather items beyond it.

Specifically I have now seen:

inode_unused->next->next->prev != inode_unused->next

and:

inode_unused->next->next->next->prev != inode_unused->next->next

My verification function doesn't check the whole inode_unused list (would
be too slow to do so), but it may be that items are only corrupted shortly
after being added to the list. Ie., someone is still using the inode
shortly after when they shouldn't be.

> __refile_inode() was introduced in 2.4.25. I'll try 2.4.24 to see if I
> can reproduce there.

No word yet on my 2.4.24 testing. (test still running without failure)

I'll keep digging,
Chris

2004-06-25 10:18:10

by Chris Caputo

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Fri, 25 Jun 2004, Chris Caputo wrote:
> On Wed, 23 Jun 2004, Chris Caputo wrote:
> > __refile_inode() was introduced in 2.4.25. I'll try 2.4.24 to see if I
> > can reproduce there.
>
> No word yet on my 2.4.24 testing. (test still running without failure)

I have now reproduced (below) with 2.4.24 (with tux patches + my driver).
In the code I believe the line doing the null deref is:

entry = entry->prev; [771 or 2.4.24 fs/inode.c]

Next I'll try to repro with simply a stock 2.4.26.

Chris

---

Unable to handle kernel NULL pointer dereference at virtual address
00000004
c01665f5
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01665f5>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010217
eax: 00000020 ebx: 00000000 ecx: 00000006 edx: 00000001
esi: 00000000 edi: fffffff8 ebp: c7f95f58 esp: c7f95f30
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 5, stackpage=c7f95000)
Stack: c7f95f58 c0163794 00000000 c7f95f3c c7f95f3c c7f95f64 c013d152 000001d0
0000003a 0000000a c7f95f64 c01666f7 00018057 c7f95f8c c013fdcf 00000006
000001d0 00000000 c0329a60 00000001 c0329a60 00000001 c7f94000
c7f95fa8
Call Trace: [<c0163794>] [<c013d152>] [<c01666f7>] [<c013fdcf>] [<c013ff78>]
[<c013ffe8>] [<c014013a>] [<c01400a0>] [<c01059ce>] [<c01400a0>]
Code: 8b 76 04 8b 87 30 01 00 00 a8 38 74 6e 81 fe 28 a9 32 c0 75


>>EIP; c01665f5 <prune_icache+65/140> <=====

>>ebp; c7f95f58 <_end+7bcdd8c/3862ae94>
>>esp; c7f95f30 <_end+7bcdd64/3862ae94>

Trace; c0163794 <prune_dcache+194/280>
Trace; c013d152 <kmem_cache_shrink+72/c0>
Trace; c01666f7 <shrink_icache_memory+27/40>
Trace; c013fdcf <try_to_free_pages_zone+8f/f0>
Trace; c013ff78 <kswapd_balance_pgdat+78/c0>
Trace; c013ffe8 <kswapd_balance+28/40>
Trace; c014013a <kswapd+9a/c0>
Trace; c01400a0 <kswapd+0/c0>
Trace; c01059ce <arch_kernel_thread+2e/40>
Trace; c01400a0 <kswapd+0/c0>

Code; c01665f5 <prune_icache+65/140>
00000000 <_EIP>:
Code; c01665f5 <prune_icache+65/140> <=====
0: 8b 76 04 mov 0x4(%esi),%esi <=====
Code; c01665f8 <prune_icache+68/140>
3: 8b 87 30 01 00 00 mov 0x130(%edi),%eax
Code; c01665fe <prune_icache+6e/140>
9: a8 38 test $0x38,%al
Code; c0166600 <prune_icache+70/140>
b: 74 6e je 7b <_EIP+0x7b>
Code; c0166602 <prune_icache+72/140>
d: 81 fe 28 a9 32 c0 cmp $0xc032a928,%esi
Code; c0166608 <prune_icache+78/140>
13: 75 00 jne 15 <_EIP+0x15>

2004-06-25 12:53:25

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: inode_unused list corruption in 2.4.26 - spin_lock problem?

On Wed, Jun 23, 2004 at 06:50:48PM -0700, Chris Caputo wrote:
> On Sat, 19 Jun 2004, Marcelo Tosatti wrote:
> > On Fri, Jun 18, 2004 at 05:47:05PM -0700, Chris Caputo wrote:
> > > In 2.4.26 on two different dual-proc x86 machines (one dual-P4 Xeon based,
> > > the other dual-PIII) I am seeing crashes which are the result of the
> > > inode_unused doubly linked list in fs/inode.c becoming corrupted.
> >
> > What steps are required to reproduce the problem?
>
> Unfortunately I don't yet know how to (or if I can) repro this with a
> stock kernel. I am using 2.4.26 + Ingo's tux3-2.4.23-A3 patch in
> conjunction with a filesystem I wrote. (the tux module itself is not
> being used, just the patches to the existing kernel)
>
> > > A particular instance of the corruption I have isolated is in a call from
> > > iput() to __refile_inode(). To try to diagnose this further I placed list
> > > verification code before and after the list_del() and list_add() calls in
> > > __refile_inode() and observed a healthy list become corrupted after the
> > > del/add was completed.
> >
> > Can you show us this data in more detail?
>
> In __refile_inode() before and after the list_add()/del() calls I call a
> function which checks up to the first 10 items on the inode_unused list to
> see if next and prev pointers are valid.
> (inode->next->prev == inode && inode->prev->next == inode)
>
> So what I observed was a case here where iput() inline __refile_inode():
>
> 1) checked inode_unused and saw that it was good
> 2) put an item on the inode_unused list
> 3) checked inode_unused and saw that it was now bad and that the item
> added was the culprit.
>
> This all happened within __refile_inode() with the inode_lock spinlock
> grabbed by iput() and so I tend to think some other code is accessing the
> inode_unused list _without_ grabbing the spinlock. I've checked the
> inode.c code over and over, plus my filesystem code, and haven't yet found
> a culprit. I also checked the tux diffs to see if it was messing with
> inode objects in an inappropriate way.
>
> Is it safe to assume that the x86 version of atomic_dec_and_lock(), which
> iput() uses, is well trusted? I figure it's got to be, but doesn't hurt
> to ask.

Pretty sure it is, used all over. You can try to use non-optimize version
at lib/dec_and_lock.c for a test.

> > > It would seem to me that list corruption on otherwise healthy machines
> > > would only be the result of the inode_lock spinlock not being properly
> > > locked prior to the call to __refile_inode(), but as far as I can tell,
> > > the call to atomic_dec_and_lock() in iput() is doing that properly.
> > >
> > > So I am at a loss. Has anyone else seen this or does anyone have any idea
> > > what routes I should be exploring to fix this problem?
> >
> > The changes between 2.4.25->2.4.26 (which introduce __refile_inode() and
> > the unused_pagecache list) must have something to do with this.
>
> __refile_inode() was introduced in 2.4.25. I'll try 2.4.24 to see if I
> can reproduce there.
>
> Marcelo, you asked for some oops' relating to the problem. Here are some.

Chris, thanks. Would love to see you reproducing the oops with a stock kernel with ext2/3.

> ---
>
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c015b465>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010a93
> eax: 8d12e0e2 ebx: 7b402366 ecx: c40fdf18 edx: d9c92888
> esi: 7b40235e edi: 7b402366 ebp: 000007cf esp: c40fdf10
> ds: 0018 es: 0018 ss: 0018
> Process kswapd (pid: 7, stackpage=c40fd000)
> Stack: e3b85380 000004e5 e3b85388 c9812688 00000002 c25caf40 c0317fd8 00023dd2
> c015b664 00000cb4 c0138f35 00000006 000001d0 ffffffff 000001d0 00000002
> 00000006 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
> Call Trace: [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
> [<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
> [<c01396b0>]
> Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08
>
>
> >>EIP; c015b465 <prune_icache+45/220> <=====
>
> >>ecx; c40fdf18 <_end+3d3ffec/38676134>
> >>edx; d9c92888 <_end+198d495c/38676134>
> >>esp; c40fdf10 <_end+3d3ffe4/38676134>
>
> Trace; c015b664 <shrink_icache_memory+24/40>
> Trace; c0138f35 <shrink_cache+185/410>
> Trace; c013936a <shrink_caches+4a/60>
> Trace; c01393e2 <try_to_free_pages_zone+62/f0>
> Trace; c0139596 <kswapd_balance_pgdat+66/b0>
> Trace; c0139608 <kswapd_balance+28/40>
> Trace; c0139748 <kswapd+98/c0>
> Trace; c01396b0 <kswapd+0/c0>
> Trace; c0105000 <_stext+0/0>
> Trace; c010587e <arch_kernel_thread+2e/40>
> Trace; c01396b0 <kswapd+0/c0>
>
> Code; c015b465 <prune_icache+45/220>
> 00000000 <_EIP>:
> Code; c015b465 <prune_icache+45/220> <=====
> 0: 8b 5b 04 mov 0x4(%ebx),%ebx <=====
> Code; c015b468 <prune_icache+48/220>
> 3: 8b 86 1c 01 00 00 mov 0x11c(%esi),%eax
> Code; c015b46e <prune_icache+4e/220>
> 9: a8 38 test $0x38,%al
> Code; c015b470 <prune_icache+50/220>
> b: 0f 84 5d 01 00 00 je 16e <_EIP+0x16e>
> Code; c015b476 <prune_icache+56/220>
> 11: 81 fb 08 00 00 00 cmp $0x8,%ebx
>
> ---
>
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c015b465>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010a97
> eax: 830418da ebx: 5954e741 ecx: c40fdf18 edx: c0318f08
> esi: 5954e739 edi: 5954e741 ebp: 000001b4 esp: c40fdf10
> ds: 0018 es: 0018 ss: 0018
> Process kswapd (pid: 7, stackpage=c40fd000)
> Stack: f718b780 000013a1 f718b788 d04d3988 00000001 c3cfe850 c0317fd8 00023d9e
> c015b664 00001555 c0138f35 00000006 000001d0 ffffffff 000001d0 00000001
> 00000004 000001d0 c0317fd8 c0317fd8 c013936a c40fdf84 000001d0 0000003c
> Call Trace: [<c015b664>] [<c0138f35>] [<c013936a>] [<c01393e2>] [<c0139596>]
> [<c0139608>] [<c0139748>] [<c01396b0>] [<c0105000>] [<c010587e>]
> [<c01396b0>]
> Code: 8b 5b 04 8b 86 1c 01 00 00 a8 38 0f 84 5d 01 00 00 81 fb 08
>
>
> >>EIP; c015b465 <prune_icache+45/220> <=====
>
> >>edx; c0318f08 <inode_unused+0/8>
>
> Trace; c015b664 <shrink_icache_memory+24/40>
> Trace; c0138f35 <shrink_cache+185/410>
> Trace; c013936a <shrink_caches+4a/60>
> Trace; c01393e2 <try_to_free_pages_zone+62/f0>
> Trace; c0139596 <kswapd_balance_pgdat+66/b0>
> Trace; c0139608 <kswapd_balance+28/40>
> Trace; c0139748 <kswapd+98/c0>
> Trace; c01396b0 <kswapd+0/c0>
> Trace; c0105000 <_stext+0/0>
> Trace; c010587e <arch_kernel_thread+2e/40>
> Trace; c01396b0 <kswapd+0/c0>
>
> Code; c015b465 <prune_icache+45/220>
> 00000000 <_EIP>:
> Code; c015b465 <prune_icache+45/220> <=====
> 0: 8b 5b 04 mov 0x4(%ebx),%ebx <=====
> Code; c015b468 <prune_icache+48/220>
> 3: 8b 86 1c 01 00 00 mov 0x11c(%esi),%eax
> Code; c015b46e <prune_icache+4e/220>
> 9: a8 38 test $0x38,%al
> Code; c015b470 <prune_icache+50/220>
> b: 0f 84 5d 01 00 00 je 16e <_EIP+0x16e>
> Code; c015b476 <prune_icache+56/220>
> 11: 81 fb 08 00 00 00 cmp $0x8,%ebx
>
> ---
>
> I think this one was an infinite loop which I used alt-sysrq-p to get
> deduce:
>
> Pid: 7, comm: kswapd
> EIP: 0010:[<c014dd1c>] CPU: 2 EFLAGS: 00000246 Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: c4e4f824 EBX: c4e4f804 ECX: 00000006 EDX: 00000000
> ESI: c4e4f80c EDI: c4e4f804 EBP: 000009ee DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
> line ignored
> CR0: 8005003b CR2: bfffdfb4 CR3: 00101000 CR4: 000006d0
> Call Trace: [<c0167f8e>] [<c01681d4>] [<c013d9d0>] [<c0168254>] [<c013fe69>]
> [<c01404ae>] [<c0140533>] [<c01405b2>] [<c011a6cb>] [<c0140766>] [<c01407d8>]
> [<c0140918>] [<c0140880>] [<c010595e>] [<c0140880>]
> Warning (Oops_read): Code line not seen, dumping what data is available
>
>
> >>EIP; c014dd1c <inode_has_buffers+6c/90> <=====
>
> >>EAX; c4e4f824 <_end+4a6b2f8/3864fb34>
> >>EBX; c4e4f804 <_end+4a6b2d8/3864fb34>
> >>ESI; c4e4f80c <_end+4a6b2e0/3864fb34>
> >>EDI; c4e4f804 <_end+4a6b2d8/3864fb34>
>
> Trace; c0167f8e <prune_icache+7e/320>
> Trace; c01681d4 <prune_icache+2c4/320>
> Trace; c013d9d0 <kmem_cache_shrink+70/c0>
> Trace; c0168254 <shrink_icache_memory+24/40>
> Trace; c013fe69 <shrink_cache+1d9/6d0>
> Trace; c01404ae <refill_inactive+14e/160>
> Trace; c0140533 <shrink_caches+73/90>
> Trace; c01405b2 <try_to_free_pages_zone+62/f0>
> Trace; c011a6cb <schedule+34b/5f0>
> Trace; c0140766 <kswapd_balance_pgdat+66/b0>
> Trace; c01407d8 <kswapd_balance+28/40>
> Trace; c0140918 <kswapd+98/c0>
> Trace; c0140880 <kswapd+0/c0>
> Trace; c010595e <arch_kernel_thread+2e/40>
> Trace; c0140880 <kswapd+0/c0>
>
> ---
>
> Another infinite loop (alt-sysrq-p):
>
> Pid: 7, comm: kswapd
> EIP: 0010:[<c0167bc8>] CPU: 3 EFLAGS: 00000206 Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: 00000001 EBX: ebe7f40c ECX: 00000002 EDX: 00000000
> ESI: ebe7f60c EDI: ebe7f604 EBP: c2851ee4 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
> line ignored
> CR0: 8005003b CR2: 40014000 CR3: 00101000 CR4: 000006d0
> Call Trace: [<c013d7e2>] [<c0167eb7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
> [<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>] [<c010596e>]
> [<c0140760>]
>
> >>EIP; c0167bc8 <prune_icache+78/340> <=====
>
> Trace; c013d7e2 <kmem_cache_shrink+72/c0>
> Trace; c0167eb7 <shrink_icache_memory+27/40>
> Trace; c013fd05 <shrink_cache+1d5/6d0>
> Trace; c0140360 <refill_inactive+160/170>
> Trace; c01403e2 <shrink_caches+72/90>
> Trace; c0140462 <try_to_free_pages_zone+62/f0>
> Trace; c0140638 <kswapd_balance_pgdat+78/c0>
> Trace; c01406a8 <kswapd_balance+28/40>
> Trace; c01407fa <kswapd+9a/c0>
> Trace; c0140760 <kswapd+0/c0>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0140760 <kswapd+0/c0>
>
> ---
>
> Infinite loop (alt-sysrq-p):
>
> Pid: 9, comm: kupdated
> EIP: 0010:[<c0168df3>] CPU: 2 EFLAGS: 00000282 Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: f7bea000 EBX: f7bea000 ECX: f7bebfd0 EDX: 00000000
> ESI: f7bebfc8 EDI: f7bebfd8 EBP: f7bebf10 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
> line ignored
> CR0: 8005003b CR2: 40013090 CR3: 00101000 CR4: 000006d0
> Call Trace: [<c011a1b2>] [<c0151333>] [<c015185e>] [<c0107a42>]
> [<c010596e>] [<c0151690>]
>
> >>EIP; c0168df3 <.text.lock.inode+69/256> <=====
>
> >>EAX; f7bea000 <_end+377ffb74/38649bd4>
> >>EBX; f7bea000 <_end+377ffb74/38649bd4>
> >>ECX; f7bebfd0 <_end+37801b44/38649bd4>
> >>ESI; f7bebfc8 <_end+37801b3c/38649bd4>
> >>EDI; f7bebfd8 <_end+37801b4c/38649bd4>
> >>EBP; f7bebf10 <_end+37801a84/38649bd4>
>
> Trace; c011a1b2 <schedule_timeout+62/b0>
> Trace; c0151333 <sync_old_buffers+53/160>
> Trace; c015185e <kupdate+1ce/230>
> Trace; c0107a42 <ret_from_fork+6/20>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0151690 <kupdate+0/230>
>
> ---
>
> Infinite loop (alt-sysrq-p):
>
> Pid: 7, comm: kswapd
> EIP: 0010:[<c0167be0>] CPU: 3 EFLAGS: 00000202 Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: 00000020 EBX: dcbc120c ECX: 00000086 EDX: daae3044
> ESI: dcbc120c EDI: dcbc1204 EBP: c2851ee4 DS: 0018 ES: 0018
> Warning (Oops_set_regs): garbage 'DS: 0018 ES: 0018' at end of register
> line ign
> ored
> CR0: 8005003b CR2: 0807f260 CR3: 00101000 CR4: 000006d0
> Call Trace: [<c013d7e2>] [<c0167ed7>] [<c013fd05>] [<c0140360>] [<c01403e2>]
> [<c011a55b>] [<c0140462>] [<c0140638>] [<c01406a8>] [<c01407fa>] [<c0140760>]
> [<c010596e>] [<c0140760>]
>
> >>EIP; c0167be0 <prune_icache+90/360> <=====
>
> >>EBX; dcbc120c <_end+1c7d6d80/38649bd4>
> >>EDX; daae3044 <_end+1a6f8bb8/38649bd4>
> >>ESI; dcbc120c <_end+1c7d6d80/38649bd4>
> >>EDI; dcbc1204 <_end+1c7d6d78/38649bd4>
> >>EBP; c2851ee4 <_end+2467a58/38649bd4>
>
> Trace; c013d7e2 <kmem_cache_shrink+72/c0>
> Trace; c0167ed7 <shrink_icache_memory+27/40>
> Trace; c013fd05 <shrink_cache+1d5/6d0>
> Trace; c0140360 <refill_inactive+160/170>
> Trace; c01403e2 <shrink_caches+72/90>
> Trace; c011a55b <schedule+34b/5f0>
> Trace; c0140462 <try_to_free_pages_zone+62/f0>
> Trace; c0140638 <kswapd_balance_pgdat+78/c0>
> Trace; c01406a8 <kswapd_balance+28/40>
> Trace; c01407fa <kswapd+9a/c0>
> Trace; c0140760 <kswapd+0/c0>
> Trace; c010596e <arch_kernel_thread+2e/40>
> Trace; c0140760 <kswapd+0/c0>