2008-08-03 12:50:35

by Hong Tran Duc

[permalink] [raw]
Subject: Oops when read/write or mount/unmount continuously ~ 600.000 times

Hi all,

I?m using kernel 2.4.20 with fully preemptive enable (patch & set the
CONFIG option). My CPU is PowerPC 750FX, HDD 80GB, RAM 512,

I got many Oops when try to mount/unmount or read/write on ATA HDD
continuously about 600.000 times (in several hours). Oops often occurred
when CPU trap SIGSEGV or SIGILL, sometime on page management module,
sometimes on scheduler, block I/O manipulation, filesystem.

The most frequently happened on:
Block I/O : make_request, generic_make_request, submit_bh, bdfind, bmap,
__wait_on_buffer ..
Filesystem: journal_commit_transaction, kill_super, invalidate_inode,
invalidate_list ..

The reasons is almost linked list on those function was broken. Ex:
linkedlist->next linkedlist->prev = NULL or set to invalid address.
In the situation SIGILL, the instruction pointer (NIP) is same as the
return address register (LR).

The newest Oops, I got on function __wait_on_buffer(). The main
sequences of __wait_on_buffer() are:
1. put_bh -> atomic_inc(bh->b_count);
2. add wait queue
3. loop: do some thing task manipulation, call *schedule()*
4. remove wait queue
5. get_bh -> atomic_dec(bh->b_count); *<- Got Oops here, SEGV because
bh->b_count = R25 = 0x02 *

After analysis assembly code (I upload on pastebin bellow) at this
point, I found that:
* At the point (1) -> address of bh->b_count stored in register r25.
* The point from (2) ->(4) all of affect to register 25 will be restored
from stack (r25 act as non violent register in gcc ABI).
* An step 5, *r25 = 0x02 ??? I don?t know why r25 is changed ? May be
stack on somewhere was corrupted ?*

This Oops is very difficult to replicate (about 2 hours run stress test
program). I try to increase/reduce the HZ 10 times, but the frequency of
bug is no change. And, I tried on ext2/ext3, it?s same result.

I?m really confusing now, I don?t know where the real problem is, and
what is effected with the frequency of Oops, how to debug and figure
this bug ?

I post my situation to this ML and hope to get some advice from you,

Some Oops, I uploaded on pastebin here:
http://vnoss.net/p/5783
http://vnoss.net/p/5785

Sources and assembly of __wait_on_buffer()
http://vnoss.net/p/5784


Thanks for your help,

--
nm.

GPG Key ID: 0xDD253B25
Fingerprint: 2B17 D64A 9561 A443 2ABC 1302 4641 D0B7 DD25 3B25


2008-08-03 13:38:49

by Matthew Wilcox

[permalink] [raw]
Subject: Re: Oops when read/write or mount/unmount continuously ~ 600.000 times

On Sun, Aug 03, 2008 at 07:49:50PM +0700, Hong Tran Duc wrote:
> I?m using kernel 2.4.20 with fully preemptive enable (patch & set the
> CONFIG option). My CPU is PowerPC 750FX, HDD 80GB, RAM 512,

2.4.20 was released in November 2002; almost 6 years ago. I don't think
you're going to find too many people interested in helping you debug
this. If you can reproduce the problem with something more recent (say
2.6.26 or even 2.4.36.6 if you can't use 2.6 for whatever reason), then
I think people will be more interested.

> The reasons is almost linked list on those function was broken. Ex:
> linkedlist->next linkedlist->prev = NULL or set to invalid address.
> In the situation SIGILL, the instruction pointer (NIP) is same as the
> return address register (LR).

In later kernels, we have a list debugging option which lets you find
list corruptions earlier.

> The newest Oops, I got on function __wait_on_buffer(). The main
> sequences of __wait_on_buffer() are:
> 1. put_bh -> atomic_inc(bh->b_count);
> 2. add wait queue
> 3. loop: do some thing task manipulation, call *schedule()*
> 4. remove wait queue
> 5. get_bh -> atomic_dec(bh->b_count); *<- Got Oops here, SEGV because
> bh->b_count = R25 = 0x02 *
>
> After analysis assembly code (I upload on pastebin bellow) at this
> point, I found that:
> * At the point (1) -> address of bh->b_count stored in register r25.
> * The point from (2) ->(4) all of affect to register 25 will be restored
> from stack (r25 act as non violent register in gcc ABI).
> * An step 5, *r25 = 0x02 ??? I don?t know why r25 is changed ? May be
> stack on somewhere was corrupted ?*

The implementation of __wait_on_buffer has completely changed since
then. It's probably not worth trying to debug this.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-08-03 15:19:20

by Hong Tran Duc

[permalink] [raw]
Subject: Re: Oops when read/write or mount/unmount continuously ~ 600.000 times

Matthew Wilcox wrote:
> On Sun, Aug 03, 2008 at 07:49:50PM +0700, Hong Tran Duc wrote:
>
>> The reasons is almost linked list on those function was broken. Ex:
>> linkedlist->next linkedlist->prev = NULL or set to invalid address.
>> In the situation SIGILL, the instruction pointer (NIP) is same as the
>> return address register (LR).
>>
>
> In later kernels, we have a list debugging option which lets you find
> list corruptions earlier.
>
I'm not have much experience with linux kernel architecture, so I don't
know where I can focus on.

Currently, I'm suspecting these module are affected with this Oops:
Block I/O management and filesystem, some Wait Queue ? Is that correct ?
Or would you give me some suggestion what is the most suspicious ? or
some debugging option you told above ?

Thanks for your help,


--
nm.

GPG Key ID: 0xDD253B25
Fingerprint: 2B17 D64A 9561 A443 2ABC 1302 4641 D0B7 DD25 3B25