2004-01-16 02:59:17

by Tsuchiya Yoshihiro

[permalink] [raw]
Subject: Re: filesystem bug?


Hi Stephen,

>Now, I can't tell from this whether it's a bash bug or an exit/signal
> bug, but it doesn't look like a filesystem problem for now. I'm going
> to try with a different shell to see if that helps.

I tried with /bin/zsh, and it seems you are right. The script
is working fine for about 2 hours.

So I will try to find out about EIO(inode corruption) problem next.

Thank you so much,

Yoshi

--
--
Yoshihiro Tsuchiya




2004-01-16 12:29:50

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: filesystem bug?

Hi,

On Fri, 2004-01-16 at 02:59, Tsuchiya Yoshihiro wrote:

> I tried with /bin/zsh, and it seems you are right. The script
> is working fine for about 2 hours.

Thank you for checking.

> So I will try to find out about EIO(inode corruption) problem next.

OK. Under exactly what circumstances have you seen this in the past, as
opposed to the other problem? I have not been able to reproduce this
one so far.

Cheers,
Stephen

2004-01-19 07:53:02

by Tsuchiya Yoshihiro

[permalink] [raw]
Subject: Re: filesystem bug?

Hello,

Stephen C. Tweedie wrote:

>OK. Under exactly what circumstances have you seen this in the past, as
>opposed to the other problem? I have not been able to reproduce this
>one so far.
>
>

The combinations of kernel versions and filesystem types are:
2.4.20-8 ext2
2.4.20-19.9 ext2, ext3
2.4.20-24.9 ext2
2.4.20-28.9 ext2

I do the test with mozilla-1.3.tar.gz and 6 processes in the script,
it happens with ext2 within a few hours.

I haven't seen the problem on 2.4.20,23 and 24.

So now I am testing followings:
2.4.24-pre2 ext2 (mozilla-1.3.tar.gz)
2.4.24 ext2 (nvi-1.79.tar.gz)
2.4.20 ext3 (mozilla-1.3.tar.gz)
2.4.23 ext3 (mozilla-1.3.tar.gz)
2.4.24 ext3 (mozilla-1.3.tar.gz)
2.4.20-28.9 ext3 (mozilla-1.3.tar.gz)

Other than 2.4.20-28.9, since they have been running for three days,
they seems nice at this point.

What exactly is the race condition between read_inode() and
clear_inode() you have
mentioned?

Thanks,
Yoshi
--
--
Yoshihiro Tsuchiya


2004-01-19 13:12:27

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: filesystem bug?

Hi,

On Mon, 2004-01-19 at 07:52, Tsuchiya Yoshihiro wrote:

> >OK. Under exactly what circumstances have you seen this in the past, as
> >opposed to the other problem? I have not been able to reproduce this
> >one so far.

> Other than 2.4.20-28.9, since they have been running for three days,
> they seems nice at this point.
>
> What exactly is the race condition between read_inode() and
> clear_inode() you have
> mentioned?

This one:

http://linux.bkbits.net:8080/linux-2.4/[email protected]

Cheers,
Stephen

2004-01-20 08:36:14

by Tsuchiya Yoshihiro

[permalink] [raw]
Subject: Re: filesystem bug?

Stephen C. Tweedie wrote:

> Other than 2.4.20-28.9, since they have been running for three days,
>
>>they seems nice at this point.
>>
>>What exactly is the race condition between read_inode() and
>>clear_inode() you have
>>mentioned?
>>
>>
>
>This one:
>
>http://linux.bkbits.net:8080/linux-2.4/[email protected]
>
>

Thank you. I think this one does not explain all of my problem.
1. the corrupted inode was still in the parent directory. It is
strange because unlink removes the directory entry first and then
iput deletes the inode.

2. some time, i_nlink was 0 and i_dtime was set which is I think
somewhat related with this patch, but the other time,
part of a inode block was cleaned with 0, which I do not understand
how at all.

Thank you,
Yoshi
--
--
Yoshihiro Tsuchiya


2004-01-20 16:27:40

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: filesystem bug?

Hi,

On Tue, 2004-01-20 at 08:36, Tsuchiya Yoshihiro wrote:

> >http://linux.bkbits.net:8080/linux-2.4/[email protected]

> 2. some time, i_nlink was 0 and i_dtime was set which is I think
> somewhat related with this patch, but the other time,
> part of a inode block was cleaned with 0, which I do not understand
> how at all.

Yep. I'd really need to see exactly which kernel versions these
specific problems reproduce on to take this much further, though. I'll
be travelling for the next week and a half, but I'll look for more
results once I'm back from that.

Thanks,
Stephen