2002-01-01 13:39:03

by Ville Herva

[permalink] [raw]
Subject: Ext2 groups descriptor corruption in 2.2 - Phillips's patch seems to work

A while ago Daniel Phillips posted a patch for 2.4.0-test11 to cure groups
descriptor corruption in ext2.

The corruption had been seen with 2.2 and 2.0 for a long time. It happens
most likely in situations where there are many hard links to same inode (see
first report below).

I gather this fix has gone into 2.4 mainline in different form (see Al
Viro's fix below). It never went to 2.2 (nor 2.0) afaik.

The fix applies pretty cleanly to 2.2.20, and has been tested by me and
Samuli K?rkk?inen (again, see first problem report) for about a month now.
Samuli reports that he had been able to reproduce the bug in about 10-15
backup runs with unpatched 2.2.18. With the patch applied, it hasn't
appeared in well over 20 runs. My experience is similar with 2.2.20.

I haven't tested the patch on 2.0, but the problem seems to exists in 2.0 as
well. Taking a quick look, it seems the patch can be applied to 2.0.40pre3
with minor tweaking (s/ext2_get_group_desc/get_group_desc/g, remove
le16_to_cpu() from patch, and perhaps the add the additional get_group_desc
return value error checking to 2.0 ialloc.c that 2.2 has.)

Alan, David, perhaps this would be worth applying to next 2.2/2.0 pre? Al,
Daniel, what do you think? Should the patch be tidied up before 2.2
inclusion (as was done for 2.4)?

Daniel Phillips's patch (2000-11-30 0:03:37):
http://marc.theaimsgroup.com/?l=linux-kernel&m=97554270404066&w=2

To which Al Viro replied (2000-11-30 0:33:30):
http://marc.theaimsgroup.com/?l=linux-kernel&m=97554444306920&w=2
"Yes, it is. Moreover, correct solution is slightly different and changes
ext2_get_group_desc() semantics. Wait until tomorrow, OK?"

Al Viro's fix (2000-11-30 22:13:27):
http://marc.theaimsgroup.com/?l=linux-kernel&m=97562262616388&w=2
"search for appropriate cylinder group had been taken out of the
ext2_new_inode() into helper functions - find_cg_dir(sb, parent_group) and
find_cg_other(sb, parent_group). Bug caught by Daniel (wrong bh being
dirtied when we update the free inodes counter) - fixed."


Some corruption reports:
http://marc.theaimsgroup.com/?l=linux-kernel&m=98882213127507&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=99730849009286&w=2
http://www.linuxsa.org.au/mailing-list/2000-04/309.html
http://groups.google.com/groups?q=ext2_new_inode:+Free+inodes+count+corrupted+in+group&hl=en&rnum=1&selm=734g9b%24dkd%241%40coranto.ucs.mun.ca
http://groups.google.com/groups?q=ext2_check_inodes_bitmap:+Wrong+free+inodes+count&start=20&hl=en&rnum=21&selm=linux.kernel.20000707120824.26346.qmail%40yusufg.portal2.com


-- v --

[email protected]


2002-01-02 11:39:45

by Daniel Phillips

[permalink] [raw]
Subject: Re: Ext2 groups descriptor corruption in 2.2 - Phillips's patch seems to work

On January 1, 2002 02:28 pm, Ville Herva wrote:
> A while ago Daniel Phillips posted a patch for 2.4.0-test11 to cure groups
> descriptor corruption in ext2.
>
> The corruption had been seen with 2.2 and 2.0 for a long time. It happens
> most likely in situations where there are many hard links to same inode (see
> first report below).
>
> I gather this fix has gone into 2.4 mainline in different form (see Al
> Viro's fix below). It never went to 2.2 (nor 2.0) afaik.
>
> The fix applies pretty cleanly to 2.2.20, and has been tested by me and
> Samuli K?rkk?inen (again, see first problem report) for about a month now.
> Samuli reports that he had been able to reproduce the bug in about 10-15
> backup runs with unpatched 2.2.18. With the patch applied, it hasn't
> appeared in well over 20 runs. My experience is similar with 2.2.20.
>
> I haven't tested the patch on 2.0, but the problem seems to exists in 2.0 as
> well. Taking a quick look, it seems the patch can be applied to 2.0.40pre3
> with minor tweaking (s/ext2_get_group_desc/get_group_desc/g, remove
> le16_to_cpu() from patch, and perhaps the add the additional get_group_desc
> return value error checking to 2.0 ialloc.c that 2.2 has.)
>
> Alan, David, perhaps this would be worth applying to next 2.2/2.0 pre? Al,
> Daniel, what do you think? Should the patch be tidied up before 2.2
> inclusion (as was done for 2.4)?

Since you have done such a thorough job of documenting the whole thing, why
not drop the other shoe and submit the patches?

--
Daniel

2002-01-02 11:43:45

by Alan

[permalink] [raw]
Subject: Re: Ext2 groups descriptor corruption in 2.2 - Phillips's patch seems to work

> Since you have done such a thorough job of documenting the whole thing, why
> not drop the other shoe and submit the patches?

He did 8)

I've asked him to run them past Stephen and the other ext2 folks

2002-01-04 00:03:02

by Andreas Dilger

[permalink] [raw]
Subject: Re: Ext2 groups descriptor corruption in 2.2 - Phillips's patch seems to work

On Jan 02, 2002 11:53 +0000, Alan Cox wrote:
> > Since you have done such a thorough job of documenting the whole thing, why
> > not drop the other shoe and submit the patches?
>
> He did 8)
>
> I've asked him to run them past Stephen and the other ext2 folks

Well, nobody else has spoken up on this, so I might as well. I remember
when Daniel originally posted this fix, and at the time it was the right
thing to do. Al's patch fixed more than Daniel's did, but I think it is
too much change to be adding to 2.2, so we should use the minimal fix
from Daniel, unless there are objections.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-01-04 07:47:42

by Ville Herva

[permalink] [raw]
Subject: Re: Ext2 groups descriptor corruption in 2.2 - Phillips's patch seems to work

On Thu, Jan 03, 2002 at 05:00:32PM -0700, you [Andreas Dilger] said:
> On Jan 02, 2002 11:53 +0000, Alan Cox wrote:
> > > Since you have done such a thorough job of documenting the whole thing, why
> > > not drop the other shoe and submit the patches?
> >
> > He did 8)
> >
> > I've asked him to run them past Stephen and the other ext2 folks
>
> Well, nobody else has spoken up on this, so I might as well. I remember
> when Daniel originally posted this fix, and at the time it was the right
> thing to do. Al's patch fixed more than Daniel's did,

"More", in what way? Fixed more problem cases or just better in sense that
he refromatted and splitted the long ext2_new_inode() function?

> but I think it is too much change to be adding to 2.2, so we should use
> the minimal fix from Daniel, unless there are objections.

Ok. The minimal approach is what I was after, too :).

Do you have any opionion about 2.0? The problem has been spotted on 2.0.39,
too, and the 2.0.40pre3 ext2_new_inode() is almost identical to 2.2 one
barring few trivialish changes. We plan to test the patch some more on 2.0
(we already know it compiles and boots, btw ;), but it would be nice if some
one who has played with 2.0 ext2 would give a "go ahead".


-- v --

[email protected]