2007-05-09 21:09:58

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

I've had a couple of instances of a linux-2.6 mercurial repo getting
corrupted in some odd way this morning. It looks like files are being
truncated; not to size 0, but losing something off the end.

This is on an xfs filesystem. I haven't had any crashes/oops, and I
don't think its the normal files getting filled with 0 problem. I saw
this before the most recent set of xfs updates, but it happened again
afterwards too.

Mercurial uses a strictly append-only model for updating its repo files,
but it looks like maybe an append operation didn't stick.

I'm repulling a fresh copy of the repo; I'll be able to compare
before/after. Update: yep, definitely truncated:

$ ls -l .hg-new/store/data/_documentation/pi-futex.txt.i .hg-broken/store/data/_documentation/pi-futex.txt.i
4 -rw-rw-r-- 1 jeremy jeremy 3309 May 9 09:43 .hg-broken/store/data/_documentation/pi-futex.txt.i
4 -rw-rw-r-- 1 jeremy jeremy 3797 May 9 13:38 .hg-new/store/data/_documentation/pi-futex.txt.i

also
3476 -rw-rw-r-- 1 jeremy jeremy 3558208 May 9 13:55 00manifest.i
3476 -rw-rw-r-- 1 jeremy jeremy 3555200 May 9 09:41 00manifest.i~


where 00manifest.i~ is the broken one. The files are identical up to the
truncation point.

The repo passed "hg verify" just after I pulled it, so this corruption
came about after a while.

Hm, the other possibility is that nlinks is being misreported. When
cloning a repo, mercurial will generally hard-link files where possible,
and then break the link if it sees nlink > 1. If xfs is mis-reporting
the link count, then this will cause havok. Is that possible? Seems
unlikely, but it would also explain the symptoms. I just did a linking
clone with an older kernel, and the link count is as expected.

xfs_check passes without any output, which I presume is good.

J


2007-05-09 21:56:08

by Matt Mackall

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 02:09:50PM -0700, Jeremy Fitzhardinge wrote:
> I've had a couple of instances of a linux-2.6 mercurial repo getting
> corrupted in some odd way this morning. It looks like files are being
> truncated; not to size 0, but losing something off the end.
>
> This is on an xfs filesystem. I haven't had any crashes/oops, and I
> don't think its the normal files getting filled with 0 problem. I saw
> this before the most recent set of xfs updates, but it happened again
> afterwards too.
>
> Mercurial uses a strictly append-only model for updating its repo files,
> but it looks like maybe an append operation didn't stick.

(Unless you're using the mq extension, which regularly truncates
files. But you're definitely the first person to run into this sort of
thing in any case.)

> I'm repulling a fresh copy of the repo; I'll be able to compare
> before/after. Update: yep, definitely truncated:
>
> $ ls -l .hg-new/store/data/_documentation/pi-futex.txt.i .hg-broken/store/data/_documentation/pi-futex.txt.i
> 4 -rw-rw-r-- 1 jeremy jeremy 3309 May 9 09:43 .hg-broken/store/data/_documentation/pi-futex.txt.i
> 4 -rw-rw-r-- 1 jeremy jeremy 3797 May 9 13:38 .hg-new/store/data/_documentation/pi-futex.txt.i
>
> also
> 3476 -rw-rw-r-- 1 jeremy jeremy 3558208 May 9 13:55 00manifest.i
> 3476 -rw-rw-r-- 1 jeremy jeremy 3555200 May 9 09:41 00manifest.i~
>
> where 00manifest.i~ is the broken one. The files are identical up to the
> truncation point.
>
> The repo passed "hg verify" just after I pulled it, so this corruption
> came about after a while.
>
> Hm, the other possibility is that nlinks is being misreported.

I think if the files are identical up to the truncation point, we can
rule out nlink concerns. If for some reason Mercurial's COW logic got
fooled, the result would be a bit of a jumble at the end. And you'd be
unlikely to hit any sort of race as a single user on a laptop.

Can you use hg debugindex to determine if the truncation point
corresponds with a whole delta or whether it's in the middle of a
delta?

For noninterleaved revlogs like your manifest above, the .i file is an
index build of a collection of 64-byte entries. It's pretty hard to
imagine how you'd lose 8 bytes since all I/O is done in multiples of
64-bytes. Oh, I'm misreading that - they differ by 3008 bytes, which
is 47 * 64.

--
Mathematics is the supreme nostalgia of our time.

2007-05-09 22:18:22

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Matt Mackall wrote:
>> Mercurial uses a strictly append-only model for updating its repo files,
>> but it looks like maybe an append operation didn't stick.
>>
>
> (Unless you're using the mq extension, which regularly truncates
> files. But you're definitely the first person to run into this sort of
> thing in any case.)
>

Which I am, extensively, but not on the repo that got damaged. That's
why I was wondering about the nlink issues. If I qpop a bunch of
patches after just pushing them, won't it simply truncate the file?

The repo which got damaged is the one I pull kernel.org/linux-2.6 into,
and use it as a source for clone/pull into my actual working repos.

> I think if the files are identical up to the truncation point, we can
> rule out nlink concerns. If for some reason Mercurial's COW logic got
> fooled, the result would be a bit of a jumble at the end. And you'd be
> unlikely to hit any sort of race as a single user on a laptop.
>

OK. It looks

> Can you use hg debugindex to determine if the truncation point
> corresponds with a whole delta or whether it's in the middle of a
> delta?
>

Appears to be on a delta boundary, exactly 47 revisions short:

55547 201166570 72 55486 55561 fec059c8328e 4edafd81aa44 000000000000
55548 201166642 72 55486 55562 96c054b45299 fec059c8328e 000000000000
55549 201166714 108 55486 55563 93316f167674 96c054b45299 000000000000

vs

55594 201568678 78 55486 55608 38c05617f083 5fe2b556c548 000000000000
55595 201568756 68 55486 55609 4b89463f4bdd 38c05617f083 000000000000
55596 201568824 81 55486 55610 bcdb471d4ba1 4b89463f4bdd 000000000000


> For noninterleaved revlogs like your manifest above, the .i file is an
> index build of a collection of 64-byte entries. It's pretty hard to
> imagine how you'd lose 8 bytes since all I/O is done in multiples of
> 64-bytes. Oh, I'm misreading that - they differ by 3008 bytes, which
> is 47 * 64.
>

Yep.

J

2007-05-09 22:44:46

by Matt Mackall

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 03:17:58PM -0700, Jeremy Fitzhardinge wrote:
> Matt Mackall wrote:
> >> Mercurial uses a strictly append-only model for updating its repo files,
> >> but it looks like maybe an append operation didn't stick.
> >>
> >
> > (Unless you're using the mq extension, which regularly truncates
> > files. But you're definitely the first person to run into this sort of
> > thing in any case.)
> >
>
> Which I am, extensively, but not on the repo that got damaged. That's
> why I was wondering about the nlink issues. If I qpop a bunch of
> patches after just pushing them, won't it simply truncate the file?

Yep. But it will break links before doing that. Basically all opens go
through a function that breaks links.

--
Mathematics is the supreme nostalgia of our time.

2007-05-09 22:50:47

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Matt Mackall wrote:
>> Which I am, extensively, but not on the repo that got damaged. That's
>> why I was wondering about the nlink issues. If I qpop a bunch of
>> patches after just pushing them, won't it simply truncate the file?
>>
>
> Yep. But it will break links before doing that. Basically all opens go
> through a function that breaks links.
>

Well, yes, but I was proposing the theory that xfs was misreporting the
linkcount, which would confuse the link-breaking logic, right?

J

2007-05-09 23:17:19

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 02:09:50PM -0700, Jeremy Fitzhardinge wrote:
> I've had a couple of instances of a linux-2.6 mercurial repo getting
> corrupted in some odd way this morning. It looks like files are being
> truncated; not to size 0, but losing something off the end.
>
> This is on an xfs filesystem. I haven't had any crashes/oops, and I
> don't think its the normal files getting filled with 0 problem. I saw
> this before the most recent set of xfs updates, but it happened again
> afterwards too.

It looks like the latest XFS changes haven't been pulled yet, so
it's not new code that is triggering this....

> Mercurial uses a strictly append-only model for updating its repo files,
> but it looks like maybe an append operation didn't stick.
>
> I'm repulling a fresh copy of the repo; I'll be able to compare
> before/after. Update: yep, definitely truncated:
>
> $ ls -l .hg-new/store/data/_documentation/pi-futex.txt.i .hg-broken/store/data/_documentation/pi-futex.txt.i
> 4 -rw-rw-r-- 1 jeremy jeremy 3309 May 9 09:43 .hg-broken/store/data/_documentation/pi-futex.txt.i
> 4 -rw-rw-r-- 1 jeremy jeremy 3797 May 9 13:38 .hg-new/store/data/_documentation/pi-futex.txt.i
>
> also
> 3476 -rw-rw-r-- 1 jeremy jeremy 3558208 May 9 13:55 00manifest.i
> 3476 -rw-rw-r-- 1 jeremy jeremy 3555200 May 9 09:41 00manifest.i~
>
>
> where 00manifest.i~ is the broken one. The files are identical up to the
> truncation point.

Hmmm - that is bizarre. What is the output of xfs_bmap -vvp <filename>
on each of those files?

what happens to these files after then are downloaded? Does it only
happen to append-only files or are other files affected as well?

BTW, what's the 'xfs_info <mntpt>' output for this filesystem?

> The repo passed "hg verify" just after I pulled it, so this corruption
> came about after a while.
>
> Hm, the other possibility is that nlinks is being misreported. When
> cloning a repo, mercurial will generally hard-link files where possible,
> and then break the link if it sees nlink > 1. If xfs is mis-reporting
> the link count, then this will cause havok. Is that possible? Seems
> unlikely, but it would also explain the symptoms. I just did a linking
> clone with an older kernel, and the link count is as expected.

I'd be surprised if it was a link count problem - that would cause
all sorts of other problems as well....

> xfs_check passes without any output, which I presume is good.

Yes, it means everythign is ok. You only have to worry when xfs_check
says something - it only brings bad news ;)

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-09 23:30:26

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> On Wed, May 09, 2007 at 02:09:50PM -0700, Jeremy Fitzhardinge wrote:
>
>> I've had a couple of instances of a linux-2.6 mercurial repo getting
>> corrupted in some odd way this morning. It looks like files are being
>> truncated; not to size 0, but losing something off the end.
>>
>> This is on an xfs filesystem. I haven't had any crashes/oops, and I
>> don't think its the normal files getting filled with 0 problem. I saw
>> this before the most recent set of xfs updates, but it happened again
>> afterwards too.
>>
>
> It looks like the latest XFS changes haven't been pulled yet, so
> it's not new code that is triggering this....
>

A bunch of xfs changes appeared in git this morning, I thought. But all
this first happened from a kernel compiled yesterday.

>> Mercurial uses a strictly append-only model for updating its repo files,
>> but it looks like maybe an append operation didn't stick.
>>
>> I'm repulling a fresh copy of the repo; I'll be able to compare
>> before/after. Update: yep, definitely truncated:
>>
>> $ ls -l .hg-new/store/data/_documentation/pi-futex.txt.i .hg-broken/store/data/_documentation/pi-futex.txt.i
>> 4 -rw-rw-r-- 1 jeremy jeremy 3309 May 9 09:43 .hg-broken/store/data/_documentation/pi-futex.txt.i
>> 4 -rw-rw-r-- 1 jeremy jeremy 3797 May 9 13:38 .hg-new/store/data/_documentation/pi-futex.txt.i
>>
>> also
>> 3476 -rw-rw-r-- 1 jeremy jeremy 3558208 May 9 13:55 00manifest.i
>> 3476 -rw-rw-r-- 1 jeremy jeremy 3555200 May 9 09:41 00manifest.i~
>>
>>
>> where 00manifest.i~ is the broken one. The files are identical up to the
>> truncation point.
>>
>
> Hmmm - that is bizarre. What is the output of xfs_bmap -vvp <filename>
> on each of those files?
>
00manifest.i~ is linux-2.6-broken/.hg/store/00manifest.i

$ xfs_bmap -vvp linux-2.6/.hg/store/00manifest.i linux-2.6-broken/.hg/store/00manifest.i
linux-2.6/.hg/store/00manifest.i:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..895]: 8135128..8136023 1 (270808..271703) 896
1: [896..1407]: 8207424..8207935 1 (343104..343615) 512
2: [1408..2047]: 8211520..8212159 1 (347200..347839) 640
3: [2048..3071]: 8212904..8213927 1 (348584..349607) 1024
4: [3072..4991]: 8215672..8217591 1 (351352..353271) 1920
5: [4992..6143]: 8344408..8345559 1 (480088..481239) 1152
6: [6144..6951]: 7930840..7931647 1 (66520..67327) 808
linux-2.6-broken/.hg/store/00manifest.i:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..383]: 27132064..27132447 3 (3539104..3539487) 384
1: [384..511]: 27132912..27133039 3 (3539952..3540079) 128
2: [512..895]: 27136216..27136599 3 (3543256..3543639) 384
3: [896..1151]: 27147816..27148071 3 (3554856..3555111) 256
4: [1152..1535]: 27148680..27149063 3 (3555720..3556103) 384
5: [1536..2175]: 27154152..27154791 3 (3561192..3561831) 640
6: [2176..3711]: 27158944..27160479 3 (3565984..3567519) 1536
7: [3712..4607]: 27161016..27161911 3 (3568056..3568951) 896
8: [4608..5247]: 27162880..27163519 3 (3569920..3570559) 640
9: [5248..5375]: 27164096..27164223 3 (3571136..3571263) 128
10: [5376..5759]: 27165080..27165463 3 (3572120..3572503) 384
11: [5760..5887]: 27166664..27166791 3 (3573704..3573831) 128
12: [5888..6015]: 27171400..27171527 3 (3578440..3578567) 128
13: [6016..6399]: 27172904..27173287 3 (3579944..3580327) 384
14: [6400..6527]: 27173336..27173463 3 (3580376..3580503) 128
15: [6528..6911]: 27173784..27174167 3 (3580824..3581207) 384
16: [6912..6943]: 27174568..27174599 3 (3581608..3581639) 32


> what happens to these files after then are downloaded? Does it only
> happen to append-only files or are other files affected as well?
>

I saw similar damage in another repo, but I was using the "mq" extension
on that, which means the files are no longer append-only.

I explicitly checked that repo was OK after I downloaded it. It became
broken again after a while.

It was as if the dirty inode data was dropped without being written to
disk, so once it had to read back it got a stale file length. Or
something like that - I'm just guessing.

> BTW, what's the 'xfs_info <mntpt>' output for this filesystem?
>

meta-data=/dev/vg00/homexfs isize=256 agcount=19, agsize=983040 blks
= sectsz=512 attr=1
data = bsize=4096 blocks=18350080, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=7680, version=1
= sectsz=512 sunit=0 blks
realtime =none extsz=65536 blocks=0, rtextents=0


J

2007-05-10 00:01:40

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 04:30:22PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > On Wed, May 09, 2007 at 02:09:50PM -0700, Jeremy Fitzhardinge wrote:
> >
> >> I've had a couple of instances of a linux-2.6 mercurial repo getting
> >> corrupted in some odd way this morning. It looks like files are being
> >> truncated; not to size 0, but losing something off the end.
> >>
> >> This is on an xfs filesystem. I haven't had any crashes/oops, and I
> >> don't think its the normal files getting filled with 0 problem. I saw
> >> this before the most recent set of xfs updates, but it happened again
> >> afterwards too.
> >>
> >
> > It looks like the latest XFS changes haven't been pulled yet, so
> > it's not new code that is triggering this....
> >
>
> A bunch of xfs changes appeared in git this morning, I thought. But all
> this first happened from a kernel compiled yesterday.

Ah, yes so it did - damn browser caching....

> >> Mercurial uses a strictly append-only model for updating its repo files,
> >> but it looks like maybe an append operation didn't stick.
> >>
> >> I'm repulling a fresh copy of the repo; I'll be able to compare
> >> before/after. Update: yep, definitely truncated:
> >>
> >> $ ls -l .hg-new/store/data/_documentation/pi-futex.txt.i .hg-broken/store/data/_documentation/pi-futex.txt.i
> >> 4 -rw-rw-r-- 1 jeremy jeremy 3309 May 9 09:43 .hg-broken/store/data/_documentation/pi-futex.txt.i
> >> 4 -rw-rw-r-- 1 jeremy jeremy 3797 May 9 13:38 .hg-new/store/data/_documentation/pi-futex.txt.i
> >>
> >> also
> >> 3476 -rw-rw-r-- 1 jeremy jeremy 3558208 May 9 13:55 00manifest.i
> >> 3476 -rw-rw-r-- 1 jeremy jeremy 3555200 May 9 09:41 00manifest.i~
> >>
> >>
> >> where 00manifest.i~ is the broken one. The files are identical up to the
> >> truncation point.
> >>
> >
> > Hmmm - that is bizarre. What is the output of xfs_bmap -vvp <filename>
> > on each of those files?
> >
> 00manifest.i~ is linux-2.6-broken/.hg/store/00manifest.i
>
> $ xfs_bmap -vvp linux-2.6/.hg/store/00manifest.i linux-2.6-broken/.hg/store/00manifest.i
> linux-2.6/.hg/store/00manifest.i:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
......
> 6: [6144..6951]: 7930840..7931647 1 (66520..67327) 808
> linux-2.6-broken/.hg/store/00manifest.i:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
.....
> 16: [6912..6943]: 27174568..27174599 3 (3581608..3581639) 32

Yeah, there's one extra filesystem block in the good case compared
to the broken case. If that was once good, then something has had to
truncate the file to remove that block....

> > what happens to these files after then are downloaded? Does it only
> > happen to append-only files or are other files affected as well?
> >
>
> I saw similar damage in another repo, but I was using the "mq" extension
> on that, which means the files are no longer append-only.
>
> I explicitly checked that repo was OK after I downloaded it. It became
> broken again after a while.
>
> It was as if the dirty inode data was dropped without being written to
> disk, so once it had to read back it got a stale file length. Or
> something like that - I'm just guessing.

Seems very unlikely. Have you unmounted and mounted the filesystem
(or rebooted or suspended) between the files being seen good and
the files being seen bad?

> > BTW, what's the 'xfs_info <mntpt>' output for this filesystem?
> >
>
> meta-data=/dev/vg00/homexfs isize=256 agcount=19, agsize=983040 blks
> = sectsz=512 attr=1
> data = bsize=4096 blocks=18350080, imaxpct=25
> = sunit=0 swidth=0 blks, unwritten=1
> naming =version 2 bsize=4096
> log =internal bsize=4096 blocks=7680, version=1
> = sectsz=512 sunit=0 blks
> realtime =none extsz=65536 blocks=0, rtextents=0

Ok, nothing unusual there.

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 00:04:38

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> Seems very unlikely. Have you unmounted and mounted the filesystem
> (or rebooted or suspended) between the files being seen good and
> the files being seen bad?
>

There was definitely a suspend-resume, and maybe a reboot. I'll try
again later on.

J

2007-05-10 00:49:43

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 05:04:36PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > Seems very unlikely. Have you unmounted and mounted the filesystem
> > (or rebooted or suspended) between the files being seen good and
> > the files being seen bad?
> >
>
> There was definitely a suspend-resume, and maybe a reboot. I'll try
> again later on.

Suspend-resume, eh?

There's an immediate suspect. Can you test this specifically for us?
i.e. download a known good file set, do some stuff, suspend, resume,
then check the files? If it doesn't show up the first time, can
you do it a few times just to rule it out?

If suspend/resume does cause the problem, can you try again but this
time please run 'xfs_freeze -f <mtpt>' on the filesystem before
suspend, and then 'xfs_freeze -u <mtpt>' after the resume and see if
the problem still occurs?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 00:54:18

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> Suspend-resume, eh?
>
> There's an immediate suspect. Can you test this specifically for us?
> i.e. download a known good file set, do some stuff, suspend, resume,
> then check the files? If it doesn't show up the first time, can
> you do it a few times just to rule it out?
>

Well, I've been doing suspend-resume with xfs for a while without
problems; the problems seem to be recent and easily repeatable. Which
just means that it could be a new suspend-resume problem, of course.

> If suspend/resume does cause the problem, can you try again but this
> time please run 'xfs_freeze -f <mtpt>' on the filesystem before
> suspend, and then 'xfs_freeze -u <mtpt>' after the resume and see if
> the problem still occurs?

OK, but I tend to find that xfs_freeze ends up locking up large parts of
the system... (For example, I tried to do the xfs_freeze + lvm snapshot
thing, but the lvm snapshot just blocked on the frozen filesystem until
I unfroze it). But I'll try it out. Hm, is there some script I can
stick it into?

J

2007-05-10 01:26:32

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Wed, May 09, 2007 at 05:54:09PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > Suspend-resume, eh?
> >
> > There's an immediate suspect. Can you test this specifically for us?
> > i.e. download a known good file set, do some stuff, suspend, resume,
> > then check the files? If it doesn't show up the first time, can
> > you do it a few times just to rule it out?
>
> Well, I've been doing suspend-resume with xfs for a while without
> problems; the problems seem to be recent and easily repeatable. Which
> just means that it could be a new suspend-resume problem, of course.

Ok. I'm just trying to find a relatively simple test case for the
problem - seeing as you seem to be able to reliably reproduce this
we should be able to work out the trigger...

> > If suspend/resume does cause the problem, can you try again but this
> > time please run 'xfs_freeze -f <mtpt>' on the filesystem before
> > suspend, and then 'xfs_freeze -u <mtpt>' after the resume and see if
> > the problem still occurs?
>
> OK, but I tend to find that xfs_freeze ends up locking up large parts of
> the system... (For example, I tried to do the xfs_freeze + lvm snapshot
> thing, but the lvm snapshot just blocked on the frozen filesystem until
> I unfroze it).

Yes, because LVM snapshot freezes the filesystem for you - if you've
already frozen the filesystem the snapshot will block until you unfreeze
it and then it will freeze it itself to take the snapshot.

> But I'll try it out. Hm, is there some script I can
> stick it into?

No idea.....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 15:39:17

by Matt Mackall

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 07:46:33AM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > On Wed, May 09, 2007 at 05:54:09PM -0700, Jeremy Fitzhardinge wrote:
> >
> >> David Chinner wrote:
> >>
> >>> Suspend-resume, eh?
> >>>
> >>> There's an immediate suspect. Can you test this specifically for us?
> >>> i.e. download a known good file set, do some stuff, suspend, resume,
> >>> then check the files? If it doesn't show up the first time, can
> >>> you do it a few times just to rule it out?
> >>>
> >> Well, I've been doing suspend-resume with xfs for a while without
> >> problems; the problems seem to be recent and easily repeatable. Which
> >> just means that it could be a new suspend-resume problem, of course.
> >>
> >
> > Ok. I'm just trying to find a relatively simple test case for the
> > problem - seeing as you seem to be able to reliably reproduce this
> > we should be able to work out the trigger...
> >
>
> OK, I was able to reproduce it reliably with a script with did basically:
>
> for i in `seq 20`; do
> hg clone -U --pull a b-$i
> hg verify b-$i # always OK
> umount /home
> sleep 5
> mount /home
> hg verify b-$i # often found truncated files
> done
>
>
> No suspend/resumes involved. The trees are linux kernel ones, so fairly
> large, but small enough to fit entirely in core. My script also
> captured xfs_bmap before/after output for files which had tended to be
> corrupted in the past, but unfortunately none of them got corrupted in
> these tests. But I do have all the trees lying around to extract more
> detail for if you like.
>
> Interestingly, the corruption happened in each case around the same
> place in the tree, often in the sata drivers. I wonder if that was just
> related to the timing of this script.

I guess this pins it as an XFS problem pretty solidly.

This test looks like it should consist solely of open-for-append and
write on about 20k files in the target directory. Because of the
--pull, no hardlinks are involved. It shouldn't be all that different
from doing tar cf - a | tar xf - b.

The files get visited in alphabetical order, so the start of the
corruption may be telling.

--
Mathematics is the supreme nostalgia of our time.

2007-05-10 21:14:23

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 07:46:33AM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > On Wed, May 09, 2007 at 05:54:09PM -0700, Jeremy Fitzhardinge wrote:
> >
> >> David Chinner wrote:
> >>
> >>> Suspend-resume, eh?
> >>>
> >>> There's an immediate suspect. Can you test this specifically for us?
> >>> i.e. download a known good file set, do some stuff, suspend, resume,
> >>> then check the files? If it doesn't show up the first time, can
> >>> you do it a few times just to rule it out?
> >>>
> >> Well, I've been doing suspend-resume with xfs for a while without
> >> problems; the problems seem to be recent and easily repeatable. Which
> >> just means that it could be a new suspend-resume problem, of course.
> >>
> >
> > Ok. I'm just trying to find a relatively simple test case for the
> > problem - seeing as you seem to be able to reliably reproduce this
> > we should be able to work out the trigger...
> >
>
> OK, I was able to reproduce it reliably with a script with did basically:
>
> for i in `seq 20`; do
> hg clone -U --pull a b-$i
> hg verify b-$i # always OK
> umount /home
> sleep 5
> mount /home
> hg verify b-$i # often found truncated files
> done
>
>
> No suspend/resumes involved. The trees are linux kernel ones, so fairly
> large, but small enough to fit entirely in core. My script also
> captured xfs_bmap before/after output for files which had tended to be
> corrupted in the past, but unfortunately none of them got corrupted in
> these tests. But I do have all the trees lying around to extract more
> detail for if you like.

Ok, so most of the of the integrity errors are processed by an
error like this:

drivers/scsi/sata_sil24.c index contains -98 extra bytes
unpacking file drivers/scsi/sata_sil24.c 5715cdfceaca: Error -5 while decompressing data

That's an -EIO and not a normal error to report. Are there any
errors in dmesg or syslog corresponding to this?

The errors tend to imply problems decompressing and patching files,
not that truncates are occurring once the files have been patched.
Can you check that what is being pulled from the repository is correct
before it gets uncompressed?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 21:23:56

by Matt Mackall

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Fri, May 11, 2007 at 07:13:48AM +1000, David Chinner wrote:
> On Thu, May 10, 2007 at 07:46:33AM -0700, Jeremy Fitzhardinge wrote:
> > David Chinner wrote:
> > > On Wed, May 09, 2007 at 05:54:09PM -0700, Jeremy Fitzhardinge wrote:
> > >
> > >> David Chinner wrote:
> > >>
> > >>> Suspend-resume, eh?
> > >>>
> > >>> There's an immediate suspect. Can you test this specifically for us?
> > >>> i.e. download a known good file set, do some stuff, suspend, resume,
> > >>> then check the files? If it doesn't show up the first time, can
> > >>> you do it a few times just to rule it out?
> > >>>
> > >> Well, I've been doing suspend-resume with xfs for a while without
> > >> problems; the problems seem to be recent and easily repeatable. Which
> > >> just means that it could be a new suspend-resume problem, of course.
> > >>
> > >
> > > Ok. I'm just trying to find a relatively simple test case for the
> > > problem - seeing as you seem to be able to reliably reproduce this
> > > we should be able to work out the trigger...
> > >
> >
> > OK, I was able to reproduce it reliably with a script with did basically:
> >
> > for i in `seq 20`; do
> > hg clone -U --pull a b-$i
> > hg verify b-$i # always OK
> > umount /home
> > sleep 5
> > mount /home
> > hg verify b-$i # often found truncated files
> > done
> >
> >
> > No suspend/resumes involved. The trees are linux kernel ones, so fairly
> > large, but small enough to fit entirely in core. My script also
> > captured xfs_bmap before/after output for files which had tended to be
> > corrupted in the past, but unfortunately none of them got corrupted in
> > these tests. But I do have all the trees lying around to extract more
> > detail for if you like.
>
> Ok, so most of the of the integrity errors are processed by an
> error like this:
>
> drivers/scsi/sata_sil24.c index contains -98 extra bytes
> unpacking file drivers/scsi/sata_sil24.c 5715cdfceaca: Error -5 while decompressing data
>
> That's an -EIO and not a normal error to report. Are there any
> errors in dmesg or syslog corresponding to this?
>
> The errors tend to imply problems decompressing and patching files,
> not that truncates are occurring once the files have been patched.
> Can you check that what is being pulled from the repository is correct
> before it gets uncompressed?

Notice that verify gets run twice. Before unmount, it's fine, after
remount, it's not.

That message saying that the file contains -98 extra bytes is
Mercurial detecting the truncation before if tries to read and decompress the
truncated bit.

--
Mathematics is the supreme nostalgia of our time.

2007-05-10 21:32:33

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> Ok, so most of the of the integrity errors are processed by an
> error like this:
>
> drivers/scsi/sata_sil24.c index contains -98 extra bytes
> unpacking file drivers/scsi/sata_sil24.c 5715cdfceaca: Error -5 while decompressing data
>
> That's an -EIO and not a normal error to report. Are there any
> errors in dmesg or syslog corresponding to this?
>

No, that's an error code from zlib:
#define Z_BUF_ERROR (-5)

I think it means it got a truncated buffer while decompressing.

> The errors tend to imply problems decompressing and patching files,
> not that truncates are occurring once the files have been patched.
> Can you check that what is being pulled from the repository is correct
> before it gets uncompressed?
>

The hg verify checks the integrity of all the files by decompressing
them and making sure their sha1 hashes are correct. The fact that the
first hg verify passed is a very strong check that the whole repo's
integrity is sound, both in structure and content. The second failing
hg verify's messages are all related to truncation. I haven't checked
this comprehensively, but in every instance I've checked the files are
identical up to the truncation point. All the error messages are
consistent with pure truncation, not content differences or IO errors.

J

2007-05-10 21:42:06

by Chuck Ebbert

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Jeremy Fitzhardinge wrote:
> David Chinner wrote:
>> Seems very unlikely. Have you unmounted and mounted the filesystem
>> (or rebooted or suspended) between the files being seen good and
>> the files being seen bad?
>>
>
> There was definitely a suspend-resume, and maybe a reboot. I'll try
> again later on.
>

What CPU architecture is this happening on? Not i686 with PAE by
any chance?

2007-05-10 21:46:40

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Chuck Ebbert wrote:
> What CPU architecture is this happening on? Not i686 with PAE by
> any chance?

Yes. Why?

J

2007-05-10 21:49:52

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Jeremy Fitzhardinge wrote:
> I haven't checked
> this comprehensively

I just did. They're all pure truncations.

J

2007-05-10 21:51:51

by Chuck Ebbert

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Jeremy Fitzhardinge wrote:
> Chuck Ebbert wrote:
>> What CPU architecture is this happening on? Not i686 with PAE by
>> any chance?
>
> Yes. Why?

I have a bug report where NFS files are corrupted only with PAE clients.
Corruption is at the end of the (newly untarred) files. Doesn't happen
without PAE.

2007-05-10 21:54:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Chuck Ebbert wrote:
> Jeremy Fitzhardinge wrote:
>
>> Chuck Ebbert wrote:
>>
>>> What CPU architecture is this happening on? Not i686 with PAE by
>>> any chance?
>>>
>> Yes. Why?
>>
>
> I have a bug report where NFS files are corrupted only with PAE clients.
> Corruption is at the end of the (newly untarred) files. Doesn't happen
> without PAE.
>

Hm, suggestive, but I'm not convinced. Two differences to this situation:

1. Immediately after the clone ("untar"), the contents are completely
OK; it's only after a umount/mount cycle to problems appear
2. There's no corruption as such; the files are just too short. And
it seems they're at a previously OK length, not some random size.

J

2007-05-10 22:58:57

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 02:54:25PM -0700, Jeremy Fitzhardinge wrote:
> Chuck Ebbert wrote:
> > Jeremy Fitzhardinge wrote:
> >
> >> Chuck Ebbert wrote:
> >>
> >>> What CPU architecture is this happening on? Not i686 with PAE by
> >>> any chance?
> >>>
> >> Yes. Why?
> >>
> >
> > I have a bug report where NFS files are corrupted only with PAE clients.
> > Corruption is at the end of the (newly untarred) files. Doesn't happen
> > without PAE.
> >
>
> Hm, suggestive, but I'm not convinced. Two differences to this situation:
>
> 1. Immediately after the clone ("untar"), the contents are completely
> OK; it's only after a umount/mount cycle to problems appear
> 2. There's no corruption as such; the files are just too short. And
> it seems they're at a previously OK length, not some random size.

Just to confirm this isn't a result of a recent change, can you reproduce
this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - I've juggling
some many things at once it's easy to forget little things).

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 23:07:43

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> Just to confirm this isn't a result of a recent change, can you reproduce
> this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - I've juggling
> some many things at once it's easy to forget little things).

It is the result of a recent change. I had seen no problem until around
2.6.21-git8-11. I will try again with a plain 2.6.21 kernel, just to
confirm.

J

2007-05-10 23:08:22

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 05:51:29PM -0400, Chuck Ebbert wrote:
> Jeremy Fitzhardinge wrote:
> > Chuck Ebbert wrote:
> >> What CPU architecture is this happening on? Not i686 with PAE by
> >> any chance?
> >
> > Yes. Why?
>
> I have a bug report where NFS files are corrupted only with PAE clients.
> Corruption is at the end of the (newly untarred) files. Doesn't happen
> without PAE.

Chuck, can you post a pointer to this thread?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 23:27:54

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 04:07:30PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > Just to confirm this isn't a result of a recent change, can you reproduce
> > this on a 2.6.20 or 2.6.21 kernel? (sorry if you've already done this - I've juggling
> > some many things at once it's easy to forget little things).
>
> It is the result of a recent change. I had seen no problem until around
> 2.6.21-git8-11. I will try again with a plain 2.6.21 kernel, just to
> confirm.

Ok, this is important to kow becase we merged a mod around that time
that changes the way we handle the updates to the file size i.e. the
fix for the NULL-files-on-crash problem:

http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542

and that means the size of the file is not updated to the incore
cached inode until after the data write is complete. The symptoms
being seen would match with a inode-not-being-written-after-last-
data-write-bug in this mod....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-10 23:49:46

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> Ok, this is important to kow becase we merged a mod around that time
> that changes the way we handle the updates to the file size i.e. the
> fix for the NULL-files-on-crash problem:
>
> http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542
>
> and that means the size of the file is not updated to the incore
> cached inode until after the data write is complete. The symptoms
> being seen would match with a inode-not-being-written-after-last-
> data-write-bug in this mod....
>

Yes, that does look like a good candidate. Should I try to
before-and-after this change?

J

2007-05-11 00:33:22

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Thu, May 10, 2007 at 04:49:35PM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > Ok, this is important to kow becase we merged a mod around that time
> > that changes the way we handle the updates to the file size i.e. the
> > fix for the NULL-files-on-crash problem:
> >
> > http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba87ea699ebd9dd577bf055ebc4a98200e337542
> >
> > and that means the size of the file is not updated to the incore
> > cached inode until after the data write is complete. The symptoms
> > being seen would match with a inode-not-being-written-after-last-
> > data-write-bug in this mod....
> >
>
> Yes, that does look like a good candidate. Should I try to
> before-and-after this change?

Yes please!

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-11 14:48:33

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
>> Yes, that does look like a good candidate. Should I try to
>> before-and-after this change?
>>
>
> Yes please!
>

OK, definite result. Before ba87ea699ebd9dd577bf055ebc4a98200e337542:
all OK. After: truncated files.

I also got a bmap of a particular truncated file,
linux-clone-test-1/.hg/store/00manifest.i, diffing before with after:

--rw-r--r-- 1 root root 3558208 May 11 01:16 /home/jeremy/hg/linux-clone-test-1/.hg/store/00manifest.i
+-rw-r--r-- 1 root root 3541760 May 11 01:16 /home/jeremy/hg/linux-clone-test-1/.hg/store/00manifest.i

16: [6144..6271]: 18141808..18141935 2 (2413168..2413295) 128
17: [6272..6399]: 18140608..18140735 2 (2411968..2412095) 128
18: [6400..6911]: 18136464..18136975 2 (2407824..2408335) 512
- 19: [6912..6951]: 18136336..18136375 2 (2407696..2407735) 40
+ 19: [6912..6919]: 18136336..18136343 2 (2407696..2407703) 8


J

2007-05-12 07:56:48

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Fri, May 11, 2007 at 07:48:26AM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> >> Yes, that does look like a good candidate. Should I try to
> >> before-and-after this change?
> >>
> >
> > Yes please!
> >
>
> OK, definite result. Before ba87ea699ebd9dd577bf055ebc4a98200e337542:
> all OK. After: truncated files.
>
> I also got a bmap of a particular truncated file,
> linux-clone-test-1/.hg/store/00manifest.i, diffing before with after:
>
> --rw-r--r-- 1 root root 3558208 May 11 01:16 /home/jeremy/hg/linux-clone-test-1/.hg/store/00manifest.i
> +-rw-r--r-- 1 root root 3541760 May 11 01:16 /home/jeremy/hg/linux-clone-test-1/.hg/store/00manifest.i
>
> 16: [6144..6271]: 18141808..18141935 2 (2413168..2413295) 128
> 17: [6272..6399]: 18140608..18140735 2 (2411968..2412095) 128
> 18: [6400..6911]: 18136464..18136975 2 (2407824..2408335) 512
> - 19: [6912..6951]: 18136336..18136375 2 (2407696..2407735) 40
> + 19: [6912..6919]: 18136336..18136343 2 (2407696..2407703) 8

Ok, thanks for confirming the cause of the regression. I'll post a patch
when I've got something for you to try.

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-12 11:25:48

by Jan Engelhardt

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?


On May 10 2007 10:38, Matt Mackall wrote:
>>
>> for i in `seq 20`; do
>> hg clone -U --pull a b-$i
>> hg verify b-$i # always OK
>> umount /home
>> sleep 5
>> mount /home
>> hg verify b-$i # often found truncated files
>> done
>>
[...]
>
>This test looks like it should consist solely of open-for-append and
>write on about 20k files in the target directory. Because of the
>--pull, no hardlinks are involved. It shouldn't be all that different
>from doing tar cf - a | tar xf - b.
>
>The files get visited in alphabetical order, so the start of the
>corruption may be telling.

You should not assume alphabetical order. Filesystems may be free to
reorder things and return them (1) randomly like in a hash (2) by
creation time during readdir().


Jan
--

2007-05-12 11:27:27

by Jan Engelhardt

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?


On May 10 2007 14:54, Jeremy Fitzhardinge wrote:
>>>> What CPU architecture is this happening on? Not i686 with PAE by
>>>> any chance?
>>>>
>>> Yes. Why?
>>
>> I have a bug report where NFS files are corrupted only with PAE clients.
>> Corruption is at the end of the (newly untarred) files. Doesn't happen
>> without PAE.
>
>Hm, suggestive, but I'm not convinced. Two differences to this situation:
>
> 1. Immediately after the clone ("untar"), the contents are completely
> OK; it's only after a umount/mount cycle to problems appear

And if you do a "sync" rather than umount/mount?

> 2. There's no corruption as such; the files are just too short. And
> it seems they're at a previously OK length, not some random size.


Jan
--

2007-05-12 12:47:16

by Matt Mackall

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Sat, May 12, 2007 at 01:21:41PM +0200, Jan Engelhardt wrote:
>
> On May 10 2007 10:38, Matt Mackall wrote:
> >>
> >> for i in `seq 20`; do
> >> hg clone -U --pull a b-$i
> >> hg verify b-$i # always OK
> >> umount /home
> >> sleep 5
> >> mount /home
> >> hg verify b-$i # often found truncated files
> >> done
> >>
> [...]
> >
> >This test looks like it should consist solely of open-for-append and
> >write on about 20k files in the target directory. Because of the
> >--pull, no hardlinks are involved. It shouldn't be all that different
> >from doing tar cf - a | tar xf - b.
> >
> >The files get visited in alphabetical order, so the start of the
> >corruption may be telling.
>
> You should not assume alphabetical order. Filesystems may be free to
> reorder things and return them (1) randomly like in a hash (2) by
> creation time during readdir().

There is no assumption. Mercurial explicitly visits files in
alphabetical order for the above commands.

--
Mathematics is the supreme nostalgia of our time.

2007-05-12 13:52:18

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Sat, May 12, 2007 at 01:23:27PM +0200, Jan Engelhardt wrote:
>
> On May 10 2007 14:54, Jeremy Fitzhardinge wrote:
> >>>> What CPU architecture is this happening on? Not i686 with PAE by
> >>>> any chance?
> >>>>
> >>> Yes. Why?
> >>
> >> I have a bug report where NFS files are corrupted only with PAE clients.
> >> Corruption is at the end of the (newly untarred) files. Doesn't happen
> >> without PAE.
> >
> >Hm, suggestive, but I'm not convinced. Two differences to this situation:
> >
> > 1. Immediately after the clone ("untar"), the contents are completely
> > OK; it's only after a umount/mount cycle to problems appear
>
> And if you do a "sync" rather than umount/mount?

I doubt it will matter - I don't think we are marking the inode dirty at
the right point.

The change that was at fault modifies the way we update the file
size on the inode. We added an in-memory copy of the file size to
the in-memory copy of the disk inode's file size that we already
keep. We now only update the disk inode's (in memory copy) file size
on I/O completion. Because the generic code writes the inode out
before waiting for I/O to complete, the old file size gets written
out instead of the new one.

If the write was to extending the file into an existing block there
would be no delalloc transaction to redirty the inode (happens on
log I/O completion). Hence when the I/O completes and the file size
gets updated to the in-core disk inode (which is marked dirty), the
linux inode remains clean. As a result, a sync will never flush the
inode to get the updated file size to disk.

What I don't understand is that on unmount dirty xfs inodes get
written out. Clearly this is not happening - either there's a hole
in the writeback logic (unlikely - it was unchanged) or we've missed
some case where we need to update the filesize and mark the inode
dirty.

Hmmmm - if the write was just a short append to the file, then the
block that was written to should already be mapped. Then we'll just
look up the extent by doing a BMAPI_READ lookup, set the type to
IOMAP_READ and add the block to ioend we are building.

The type IOMAP_READ determines the I/O completion behaviour - in this case
it is xfs_end_bio_read(), which fails to update the file size....

Bingo.

A patch for you to try, Jeremy. I've just started a test run on it...

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group


---
fs/xfs/linux-2.6/xfs_aops.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_aops.c 2007-05-11 16:03:59.000000000 +1000
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c 2007-05-12 23:35:42.691464799 +1000
@@ -973,8 +973,9 @@ xfs_page_state_convert(

bh = head = page_buffers(page);
offset = page_offset(page);
- flags = -1;
- type = IOMAP_READ;
+ iomap_valid = 0;
+ flags = BMAPI_READ;
+ type = IOMAP_NEW;

/* TODO: cleanup count and page_dirty */

@@ -1004,14 +1005,14 @@ xfs_page_state_convert(
*
* Third case, an unmapped buffer was found, and we are
* in a path where we need to write the whole page out.
- */
+ */
if (buffer_unwritten(bh) || buffer_delay(bh) ||
((buffer_uptodate(bh) || PageUptodate(page)) &&
!buffer_mapped(bh) && (unmapped || startio))) {
- /*
+ /*
* Make sure we don't use a read-only iomap
*/
- if (flags == BMAPI_READ)
+ if (flags == BMAPI_READ)
iomap_valid = 0;

if (buffer_unwritten(bh)) {
@@ -1060,7 +1061,7 @@ xfs_page_state_convert(
* That means it must already have extents allocated
* underneath it. Map the extent by reading it.
*/
- if (!iomap_valid || type != IOMAP_READ) {
+ if (!iomap_valid || flags != BMAPI_READ) {
flags = BMAPI_READ;
size = xfs_probe_cluster(inode, page, bh,
head, 1);
@@ -1071,7 +1072,15 @@ xfs_page_state_convert(
iomap_valid = xfs_iomap_valid(&iomap, offset);
}

- type = IOMAP_READ;
+ /*
+ * We set the type to IOMAP_NEW in case we are doing a
+ * small write at EOF that is extending the file but
+ * without needing an allocation. We need to update the
+ * file size on I/O completion in this case so it is
+ * the same case as having just allocated a new extent
+ * that we are writing into for the first time.
+ */
+ type = IOMAP_NEW;
if (!test_and_set_bit(BH_Lock, &bh->b_state)) {
ASSERT(buffer_mapped(bh));
if (iomap_valid)

2007-05-12 14:56:24

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> What I don't understand is that on unmount dirty xfs inodes get
> written out. Clearly this is not happening - either there's a hole
> in the writeback logic (unlikely - it was unchanged) or we've missed
> some case where we need to update the filesize and mark the inode
> dirty.
>
> Hmmmm - if the write was just a short append to the file, then the
> block that was written to should already be mapped. Then we'll just
> look up the extent by doing a BMAPI_READ lookup, set the type to
> IOMAP_READ and add the block to ioend we are building.
>

Well, that result I mailed you showed that the difference was just over
16k, and that there was a 32 block difference in the final extent
length. Does that fit with this theory?

> The type IOMAP_READ determines the I/O completion behaviour - in this case
> it is xfs_end_bio_read(), which fails to update the file size....
>
> Bingo.
>
> A patch for you to try, Jeremy. I've just started a test run on it...
>

Thanks, I'll give it a spin. Have you reproduced the bug yourself?


J

2007-05-14 20:20:31

by Jan Engelhardt

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?


On May 12 2007 07:46, Matt Mackall wrote:
>>
>> You should not assume alphabetical order. Filesystems may be free to
>> reorder things and return them (1) randomly like in a hash (2) by
>> creation time during readdir().
>
>There is no assumption. Mercurial explicitly visits files in
>alphabetical order for the above commands.

But who says that

for i in {a..z}; do ## {..} is a bash3 extension
touch $i;
done;

actually makes readdir() return them in the same order?

Jan
--

2007-05-14 20:27:54

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

Jan Engelhardt wrote:
> On May 12 2007 07:46, Matt Mackall wrote:
>
>>> You should not assume alphabetical order. Filesystems may be free to
>>> reorder things and return them (1) randomly like in a hash (2) by
>>> creation time during readdir().
>>>
>> There is no assumption. Mercurial explicitly visits files in
>> alphabetical order for the above commands.
>>
>
> But who says that
>
> for i in {a..z}; do ## {..} is a bash3 extension
> touch $i;
> done;
>
> actually makes readdir() return them in the same order?

Nobody. But doing a readdir, sorting the results and visiting the files
in that order does mean you'll visit them in alphabetical order. Hence
"explicitly visits".

J

2007-05-15 00:15:24

by David Chinner

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

On Sat, May 12, 2007 at 07:56:20AM -0700, Jeremy Fitzhardinge wrote:
> David Chinner wrote:
> > What I don't understand is that on unmount dirty xfs inodes get
> > written out. Clearly this is not happening - either there's a hole
> > in the writeback logic (unlikely - it was unchanged) or we've missed
> > some case where we need to update the filesize and mark the inode
> > dirty.
> >
> > Hmmmm - if the write was just a short append to the file, then the
> > block that was written to should already be mapped. Then we'll just
> > look up the extent by doing a BMAPI_READ lookup, set the type to
> > IOMAP_READ and add the block to ioend we are building.
> >
>
> Well, that result I mailed you showed that the difference was just over
> 16k, and that there was a 32 block difference in the final extent
> length. Does that fit with this theory?

Yes - because when we do specualtive allocation of 64k beyond EOF
by default on appends....

> > The type IOMAP_READ determines the I/O completion behaviour - in this case
> > it is xfs_end_bio_read(), which fails to update the file size....
> >
> > Bingo.
> >
> > A patch for you to try, Jeremy. I've just started a test run on it...
> >
>
> Thanks, I'll give it a spin. Have you reproduced the bug yourself?

No, not yet. I haven't had chance because I'm travelling at the moment....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-05-15 19:24:28

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git10/11: files getting truncated on xfs? or maybe an nlink problem?

David Chinner wrote:
> A patch for you to try, Jeremy. I've just started a test run on it...
>

OK, it seems to work. I haven't given it an overnight run, but its run
longer without failing than it did before.

J