2013-06-08 03:13:58

by Zhao Hongjiang

[permalink] [raw]
Subject: xfstests failure generic/239

Hi all,

I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
case a hundred times, it certainly hit the failure several times. The failure msg is as follow:

FSTYP -- ext4
PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline

generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
--- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
+++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
@@ -1,2 +1,515 @@
QA output created by 239
+hostname: Host name lookup failure
Silence is golden
+0: 0x0
+1: 0x0
+2: 0x0
+3: 0x0
...

I found that from 2.6.39 the test case occur this failure and it is is introduced by this
commit: 198868f35de99e7197829314076e5465c37e4cc5
ext4: Use single thread to perform DIO unwritten convertion

When i revert this patch, the failure seems go away. But when i see
http://oss.sgi.com/archives/xfs/2010-06/msg00269.html, it said that
"This contains core direct I/O changes to allow the filesystem to delay AIO completion,
as well as a patch to fix XFS. ext4 also has the same issue, and from a quick look also
doesn't properly complete unwritten extent conversions for synchronous direct I/O, but
I'll leave that for someone more familar to figure out."

I also run the test against 3.4.24 rt-tree with revert the patch which i found before, but
the hundred times tests all failure.

So, things maybe not simple, i doubt the patch i found isn't the root cause.

Regards,
- Hongjiang



2013-06-08 22:30:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>
> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>
> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> @@ -1,2 +1,515 @@
> QA output created by 239
> +hostname: Host name lookup failure

OK, so this hostname failure is weird; I'm not sure what's causing
this, but this I presume unrelated to the failure at hand.

> Silence is golden
> +0: 0x0
> +1: 0x0
> +2: 0x0
> +3: 0x0

This indicates a problem. Test generic/239 is running
aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
4k write with a buffer containing non-zero contents to a sparse file,
and once the I/O has completed, it uses pread to read it back, using
the same descriptor, so it is doing the read using direct I/O. It
then checks to see if the read returns zero or not.

The "XX: 0x0" lines indicates that buffer is zero, which implies that
somehow aio_complete() is getting called before the uninitialized to
initialized conversion is taking place. I'm not seeing how this is
happening, though, so I'm a bit puzzled. If there are any unwritten
extents, we don't call aio_complete() in ext4_end_io_dio(), but
instead the conversion is queued via a call to ext4_add_compete_io(),
and and aio_done() is only called on the iocb after the conversion is
complete.

Can anyone see something that I might be missing?

- Ted

P.S. Zhao, what was the hardware that you using to find this failure?
I'm not seeing it, but then again if the failure is only happening
once every few hundred runs that might explain it. I'm perhaps
wondering if we should add a mode to aio-dio-hole-filling-race.c which
allows it to try the race a large number of times, instead of just
once.

P.P.S. One thought.... perhaps it might be useful to have a debug
mode where we use queue_delayed_work() to submit the conversion
request to the workqueue. It will of course make certain workloads
run slow as molasses, but it might expose some races so we can see
them more easily.

2013-06-09 02:38:12

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/6/9 6:30, Theodore Ts'o wrote:
> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>
>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>
>> FSTYP -- ext4
>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>
>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>> @@ -1,2 +1,515 @@
>> QA output created by 239
>> +hostname: Host name lookup failure
>
> OK, so this hostname failure is weird; I'm not sure what's causing
> this, but this I presume unrelated to the failure at hand.
>
>> Silence is golden
>> +0: 0x0
>> +1: 0x0
>> +2: 0x0
>> +3: 0x0
>
> This indicates a problem. Test generic/239 is running
> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> 4k write with a buffer containing non-zero contents to a sparse file,
> and once the I/O has completed, it uses pread to read it back, using
> the same descriptor, so it is doing the read using direct I/O. It
> then checks to see if the read returns zero or not.
>
> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> somehow aio_complete() is getting called before the uninitialized to
> initialized conversion is taking place. I'm not seeing how this is
> happening, though, so I'm a bit puzzled. If there are any unwritten
> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> instead the conversion is queued via a call to ext4_add_compete_io(),
> and and aio_done() is only called on the iocb after the conversion is
> complete.
>
> Can anyone see something that I might be missing?
>
> - Ted
>
> P.S. Zhao, what was the hardware that you using to find this failure?

I'am use x86 and start a qumu-kvm to run the test.

> I'm not seeing it, but then again if the failure is only happening
> once every few hundred runs that might explain it. I'm perhaps

And as Christoph Hellwig said "the race is very easy to hit by using QEMU with
native AIO support on a sparse image, and the result is filesystem corruption
in the guest", i also run the test on the host, but nerver see the failure.

> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> allows it to try the race a large number of times, instead of just
> once.

This seems necessary, i'll give a patch for this.
- Zhao
>
> P.P.S. One thought.... perhaps it might be useful to have a debug
> mode where we use queue_delayed_work() to submit the conversion
> request to the workqueue. It will of course make certain workloads
> run slow as molasses, but it might expose some races so we can see
> them more easily.
>
> .
>



2013-06-09 03:29:24

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/6/9 10:37, Zhao Hongjiang wrote:
> On 2013/6/9 6:30, Theodore Ts'o wrote:
>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>
>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>
>>> FSTYP -- ext4
>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>
>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>> @@ -1,2 +1,515 @@
>>> QA output created by 239
>>> +hostname: Host name lookup failure
>>
>> OK, so this hostname failure is weird; I'm not sure what's causing
>> this, but this I presume unrelated to the failure at hand.
>>
>>> Silence is golden
>>> +0: 0x0
>>> +1: 0x0
>>> +2: 0x0
>>> +3: 0x0
>>
>> This indicates a problem. Test generic/239 is running
>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>> 4k write with a buffer containing non-zero contents to a sparse file,
>> and once the I/O has completed, it uses pread to read it back, using
>> the same descriptor, so it is doing the read using direct I/O. It
>> then checks to see if the read returns zero or not.
>>
>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>> somehow aio_complete() is getting called before the uninitialized to
>> initialized conversion is taking place. I'm not seeing how this is
>> happening, though, so I'm a bit puzzled. If there are any unwritten
>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>> instead the conversion is queued via a call to ext4_add_compete_io(),
>> and and aio_done() is only called on the iocb after the conversion is
>> complete.
>>
>> Can anyone see something that I might be missing?
>>
>> - Ted
>>
>> P.S. Zhao, what was the hardware that you using to find this failure?
>
> I'am use x86 and start a qumu-kvm to run the test.
>
>> I'm not seeing it, but then again if the failure is only happening
>> once every few hundred runs that might explain it. I'm perhaps
>
> And as Christoph Hellwig said "the race is very easy to hit by using QEMU with
> native AIO support on a sparse image, and the result is filesystem corruption
> in the guest", i also run the test on the host, but nerver see the failure.

Sorry, i run the test on host a hundred times again, there are six failures.

>
>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>> allows it to try the race a large number of times, instead of just
>> once.
>
> This seems necessary, i'll give a patch for this.
> - Zhao
>>
>> P.P.S. One thought.... perhaps it might be useful to have a debug
>> mode where we use queue_delayed_work() to submit the conversion
>> request to the workqueue. It will of course make certain workloads
>> run slow as molasses, but it might expose some races so we can see
>> them more easily.
>>
>> .
>>
>



2013-06-09 04:42:37

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/6/9 11:29, Zhao Hongjiang wrote:
> On 2013/6/9 10:37, Zhao Hongjiang wrote:
>> On 2013/6/9 6:30, Theodore Ts'o wrote:
>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>>
>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>>
>>>> FSTYP -- ext4
>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>>
>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>>> @@ -1,2 +1,515 @@
>>>> QA output created by 239
>>>> +hostname: Host name lookup failure
>>>
>>> OK, so this hostname failure is weird; I'm not sure what's causing
>>> this, but this I presume unrelated to the failure at hand.
>>>
>>>> Silence is golden
>>>> +0: 0x0
>>>> +1: 0x0
>>>> +2: 0x0
>>>> +3: 0x0
>>>
>>> This indicates a problem. Test generic/239 is running
>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>>> 4k write with a buffer containing non-zero contents to a sparse file,
>>> and once the I/O has completed, it uses pread to read it back, using
>>> the same descriptor, so it is doing the read using direct I/O. It
>>> then checks to see if the read returns zero or not.
>>>
>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>>> somehow aio_complete() is getting called before the uninitialized to
>>> initialized conversion is taking place. I'm not seeing how this is
>>> happening, though, so I'm a bit puzzled. If there are any unwritten
>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>>> instead the conversion is queued via a call to ext4_add_compete_io(),
>>> and and aio_done() is only called on the iocb after the conversion is
>>> complete.
>>>
>>> Can anyone see something that I might be missing?
>>>
>>> - Ted
>>>
>>> P.S. Zhao, what was the hardware that you using to find this failure?
>>
>> I'am use x86 and start a qumu-kvm to run the test.
more informations:

8 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz

Host: scsi0 Channel: 02 Id: 01 Lun: 00
Vendor: LSI Model: MegaRAID SAS RMB Rev: 1.40
Type: Direct-Access ANSI SCSI revision: 05

>>
>>> I'm not seeing it, but then again if the failure is only happening
>>> once every few hundred runs that might explain it. I'm perhaps
>>
>> And as Christoph Hellwig said "the race is very easy to hit by using QEMU with
>> native AIO support on a sparse image, and the result is filesystem corruption
>> in the guest", i also run the test on the host, but nerver see the failure.
>
> Sorry, i run the test on host a hundred times again, there are six failures.
>
>>
>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>>> allows it to try the race a large number of times, instead of just
>>> once.
>>
>> This seems necessary, i'll give a patch for this.
>> - Zhao
>>>
>>> P.P.S. One thought.... perhaps it might be useful to have a debug
>>> mode where we use queue_delayed_work() to submit the conversion
>>> request to the workqueue. It will of course make certain workloads
>>> run slow as molasses, but it might expose some races so we can see
>>> them more easily.
>>>
>>> .
>>>
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>



2013-06-25 07:18:39

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

Hi, Ted

How about this problem? Is there some progress?

Hongjiang

On 2013/6/9 6:30, Theodore Ts'o wrote:
> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>
>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>
>> FSTYP -- ext4
>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>
>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>> @@ -1,2 +1,515 @@
>> QA output created by 239
>> +hostname: Host name lookup failure
>
> OK, so this hostname failure is weird; I'm not sure what's causing
> this, but this I presume unrelated to the failure at hand.
>
>> Silence is golden
>> +0: 0x0
>> +1: 0x0
>> +2: 0x0
>> +3: 0x0
>
> This indicates a problem. Test generic/239 is running
> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> 4k write with a buffer containing non-zero contents to a sparse file,
> and once the I/O has completed, it uses pread to read it back, using
> the same descriptor, so it is doing the read using direct I/O. It
> then checks to see if the read returns zero or not.
>
> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> somehow aio_complete() is getting called before the uninitialized to
> initialized conversion is taking place. I'm not seeing how this is
> happening, though, so I'm a bit puzzled. If there are any unwritten
> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> instead the conversion is queued via a call to ext4_add_compete_io(),
> and and aio_done() is only called on the iocb after the conversion is
> complete.
>
> Can anyone see something that I might be missing?
>
> - Ted
>
> P.S. Zhao, what was the hardware that you using to find this failure?
> I'm not seeing it, but then again if the failure is only happening
> once every few hundred runs that might explain it. I'm perhaps
> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> allows it to try the race a large number of times, instead of just
> once.
>
> P.P.S. One thought.... perhaps it might be useful to have a debug
> mode where we use queue_delayed_work() to submit the conversion
> request to the workqueue. It will of course make certain workloads
> run slow as molasses, but it might expose some races so we can see
> them more easily.
>
> .
>



2013-07-30 03:29:11

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

Hi, jack

I test the latest kernel 3.11-rc2 and it seems the problem is fix by the follow patch:
commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4: use io_end for multiple bios).
But it's so difficult to backport to kernel 3.4-stable, any suggestion for this?

Regards,
Zhao


On 2013/6/9 6:30, Theodore Ts'o wrote:
> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>
>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>
>> FSTYP -- ext4
>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>
>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>> @@ -1,2 +1,515 @@
>> QA output created by 239
>> +hostname: Host name lookup failure
>
> OK, so this hostname failure is weird; I'm not sure what's causing
> this, but this I presume unrelated to the failure at hand.
>
>> Silence is golden
>> +0: 0x0
>> +1: 0x0
>> +2: 0x0
>> +3: 0x0
>
> This indicates a problem. Test generic/239 is running
> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> 4k write with a buffer containing non-zero contents to a sparse file,
> and once the I/O has completed, it uses pread to read it back, using
> the same descriptor, so it is doing the read using direct I/O. It
> then checks to see if the read returns zero or not.
>
> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> somehow aio_complete() is getting called before the uninitialized to
> initialized conversion is taking place. I'm not seeing how this is
> happening, though, so I'm a bit puzzled. If there are any unwritten
> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> instead the conversion is queued via a call to ext4_add_compete_io(),
> and and aio_done() is only called on the iocb after the conversion is
> complete.
>
> Can anyone see something that I might be missing?
>
> - Ted
>
> P.S. Zhao, what was the hardware that you using to find this failure?
> I'm not seeing it, but then again if the failure is only happening
> once every few hundred runs that might explain it. I'm perhaps
> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> allows it to try the race a large number of times, instead of just
> once.
>
> P.P.S. One thought.... perhaps it might be useful to have a debug
> mode where we use queue_delayed_work() to submit the conversion
> request to the workqueue. It will of course make certain workloads
> run slow as molasses, but it might expose some races so we can see
> them more easily.
>
> .
>



2013-07-30 15:48:03

by Jan Kara

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
> Hi, jack
>
> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
> use io_end for multiple bios). But it's so difficult to backport to
> kernel 3.4-stable, any suggestion for this?
Backporting that patch to stable kernels is no-go. It is far to intrusive
for stable kernels. I was looking for a while how that patch could fix the
problem you were observing. I think there is a subtle race possible when
AIO DIO write completes before __blockdev_direct_IO() returns. In that case
we set iocb->private to NULL in ext4_end_io_dio() but we also key off
iocb->private in ext4_ext_direct_IO() as:
if (iocb->private)
ext4_inode_aio_set(inode, NULL);

So in the case above we forget to reset inode's AIO pointer. That can then
cause strange effects with unwritten extent handling (although I admit I'm
not sure whether it can also cause the failure you observe) and
97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
easily check whether you are hitting that bug or not by changing the above
condition from testing iocb->private to testing some private variable...
E.g. you could declare io_end and set it to NULL one level up in
ext4_ext_direct_IO() and then test io_end != NULL in that condition.

Honza

> On 2013/6/9 6:30, Theodore Ts'o wrote:
> > On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
> >>
> >> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> >> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
> >>
> >> FSTYP -- ext4
> >> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
> >>
> >> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> >> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> >> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> >> @@ -1,2 +1,515 @@
> >> QA output created by 239
> >> +hostname: Host name lookup failure
> >
> > OK, so this hostname failure is weird; I'm not sure what's causing
> > this, but this I presume unrelated to the failure at hand.
> >
> >> Silence is golden
> >> +0: 0x0
> >> +1: 0x0
> >> +2: 0x0
> >> +3: 0x0
> >
> > This indicates a problem. Test generic/239 is running
> > aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> > 4k write with a buffer containing non-zero contents to a sparse file,
> > and once the I/O has completed, it uses pread to read it back, using
> > the same descriptor, so it is doing the read using direct I/O. It
> > then checks to see if the read returns zero or not.
> >
> > The "XX: 0x0" lines indicates that buffer is zero, which implies that
> > somehow aio_complete() is getting called before the uninitialized to
> > initialized conversion is taking place. I'm not seeing how this is
> > happening, though, so I'm a bit puzzled. If there are any unwritten
> > extents, we don't call aio_complete() in ext4_end_io_dio(), but
> > instead the conversion is queued via a call to ext4_add_compete_io(),
> > and and aio_done() is only called on the iocb after the conversion is
> > complete.
> >
> > Can anyone see something that I might be missing?
> >
> > - Ted
> >
> > P.S. Zhao, what was the hardware that you using to find this failure?
> > I'm not seeing it, but then again if the failure is only happening
> > once every few hundred runs that might explain it. I'm perhaps
> > wondering if we should add a mode to aio-dio-hole-filling-race.c which
> > allows it to try the race a large number of times, instead of just
> > once.
> >
> > P.P.S. One thought.... perhaps it might be useful to have a debug
> > mode where we use queue_delayed_work() to submit the conversion
> > request to the workqueue. It will of course make certain workloads
> > run slow as molasses, but it might expose some races so we can see
> > them more easily.
> >
> > .
> >
>
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-07-31 02:42:45

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/7/30 23:48, Jan Kara wrote:
> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
>> Hi, jack
>>
>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
>> use io_end for multiple bios). But it's so difficult to backport to
>> kernel 3.4-stable, any suggestion for this?
> Backporting that patch to stable kernels is no-go. It is far to intrusive
> for stable kernels. I was looking for a while how that patch could fix the
> problem you were observing. I think there is a subtle race possible when
> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
> iocb->private in ext4_ext_direct_IO() as:
> if (iocb->private)
> ext4_inode_aio_set(inode, NULL);
>
> So in the case above we forget to reset inode's AIO pointer. That can then
> cause strange effects with unwritten extent handling (although I admit I'm
> not sure whether it can also cause the failure you observe) and
> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
> easily check whether you are hitting that bug or not by changing the above
> condition from testing iocb->private to testing some private variable...
> E.g. you could declare io_end and set it to NULL one level up in
> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
>
Thanks for your reply first.
I change the code like the follow:

@@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
struct inode *inode = file->f_mapping->host;
ssize_t ret;
size_t count = iov_length(iov, nr_segs);
+ ext4_io_end_t *io_end = NULL;

loff_t final_size = offset + count;
if (rw == WRITE && final_size <= inode->i_size) {
@@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
iocb->private = NULL;
EXT4_I(inode)->cur_aio_dio = NULL;
if (!is_sync_kiocb(iocb)) {
- ext4_io_end_t *io_end =
- ext4_init_io_end(inode, GFP_NOFS);
+ io_end = ext4_init_io_end(inode, GFP_NOFS);
if (!io_end)
return -ENOMEM;
io_end->flag |= EXT4_IO_END_DIRECT;
@@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
ext4_end_io_dio,
NULL,
DIO_LOCKING);
- if (iocb->private)
+ if (io_end != NULL) {
+ printk("Zhao Hongjiang Ext4 test!\n");
EXT4_I(inode)->cur_aio_dio = NULL;
+ }
/*
* The io_end structure takes a reference to the inode,
* that structure needs to be destroyed and the

And the print come out when i run the test everytime. So i think the test hit the bug that you
mentioned, Am i right or miss something?

Regards,
Zhao

>> On 2013/6/9 6:30, Theodore Ts'o wrote:
>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>>
>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>>
>>>> FSTYP -- ext4
>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>>
>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>>> @@ -1,2 +1,515 @@
>>>> QA output created by 239
>>>> +hostname: Host name lookup failure
>>>
>>> OK, so this hostname failure is weird; I'm not sure what's causing
>>> this, but this I presume unrelated to the failure at hand.
>>>
>>>> Silence is golden
>>>> +0: 0x0
>>>> +1: 0x0
>>>> +2: 0x0
>>>> +3: 0x0
>>>
>>> This indicates a problem. Test generic/239 is running
>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>>> 4k write with a buffer containing non-zero contents to a sparse file,
>>> and once the I/O has completed, it uses pread to read it back, using
>>> the same descriptor, so it is doing the read using direct I/O. It
>>> then checks to see if the read returns zero or not.
>>>
>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>>> somehow aio_complete() is getting called before the uninitialized to
>>> initialized conversion is taking place. I'm not seeing how this is
>>> happening, though, so I'm a bit puzzled. If there are any unwritten
>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>>> instead the conversion is queued via a call to ext4_add_compete_io(),
>>> and and aio_done() is only called on the iocb after the conversion is
>>> complete.
>>>
>>> Can anyone see something that I might be missing?
>>>
>>> - Ted
>>>
>>> P.S. Zhao, what was the hardware that you using to find this failure?
>>> I'm not seeing it, but then again if the failure is only happening
>>> once every few hundred runs that might explain it. I'm perhaps
>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>>> allows it to try the race a large number of times, instead of just
>>> once.
>>>
>>> P.P.S. One thought.... perhaps it might be useful to have a debug
>>> mode where we use queue_delayed_work() to submit the conversion
>>> request to the workqueue. It will of course make certain workloads
>>> run slow as molasses, but it might expose some races so we can see
>>> them more easily.
>>>
>>> .




2013-07-31 14:13:42

by Jan Kara

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
> On 2013/7/30 23:48, Jan Kara wrote:
> > On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
> >> Hi, jack
> >>
> >> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
> >> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
> >> use io_end for multiple bios). But it's so difficult to backport to
> >> kernel 3.4-stable, any suggestion for this?
> > Backporting that patch to stable kernels is no-go. It is far to intrusive
> > for stable kernels. I was looking for a while how that patch could fix the
> > problem you were observing. I think there is a subtle race possible when
> > AIO DIO write completes before __blockdev_direct_IO() returns. In that case
> > we set iocb->private to NULL in ext4_end_io_dio() but we also key off
> > iocb->private in ext4_ext_direct_IO() as:
> > if (iocb->private)
> > ext4_inode_aio_set(inode, NULL);
> >
> > So in the case above we forget to reset inode's AIO pointer. That can then
> > cause strange effects with unwritten extent handling (although I admit I'm
> > not sure whether it can also cause the failure you observe) and
> > 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
> > easily check whether you are hitting that bug or not by changing the above
> > condition from testing iocb->private to testing some private variable...
> > E.g. you could declare io_end and set it to NULL one level up in
> > ext4_ext_direct_IO() and then test io_end != NULL in that condition.
> >
> Thanks for your reply first.
> I change the code like the follow:
>
> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> struct inode *inode = file->f_mapping->host;
> ssize_t ret;
> size_t count = iov_length(iov, nr_segs);
> + ext4_io_end_t *io_end = NULL;
>
> loff_t final_size = offset + count;
> if (rw == WRITE && final_size <= inode->i_size) {
> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> iocb->private = NULL;
> EXT4_I(inode)->cur_aio_dio = NULL;
> if (!is_sync_kiocb(iocb)) {
> - ext4_io_end_t *io_end =
> - ext4_init_io_end(inode, GFP_NOFS);
> + io_end = ext4_init_io_end(inode, GFP_NOFS);
> if (!io_end)
> return -ENOMEM;
> io_end->flag |= EXT4_IO_END_DIRECT;
> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> ext4_end_io_dio,
> NULL,
> DIO_LOCKING);
> - if (iocb->private)
> + if (io_end != NULL) {
> + printk("Zhao Hongjiang Ext4 test!\n");
> EXT4_I(inode)->cur_aio_dio = NULL;
> + }
> /*
> * The io_end structure takes a reference to the inode,
> * that structure needs to be destroyed and the
>
> And the print come out when i run the test everytime. So i think the test
> hit the bug that you mentioned, Am i right or miss something?
It is not a bug that you hit the branch with printk(). It would be a bug
if the debug check looked like:
if (io_end != NULL) {
if (iocb->private == NULL)
printk("Bug happened!\n");
EXT4_I(inode)->cur_aio_dio = NULL;
}

Honza

> >> On 2013/6/9 6:30, Theodore Ts'o wrote:
> >>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
> >>>>
> >>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> >>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
> >>>>
> >>>> FSTYP -- ext4
> >>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
> >>>>
> >>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> >>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> >>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> >>>> @@ -1,2 +1,515 @@
> >>>> QA output created by 239
> >>>> +hostname: Host name lookup failure
> >>>
> >>> OK, so this hostname failure is weird; I'm not sure what's causing
> >>> this, but this I presume unrelated to the failure at hand.
> >>>
> >>>> Silence is golden
> >>>> +0: 0x0
> >>>> +1: 0x0
> >>>> +2: 0x0
> >>>> +3: 0x0
> >>>
> >>> This indicates a problem. Test generic/239 is running
> >>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> >>> 4k write with a buffer containing non-zero contents to a sparse file,
> >>> and once the I/O has completed, it uses pread to read it back, using
> >>> the same descriptor, so it is doing the read using direct I/O. It
> >>> then checks to see if the read returns zero or not.
> >>>
> >>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> >>> somehow aio_complete() is getting called before the uninitialized to
> >>> initialized conversion is taking place. I'm not seeing how this is
> >>> happening, though, so I'm a bit puzzled. If there are any unwritten
> >>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> >>> instead the conversion is queued via a call to ext4_add_compete_io(),
> >>> and and aio_done() is only called on the iocb after the conversion is
> >>> complete.
> >>>
> >>> Can anyone see something that I might be missing?
> >>>
> >>> - Ted
> >>>
> >>> P.S. Zhao, what was the hardware that you using to find this failure?
> >>> I'm not seeing it, but then again if the failure is only happening
> >>> once every few hundred runs that might explain it. I'm perhaps
> >>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> >>> allows it to try the race a large number of times, instead of just
> >>> once.
> >>>
> >>> P.P.S. One thought.... perhaps it might be useful to have a debug
> >>> mode where we use queue_delayed_work() to submit the conversion
> >>> request to the workqueue. It will of course make certain workloads
> >>> run slow as molasses, but it might expose some races so we can see
> >>> them more easily.
> >>>
> >>> .
>
>
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-08-01 02:06:00

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

Hi, Jan

It hit this bug, the "Bug happened!" is come out everytime while the test is fail.
Any suggestion for fix this?

Regards,
Zhao

On 2013/7/31 22:13, Jan Kara wrote:
> On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
>> On 2013/7/30 23:48, Jan Kara wrote:
>>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
>>>> Hi, jack
>>>>
>>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
>>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
>>>> use io_end for multiple bios). But it's so difficult to backport to
>>>> kernel 3.4-stable, any suggestion for this?
>>> Backporting that patch to stable kernels is no-go. It is far to intrusive
>>> for stable kernels. I was looking for a while how that patch could fix the
>>> problem you were observing. I think there is a subtle race possible when
>>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
>>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
>>> iocb->private in ext4_ext_direct_IO() as:
>>> if (iocb->private)
>>> ext4_inode_aio_set(inode, NULL);
>>>
>>> So in the case above we forget to reset inode's AIO pointer. That can then
>>> cause strange effects with unwritten extent handling (although I admit I'm
>>> not sure whether it can also cause the failure you observe) and
>>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
>>> easily check whether you are hitting that bug or not by changing the above
>>> condition from testing iocb->private to testing some private variable...
>>> E.g. you could declare io_end and set it to NULL one level up in
>>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
>>>
>> Thanks for your reply first.
>> I change the code like the follow:
>>
>> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>> struct inode *inode = file->f_mapping->host;
>> ssize_t ret;
>> size_t count = iov_length(iov, nr_segs);
>> + ext4_io_end_t *io_end = NULL;
>>
>> loff_t final_size = offset + count;
>> if (rw == WRITE && final_size <= inode->i_size) {
>> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>> iocb->private = NULL;
>> EXT4_I(inode)->cur_aio_dio = NULL;
>> if (!is_sync_kiocb(iocb)) {
>> - ext4_io_end_t *io_end =
>> - ext4_init_io_end(inode, GFP_NOFS);
>> + io_end = ext4_init_io_end(inode, GFP_NOFS);
>> if (!io_end)
>> return -ENOMEM;
>> io_end->flag |= EXT4_IO_END_DIRECT;
>> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>> ext4_end_io_dio,
>> NULL,
>> DIO_LOCKING);
>> - if (iocb->private)
>> + if (io_end != NULL) {
>> + printk("Zhao Hongjiang Ext4 test!\n");
>> EXT4_I(inode)->cur_aio_dio = NULL;
>> + }
>> /*
>> * The io_end structure takes a reference to the inode,
>> * that structure needs to be destroyed and the
>>
>> And the print come out when i run the test everytime. So i think the test
>> hit the bug that you mentioned, Am i right or miss something?
> It is not a bug that you hit the branch with printk(). It would be a bug
> if the debug check looked like:
> if (io_end != NULL) {
> if (iocb->private == NULL)
> printk("Bug happened!\n");
> EXT4_I(inode)->cur_aio_dio = NULL;
> }
>
> Honza
>
>>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
>>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>>>>
>>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>>>>
>>>>>> FSTYP -- ext4
>>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>>>>
>>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>>>>> @@ -1,2 +1,515 @@
>>>>>> QA output created by 239
>>>>>> +hostname: Host name lookup failure
>>>>>
>>>>> OK, so this hostname failure is weird; I'm not sure what's causing
>>>>> this, but this I presume unrelated to the failure at hand.
>>>>>
>>>>>> Silence is golden
>>>>>> +0: 0x0
>>>>>> +1: 0x0
>>>>>> +2: 0x0
>>>>>> +3: 0x0
>>>>>
>>>>> This indicates a problem. Test generic/239 is running
>>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>>>>> 4k write with a buffer containing non-zero contents to a sparse file,
>>>>> and once the I/O has completed, it uses pread to read it back, using
>>>>> the same descriptor, so it is doing the read using direct I/O. It
>>>>> then checks to see if the read returns zero or not.
>>>>>
>>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>>>>> somehow aio_complete() is getting called before the uninitialized to
>>>>> initialized conversion is taking place. I'm not seeing how this is
>>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
>>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
>>>>> and and aio_done() is only called on the iocb after the conversion is
>>>>> complete.
>>>>>
>>>>> Can anyone see something that I might be missing?
>>>>>
>>>>> - Ted
>>>>>
>>>>> P.S. Zhao, what was the hardware that you using to find this failure?
>>>>> I'm not seeing it, but then again if the failure is only happening
>>>>> once every few hundred runs that might explain it. I'm perhaps
>>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>>>>> allows it to try the race a large number of times, instead of just
>>>>> once.
>>>>>
>>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
>>>>> mode where we use queue_delayed_work() to submit the conversion
>>>>> request to the workqueue. It will of course make certain workloads
>>>>> run slow as molasses, but it might expose some races so we can see
>>>>> them more easily.
>>>>>
>>>>> .



2013-08-01 08:49:43

by Jan Kara

[permalink] [raw]
Subject: Re: xfstests failure generic/239

Hi,

On Thu 01-08-13 10:05:08, Zhao Hongjiang wrote:
> It hit this bug, the "Bug happened!" is come out everytime while the test
> is fail. Any suggestion for fix this?
OK, so the test is still failing after using io_end instead of
iocb->private? If yes, I'm not sure where the problem exactly is, sorry.

Honza

> On 2013/7/31 22:13, Jan Kara wrote:
> > On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
> >> On 2013/7/30 23:48, Jan Kara wrote:
> >>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
> >>>> Hi, jack
> >>>>
> >>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
> >>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
> >>>> use io_end for multiple bios). But it's so difficult to backport to
> >>>> kernel 3.4-stable, any suggestion for this?
> >>> Backporting that patch to stable kernels is no-go. It is far to intrusive
> >>> for stable kernels. I was looking for a while how that patch could fix the
> >>> problem you were observing. I think there is a subtle race possible when
> >>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
> >>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
> >>> iocb->private in ext4_ext_direct_IO() as:
> >>> if (iocb->private)
> >>> ext4_inode_aio_set(inode, NULL);
> >>>
> >>> So in the case above we forget to reset inode's AIO pointer. That can then
> >>> cause strange effects with unwritten extent handling (although I admit I'm
> >>> not sure whether it can also cause the failure you observe) and
> >>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
> >>> easily check whether you are hitting that bug or not by changing the above
> >>> condition from testing iocb->private to testing some private variable...
> >>> E.g. you could declare io_end and set it to NULL one level up in
> >>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
> >>>
> >> Thanks for your reply first.
> >> I change the code like the follow:
> >>
> >> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >> struct inode *inode = file->f_mapping->host;
> >> ssize_t ret;
> >> size_t count = iov_length(iov, nr_segs);
> >> + ext4_io_end_t *io_end = NULL;
> >>
> >> loff_t final_size = offset + count;
> >> if (rw == WRITE && final_size <= inode->i_size) {
> >> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >> iocb->private = NULL;
> >> EXT4_I(inode)->cur_aio_dio = NULL;
> >> if (!is_sync_kiocb(iocb)) {
> >> - ext4_io_end_t *io_end =
> >> - ext4_init_io_end(inode, GFP_NOFS);
> >> + io_end = ext4_init_io_end(inode, GFP_NOFS);
> >> if (!io_end)
> >> return -ENOMEM;
> >> io_end->flag |= EXT4_IO_END_DIRECT;
> >> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >> ext4_end_io_dio,
> >> NULL,
> >> DIO_LOCKING);
> >> - if (iocb->private)
> >> + if (io_end != NULL) {
> >> + printk("Zhao Hongjiang Ext4 test!\n");
> >> EXT4_I(inode)->cur_aio_dio = NULL;
> >> + }
> >> /*
> >> * The io_end structure takes a reference to the inode,
> >> * that structure needs to be destroyed and the
> >>
> >> And the print come out when i run the test everytime. So i think the test
> >> hit the bug that you mentioned, Am i right or miss something?
> > It is not a bug that you hit the branch with printk(). It would be a bug
> > if the debug check looked like:
> > if (io_end != NULL) {
> > if (iocb->private == NULL)
> > printk("Bug happened!\n");
> > EXT4_I(inode)->cur_aio_dio = NULL;
> > }
> >
> > Honza
> >
> >>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
> >>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
> >>>>>>
> >>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> >>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
> >>>>>>
> >>>>>> FSTYP -- ext4
> >>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
> >>>>>>
> >>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> >>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> >>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> >>>>>> @@ -1,2 +1,515 @@
> >>>>>> QA output created by 239
> >>>>>> +hostname: Host name lookup failure
> >>>>>
> >>>>> OK, so this hostname failure is weird; I'm not sure what's causing
> >>>>> this, but this I presume unrelated to the failure at hand.
> >>>>>
> >>>>>> Silence is golden
> >>>>>> +0: 0x0
> >>>>>> +1: 0x0
> >>>>>> +2: 0x0
> >>>>>> +3: 0x0
> >>>>>
> >>>>> This indicates a problem. Test generic/239 is running
> >>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> >>>>> 4k write with a buffer containing non-zero contents to a sparse file,
> >>>>> and once the I/O has completed, it uses pread to read it back, using
> >>>>> the same descriptor, so it is doing the read using direct I/O. It
> >>>>> then checks to see if the read returns zero or not.
> >>>>>
> >>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> >>>>> somehow aio_complete() is getting called before the uninitialized to
> >>>>> initialized conversion is taking place. I'm not seeing how this is
> >>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
> >>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> >>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
> >>>>> and and aio_done() is only called on the iocb after the conversion is
> >>>>> complete.
> >>>>>
> >>>>> Can anyone see something that I might be missing?
> >>>>>
> >>>>> - Ted
> >>>>>
> >>>>> P.S. Zhao, what was the hardware that you using to find this failure?
> >>>>> I'm not seeing it, but then again if the failure is only happening
> >>>>> once every few hundred runs that might explain it. I'm perhaps
> >>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> >>>>> allows it to try the race a large number of times, instead of just
> >>>>> once.
> >>>>>
> >>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
> >>>>> mode where we use queue_delayed_work() to submit the conversion
> >>>>> request to the workqueue. It will of course make certain workloads
> >>>>> run slow as molasses, but it might expose some races so we can see
> >>>>> them more easily.
> >>>>>
> >>>>> .
>
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-08-01 09:10:48

by Jan Kara

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On Thu 01-08-13 10:49:41, Jan Kara wrote:
> Hi,
>
> On Thu 01-08-13 10:05:08, Zhao Hongjiang wrote:
> > It hit this bug, the "Bug happened!" is come out everytime while the test
> > is fail. Any suggestion for fix this?
> OK, so the test is still failing after using io_end instead of
> iocb->private? If yes, I'm not sure where the problem exactly is, sorry.
Just one addition: But from the fact that debug printk triggers we see
that really the problem happens when DIO is completed before
__blockdev_direct_IO() returns and thus extent conversion can race with
parts of __blockdev_direct_IO() or ext4_ext_direct_IO(). That is impossible
after my patch because ext4_ext_direct_IO() holds a reference to io_end so
it gets queued for completion only after __blockdev_direct_IO() returns. So
at least I somewhat understand why my patch makes a difference.


Honza

> > On 2013/7/31 22:13, Jan Kara wrote:
> > > On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
> > >> On 2013/7/30 23:48, Jan Kara wrote:
> > >>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
> > >>>> Hi, jack
> > >>>>
> > >>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
> > >>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
> > >>>> use io_end for multiple bios). But it's so difficult to backport to
> > >>>> kernel 3.4-stable, any suggestion for this?
> > >>> Backporting that patch to stable kernels is no-go. It is far to intrusive
> > >>> for stable kernels. I was looking for a while how that patch could fix the
> > >>> problem you were observing. I think there is a subtle race possible when
> > >>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
> > >>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
> > >>> iocb->private in ext4_ext_direct_IO() as:
> > >>> if (iocb->private)
> > >>> ext4_inode_aio_set(inode, NULL);
> > >>>
> > >>> So in the case above we forget to reset inode's AIO pointer. That can then
> > >>> cause strange effects with unwritten extent handling (although I admit I'm
> > >>> not sure whether it can also cause the failure you observe) and
> > >>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
> > >>> easily check whether you are hitting that bug or not by changing the above
> > >>> condition from testing iocb->private to testing some private variable...
> > >>> E.g. you could declare io_end and set it to NULL one level up in
> > >>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
> > >>>
> > >> Thanks for your reply first.
> > >> I change the code like the follow:
> > >>
> > >> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> > >> struct inode *inode = file->f_mapping->host;
> > >> ssize_t ret;
> > >> size_t count = iov_length(iov, nr_segs);
> > >> + ext4_io_end_t *io_end = NULL;
> > >>
> > >> loff_t final_size = offset + count;
> > >> if (rw == WRITE && final_size <= inode->i_size) {
> > >> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> > >> iocb->private = NULL;
> > >> EXT4_I(inode)->cur_aio_dio = NULL;
> > >> if (!is_sync_kiocb(iocb)) {
> > >> - ext4_io_end_t *io_end =
> > >> - ext4_init_io_end(inode, GFP_NOFS);
> > >> + io_end = ext4_init_io_end(inode, GFP_NOFS);
> > >> if (!io_end)
> > >> return -ENOMEM;
> > >> io_end->flag |= EXT4_IO_END_DIRECT;
> > >> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> > >> ext4_end_io_dio,
> > >> NULL,
> > >> DIO_LOCKING);
> > >> - if (iocb->private)
> > >> + if (io_end != NULL) {
> > >> + printk("Zhao Hongjiang Ext4 test!\n");
> > >> EXT4_I(inode)->cur_aio_dio = NULL;
> > >> + }
> > >> /*
> > >> * The io_end structure takes a reference to the inode,
> > >> * that structure needs to be destroyed and the
> > >>
> > >> And the print come out when i run the test everytime. So i think the test
> > >> hit the bug that you mentioned, Am i right or miss something?
> > > It is not a bug that you hit the branch with printk(). It would be a bug
> > > if the debug check looked like:
> > > if (io_end != NULL) {
> > > if (iocb->private == NULL)
> > > printk("Bug happened!\n");
> > > EXT4_I(inode)->cur_aio_dio = NULL;
> > > }
> > >
> > > Honza
> > >
> > >>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
> > >>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
> > >>>>>>
> > >>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> > >>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
> > >>>>>>
> > >>>>>> FSTYP -- ext4
> > >>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
> > >>>>>>
> > >>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> > >>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> > >>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> > >>>>>> @@ -1,2 +1,515 @@
> > >>>>>> QA output created by 239
> > >>>>>> +hostname: Host name lookup failure
> > >>>>>
> > >>>>> OK, so this hostname failure is weird; I'm not sure what's causing
> > >>>>> this, but this I presume unrelated to the failure at hand.
> > >>>>>
> > >>>>>> Silence is golden
> > >>>>>> +0: 0x0
> > >>>>>> +1: 0x0
> > >>>>>> +2: 0x0
> > >>>>>> +3: 0x0
> > >>>>>
> > >>>>> This indicates a problem. Test generic/239 is running
> > >>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> > >>>>> 4k write with a buffer containing non-zero contents to a sparse file,
> > >>>>> and once the I/O has completed, it uses pread to read it back, using
> > >>>>> the same descriptor, so it is doing the read using direct I/O. It
> > >>>>> then checks to see if the read returns zero or not.
> > >>>>>
> > >>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> > >>>>> somehow aio_complete() is getting called before the uninitialized to
> > >>>>> initialized conversion is taking place. I'm not seeing how this is
> > >>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
> > >>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> > >>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
> > >>>>> and and aio_done() is only called on the iocb after the conversion is
> > >>>>> complete.
> > >>>>>
> > >>>>> Can anyone see something that I might be missing?
> > >>>>>
> > >>>>> - Ted
> > >>>>>
> > >>>>> P.S. Zhao, what was the hardware that you using to find this failure?
> > >>>>> I'm not seeing it, but then again if the failure is only happening
> > >>>>> once every few hundred runs that might explain it. I'm perhaps
> > >>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> > >>>>> allows it to try the race a large number of times, instead of just
> > >>>>> once.
> > >>>>>
> > >>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
> > >>>>> mode where we use queue_delayed_work() to submit the conversion
> > >>>>> request to the workqueue. It will of course make certain workloads
> > >>>>> run slow as molasses, but it might expose some races so we can see
> > >>>>> them more easily.
> > >>>>>
> > >>>>> .
> >
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-08-01 09:28:49

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/8/1 16:49, Jan Kara wrote:
> Hi,
>
> On Thu 01-08-13 10:05:08, Zhao Hongjiang wrote:
>> It hit this bug, the "Bug happened!" is come out everytime while the test
>> is fail. Any suggestion for fix this?
> OK, so the test is still failing after using io_end instead of
> iocb->private? If yes, I'm not sure where the problem exactly is, sorry.
>
I hit the bug just with the follow code that you give out:

if (io_end != NULL) {
if (iocb->private == NULL)
printk("Bug happened!\n");
EXT4_I(inode)->cur_aio_dio = NULL;
}

With this the "Bug happened!" is come out everytime while the test is fail.
But if the test case is pass, the "Bug happened!" never come out!
>
>> On 2013/7/31 22:13, Jan Kara wrote:
>>> On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
>>>> On 2013/7/30 23:48, Jan Kara wrote:
>>>>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
>>>>>> Hi, jack
>>>>>>
>>>>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
>>>>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
>>>>>> use io_end for multiple bios). But it's so difficult to backport to
>>>>>> kernel 3.4-stable, any suggestion for this?
>>>>> Backporting that patch to stable kernels is no-go. It is far to intrusive
>>>>> for stable kernels. I was looking for a while how that patch could fix the
>>>>> problem you were observing. I think there is a subtle race possible when
>>>>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
>>>>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
>>>>> iocb->private in ext4_ext_direct_IO() as:
>>>>> if (iocb->private)
>>>>> ext4_inode_aio_set(inode, NULL);
>>>>>
>>>>> So in the case above we forget to reset inode's AIO pointer. That can then
>>>>> cause strange effects with unwritten extent handling (although I admit I'm
>>>>> not sure whether it can also cause the failure you observe) and
>>>>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
>>>>> easily check whether you are hitting that bug or not by changing the above
>>>>> condition from testing iocb->private to testing some private variable...
>>>>> E.g. you could declare io_end and set it to NULL one level up in
>>>>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
>>>>>
>>>> Thanks for your reply first.
>>>> I change the code like the follow:
>>>>
>>>> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>> struct inode *inode = file->f_mapping->host;
>>>> ssize_t ret;
>>>> size_t count = iov_length(iov, nr_segs);
>>>> + ext4_io_end_t *io_end = NULL;
>>>>
>>>> loff_t final_size = offset + count;
>>>> if (rw == WRITE && final_size <= inode->i_size) {
>>>> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>> iocb->private = NULL;
>>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>>> if (!is_sync_kiocb(iocb)) {
>>>> - ext4_io_end_t *io_end =
>>>> - ext4_init_io_end(inode, GFP_NOFS);
>>>> + io_end = ext4_init_io_end(inode, GFP_NOFS);
>>>> if (!io_end)
>>>> return -ENOMEM;
>>>> io_end->flag |= EXT4_IO_END_DIRECT;
>>>> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>> ext4_end_io_dio,
>>>> NULL,
>>>> DIO_LOCKING);
>>>> - if (iocb->private)
>>>> + if (io_end != NULL) {
>>>> + printk("Zhao Hongjiang Ext4 test!\n");
>>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>>> + }
>>>> /*
>>>> * The io_end structure takes a reference to the inode,
>>>> * that structure needs to be destroyed and the
>>>>
>>>> And the print come out when i run the test everytime. So i think the test
>>>> hit the bug that you mentioned, Am i right or miss something?
>>> It is not a bug that you hit the branch with printk(). It would be a bug
>>> if the debug check looked like:
>>> if (io_end != NULL) {
>>> if (iocb->private == NULL)
>>> printk("Bug happened!\n");
>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>> }
>>>
>>> Honza
>>>
>>>>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
>>>>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>>>>>>
>>>>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>>>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>>>>>>
>>>>>>>> FSTYP -- ext4
>>>>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>>>>>>
>>>>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>>>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>>>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>>>>>>> @@ -1,2 +1,515 @@
>>>>>>>> QA output created by 239
>>>>>>>> +hostname: Host name lookup failure
>>>>>>>
>>>>>>> OK, so this hostname failure is weird; I'm not sure what's causing
>>>>>>> this, but this I presume unrelated to the failure at hand.
>>>>>>>
>>>>>>>> Silence is golden
>>>>>>>> +0: 0x0
>>>>>>>> +1: 0x0
>>>>>>>> +2: 0x0
>>>>>>>> +3: 0x0
>>>>>>>
>>>>>>> This indicates a problem. Test generic/239 is running
>>>>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>>>>>>> 4k write with a buffer containing non-zero contents to a sparse file,
>>>>>>> and once the I/O has completed, it uses pread to read it back, using
>>>>>>> the same descriptor, so it is doing the read using direct I/O. It
>>>>>>> then checks to see if the read returns zero or not.
>>>>>>>
>>>>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>>>>>>> somehow aio_complete() is getting called before the uninitialized to
>>>>>>> initialized conversion is taking place. I'm not seeing how this is
>>>>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
>>>>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>>>>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
>>>>>>> and and aio_done() is only called on the iocb after the conversion is
>>>>>>> complete.
>>>>>>>
>>>>>>> Can anyone see something that I might be missing?
>>>>>>>
>>>>>>> - Ted
>>>>>>>
>>>>>>> P.S. Zhao, what was the hardware that you using to find this failure?
>>>>>>> I'm not seeing it, but then again if the failure is only happening
>>>>>>> once every few hundred runs that might explain it. I'm perhaps
>>>>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>>>>>>> allows it to try the race a large number of times, instead of just
>>>>>>> once.
>>>>>>>
>>>>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
>>>>>>> mode where we use queue_delayed_work() to submit the conversion
>>>>>>> request to the workqueue. It will of course make certain workloads
>>>>>>> run slow as molasses, but it might expose some races so we can see
>>>>>>> them more easily.
>>>>>>>
>>>>>>> .
>>
>>



2013-08-01 10:28:51

by Zhao Hongjiang

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On 2013/8/1 17:10, Jan Kara wrote:
> On Thu 01-08-13 10:49:41, Jan Kara wrote:
>> Hi,
>>
>> On Thu 01-08-13 10:05:08, Zhao Hongjiang wrote:
>>> It hit this bug, the "Bug happened!" is come out everytime while the test
>>> is fail. Any suggestion for fix this?
>> OK, so the test is still failing after using io_end instead of
>> iocb->private? If yes, I'm not sure where the problem exactly is, sorry.
> Just one addition: But from the fact that debug printk triggers we see
> that really the problem happens when DIO is completed before
> __blockdev_direct_IO() returns and thus extent conversion can race with
> parts of __blockdev_direct_IO() or ext4_ext_direct_IO(). That is impossible
> after my patch because ext4_ext_direct_IO() holds a reference to io_end so
> it gets queued for completion only after __blockdev_direct_IO() returns. So
> at least I somewhat understand why my patch makes a difference.
>
So is this might be the solution to fix the problem?

Regards,
Zhao
>
>>> On 2013/7/31 22:13, Jan Kara wrote:
>>>> On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
>>>>> On 2013/7/30 23:48, Jan Kara wrote:
>>>>>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
>>>>>>> Hi, jack
>>>>>>>
>>>>>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
>>>>>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
>>>>>>> use io_end for multiple bios). But it's so difficult to backport to
>>>>>>> kernel 3.4-stable, any suggestion for this?
>>>>>> Backporting that patch to stable kernels is no-go. It is far to intrusive
>>>>>> for stable kernels. I was looking for a while how that patch could fix the
>>>>>> problem you were observing. I think there is a subtle race possible when
>>>>>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
>>>>>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
>>>>>> iocb->private in ext4_ext_direct_IO() as:
>>>>>> if (iocb->private)
>>>>>> ext4_inode_aio_set(inode, NULL);
>>>>>>
>>>>>> So in the case above we forget to reset inode's AIO pointer. That can then
>>>>>> cause strange effects with unwritten extent handling (although I admit I'm
>>>>>> not sure whether it can also cause the failure you observe) and
>>>>>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
>>>>>> easily check whether you are hitting that bug or not by changing the above
>>>>>> condition from testing iocb->private to testing some private variable...
>>>>>> E.g. you could declare io_end and set it to NULL one level up in
>>>>>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
>>>>>>
>>>>> Thanks for your reply first.
>>>>> I change the code like the follow:
>>>>>
>>>>> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>>> struct inode *inode = file->f_mapping->host;
>>>>> ssize_t ret;
>>>>> size_t count = iov_length(iov, nr_segs);
>>>>> + ext4_io_end_t *io_end = NULL;
>>>>>
>>>>> loff_t final_size = offset + count;
>>>>> if (rw == WRITE && final_size <= inode->i_size) {
>>>>> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>>> iocb->private = NULL;
>>>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>>>> if (!is_sync_kiocb(iocb)) {
>>>>> - ext4_io_end_t *io_end =
>>>>> - ext4_init_io_end(inode, GFP_NOFS);
>>>>> + io_end = ext4_init_io_end(inode, GFP_NOFS);
>>>>> if (!io_end)
>>>>> return -ENOMEM;
>>>>> io_end->flag |= EXT4_IO_END_DIRECT;
>>>>> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
>>>>> ext4_end_io_dio,
>>>>> NULL,
>>>>> DIO_LOCKING);
>>>>> - if (iocb->private)
>>>>> + if (io_end != NULL) {
>>>>> + printk("Zhao Hongjiang Ext4 test!\n");
>>>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>>>> + }
>>>>> /*
>>>>> * The io_end structure takes a reference to the inode,
>>>>> * that structure needs to be destroyed and the
>>>>>
>>>>> And the print come out when i run the test everytime. So i think the test
>>>>> hit the bug that you mentioned, Am i right or miss something?
>>>> It is not a bug that you hit the branch with printk(). It would be a bug
>>>> if the debug check looked like:
>>>> if (io_end != NULL) {
>>>> if (iocb->private == NULL)
>>>> printk("Bug happened!\n");
>>>> EXT4_I(inode)->cur_aio_dio = NULL;
>>>> }
>>>>
>>>> Honza
>>>>
>>>>>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
>>>>>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
>>>>>>>>>
>>>>>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
>>>>>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
>>>>>>>>>
>>>>>>>>> FSTYP -- ext4
>>>>>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
>>>>>>>>>
>>>>>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
>>>>>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
>>>>>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
>>>>>>>>> @@ -1,2 +1,515 @@
>>>>>>>>> QA output created by 239
>>>>>>>>> +hostname: Host name lookup failure
>>>>>>>>
>>>>>>>> OK, so this hostname failure is weird; I'm not sure what's causing
>>>>>>>> this, but this I presume unrelated to the failure at hand.
>>>>>>>>
>>>>>>>>> Silence is golden
>>>>>>>>> +0: 0x0
>>>>>>>>> +1: 0x0
>>>>>>>>> +2: 0x0
>>>>>>>>> +3: 0x0
>>>>>>>>
>>>>>>>> This indicates a problem. Test generic/239 is running
>>>>>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
>>>>>>>> 4k write with a buffer containing non-zero contents to a sparse file,
>>>>>>>> and once the I/O has completed, it uses pread to read it back, using
>>>>>>>> the same descriptor, so it is doing the read using direct I/O. It
>>>>>>>> then checks to see if the read returns zero or not.
>>>>>>>>
>>>>>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
>>>>>>>> somehow aio_complete() is getting called before the uninitialized to
>>>>>>>> initialized conversion is taking place. I'm not seeing how this is
>>>>>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
>>>>>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
>>>>>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
>>>>>>>> and and aio_done() is only called on the iocb after the conversion is
>>>>>>>> complete.
>>>>>>>>
>>>>>>>> Can anyone see something that I might be missing?
>>>>>>>>
>>>>>>>> - Ted
>>>>>>>>
>>>>>>>> P.S. Zhao, what was the hardware that you using to find this failure?
>>>>>>>> I'm not seeing it, but then again if the failure is only happening
>>>>>>>> once every few hundred runs that might explain it. I'm perhaps
>>>>>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
>>>>>>>> allows it to try the race a large number of times, instead of just
>>>>>>>> once.
>>>>>>>>
>>>>>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
>>>>>>>> mode where we use queue_delayed_work() to submit the conversion
>>>>>>>> request to the workqueue. It will of course make certain workloads
>>>>>>>> run slow as molasses, but it might expose some races so we can see
>>>>>>>> them more easily.
>>>>>>>>
>>>>>>>> .
>>>
>>>
>> --
>> Jan Kara <[email protected]>
>> SUSE Labs, CR



2013-08-01 21:03:27

by Jan Kara

[permalink] [raw]
Subject: Re: xfstests failure generic/239

On Thu 01-08-13 17:28:38, Zhao Hongjiang wrote:
> On 2013/8/1 16:49, Jan Kara wrote:
> > Hi,
> >
> > On Thu 01-08-13 10:05:08, Zhao Hongjiang wrote:
> >> It hit this bug, the "Bug happened!" is come out everytime while the test
> >> is fail. Any suggestion for fix this?
> > OK, so the test is still failing after using io_end instead of
> > iocb->private? If yes, I'm not sure where the problem exactly is, sorry.
> >
> I hit the bug just with the follow code that you give out:
>
> if (io_end != NULL) {
> if (iocb->private == NULL)
> printk("Bug happened!\n");
> EXT4_I(inode)->cur_aio_dio = NULL;
> }
>
> With this the "Bug happened!" is come out everytime while the test is fail.
> But if the test case is pass, the "Bug happened!" never come out!
Yes, so we know early completion of direct IO triggers the bug. The above
condition fixes a bug in ext4_ext_direct_IO() but as you are still getting
test failures, there must be some other bug still present. And I don't know
where it is...

Honza

> >> On 2013/7/31 22:13, Jan Kara wrote:
> >>> On Wed 31-07-13 10:42:37, Zhao Hongjiang wrote:
> >>>> On 2013/7/30 23:48, Jan Kara wrote:
> >>>>> On Tue 30-07-13 11:28:58, Zhao Hongjiang wrote:
> >>>>>> Hi, jack
> >>>>>>
> >>>>>> I test the latest kernel 3.11-rc2 and it seems the problem is fix by the
> >>>>>> follow patch: commit id:97a851ed71cd9cc2542955e92a001c6ea3d21d35 (ext4:
> >>>>>> use io_end for multiple bios). But it's so difficult to backport to
> >>>>>> kernel 3.4-stable, any suggestion for this?
> >>>>> Backporting that patch to stable kernels is no-go. It is far to intrusive
> >>>>> for stable kernels. I was looking for a while how that patch could fix the
> >>>>> problem you were observing. I think there is a subtle race possible when
> >>>>> AIO DIO write completes before __blockdev_direct_IO() returns. In that case
> >>>>> we set iocb->private to NULL in ext4_end_io_dio() but we also key off
> >>>>> iocb->private in ext4_ext_direct_IO() as:
> >>>>> if (iocb->private)
> >>>>> ext4_inode_aio_set(inode, NULL);
> >>>>>
> >>>>> So in the case above we forget to reset inode's AIO pointer. That can then
> >>>>> cause strange effects with unwritten extent handling (although I admit I'm
> >>>>> not sure whether it can also cause the failure you observe) and
> >>>>> 97a851ed71cd9cc2542955e92a001c6ea3d21d35 actually fixes that bug. You can
> >>>>> easily check whether you are hitting that bug or not by changing the above
> >>>>> condition from testing iocb->private to testing some private variable...
> >>>>> E.g. you could declare io_end and set it to NULL one level up in
> >>>>> ext4_ext_direct_IO() and then test io_end != NULL in that condition.
> >>>>>
> >>>> Thanks for your reply first.
> >>>> I change the code like the follow:
> >>>>
> >>>> @@ -2921,6 +2921,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >>>> struct inode *inode = file->f_mapping->host;
> >>>> ssize_t ret;
> >>>> size_t count = iov_length(iov, nr_segs);
> >>>> + ext4_io_end_t *io_end = NULL;
> >>>>
> >>>> loff_t final_size = offset + count;
> >>>> if (rw == WRITE && final_size <= inode->i_size) {
> >>>> @@ -2947,8 +2948,7 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >>>> iocb->private = NULL;
> >>>> EXT4_I(inode)->cur_aio_dio = NULL;
> >>>> if (!is_sync_kiocb(iocb)) {
> >>>> - ext4_io_end_t *io_end =
> >>>> - ext4_init_io_end(inode, GFP_NOFS);
> >>>> + io_end = ext4_init_io_end(inode, GFP_NOFS);
> >>>> if (!io_end)
> >>>> return -ENOMEM;
> >>>> io_end->flag |= EXT4_IO_END_DIRECT;
> >>>> @@ -2970,8 +2970,10 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> >>>> ext4_end_io_dio,
> >>>> NULL,
> >>>> DIO_LOCKING);
> >>>> - if (iocb->private)
> >>>> + if (io_end != NULL) {
> >>>> + printk("Zhao Hongjiang Ext4 test!\n");
> >>>> EXT4_I(inode)->cur_aio_dio = NULL;
> >>>> + }
> >>>> /*
> >>>> * The io_end structure takes a reference to the inode,
> >>>> * that structure needs to be destroyed and the
> >>>>
> >>>> And the print come out when i run the test everytime. So i think the test
> >>>> hit the bug that you mentioned, Am i right or miss something?
> >>> It is not a bug that you hit the branch with printk(). It would be a bug
> >>> if the debug check looked like:
> >>> if (io_end != NULL) {
> >>> if (iocb->private == NULL)
> >>> printk("Bug happened!\n");
> >>> EXT4_I(inode)->cur_aio_dio = NULL;
> >>> }
> >>>
> >>> Honza
> >>>
> >>>>>> On 2013/6/9 6:30, Theodore Ts'o wrote:
> >>>>>>> On Sat, Jun 08, 2013 at 11:13:35AM +0800, Zhao Hongjiang wrote:
> >>>>>>>>
> >>>>>>>> I run xfstests #239 against mainline 3.10.0-rc3, unfortunately it failure in my QEMU. I run the
> >>>>>>>> case a hundred times, it certainly hit the failure several times. The failure msg is as follow:
> >>>>>>>>
> >>>>>>>> FSTYP -- ext4
> >>>>>>>> PLATFORM -- Linux/x86_64 3.10.0-rc3-mainline
> >>>>>>>>
> >>>>>>>> generic/239 1s ... - output mismatch (see /home/zhj/xfstests/results/generic/239.out.bad)
> >>>>>>>> --- tests/generic/239.out 2013-06-07 22:04:09.000000000 -0400
> >>>>>>>> +++ /home/zff/xfstests/results/generic/239.out.bad 2013-06-07 22:04:09.000000000 -0400
> >>>>>>>> @@ -1,2 +1,515 @@
> >>>>>>>> QA output created by 239
> >>>>>>>> +hostname: Host name lookup failure
> >>>>>>>
> >>>>>>> OK, so this hostname failure is weird; I'm not sure what's causing
> >>>>>>> this, but this I presume unrelated to the failure at hand.
> >>>>>>>
> >>>>>>>> Silence is golden
> >>>>>>>> +0: 0x0
> >>>>>>>> +1: 0x0
> >>>>>>>> +2: 0x0
> >>>>>>>> +3: 0x0
> >>>>>>>
> >>>>>>> This indicates a problem. Test generic/239 is running
> >>>>>>> aio-dio-hole-filling-race.c, which submits an asynchronous, direct I/O
> >>>>>>> 4k write with a buffer containing non-zero contents to a sparse file,
> >>>>>>> and once the I/O has completed, it uses pread to read it back, using
> >>>>>>> the same descriptor, so it is doing the read using direct I/O. It
> >>>>>>> then checks to see if the read returns zero or not.
> >>>>>>>
> >>>>>>> The "XX: 0x0" lines indicates that buffer is zero, which implies that
> >>>>>>> somehow aio_complete() is getting called before the uninitialized to
> >>>>>>> initialized conversion is taking place. I'm not seeing how this is
> >>>>>>> happening, though, so I'm a bit puzzled. If there are any unwritten
> >>>>>>> extents, we don't call aio_complete() in ext4_end_io_dio(), but
> >>>>>>> instead the conversion is queued via a call to ext4_add_compete_io(),
> >>>>>>> and and aio_done() is only called on the iocb after the conversion is
> >>>>>>> complete.
> >>>>>>>
> >>>>>>> Can anyone see something that I might be missing?
> >>>>>>>
> >>>>>>> - Ted
> >>>>>>>
> >>>>>>> P.S. Zhao, what was the hardware that you using to find this failure?
> >>>>>>> I'm not seeing it, but then again if the failure is only happening
> >>>>>>> once every few hundred runs that might explain it. I'm perhaps
> >>>>>>> wondering if we should add a mode to aio-dio-hole-filling-race.c which
> >>>>>>> allows it to try the race a large number of times, instead of just
> >>>>>>> once.
> >>>>>>>
> >>>>>>> P.P.S. One thought.... perhaps it might be useful to have a debug
> >>>>>>> mode where we use queue_delayed_work() to submit the conversion
> >>>>>>> request to the workqueue. It will of course make certain workloads
> >>>>>>> run slow as molasses, but it might expose some races so we can see
> >>>>>>> them more easily.
> >>>>>>>
> >>>>>>> .
> >>
> >>
>
>
--
Jan Kara <[email protected]>
SUSE Labs, CR