2017-01-24 15:44:23

by Jeff Layton

[permalink] [raw]
Subject: regression in DIO write behavior

I've noticed a probable regression in recent kernels. When you run the
attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
I see the kernel generate wsize WRITE calls on the wire.

When I run the same program on a more modern kernel (mainline as of
today), it generates a ton of page-sized I/Os instead. I've verified
that iov_iter_get_pages_alloc is returning a wsize array of pages, it
just seems like the request handling code isn't stitching them together
like it should.

Is this an expected change or a regression? I'm guessing the latter, and
that it might have crept in during the pageio rework from a couple of
years ago.

Any idea where the bug might be?
--
Jeff Layton <[email protected]>


Attachments:
diotest2.c (814.00 B)

2017-01-24 17:23:14

by Weston Andros Adamson

[permalink] [raw]
Subject: Re: regression in DIO write behavior

Hey Jeff,

That sounds like a regression to me. I don't think it's been around since the
pgio rework, but maybe?

-dros

> On Jan 24, 2017, at 10:44 AM, Jeff Layton <[email protected]> wrote:
>
> I've noticed a probable regression in recent kernels. When you run the
> attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> I see the kernel generate wsize WRITE calls on the wire.
>
> When I run the same program on a more modern kernel (mainline as of
> today), it generates a ton of page-sized I/Os instead. I've verified
> that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> just seems like the request handling code isn't stitching them together
> like it should.
>
> Is this an expected change or a regression? I'm guessing the latter, and
> that it might have crept in during the pageio rework from a couple of
> years ago.
>
> Any idea where the bug might be?
> --
> Jeff Layton <[email protected]><diotest2.c>


2017-01-24 17:50:25

by Jeff Layton

[permalink] [raw]
Subject: Re: regression in DIO write behavior

On Tue, 2017-01-24 at 12:23 -0500, Weston Andros Adamson wrote:
> Hey Jeff,
>
> That sounds like a regression to me. I don't think it's been around since the
> pgio rework, but maybe?
>
> -dros
>

I certainly could be wrong. :)

I did open this bug, and we'll track it down there:

https://bugzilla.redhat.com/show_bug.cgi?id=1416127

Looks like Scott bisected it down in RHEL7 kernels so we should be able
to ID it from there.

Cheers,
Jeff

> > On Jan 24, 2017, at 10:44 AM, Jeff Layton <[email protected]> wrote:
> >
> > I've noticed a probable regression in recent kernels. When you run the
> > attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> > I see the kernel generate wsize WRITE calls on the wire.
> >
> > When I run the same program on a more modern kernel (mainline as of
> > today), it generates a ton of page-sized I/Os instead. I've verified
> > that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> > just seems like the request handling code isn't stitching them together
> > like it should.
> >
> > Is this an expected change or a regression? I'm guessing the latter, and
> > that it might have crept in during the pageio rework from a couple of
> > years ago.
> >
> > Any idea where the bug might be?
> > --
> > Jeff Layton <[email protected]><diotest2.c>
>
>

--
Jeff Layton <[email protected]>

2017-01-24 19:46:31

by Jeff Layton

[permalink] [raw]
Subject: Re: regression in DIO write behavior

On Tue, 2017-01-24 at 12:23 -0500, Weston Andros Adamson wrote:
> Hey Jeff,
>
> That sounds like a regression to me. I don't think it's been around since the
> pgio rework, but maybe?
>
> -dros
>
> > On Jan 24, 2017, at 10:44 AM, Jeff Layton <[email protected]> wrote:
> >
> > I've noticed a probable regression in recent kernels. When you run the
> > attached program on an older kernel (I used 2.6.32-642.6.2.el6.x86_64),
> > I see the kernel generate wsize WRITE calls on the wire.
> >
> > When I run the same program on a more modern kernel (mainline as of
> > today), it generates a ton of page-sized I/Os instead. I've verified
> > that iov_iter_get_pages_alloc is returning a wsize array of pages, it
> > just seems like the request handling code isn't stitching them together
> > like it should.
> >
> > Is this an expected change or a regression? I'm guessing the latter, and
> > that it might have crept in during the pageio rework from a couple of
> > years ago.
> >
> > Any idea where the bug might be?
> > --
> > Jeff Layton <[email protected]><diotest2.c>
>
>

Ahh, I think I might get it now and it's not as bad as I had originally
feared...

If you dirty all of the pages before writing, it seems to coalesce them
correctly. The reproducer allocates pages, but doesn't actually dirty
them before writing them. Apparently the allocator is setting up the
mapping such that each page offset address in the allocation points to
the same page. I imagine it's then setting up that page for CoW.

So we end up in this test in nfs_can_coalesce_requests and hit the
return false:

                if (req->wb_page == prev->wb_page) {
                        if (req->wb_pgbase != prev->wb_pgbase + prev->wb_bytes)
                                return false;

I think that's in place to handle sub-page write requests, but maybe we
should consider doing that a different way for DIO?
--
Jeff Layton <[email protected]>