2009-10-22 14:40:22

by Peter Horton

[permalink] [raw]
Subject: [PATCH] prevent AoE causing cache aliases

This patch prevents the AoE block driver from creating cache aliases of
page cache pages on machines with virtually indexed caches.

Building kernels on an AT91SAM9G20 board without this patch fails with
segmentation faults after a couple of passes.

Signed-off-by: Peter Horton <[email protected]>

Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
===================================================================
--- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
+++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
@@ -735,6 +735,21 @@
part_stat_unlock();
}

+/*
+ * Ensure we don't create aliases in VI caches
+ */
+static inline void
+killalias(struct bio *bio)
+{
+ struct bio_vec *bv;
+ int i;
+
+ if (bio_data_dir(bio) == READ)
+ __bio_for_each_segment(bv, bio, i, 0) {
+ flush_dcache_page(bv->bv_page);
+ }
+}
+
void
aoecmd_ata_rsp(struct sk_buff *skb)
{
@@ -853,8 +868,12 @@

if (buf && --buf->nframesout == 0 && buf->resid == 0) {
diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
- n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
- bio_endio(buf->bio, n);
+ if (buf->flags & BUFFL_FAIL)
+ bio_endio(buf->bio, -EIO);
+ else {
+ killalias(buf->bio);
+ bio_endio(buf->bio, 0);
+ }
mempool_free(buf, d->bufpool);
}


2009-11-04 00:37:59

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

On Thu, 22 Oct 2009 15:22:28 +0100
[email protected] (Peter Horton) wrote:

> To: [email protected]

Have you heard back from Ed on this?

> Cc: [email protected]
> Subject: [PATCH] prevent AoE causing cache aliases
> Date: Thu, 22 Oct 2009 15:22:28 +0100
> Sender: [email protected]
> User-Agent: Mutt/1.5.9i
>
> This patch prevents the AoE block driver from creating cache aliases of
> page cache pages on machines with virtually indexed caches.
>
> Building kernels on an AT91SAM9G20 board without this patch fails with
> segmentation faults after a couple of passes.
>
>
> Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
> ===================================================================
> --- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
> +++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
> @@ -735,6 +735,21 @@
> part_stat_unlock();
> }
>
> +/*
> + * Ensure we don't create aliases in VI caches
> + */
> +static inline void
> +killalias(struct bio *bio)
> +{
> + struct bio_vec *bv;
> + int i;
> +
> + if (bio_data_dir(bio) == READ)
> + __bio_for_each_segment(bv, bio, i, 0) {
> + flush_dcache_page(bv->bv_page);
> + }
> +}
> +
> void
> aoecmd_ata_rsp(struct sk_buff *skb)
> {
> @@ -853,8 +868,12 @@
>
> if (buf && --buf->nframesout == 0 && buf->resid == 0) {
> diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
> - n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
> - bio_endio(buf->bio, n);
> + if (buf->flags & BUFFL_FAIL)
> + bio_endio(buf->bio, -EIO);
> + else {
> + killalias(buf->bio);
> + bio_endio(buf->bio, 0);
> + }
> mempool_free(buf, d->bufpool);
> }

Looks OK.

This bugfix will cause a pointless __bio_for_each_segment() busywait
loop to be executed on architectures for which flush_dcache_page() is a
no-op.

We don't have infrastructure to fix that.

2009-11-04 10:53:46

by Peter Horton

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

Andrew Morton wrote:
> On Thu, 22 Oct 2009 15:22:28 +0100
> [email protected] (Peter Horton) wrote:
>
>> To: [email protected]
>
> Have you heard back from Ed on this?
>

No.

>> Cc: [email protected]
>> Subject: [PATCH] prevent AoE causing cache aliases
>> Date: Thu, 22 Oct 2009 15:22:28 +0100
>> Sender: [email protected]
>> User-Agent: Mutt/1.5.9i
>>
>> This patch prevents the AoE block driver from creating cache aliases of
>> page cache pages on machines with virtually indexed caches.
>>
>> Building kernels on an AT91SAM9G20 board without this patch fails with
>> segmentation faults after a couple of passes.
>>
>>
>> Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
>> ===================================================================
>> --- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
>> +++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
>> @@ -735,6 +735,21 @@
>> part_stat_unlock();
>> }
>>
>> +/*
>> + * Ensure we don't create aliases in VI caches
>> + */
>> +static inline void
>> +killalias(struct bio *bio)
>> +{
>> + struct bio_vec *bv;
>> + int i;
>> +
>> + if (bio_data_dir(bio) == READ)
>> + __bio_for_each_segment(bv, bio, i, 0) {
>> + flush_dcache_page(bv->bv_page);
>> + }
>> +}
>> +
>> void
>> aoecmd_ata_rsp(struct sk_buff *skb)
>> {
>> @@ -853,8 +868,12 @@
>>
>> if (buf && --buf->nframesout == 0 && buf->resid == 0) {
>> diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
>> - n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
>> - bio_endio(buf->bio, n);
>> + if (buf->flags & BUFFL_FAIL)
>> + bio_endio(buf->bio, -EIO);
>> + else {
>> + killalias(buf->bio);
>> + bio_endio(buf->bio, 0);
>> + }
>> mempool_free(buf, d->bufpool);
>> }
>
> Looks OK.
>
> This bugfix will cause a pointless __bio_for_each_segment() busywait
> loop to be executed on architectures for which flush_dcache_page() is a
> no-op.
>
> We don't have infrastructure to fix that.

Couldn't we add a flag to the bio that users could set to indicate that
they are not house trained with respect to the D-cache (i.e non-DMA
drivers). Architectures that needed to could then flush the relevant
pages in the bio_endio() path somewhere. At the moment all the non-DMA
block drivers need to be aware of the cache aliasing issue which means
this problem keeps arising ...

P.

2009-11-04 13:28:33

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

On Tue, Nov 03, 2009 at 04:37:55PM -0800, Andrew Morton wrote:
> On Thu, 22 Oct 2009 15:22:28 +0100
> [email protected] (Peter Horton) wrote:
>
> > To: [email protected]
>
> Have you heard back from Ed on this?

Sorry, I didn't comment because I don't have much experience
with virtually indexed caches, and while the fix seems to
make sense, I was hoping to learn more from the discussion
to follow.

I do think that it would be helpful to spell out "virtually
indexed" in the comment. "VI" might not be enough to jog
the memory of folks who haven't used such an architecture
recently.

--
Ed

2009-11-04 15:35:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

On Wed, 04 Nov 2009 10:54:34 +0000 Peter Horton <[email protected]> wrote:

> Andrew Morton wrote:
> > On Thu, 22 Oct 2009 15:22:28 +0100
> > [email protected] (Peter Horton) wrote:
> >
> >> To: [email protected]
> >
> > Have you heard back from Ed on this?
> >
>
> No.
>
> >> Cc: [email protected]
> >> Subject: [PATCH] prevent AoE causing cache aliases
> >> Date: Thu, 22 Oct 2009 15:22:28 +0100
> >> Sender: [email protected]
> >> User-Agent: Mutt/1.5.9i
> >>
> >> This patch prevents the AoE block driver from creating cache aliases of
> >> page cache pages on machines with virtually indexed caches.
> >>
> >> Building kernels on an AT91SAM9G20 board without this patch fails with
> >> segmentation faults after a couple of passes.
> >>
> >>
> >> Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
> >> ===================================================================
> >> --- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
> >> +++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
> >> @@ -735,6 +735,21 @@
> >> part_stat_unlock();
> >> }
> >>
> >> +/*
> >> + * Ensure we don't create aliases in VI caches
> >> + */
> >> +static inline void
> >> +killalias(struct bio *bio)
> >> +{
> >> + struct bio_vec *bv;
> >> + int i;
> >> +
> >> + if (bio_data_dir(bio) == READ)
> >> + __bio_for_each_segment(bv, bio, i, 0) {
> >> + flush_dcache_page(bv->bv_page);
> >> + }
> >> +}
> >> +
> >> void
> >> aoecmd_ata_rsp(struct sk_buff *skb)
> >> {
> >> @@ -853,8 +868,12 @@
> >>
> >> if (buf && --buf->nframesout == 0 && buf->resid == 0) {
> >> diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
> >> - n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
> >> - bio_endio(buf->bio, n);
> >> + if (buf->flags & BUFFL_FAIL)
> >> + bio_endio(buf->bio, -EIO);
> >> + else {
> >> + killalias(buf->bio);
> >> + bio_endio(buf->bio, 0);
> >> + }
> >> mempool_free(buf, d->bufpool);
> >> }
> >
> > Looks OK.
> >
> > This bugfix will cause a pointless __bio_for_each_segment() busywait
> > loop to be executed on architectures for which flush_dcache_page() is a
> > no-op.
> >
> > We don't have infrastructure to fix that.
>
> Couldn't we add a flag to the bio that users could set to indicate that
> they are not house trained with respect to the D-cache (i.e non-DMA
> drivers). Architectures that needed to could then flush the relevant
> pages in the bio_endio() path somewhere. At the moment all the non-DMA
> block drivers need to be aware of the cache aliasing issue which means
> this problem keeps arising ...
>

Could. We'll need to change each arch _somehow_. Even if it's a
matter of adding `#define i_am_not_house_trained' to the troublesome
ones or something, then ifdeffing existing code.

I was thinking that a general bio_flush_dcache_pages() in block core
(or in each arch) would be a suitable way to handle this but I was
unable to find other drivers which needed it after a brief search.

2009-11-04 15:51:52

by Peter Horton

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

Andrew Morton wrote:
> On Wed, 04 Nov 2009 10:54:34 +0000 Peter Horton <[email protected]> wrote:
>
>> Andrew Morton wrote:
>>> On Thu, 22 Oct 2009 15:22:28 +0100
>>> [email protected] (Peter Horton) wrote:
>>>
>>>> To: [email protected]
>>> Have you heard back from Ed on this?
>>>
>> No.
>>
>>>> Cc: [email protected]
>>>> Subject: [PATCH] prevent AoE causing cache aliases
>>>> Date: Thu, 22 Oct 2009 15:22:28 +0100
>>>> Sender: [email protected]
>>>> User-Agent: Mutt/1.5.9i
>>>>
>>>> This patch prevents the AoE block driver from creating cache aliases of
>>>> page cache pages on machines with virtually indexed caches.
>>>>
>>>> Building kernels on an AT91SAM9G20 board without this patch fails with
>>>> segmentation faults after a couple of passes.
>>>>
>>>>
>>>> Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
>>>> ===================================================================
>>>> --- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
>>>> +++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
>>>> @@ -735,6 +735,21 @@
>>>> part_stat_unlock();
>>>> }
>>>>
>>>> +/*
>>>> + * Ensure we don't create aliases in VI caches
>>>> + */
>>>> +static inline void
>>>> +killalias(struct bio *bio)
>>>> +{
>>>> + struct bio_vec *bv;
>>>> + int i;
>>>> +
>>>> + if (bio_data_dir(bio) == READ)
>>>> + __bio_for_each_segment(bv, bio, i, 0) {
>>>> + flush_dcache_page(bv->bv_page);
>>>> + }
>>>> +}
>>>> +
>>>> void
>>>> aoecmd_ata_rsp(struct sk_buff *skb)
>>>> {
>>>> @@ -853,8 +868,12 @@
>>>>
>>>> if (buf && --buf->nframesout == 0 && buf->resid == 0) {
>>>> diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
>>>> - n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
>>>> - bio_endio(buf->bio, n);
>>>> + if (buf->flags & BUFFL_FAIL)
>>>> + bio_endio(buf->bio, -EIO);
>>>> + else {
>>>> + killalias(buf->bio);
>>>> + bio_endio(buf->bio, 0);
>>>> + }
>>>> mempool_free(buf, d->bufpool);
>>>> }
>>> Looks OK.
>>>
>>> This bugfix will cause a pointless __bio_for_each_segment() busywait
>>> loop to be executed on architectures for which flush_dcache_page() is a
>>> no-op.
>>>
>>> We don't have infrastructure to fix that.
>> Couldn't we add a flag to the bio that users could set to indicate that
>> they are not house trained with respect to the D-cache (i.e non-DMA
>> drivers). Architectures that needed to could then flush the relevant
>> pages in the bio_endio() path somewhere. At the moment all the non-DMA
>> block drivers need to be aware of the cache aliasing issue which means
>> this problem keeps arising ...
>>
>
> Could. We'll need to change each arch _somehow_. Even if it's a
> matter of adding `#define i_am_not_house_trained' to the troublesome
> ones or something, then ifdeffing existing code.
>
> I was thinking that a general bio_flush_dcache_pages() in block core
> (or in each arch) would be a suitable way to handle this but I was
> unable to find other drivers which needed it after a brief search.
>
>

IDE does it at a lower level (arch/mips/include/asm/mach-generic/ide.h
for example).

Looks like the generic PIO ops in drivers/ata/libata-sff.c could cause
problems too.

I think the problem is often masked by the small cache sizes on the
platforms with VI caches. The original AoE problem I see is on an ARM926
with 32K D-cache, I don't see the problem at all on another ARM926 with
16K D-cache.

P.

2009-11-04 17:35:29

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] prevent AoE causing cache aliases

On Wed, Nov 04 2009, Andrew Morton wrote:
> On Wed, 04 Nov 2009 10:54:34 +0000 Peter Horton <[email protected]> wrote:
>
> > Andrew Morton wrote:
> > > On Thu, 22 Oct 2009 15:22:28 +0100
> > > [email protected] (Peter Horton) wrote:
> > >
> > >> To: [email protected]
> > >
> > > Have you heard back from Ed on this?
> > >
> >
> > No.
> >
> > >> Cc: [email protected]
> > >> Subject: [PATCH] prevent AoE causing cache aliases
> > >> Date: Thu, 22 Oct 2009 15:22:28 +0100
> > >> Sender: [email protected]
> > >> User-Agent: Mutt/1.5.9i
> > >>
> > >> This patch prevents the AoE block driver from creating cache aliases of
> > >> page cache pages on machines with virtually indexed caches.
> > >>
> > >> Building kernels on an AT91SAM9G20 board without this patch fails with
> > >> segmentation faults after a couple of passes.
> > >>
> > >>
> > >> Index: linux-2.6.31/drivers/block/aoe/aoecmd.c
> > >> ===================================================================
> > >> --- linux-2.6.31.orig/drivers/block/aoe/aoecmd.c 2009-09-09 23:13:59.000000000 +0100
> > >> +++ linux-2.6.31/drivers/block/aoe/aoecmd.c 2009-10-22 10:24:50.000000000 +0100
> > >> @@ -735,6 +735,21 @@
> > >> part_stat_unlock();
> > >> }
> > >>
> > >> +/*
> > >> + * Ensure we don't create aliases in VI caches
> > >> + */
> > >> +static inline void
> > >> +killalias(struct bio *bio)
> > >> +{
> > >> + struct bio_vec *bv;
> > >> + int i;
> > >> +
> > >> + if (bio_data_dir(bio) == READ)
> > >> + __bio_for_each_segment(bv, bio, i, 0) {
> > >> + flush_dcache_page(bv->bv_page);
> > >> + }
> > >> +}
> > >> +
> > >> void
> > >> aoecmd_ata_rsp(struct sk_buff *skb)
> > >> {
> > >> @@ -853,8 +868,12 @@
> > >>
> > >> if (buf && --buf->nframesout == 0 && buf->resid == 0) {
> > >> diskstats(d->gd, buf->bio, jiffies - buf->stime, buf->sector);
> > >> - n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
> > >> - bio_endio(buf->bio, n);
> > >> + if (buf->flags & BUFFL_FAIL)
> > >> + bio_endio(buf->bio, -EIO);
> > >> + else {
> > >> + killalias(buf->bio);
> > >> + bio_endio(buf->bio, 0);
> > >> + }
> > >> mempool_free(buf, d->bufpool);
> > >> }
> > >
> > > Looks OK.
> > >
> > > This bugfix will cause a pointless __bio_for_each_segment() busywait
> > > loop to be executed on architectures for which flush_dcache_page() is a
> > > no-op.
> > >
> > > We don't have infrastructure to fix that.
> >
> > Couldn't we add a flag to the bio that users could set to indicate that
> > they are not house trained with respect to the D-cache (i.e non-DMA
> > drivers). Architectures that needed to could then flush the relevant
> > pages in the bio_endio() path somewhere. At the moment all the non-DMA
> > block drivers need to be aware of the cache aliasing issue which means
> > this problem keeps arising ...
> >
>
> Could. We'll need to change each arch _somehow_. Even if it's a
> matter of adding `#define i_am_not_house_trained' to the troublesome
> ones or something, then ifdeffing existing code.
>
> I was thinking that a general bio_flush_dcache_pages() in block core
> (or in each arch) would be a suitable way to handle this but I was
> unable to find other drivers which needed it after a brief search.

Indeed, we should have such a helper. I can't find any ARCH define that
tells us when we need to do this. Easiest is probably to grep in arch/
for non-empty definitions of flush_dcache_page() and add such a define.

I'll hack one up.

--
Jens Axboe