2006-10-17 21:13:08

by Mike Miller (OS Dev)

[permalink] [raw]
Subject: [PATCH 2/2] cciss: disable dma prefetch for P600

PATCH 2/2
Turned off DMA prefetch for the P600 on systems which may present
discontiguous memory.

---
commit 68e76156e7a203a86996ac99c1326f098d3191f6
tree b191a99ae1bfa6588860136265f11f9ef789683a
parent 499cc64fc708f3a25985bea3b77b40c3448ccbf8
author Mike Miller <[email protected]> Tue, 17 Oct 2006 16:02:22 -0500
committer Mike Miller <[email protected]> Tue, 17 Oct 2006 16:02:22 -0500

Signed-off-by: Mike Miller <[email protected]>

drivers/block/cciss.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index a0a1dd9..b445528 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -2982,6 +2982,21 @@ #ifdef CONFIG_X86
}
#endif

+#if defined CONFIG_IA64 || if defined CONFIG_X86_64
+ {
+ /* DMA prefetch must be disabled on P600 on platforms that may
+ * present noncontiguous memory.
+ */
+
+ __u32 dma_prefetch;
+ if(board_id == 0x3225103C) {
+ dma_prefetch = readl(c->vaddr + I2O0_DMA1_CFG);
+ dma_prefetch |= 0x8000;
+ writel(c->vaddr + I2O0_DMA1_CFG, dma_prefetch);
+ }
+ }
+#endif /* CONFIG_IA64 || CONFIG_X86_64 */
+
#ifdef CCISS_DEBUG
printk("Trying to put board into Simple mode\n");
#endif /* CCISS_DEBUG */


2006-10-18 00:14:05

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/2] cciss: disable dma prefetch for P600

On Tue, 17 Oct 2006 16:13:03 -0500
"Mike Miller (OS Dev)" <[email protected]> wrote:

> PATCH 2/2
> Turned off DMA prefetch for the P600 on systems which may present
> discontiguous memory.
>

What do you mean by "discontiguous memory"? CONFIG_DISCONTIGMEM?

What is the actual problem which is being fixed here?

> +#if defined CONFIG_IA64 || if defined CONFIG_X86_64

hm, does that work?

I'll change it to

#if defined(CONFIG_IA64) || defined(CONFIG_X86_64)


2006-10-20 19:56:21

by Mike Miller (OS Dev)

[permalink] [raw]
Subject: Re: [PATCH 2/2] cciss: disable dma prefetch for P600

On Wed, Oct 18, 2006 at 02:37:23PM -0700, Andrew Morton wrote:
>
> argh, you removed the mailing list from cc.

Sorry, I'm still lacking proper etiquette.

>
> On Wed, 18 Oct 2006 11:54:53 -0500
> "Mike Miller (OS Dev)" <[email protected]> wrote:
>
> > On Tue, Oct 17, 2006 at 05:10:21PM -0700, Andrew Morton wrote:
> > > On Tue, 17 Oct 2006 16:13:03 -0500
> > > "Mike Miller (OS Dev)" <[email protected]> wrote:
> > >
> > > > PATCH 2/2
> > > > Turned off DMA prefetch for the P600 on systems which may present
> > > > discontiguous memory.
> > > >
> > >
> > > What do you mean by "discontiguous memory"? CONFIG_DISCONTIGMEM?
> >
> > The IPF memory map can have holes between the different regions. I've
> > been told by our HW guys that AMD may also have holes.
>
> Pretty much all platforms/architectures have holes in their physical memory
> map.
>
>
> > >
> > > What is the actual problem which is being fixed here?
> >
> > Sorry, I should have been clearer. There is a bug in the DMA engine that
> > that may result in prefetching data from beyond the end of memory or
> > falling off into one the holes on IPF and AMD. It causes a machine check
> > when that happens.
> > It doesn't happen on Proliant because the last 4kB (or so) of memory is
> > mapped out by the BIOS and Pentium guarantees contiguous memory.
>
> I think that this:
>
> > > #if defined(CONFIG_IA64) || defined(CONFIG_X86_64)
>
> is nowhere near strong enough and is probably inappropriate.
>
> It _could_ be that CONFIG_DISCONTIGMEM|CONFIG_SPARSEMEM will be closer, but
> even CONFIG_FLATMEM systems can have holes.

I'm poking around on some IPF platforms. It looks like CONFIG_DISCONTIGMEM is
set on them, but not the others you mention. Would that be sufficient?

>
> On what machines can/does this card exist? Things like powerpc?

This problem was found on Itanium. We don't try to support powerpc.

Thanks,
mikem

2006-10-20 20:27:13

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/2] cciss: disable dma prefetch for P600

On Fri, 20 Oct 2006 14:56:18 -0500
"Mike Miller (OS Dev)" <[email protected]> wrote:

> ..
> >
> > > >
> > > > What is the actual problem which is being fixed here?
> > >
> > > Sorry, I should have been clearer. There is a bug in the DMA engine that
> > > that may result in prefetching data from beyond the end of memory or
> > > falling off into one the holes on IPF and AMD. It causes a machine check
> > > when that happens.
> > > It doesn't happen on Proliant because the last 4kB (or so) of memory is
> > > mapped out by the BIOS and Pentium guarantees contiguous memory.
> >
> > I think that this:
> >
> > > > #if defined(CONFIG_IA64) || defined(CONFIG_X86_64)
> >
> > is nowhere near strong enough and is probably inappropriate.
> >
> > It _could_ be that CONFIG_DISCONTIGMEM|CONFIG_SPARSEMEM will be closer, but
> > even CONFIG_FLATMEM systems can have holes.
>
> I'm poking around on some IPF platforms. It looks like CONFIG_DISCONTIGMEM is
> set on them, but not the others you mention. Would that be sufficient?

I don't think so. All machines in all memory models can and do have holes
in their memory map. I think the problem is that some machines object to
having those holes read from and others do not. It could be that this
problem is purely an ia64 thing.

And it's not just holes: we had a problem a year or so back where CPU
prefetching was walking off the end of real mmeory and into the AGP region
and was causing weird cache coherency problems on x86_64 (or something like
that).

> >
> > On what machines can/does this card exist? Things like powerpc?
>
> This problem was found on Itanium. We don't try to support powerpc.

Well the CCISS driver presently has no architecture Kconfig dependencies,
so anyone can build it on anything. I don't know whether it's physically
possible to put a cciss controller into a power/sparc/whatever machine -
are these controllers only ever integrated onto the main boad?

Anyway, I'd suggest the best way of sorting this out is to come up with a
complete description of the problem, decide which architectures are
affected and to then ask the relevant architecture maintainers to recommend
a solution.

I think the description would be

There is a bug in the DMA engine that that may result in prefetching
data from beyond the end of memory or falling off into one the holes on
IPF and AMD. It causes a machine check when that happens.

It doesn't happen on Proliant because the last 4kB (or so) of memory is
mapped out by the BIOS and Pentium guarantees contiguous memory.

If the platform is culnerable to this then driver's prefetching needs
to be disabled at compile-time or, preferably, initialization-time. What
is the best means by which we can determine whether the platform needs
this treatment?


(the patch didn't compile, btw: there's no definition of I2O0_DMA1_CFG)