2009-04-24 18:35:29

by Ira W. Snyder

[permalink] [raw]
Subject: [PATCH] fsldma: use PCI Read Multiple command

By default, the Freescale 83xx DMA controller uses the PCI Read Line
command when reading data over the PCI bus. Setting the controller to use
the PCI Read Multiple command instead allows the controller to read much
larger bursts of data, which provides a drastic speed increase.

The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI
bridge was between the devices trying to communicate.

A simple test driver showed an increase from 4MB/sec to 116MB/sec when
performing DMA over the PCI bus. Using DMA to transfer between blocks of
local SDRAM showed no change in performance with this patch. The dmatest
driver was also used to verify the correctness of the transfers, and showed
no errors.

Signed-off-by: Ira W. Snyder <[email protected]>
---
drivers/dma/fsldma.c | 5 +++--
drivers/dma/fsldma.h | 1 +
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index da8a8ed..638d00e 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -49,9 +49,10 @@ static void dma_init(struct fsl_dma_chan *fsl_chan)
case FSL_DMA_IP_83XX:
/* Set the channel to below modes:
* EOTIE - End-of-transfer interrupt enable
+ * PRC_RM - PCI read multiple
*/
- DMA_OUT(fsl_chan, &fsl_chan->reg_base->mr, FSL_DMA_MR_EOTIE,
- 32);
+ DMA_OUT(fsl_chan, &fsl_chan->reg_base->mr, FSL_DMA_MR_EOTIE
+ | FSL_DMA_MR_PRC_RM, 32);
break;
}

diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index 4f21a51..dc7f268 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -38,6 +38,7 @@

/* Special MR definition for MPC8349 */
#define FSL_DMA_MR_EOTIE 0x00000080
+#define FSL_DMA_MR_PRC_RM 0x00000800

#define FSL_DMA_SR_CH 0x00000020
#define FSL_DMA_SR_PE 0x00000010
--
1.5.4.3


2009-04-27 07:48:44

by Li Yang

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Sat, Apr 25, 2009 at 2:35 AM, Ira Snyder <[email protected]> wrote:
> By default, the Freescale 83xx DMA controller uses the PCI Read Line
> command when reading data over the PCI bus. Setting the controller to use
> the PCI Read Multiple command instead allows the controller to read much
> larger bursts of data, which provides a drastic speed increase.
>
> The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI
> bridge was between the devices trying to communicate.
>
> A simple test driver showed an increase from 4MB/sec to 116MB/sec when
> performing DMA over the PCI bus. Using DMA to transfer between blocks of
> local SDRAM showed no change in performance with this patch. The dmatest
> driver was also used to verify the correctness of the transfers, and showed
> no errors.
>
> Signed-off-by: Ira W. Snyder <[email protected]>

Acked-by: Li Yang <[email protected]>

Thanks,
Leo

2009-04-27 09:09:30

by Liu Dave-R63238

[permalink] [raw]
Subject: RE: [PATCH] fsldma: use PCI Read Multiple command

> By default, the Freescale 83xx DMA controller uses the PCI Read Line
> command when reading data over the PCI bus. Setting the
> controller to use the PCI Read Multiple command instead allows the
> controller to read much larger bursts of data, which provides a
drastic
> speed increase.

IIRC, the default for 83xx DMA controller uses the PCI mem read
command, not mem read line.

You are assuming the PCI memory space is prefetchable( no side effect)
for DMA.
Is it possible that DMA is from non-prefetchable memory space?

> The slowdown due to using PCI Read Line was only observed
> when a PCI-to-PCI bridge was between the devices trying to
communicate.
>
> A simple test driver showed an increase from 4MB/sec to 116MB/sec when
> performing DMA over the PCI bus. Using DMA to transfer between blocks
> of local SDRAM showed no change in performance with this patch.
> The dmatest driver was also used to verify the correctness of the
transfers,
> and showed no errors.

2009-04-27 10:16:31

by Li Yang

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Mon, Apr 27, 2009 at 5:09 PM, Liu Dave-R63238 <[email protected]> wrote:
>> By default, the Freescale 83xx DMA controller uses the PCI Read Line
>> command when reading data over the PCI bus. Setting the
>> controller to use the PCI Read Multiple command instead allows the
>> controller to read much larger bursts of data, which provides a
> drastic
>> speed increase.
>
> IIRC, the default for 83xx DMA controller uses the PCI mem read
> command, not mem read line.
>
> You are assuming the PCI memory space is prefetchable( no side effect)
> for DMA.
> Is it possible that DMA is from non-prefetchable memory space?

I guess it's not reasonable to use DMA from non-prefetchable memory.
So it's up to the driver which uses the DMA engine to prevent from
using DMA API on non-prefetchable memory.

- Leo

2009-04-27 14:39:23

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Mon, Apr 27, 2009 at 4:09 AM, Liu Dave-R63238 <[email protected]> wrote:

> You are assuming the PCI memory space is prefetchable( no side effect)
> for DMA.
> Is it possible that DMA is from non-prefetchable memory space?

This should be a safe assumption for this driver. Remember, this
driver just does offload memcpy, from one region to another. So the
PCI memory that you are reading from should be just a buffer of data,
and there should be side-effect of reading it.

However, I would like to see a comment at the top of the file warning
people that copying from PCI memory will result in prefetched reads.

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 19:34:31

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Hi all,

>> You are assuming the PCI memory space is prefetchable( no side effect)
>> for DMA. Is it possible that DMA is from non-prefetchable memory space?
>
> This should be a safe assumption for this driver. Remember, this
> driver just does offload memcpy, from one region to another. So the
> PCI memory that you are reading from should be just a buffer of data,
> and there should be side-effect of reading it.
>
> However, I would like to see a comment at the top of the file warning
> people that copying from PCI memory will result in prefetched reads.
>

Here's a few results from DMA tests between two
MPC8349EA boards housed in a CPCI chassis.

1. DMA mode register (DMAMRn)
PCI read command (PRC, bits 11:10)

a) DMAMRn[PRC] = 00 = PCI Read

The PCI read command is 6h on the PCI bus.
For DMA lengths of less than 1 cache line (32-bytes)
the DMA controller will generate a PCI 6h command.
However, for lengths of 32-bytes and higher, the
DMA controller actually generates a PCI Read Line (Eh)
command.

Freescale indicated that this 'change of PCI command code'
functionality is an undocumented 'feature', there to
improve performance for longer read lengths.

b) DMAMRn[PRC] = 01 = PCI Read Line

Generated the PCI command code for PCI read line (Eh),
regardless of DMA length.

c) DMAMRn[PRC] = 10 = PCI Read Multiple

Generated the PCI command code for PCI Read Multiple (Ch),
regardless of DMA length.

2. DMA from areas marked as non-prefetchable.

We setup two test cases:

a) Two boards in the same PCI segment with no intervening
bridges.

b) Two boards in separate PCI segments with intervening
bridges (in this case, there was two bridges on SBS
CPCI-to-CPI backplane bridging cards).

The PCI window to the IMMR registers on the boards is marked
as non-prefetchable. DMA from that area (reads) appeared on the
PCI bus as single 32-bit read transactions, i.e., the target
device effectively ignores the PCI read command, and the target
returns read data as single reads, i.e., the target acts as
as non-prefetchable memory.

We tested with the PCI Read Multiple command, and no data
was ever prefetched from the target. The bridges did not
prefetch and discard data, eg. we tried 36-bytes, and there
were 9 separate 4-byte transactions (we were using a 32-bit
PCI host for this test). Hence, at least for this example
target device, there are no side effects to using PCI
read multiple on the bus master, to a target PCI memory
region marked as non-prefetchable.

So even though the bridges were seeing transactions that
indicated read multiple, only single read cycles were
seen to be used.

Would you like some sort of summary of this info for a commit
message?

Would you like us to check any other transaction/register combos?

Cheers,
Dave




2009-04-27 19:41:11

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

David Hawkins wrote:

> Would you like some sort of summary of this info for a commit
> message?

That's probably overkill. I just want a sentence or two that tells
someone looking at the code casually that the behavior of reading PCI
memory might be different than what they expect.

> Would you like us to check any other transaction/register combos?

Yes, could you try this on non-PCI memory?

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 19:48:21

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command



>> Would you like some sort of summary of this info for a commit
>> message?
>
> That's probably overkill. I just want a sentence or two that tells
> someone looking at the code casually that the behavior of reading PCI
> memory might be different than what they expect.

Ok, will-do.

>> Would you like us to check any other transaction/register combos?
>
> Yes, could you try this on non-PCI memory?

We've been using it to DMA between the x86 host main memory and
the MPC8349EA boards (PCI targets). The reason we changed to
Read Multiple was that it had a dramatic improvement in
efficiency through bridges. However, the x86 host memory
is prefetchable, so is consistent with the use of Read Multiple.

Can you give me an example of non-PCI memory that would be
non-prefetchable that you'd like us to try? We can see if our
host CPUs have an area like that ... we just need to know
what device to look for first :)

Cheers,
Dave


2009-04-27 19:56:50

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


On Apr 27, 2009, at 2:48 PM, David Hawkins wrote:

> Can you give me an example of non-PCI memory that would be
> non-prefetchable that you'd like us to try? We can see if our
> host CPUs have an area like that ... we just need to know
> what device to look for first :)

You can mark the pci inbound window on the 83xx as non-prefetchable
(assuming 83xx is host). On a x86 host I doubt there is any easy way
to get non-prefetchable memory.

- k

2009-04-27 20:00:31

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


>> Can you give me an example of non-PCI memory that would be
>> non-prefetchable that you'd like us to try? We can see if our
>> host CPUs have an area like that ... we just need to know
>> what device to look for first :)
>
> You can mark the pci inbound window on the 83xx as non-prefetchable
> (assuming 83xx is host). On a x86 host I doubt there is any easy way to
> get non-prefetchable memory.

Yep, we were going to do that, but chose to use the
1MB region already setup for the IMMRs since its already
marked as non-prefetchable. We were only doing reads, so
it wasn't going to hurt anything.

I doubt that marking one of the other BAR regions
as non-prefetchable will give a different result.
However, we're more than happy to double-check if
you'd like.

Cheers,
Dave

2009-04-27 20:01:45

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command



>> You can mark the pci inbound window on the 83xx as non-prefetchable
>> (assuming 83xx is host). On a x86 host I doubt there is any easy way
>> to get non-prefetchable memory.

One more note; we don't have access to a host-mode MPC8349EA,
our boards are all targets.

Cheers,
Dave

2009-04-27 20:04:29

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


On Apr 27, 2009, at 3:00 PM, David Hawkins wrote:

>
>>> Can you give me an example of non-PCI memory that would be
>>> non-prefetchable that you'd like us to try? We can see if our
>>> host CPUs have an area like that ... we just need to know
>>> what device to look for first :)
>> You can mark the pci inbound window on the 83xx as non-prefetchable
>> (assuming 83xx is host). On a x86 host I doubt there is any easy
>> way to get non-prefetchable memory.
>
> Yep, we were going to do that, but chose to use the
> 1MB region already setup for the IMMRs since its already
> marked as non-prefetchable. We were only doing reads, so
> it wasn't going to hurt anything.
>
> I doubt that marking one of the other BAR regions
> as non-prefetchable will give a different result.
> However, we're more than happy to double-check if
> you'd like.

Its possible you'll get a different result since IMMR is a register
space internal and thats normally a completely different bus than
memory would be (internal to the 83xx). I'd suggest double-checking w/
a BAR marked non-prefetch pointing to real memory.

- k

2009-04-27 20:05:12

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

David Hawkins wrote:

> Can you give me an example of non-PCI memory that would be
> non-prefetchable that you'd like us to try? We can see if our
> host CPUs have an area like that ... we just need to know
> what device to look for first :)

Hmmmm.... I was going to say any SOC device in the IMMR, but I don't see
anything there that would constitute a memory buffer.

I test this change on an 8610 and DMA to a register I/O, where this bit
isn't even defined, and it made no difference. So I guess this change
is okay.

Acked-by: Timur Tabi <[email protected]>

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 20:13:10

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


>>>> Can you give me an example of non-PCI memory that would be
>>>> non-prefetchable that you'd like us to try? We can see if our
>>>> host CPUs have an area like that ... we just need to know
>>>> what device to look for first :)
>>> You can mark the pci inbound window on the 83xx as non-prefetchable
>>> (assuming 83xx is host). On a x86 host I doubt there is any easy way
>>> to get non-prefetchable memory.
>>
>> Yep, we were going to do that, but chose to use the
>> 1MB region already setup for the IMMRs since its already
>> marked as non-prefetchable. We were only doing reads, so
>> it wasn't going to hurt anything.
>>
>> I doubt that marking one of the other BAR regions
>> as non-prefetchable will give a different result.
>> However, we're more than happy to double-check if
>> you'd like.
>
> Its possible you'll get a different result since IMMR is a register
> space internal and thats normally a completely different bus than memory
> would be (internal to the 83xx). I'd suggest double-checking w/a BAR
> marked non-prefetch pointing to real memory.

We had a 4k BAR1 setup to point to DDR memory.

With prefetchable set, a 36-byte transfer generated a
burst-of-8 32-bit words followed by a single transaction.

With non-prefetchable set, the transfers were all single.

So it works like we'd expect.

Cheers,
Dave

2009-04-27 20:23:00

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Hi Timur,

>> Would you like some sort of summary of this info for a commit
>> message?
>
> That's probably overkill. I just want a sentence or two that tells
> someone looking at the code casually that the behavior of reading PCI
> memory might be different than what they expect.

Could you help us with the wording you'd like to see in the code.
Did you want to see something in the header comments, or something
near the register settings?

How about something like this in place of the existing PCI_RM
comment:

PRC_RM - PCI read multiple
The default PCI read command used by the DMA controller is
PCI Read (PCI command 6h). When the burst length is 32-bytes
or longer, PCI Read Line (PCI command Eh) is used (undocumented
feature of the controller). Using PCI read multiple
(PCI command Ch) results in high-performance across PCI
bridges. DMA transfers to non-prefetchable PCI registers
should not result in prefetched reads, even when using
the PCI read multiple command.


Cheers,
Dave


2009-04-27 20:26:54

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

David Hawkins wrote:

> PRC_RM - PCI read multiple
> The default PCI read command used by the DMA controller is
> PCI Read (PCI command 6h). When the burst length is 32-bytes
> or longer, PCI Read Line (PCI command Eh) is used (undocumented
> feature of the controller). Using PCI read multiple
> (PCI command Ch) results in high-performance across PCI
> bridges. DMA transfers to non-prefetchable PCI registers
> should not result in prefetched reads, even when using
> the PCI read multiple command.

I was thinking more along the lines of:

"This driver tells the DMA controller to use the PCI Read Multiple
command, instead of the PCI Read Line command, for PCI read operations.
Please be aware that this setting may result in read pre-fetching on
some platforms."

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 20:41:50

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Hi Timur,

>> PRC_RM - PCI read multiple
>> The default PCI read command used by the DMA controller is
>> PCI Read (PCI command 6h). When the burst length is 32-bytes
>> or longer, PCI Read Line (PCI command Eh) is used (undocumented
>> feature of the controller). Using PCI read multiple
>> (PCI command Ch) results in high-performance across PCI
>> bridges. DMA transfers to non-prefetchable PCI registers
>> should not result in prefetched reads, even when using
>> the PCI read multiple command.
>
> I was thinking more along the lines of:
>
> "This driver tells the DMA controller to use the PCI Read Multiple
> command, instead of the PCI Read Line command, for PCI read operations.
> Please be aware that this setting may result in read pre-fetching on
> some platforms."

Ok, thanks.

Ira will add your comment to the body of the code near
the PRC_RM command and submit a new patch.

Cheers,
Dave

2009-04-27 20:42:28

by Ira W. Snyder

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Mon, Apr 27, 2009 at 03:26:36PM -0500, Timur Tabi wrote:
> David Hawkins wrote:
>
> > PRC_RM - PCI read multiple
> > The default PCI read command used by the DMA controller is
> > PCI Read (PCI command 6h). When the burst length is 32-bytes
> > or longer, PCI Read Line (PCI command Eh) is used (undocumented
> > feature of the controller). Using PCI read multiple
> > (PCI command Ch) results in high-performance across PCI
> > bridges. DMA transfers to non-prefetchable PCI registers
> > should not result in prefetched reads, even when using
> > the PCI read multiple command.
>
> I was thinking more along the lines of:
>
> "This driver tells the DMA controller to use the PCI Read Multiple
> command, instead of the PCI Read Line command, for PCI read operations.
> Please be aware that this setting may result in read pre-fetching on
> some platforms."
>

How about the following revised patch? Is it ok to inline it this way,
or should I send another email to the list containing just this patch?

>From 73e42fa58c93de8d4d429ba8e069b60c42037b58 Mon Sep 17 00:00:00 2001
From: Ira W. Snyder <[email protected]>
Date: Thu, 23 Apr 2009 16:17:54 -0700
Subject: [PATCH] fsldma: use PCI Read Multiple command

By default, the Freescale 83xx DMA controller uses the PCI Read Line
command when reading data over the PCI bus. Setting the controller to use
the PCI Read Multiple command instead allows the controller to read much
larger bursts of data, which provides a drastic speed increase.

The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI
bridge was between the devices trying to communicate.

A simple test driver showed an increase from 4MB/sec to 116MB/sec when
performing DMA over the PCI bus. Using DMA to transfer between blocks of
local SDRAM showed no change in performance with this patch. The dmatest
driver was also used to verify the correctness of the transfers, and showed
no errors.

Signed-off-by: Ira W. Snyder <[email protected]>
---

v1 -> v2:
* Added comment warning about possible prefetching

drivers/dma/fsldma.c | 10 ++++++++--
drivers/dma/fsldma.h | 1 +
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index da8a8ed..5943ca8 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -12,6 +12,11 @@
* also fit for MPC8560, MPC8555, MPC8548, MPC8641, and etc.
* The support for MPC8349 DMA contorller is also added.
*
+ * This driver instructs the DMA controller to issue the PCI Read Multiple
+ * command for PCI read operations, instead of using the default PCI Read Line
+ * command. Please be aware that this setting may result in read pre-fetching
+ * on some platforms.
+ *
* This is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
@@ -49,9 +54,10 @@ static void dma_init(struct fsl_dma_chan *fsl_chan)
case FSL_DMA_IP_83XX:
/* Set the channel to below modes:
* EOTIE - End-of-transfer interrupt enable
+ * PRC_RM - PCI read multiple
*/
- DMA_OUT(fsl_chan, &fsl_chan->reg_base->mr, FSL_DMA_MR_EOTIE,
- 32);
+ DMA_OUT(fsl_chan, &fsl_chan->reg_base->mr, FSL_DMA_MR_EOTIE
+ | FSL_DMA_MR_PRC_RM, 32);
break;
}

diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index 4f21a51..dc7f268 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -38,6 +38,7 @@

/* Special MR definition for MPC8349 */
#define FSL_DMA_MR_EOTIE 0x00000080
+#define FSL_DMA_MR_PRC_RM 0x00000800

#define FSL_DMA_SR_CH 0x00000020
#define FSL_DMA_SR_PE 0x00000010
--
1.5.4.3

2009-04-27 20:43:21

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

David Hawkins wrote:

> Ira will add your comment to the body of the code near
> the PRC_RM command and submit a new patch.

I'd rather have it near the top where people can see it.

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 20:44:55

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Timur Tabi wrote:
> David Hawkins wrote:
>
>> Ira will add your comment to the body of the code near
>> the PRC_RM command and submit a new patch.
>
> I'd rather have it near the top where people can see it.

Looks like Ira had the same thought :)

Dave

2009-04-27 20:47:21

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Adding Kumar to the CC: list, since he might pick up the patch.

Ira Snyder wrote:

> From 73e42fa58c93de8d4d429ba8e069b60c42037b58 Mon Sep 17 00:00:00 2001
> From: Ira W. Snyder <[email protected]>
> Date: Thu, 23 Apr 2009 16:17:54 -0700
> Subject: [PATCH] fsldma: use PCI Read Multiple command
>
> By default, the Freescale 83xx DMA controller uses the PCI Read Line
> command when reading data over the PCI bus. Setting the controller to use
> the PCI Read Multiple command instead allows the controller to read much
> larger bursts of data, which provides a drastic speed increase.
>
> The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI
> bridge was between the devices trying to communicate.
>
> A simple test driver showed an increase from 4MB/sec to 116MB/sec when
> performing DMA over the PCI bus. Using DMA to transfer between blocks of
> local SDRAM showed no change in performance with this patch. The dmatest
> driver was also used to verify the correctness of the transfers, and showed
> no errors.
>
> Signed-off-by: Ira W. Snyder <[email protected]>

Acked-by: Timur Tabi <[email protected]>

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-27 20:48:20

by Scott Wood

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Mon, Apr 27, 2009 at 03:04:49PM -0500, Timur Tabi wrote:
> David Hawkins wrote:
>
> > Can you give me an example of non-PCI memory that would be
> > non-prefetchable that you'd like us to try? We can see if our
> > host CPUs have an area like that ... we just need to know
> > what device to look for first :)
>
> Hmmmm.... I was going to say any SOC device in the IMMR, but I don't see
> anything there that would constitute a memory buffer.
>
> I test this change on an 8610 and DMA to a register I/O, where this bit
> isn't even defined, and it made no difference. So I guess this change
> is okay.

I thought the driver only used the bit if the device tree indicated it
was an 83xx-era DMA controller?

That said, the bits are documented as specifically for PCI, so it would
be surprising if it had any effect elsewhere.

-Scott

2009-04-27 20:49:48

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]> wrote:
> Adding Kumar to the CC: list, since he might pick up the patch.
>

Acked-by: Dan Williams <[email protected]>

I agree with taking this through Kumar's tree.

2009-04-27 20:50:24

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Scott Wood wrote:

> I thought the driver only used the bit if the device tree indicated it
> was an 83xx-era DMA controller?

I just wanted to make sure it didn't do anything weird. It was the only
test I could think of that didn't involve PCI.

> That said, the bits are documented as specifically for PCI, so it would
> be surprising if it had any effect elsewhere.

Surely you wouldn't really be surprised by incorrect documentation of
our parts. :-)

--
Timur Tabi
Linux kernel developer at Freescale

2009-04-28 01:31:40

by Liu Dave-R63238

[permalink] [raw]
Subject: RE: [PATCH] fsldma: use PCI Read Multiple command

> > You are assuming the PCI memory space is prefetchable( no
> > side effect) for DMA.
> > Is it possible that DMA is from non-prefetchable memory space?
>
> This should be a safe assumption for this driver. Remember, this
> driver just does offload memcpy, from one region to another. So the
> PCI memory that you are reading from should be just a buffer of data,
> and there should be side-effect of reading it.
>
> However, I would like to see a comment at the top of the file warning
> people that copying from PCI memory will result in prefetched reads.

How about FIFO RAM case?

2009-04-28 01:37:21

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


> How about FIFO RAM case?

If the FIFO has a fixed address, then according to
the user guide, the DMA controller won't generate
a burst transaction to it.

We can try confirming this if you'd like.

Cheers,
Dave



2009-04-28 01:49:22

by Liu Dave-R63238

[permalink] [raw]
Subject: RE: [PATCH] fsldma: use PCI Read Multiple command

> Here's a few results from DMA tests between two
> MPC8349EA boards housed in a CPCI chassis.
>
> 1. DMA mode register (DMAMRn)
> PCI read command (PRC, bits 11:10)
>
> a) DMAMRn[PRC] = 00 = PCI Read
>
> The PCI read command is 6h on the PCI bus.
> For DMA lengths of less than 1 cache line (32-bytes)
> the DMA controller will generate a PCI 6h command.
> However, for lengths of 32-bytes and higher, the
> DMA controller actually generates a PCI Read Line (Eh)
> command.
>
> Freescale indicated that this 'change of PCI command code'
> functionality is an undocumented 'feature', there to
> improve performance for longer read lengths.
>
> b) DMAMRn[PRC] = 01 = PCI Read Line
>
> Generated the PCI command code for PCI read line (Eh),
> regardless of DMA length.
>
> c) DMAMRn[PRC] = 10 = PCI Read Multiple
>
> Generated the PCI command code for PCI Read Multiple (Ch),
> regardless of DMA length.

Good summary!

For the DMA PCI read/line/multi-line is outbound transaction.
So according to your experiment, the 8349 PCI controller(as master)
attemp to streaming/combining the outbound transaction(treated as
prefetchable
space).
IIRC, the early 8349EUM has the bit[2]-SE in the POCMRn
register, and is removed now. Not sure if it does function.

2009-04-28 02:07:44

by Liu Dave-R63238

[permalink] [raw]
Subject: RE: [PATCH] fsldma: use PCI Read Multiple command

> >> You can mark the pci inbound window on the 83xx as
> >> non-prefetchable(assuming 83xx is host). On a x86 host
> >> I doubt there is any easy way to get non-prefetchable memory.

> One more note; we don't have access to a host-mode MPC8349EA,
> our boards are all targets.

Actually we also can use the inbound window(w/wo prefetchable)
in the agent mode. It doesn't care if it is host or agent.

2009-04-28 02:08:52

by David Hawkins

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Hi Dave,

> For the DMA PCI read/line/multi-line is outbound transaction.
> So according to your experiment, the 8349 PCI controller(as master)
> attemp to streaming/combining the outbound transaction(treated as
> prefetchable space).

Yep, with the MPC8349EA configured as a PCI Target,
and operating as a Bus Master, DMA transfers between
two MPC8349EA targets to prefetchable memory on the
slave will burst at pretty much full-speed over the
PCI bus. The same goes for DMA to the host memory.
However, reading from the host is slower, as the bursts
get chopped up more than they do between two target
boards. At some point I'll get a bunch of screen
captures and put them in a document.

> IIRC, the early 8349EUM has the bit[2]-SE in the POCMRn
> register, and is removed now. Not sure if it does function.

Hey yeah, I looked in the 8349E manual, and it is defined.
I'm not sure why it would be defined there though. I can't
think of why the master would want to disable streaming
based on a bit setting; it should burst until the IOS
says its full. Anyway, the bit is gone now, so we'll
just ignore the fact it existed :)

Cheers,
Dave

2009-04-28 13:45:18

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

David Hawkins wrote:
>> How about FIFO RAM case?
>
> If the FIFO has a fixed address, then according to
> the user guide, the DMA controller won't generate
> a burst transaction to it.
>
> We can try confirming this if you'd like.

Like I said earlier, this driver does not support copying data to/from a
single register address. It's just a off-loaded memcpy. So we don't
need to worry about what happens to a fixed address.

--
Timur Tabi
Linux kernel developer at Freescale

2009-06-11 02:47:57

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:

> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
> wrote:
>> Adding Kumar to the CC: list, since he might pick up the patch.
>>
>
> Acked-by: Dan Williams <[email protected]>
>
> I agree with taking this through Kumar's tree.

I'm going through patches for .31.. Should I still pick this up?
Going forward should I pick up fsldma patches?

- k

2009-06-11 15:17:56

by Ira W. Snyder

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote:
>
> On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:
>
>> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
>> wrote:
>>> Adding Kumar to the CC: list, since he might pick up the patch.
>>>
>>
>> Acked-by: Dan Williams <[email protected]>
>>
>> I agree with taking this through Kumar's tree.
>
> I'm going through patches for .31.. Should I still pick this up? Going
> forward should I pick up fsldma patches?
>

I'm fine with that, but you should probably talk to Li Yang (added to
CC). He's gotten in contact with me a few times recently.

In addition to the PCI Read Multiple patch, I've posted a few more
fsldma patches to ppcdev recently:

fsldma: enable external start for the 83xx controller
fsldma: do not clear bandwidth control bits on the 83xx controller
fsldma: Add DMA_SLAVE support

The first two are very straightforward. The third probably needs some
review before it can be picked up, though. It is well tested, I've been
using it for a few weeks without issues.

Thanks,
Ira

2009-06-12 09:23:59

by Li Yang

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyder<[email protected]> wrote:
> On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote:
>>
>> On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:
>>
>>> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
>>> wrote:
>>>> Adding Kumar to the CC: list, since he might pick up the patch.
>>>>
>>>
>>> Acked-by: Dan Williams <[email protected]>
>>>
>>> I agree with taking this through Kumar's tree.
>>
>> I'm going through patches for .31.. Should I still pick this up?  Going
>> forward should I pick up fsldma patches?
>>
>
> I'm fine with that, but you should probably talk to Li Yang (added to
> CC). He's gotten in contact with me a few times recently.

I am fine with both ways for this patch as it is only related to
Freescale register details. But in general I think patches should go
through functional subsystem, as they usually would need insight of
the subsystem architecture. I prefer the way that the patch acked or
signed-off by Freescale guys and push upstream through Dan's tree as
most other subsystems did. Unless Dan prefers to ack the subsystem
architectural part of each patch and have them pushed other way.

- Leo

2009-06-12 15:05:57

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


On Jun 12, 2009, at 4:23 AM, Li Yang wrote:

> On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyder<[email protected]>
> wrote:
>> On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote:
>>>
>>> On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:
>>>
>>>> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
>>>> wrote:
>>>>> Adding Kumar to the CC: list, since he might pick up the patch.
>>>>>
>>>>
>>>> Acked-by: Dan Williams <[email protected]>
>>>>
>>>> I agree with taking this through Kumar's tree.
>>>
>>> I'm going through patches for .31.. Should I still pick this up?
>>> Going
>>> forward should I pick up fsldma patches?
>>>
>>
>> I'm fine with that, but you should probably talk to Li Yang (added to
>> CC). He's gotten in contact with me a few times recently.
>
> I am fine with both ways for this patch as it is only related to
> Freescale register details. But in general I think patches should go
> through functional subsystem, as they usually would need insight of
> the subsystem architecture. I prefer the way that the patch acked or
> signed-off by Freescale guys and push upstream through Dan's tree as
> most other subsystems did. Unless Dan prefers to ack the subsystem
> architectural part of each patch and have them pushed other way.

I agree w/this and just wanting to see what Dan's preference is.

- k

2009-06-12 17:38:14

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command

Kumar Gala wrote:
> On Jun 12, 2009, at 4:23 AM, Li Yang wrote:
>
>> On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyder<[email protected]>
>> wrote:
>>> On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote:
>>>> On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:
>>>>
>>>>> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
>>>>> wrote:
>>>>>> Adding Kumar to the CC: list, since he might pick up the patch.
>>>>>>
>>>>> Acked-by: Dan Williams <[email protected]>
>>>>>
>>>>> I agree with taking this through Kumar's tree.
>>>> I'm going through patches for .31.. Should I still pick this up?
>>>> Going
>>>> forward should I pick up fsldma patches?
>>>>
>>> I'm fine with that, but you should probably talk to Li Yang (added to
>>> CC). He's gotten in contact with me a few times recently.
>> I am fine with both ways for this patch as it is only related to
>> Freescale register details. But in general I think patches should go
>> through functional subsystem, as they usually would need insight of
>> the subsystem architecture. I prefer the way that the patch acked or
>> signed-off by Freescale guys and push upstream through Dan's tree as
>> most other subsystems did. Unless Dan prefers to ack the subsystem
>> architectural part of each patch and have them pushed other way.
>
> I agree w/this and just wanting to see what Dan's preference is.

I'll take fsldma patches through the dmaengine tree with Leo's
ack/sign-off. That last request was a one-off because I had nothing
else to push and the discussion was very architecture specific.

Thanks,
Dan

2009-06-12 18:04:22

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH] fsldma: use PCI Read Multiple command


On Jun 12, 2009, at 12:38 PM, Dan Williams wrote:

> Kumar Gala wrote:
>> On Jun 12, 2009, at 4:23 AM, Li Yang wrote:
>>> On Thu, Jun 11, 2009 at 11:17 PM, Ira
>>> Snyder<[email protected]> wrote:
>>>> On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote:
>>>>> On Apr 27, 2009, at 3:49 PM, Dan Williams wrote:
>>>>>
>>>>>> On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi <[email protected]>
>>>>>> wrote:
>>>>>>> Adding Kumar to the CC: list, since he might pick up the patch.
>>>>>>>
>>>>>> Acked-by: Dan Williams <[email protected]>
>>>>>>
>>>>>> I agree with taking this through Kumar's tree.
>>>>> I'm going through patches for .31.. Should I still pick this
>>>>> up? Going
>>>>> forward should I pick up fsldma patches?
>>>>>
>>>> I'm fine with that, but you should probably talk to Li Yang
>>>> (added to
>>>> CC). He's gotten in contact with me a few times recently.
>>> I am fine with both ways for this patch as it is only related to
>>> Freescale register details. But in general I think patches should
>>> go
>>> through functional subsystem, as they usually would need insight of
>>> the subsystem architecture. I prefer the way that the patch acked
>>> or
>>> signed-off by Freescale guys and push upstream through Dan's tree as
>>> most other subsystems did. Unless Dan prefers to ack the subsystem
>>> architectural part of each patch and have them pushed other way.
>> I agree w/this and just wanting to see what Dan's preference is.
>
> I'll take fsldma patches through the dmaengine tree with Leo's ack/
> sign-off. That last request was a one-off because I had nothing
> else to push and the discussion was very architecture specific.

Sounds good to me. I expect you to pick up this patch for .31

- k