2005-12-20 10:11:31

by Deepak Saxena

[permalink] [raw]
Subject: IOAT comments


Andy & Chris,

Sorry for the very very delayed response on your patch. Some comments below.

- Embedded chipsets have had DMA engines on them for a long time and
having a single cross-platform API that can be used to offload standard
kernel functions (memcpy, user_copy, etc) is a good starting point to make
use of these. However, in addition to simple memory to memory copying, the
engines on embedded devices often support memory <-> I/O space copying
(accelereted memcpy_to_io/from_io).

What I would ideally like to see is an API that allows me to allocate a
DMA channel between system memory and a device struct:

dma_client_alloc_chan(struct dma_client client*, struct device *dev);

The core would then search the DMA controller list, calling some function
[(dma_device->device_supported(dev)?] to determine whether a controller
can DMA to/from this device. Currently we have lots of CPU-specific DMA
APIs in the embedded architectures and it would be nice to have well
defined API that all drivers across all architectures could use.

Passing a NULL dev would signify that we just want to do mem <-> mem DMA.

- Make the various copy functions static inlines since they are just doing
an extra function pointer dereference.

- The API currently supports only 1 client per DMA channel. I think a
client should be able to ask for exclusive or non-exclusive usage of
a DMA channel and the core can mark channels as unvailable when
exclusive use is required. This could be useful in systems with just
1 DMA channel but multiple applications wanting to use it.

- Add an interface that takes scatter gather lists as both source and
destination.

- DMA engine buffer alignment requirements? I've seen engines that
handle any source/destination alignment and ones that handle
only 32-bit or 64-bit aligned buffers. In the case of a transfer
that is != alignment requirement of DMA engine, does the DMA device
driver handle this or does the DMA core handle this?

- Can we get rid of the "async" in the various function names? I don't
know of any HW that implements a synchronous DMA engine! It would sort
of defeat the purpose. :)

- The user_dma code should be generic, not in net/ but in kernel/ or
in drivers/dma as other subsystems and applications can probably
make use of this funcionality.

~Deepak



--
Deepak Saxena - [email protected] - http://www.plexity.net


2005-12-20 17:13:12

by Chris Leech

[permalink] [raw]
Subject: Re: IOAT comments

On Tue, 2005-12-20 at 02:11 -0800, Deepak Saxena wrote:
>
> Andy & Chris,
>
> Sorry for the very very delayed response on your patch. Some comments
> below.

Thanks for the feedback. I'm working on fixing up a few things and
getting an updated set of patches out, so your timing's not bad at all.

> What I would ideally like to see is an API that allows me to allocate
> a DMA channel between system memory and a device struct:
>
> dma_client_alloc_chan(struct dma_client client*, struct device *dev);

I had something similar in my initial design, the I/OAT DMA engine can
operate on any memory mapped region "downstream" from the PCI device.
That's easy enough to add to the API, and then flush out the device
matching code in the ioatdma driver later.

> Make the various copy functions static inlines since they are just
> doing an extra function pointer dereference.

OK

> - The API currently supports only 1 client per DMA channel. I think a
> client should be able to ask for exclusive or non-exclusive usage of
> a DMA channel and the core can mark channels as unvailable when
> exclusive use is required. This could be useful in systems with just
> 1 DMA channel but multiple applications wanting to use it.

We were trying to not share them for simplicity. But with the way we
ended up using the DMA engine for TCP, we needed to get the locking
right for requests from multiple threads anyway. Shared channel use
shouldn't be hard to add, I'll look into it.

> - Add an interface that takes scatter gather lists as both source and
> destination.

OK

> - DMA engine buffer alignment requirements? I've seen engines that
> handle any source/destination alignment and ones that handle
> only 32-bit or 64-bit aligned buffers. In the case of a transfer
> that is != alignment requirement of DMA engine, does the DMA device
> driver handle this or does the DMA core handle this?

Any ideas of what this would look like?

> - Can we get rid of the "async" in the various function names? I don't
> know of any HW that implements a synchronous DMA engine! It would
> sort of defeat the purpose. :)

We tossed the async in after an early comment that the dma namespace
wasn't unique enough, especially with the DMA mapping APIs already in
place. Maybe hwdma? I'm open to suggestions.

> - The user_dma code should be generic, not in net/ but in kernel/ or
> in drivers/dma as other subsystems and applications can probably
> make use of this funcionality.

You're right. That file started out with dma_skb_copy_datagram_iovec,
and dma_memcpy_[pg_]toiovec were created to support that. Anything not
specifically tied to sk_buffs should be moved.

-Chris