Hi. I'm writing a new DMA driver, using dw_dmac driver in kernel
2.6.28 as a reference implementation.
I can see an outline of the dw_dmac driver and framework (well,
hopefully), but have some questions in details.
1. map/unmap DMA buffers for slave transfer
For slave-DMA, it seems dmac driver is responsible for mapping DMA
buffers, and client is responsible for unmapping them. Is it right?
Now dwc_descriptor_complete() unmaps DMA buffer unless
DMA_COMPL_SKIP_XXX_UNMAP set. These unmapping should be done only for
non-slave transfers? I.e. I suppose "if (!dwc->dws)" is required
around dma_unmap_page() pair.
I also wonder why dwc_prep_slave_sg() calculates total_len. The
total_len is stored in first descriptor of the chain and it is only
referenced for unmapping DMA buffers. But the scatterlist may contain
uncontinuous buffers, so they can not be unmaped at a time.
2. dwc_is_tx_complete() and dwc->lock
The function dwc_is_tx_complete() calls dwc_scan_descriptors() without
dwc->lock (and disabling bh). Is it safe? All other callers of
dwc_scan_descriptors() take dwc->lock and disable bh.
3. dwc->queue list management
The function dwc_tx_submit() add the descriptor to dwc->queue list if
active list was not empty. But it does not manage lli.llp list. And
all descriptors in the queue list will be moved to active list at a
time. So it seems non-first descriptors in queue list will never
processed by the hardware.
The dwc_tx_submit() should rewrite lli.llp of the last descriptor in
queue list (it it had children, the last children of it) by txd.phys
of newly queued descriptor. Or, only first elements of queue list
should be moved to active list at a time.
Is my analysis correct?
BTW, I found some redundant code in the driver.
* memset(dw, 0, sizeof *dw) seems redundant while dw is allocated
kzalloc().
* platform_get_drvdata(pdev) is called twice in dw_shutdown,
dw_suspend_late.
Both of them harmless.
---
Atsushi Nemoto
On Mon, 29 Dec 2008 23:39:32 +0900 (JST), Atsushi Nemoto <[email protected]> wrote:
> Hi. I'm writing a new DMA driver, using dw_dmac driver in kernel
> 2.6.28 as a reference implementation.
>
> I can see an outline of the dw_dmac driver and framework (well,
> hopefully), but have some questions in details.
One more.
4. This is a comment on head of dwc_handle_error().
/*
* The descriptor currently at the head of the active list is
* borked. Since we don't have any way to report errors, we'll
* just have to scream loudly and try to carry on.
*/
But, the bad descriptor can be at any place of the active list, no ?
For example, if the active list contained two descriptor and latter
was broken and tasklet was delayed by some reason, the head of the
list should be good.
---
Atsushi Nemoto
Atsushi Nemoto wrote:
> Hi. I'm writing a new DMA driver, using dw_dmac driver in kernel
> 2.6.28 as a reference implementation.
>
> I can see an outline of the dw_dmac driver and framework (well,
> hopefully), but have some questions in details.
>
>
> 1. map/unmap DMA buffers for slave transfer
>
> For slave-DMA, it seems dmac driver is responsible for mapping DMA
> buffers, and client is responsible for unmapping them. Is it right?
No, it's the other way around, unless DMA_COMPL_SKIP_*_UNMAP is set.
But I think the dw_dmac driver wrongly maps the buffers before queuing
them.
> Now dwc_descriptor_complete() unmaps DMA buffer unless
> DMA_COMPL_SKIP_XXX_UNMAP set. These unmapping should be done only for
> non-slave transfers? I.e. I suppose "if (!dwc->dws)" is required
> around dma_unmap_page() pair.
Yes, perhaps. Or we could just forcibly set those flags in
prep_slave_sg().
> I also wonder why dwc_prep_slave_sg() calculates total_len. The
> total_len is stored in first descriptor of the chain and it is only
> referenced for unmapping DMA buffers. But the scatterlist may contain
> uncontinuous buffers, so they can not be unmaped at a time.
Yes, it is indeed a bit pointless to keep the total length around for
scatterlists.
> 2. dwc_is_tx_complete() and dwc->lock
>
> The function dwc_is_tx_complete() calls dwc_scan_descriptors() without
> dwc->lock (and disabling bh). Is it safe? All other callers of
> dwc_scan_descriptors() take dwc->lock and disable bh.
No, that does indeed look unsafe. We should probably take dwc->lock and
disable bh around the call to dwc_scan_descriptors().
> 3. dwc->queue list management
>
> The function dwc_tx_submit() add the descriptor to dwc->queue list if
> active list was not empty. But it does not manage lli.llp list. And
> all descriptors in the queue list will be moved to active list at a
> time. So it seems non-first descriptors in queue list will never
> processed by the hardware.
>
> The dwc_tx_submit() should rewrite lli.llp of the last descriptor in
> queue list (it it had children, the last children of it) by txd.phys
> of newly queued descriptor. Or, only first elements of queue list
> should be moved to active list at a time.
>
> Is my analysis correct?
Yes, I think you're right. lli.llp of the last queued entry should be
updated when adding new entries to the queue.
>
> BTW, I found some redundant code in the driver.
>
> * memset(dw, 0, sizeof *dw) seems redundant while dw is allocated
> kzalloc().
Indeed.
> * platform_get_drvdata(pdev) is called twice in dw_shutdown,
> dw_suspend_late.
>
> Both of them harmless.
Right.
> 4. This is a comment on head of dwc_handle_error().
>
> /*
> * The descriptor currently at the head of the active list is
> * borked. Since we don't have any way to report errors, we'll
> * just have to scream loudly and try to carry on.
> */
>
> But, the bad descriptor can be at any place of the active list, no ?
> For example, if the active list contained two descriptor and latter
> was broken and tasklet was delayed by some reason, the head of the
> list should be good.
Since dwc_scan_descriptors() was just called, all descriptors that were
completed successfully have been removed from the active list. So the
first entry must be the broken one.
Haavard
On Mon, 5 Jan 2009 15:30:24 +0100, Haavard Skinnemoen <[email protected]> wrote:
> > 1. map/unmap DMA buffers for slave transfer
> >
> > For slave-DMA, it seems dmac driver is responsible for mapping DMA
> > buffers, and client is responsible for unmapping them. Is it right?
>
> No, it's the other way around, unless DMA_COMPL_SKIP_*_UNMAP is set.
> But I think the dw_dmac driver wrongly maps the buffers before queuing
> them.
Well, I'm confused... I reference atmel-mci for client and dw_dmac
for dma engine. Currently, dw_dmac calls dma_map_sg() and atmel-mci
calls dma_unmap_sg(). Do you mean atmel-mci will be changed to call
dma_map_sg()?
> > 4. This is a comment on head of dwc_handle_error().
> >
> > /*
> > * The descriptor currently at the head of the active list is
> > * borked. Since we don't have any way to report errors, we'll
> > * just have to scream loudly and try to carry on.
> > */
> >
> > But, the bad descriptor can be at any place of the active list, no ?
> > For example, if the active list contained two descriptor and latter
> > was broken and tasklet was delayed by some reason, the head of the
> > list should be good.
>
> Since dwc_scan_descriptors() was just called, all descriptors that were
> completed successfully have been removed from the active list. So the
> first entry must be the broken one.
But the removal from the active list is done in tasklet too. If the
CPU was slow and DMA engine was fast enough, DMA engine will process
multiple descriptors before the tasklet called, no ?
---
Atsushi Nemoto
Atsushi Nemoto wrote:
> On Mon, 5 Jan 2009 15:30:24 +0100, Haavard Skinnemoen <[email protected]> wrote:
>
> > > 1. map/unmap DMA buffers for slave transfer
> > >
> > > For slave-DMA, it seems dmac driver is responsible for mapping DMA
> > > buffers, and client is responsible for unmapping them. Is it right?
> >
> > No, it's the other way around, unless DMA_COMPL_SKIP_*_UNMAP is set.
> > But I think the dw_dmac driver wrongly maps the buffers before queuing
> > them.
>
> Well, I'm confused... I reference atmel-mci for client and dw_dmac
> for dma engine. Currently, dw_dmac calls dma_map_sg() and atmel-mci
> calls dma_unmap_sg(). Do you mean atmel-mci will be changed to call
> dma_map_sg()?
Yes, I think it should.
Preferably, I'd like the client to do both mapping and unmapping in all
cases since it knows best what's actually needed. For example, some
drivers may be using DMA-coherent buffers, which don't need to be
mapped at all.
> > > 4. This is a comment on head of dwc_handle_error().
> > >
> > > /*
> > > * The descriptor currently at the head of the active list is
> > > * borked. Since we don't have any way to report errors, we'll
> > > * just have to scream loudly and try to carry on.
> > > */
> > >
> > > But, the bad descriptor can be at any place of the active list, no ?
> > > For example, if the active list contained two descriptor and latter
> > > was broken and tasklet was delayed by some reason, the head of the
> > > list should be good.
> >
> > Since dwc_scan_descriptors() was just called, all descriptors that were
> > completed successfully have been removed from the active list. So the
> > first entry must be the broken one.
>
> But the removal from the active list is done in tasklet too. If the
> CPU was slow and DMA engine was fast enough, DMA engine will process
> multiple descriptors before the tasklet called, no ?
The DMA engine stops if it encounters a bad address.
Haavard
On Mon, 5 Jan 2009 16:48:47 +0100, Haavard Skinnemoen <[email protected]> wrote:
> > Well, I'm confused... I reference atmel-mci for client and dw_dmac
> > for dma engine. Currently, dw_dmac calls dma_map_sg() and atmel-mci
> > calls dma_unmap_sg(). Do you mean atmel-mci will be changed to call
> > dma_map_sg()?
>
> Yes, I think it should.
>
> Preferably, I'd like the client to do both mapping and unmapping in all
> cases since it knows best what's actually needed. For example, some
> drivers may be using DMA-coherent buffers, which don't need to be
> mapped at all.
I see, thanks.
> > > > 4. This is a comment on head of dwc_handle_error().
> > > >
> > > > /*
> > > > * The descriptor currently at the head of the active list is
> > > > * borked. Since we don't have any way to report errors, we'll
> > > > * just have to scream loudly and try to carry on.
> > > > */
> > > >
> > > > But, the bad descriptor can be at any place of the active list, no ?
> > > > For example, if the active list contained two descriptor and latter
> > > > was broken and tasklet was delayed by some reason, the head of the
> > > > list should be good.
> > >
> > > Since dwc_scan_descriptors() was just called, all descriptors that were
> > > completed successfully have been removed from the active list. So the
> > > first entry must be the broken one.
> >
> > But the removal from the active list is done in tasklet too. If the
> > CPU was slow and DMA engine was fast enough, DMA engine will process
> > multiple descriptors before the tasklet called, no ?
>
> The DMA engine stops if it encounters a bad address.
Yes. And it does not mean the bad descriptor is at head of the active
queue.
1) driver enqueue desc A to active list and start DMA engine
2) driver enqueue desc B to queue list
3) driver enqueue desc C to queue list (desc C contains bad address)
4) DMA engine finish desc A and raise interrupt
5) tasklet remove desc A from active list and move desc B and C to active list
6) DMA engine finish desc B and raise interrupt
7) DMA engine abort desc C and raise interrupt
8) tasklet detect error
The point is the order of (7) and (8) cannot be expected. If (7)
comes first, the head of the active list contains desc B at (8) and
that is not a bad descriptor.
---
Atsushi Nemoto
Atsushi Nemoto wrote:
> > > > Since dwc_scan_descriptors() was just called, all descriptors that were
> > > > completed successfully have been removed from the active list. So the
> > > > first entry must be the broken one.
> > >
> > > But the removal from the active list is done in tasklet too. If the
> > > CPU was slow and DMA engine was fast enough, DMA engine will process
> > > multiple descriptors before the tasklet called, no ?
> >
> > The DMA engine stops if it encounters a bad address.
>
> Yes. And it does not mean the bad descriptor is at head of the active
> queue.
>
> 1) driver enqueue desc A to active list and start DMA engine
> 2) driver enqueue desc B to queue list
> 3) driver enqueue desc C to queue list (desc C contains bad address)
> 4) DMA engine finish desc A and raise interrupt
> 5) tasklet remove desc A from active list and move desc B and C to active list
> 6) DMA engine finish desc B and raise interrupt
> 7) DMA engine abort desc C and raise interrupt
> 8) tasklet detect error
>
> The point is the order of (7) and (8) cannot be expected. If (7)
> comes first, the head of the active list contains desc B at (8) and
> that is not a bad descriptor.
The tasklet won't detect any errors unless the DMA controller flags it,
so (7) must happen before (8). That does not necessarily mean that the
interrupts from (6) and (7) get handled before (8), but I don't think
it matters because dwc_handle_error() calls dwc_scan_descriptors(),
which will remove desc B from the active list if it is finished. And it
must be finished if desc C has failed, so the failed descriptor will
always be at the head of the queue after dwc_scan_descriptors() returns.
Haavard
On Tue, 6 Jan 2009 11:14:03 +0100, Haavard Skinnemoen <[email protected]> wrote:
> > 1) driver enqueue desc A to active list and start DMA engine
> > 2) driver enqueue desc B to queue list
> > 3) driver enqueue desc C to queue list (desc C contains bad address)
> > 4) DMA engine finish desc A and raise interrupt
> > 5) tasklet remove desc A from active list and move desc B and C to active list
> > 6) DMA engine finish desc B and raise interrupt
> > 7) DMA engine abort desc C and raise interrupt
> > 8) tasklet detect error
> >
> > The point is the order of (7) and (8) cannot be expected. If (7)
> > comes first, the head of the active list contains desc B at (8) and
> > that is not a bad descriptor.
>
> The tasklet won't detect any errors unless the DMA controller flags it,
> so (7) must happen before (8). That does not necessarily mean that the
> interrupts from (6) and (7) get handled before (8), but I don't think
> it matters because dwc_handle_error() calls dwc_scan_descriptors(),
> which will remove desc B from the active list if it is finished. And it
> must be finished if desc C has failed, so the failed descriptor will
> always be at the head of the queue after dwc_scan_descriptors() returns.
Oh I understand. I overlooked dwc_scan_descriptors() call at
beginning of dwc_handle_error(). Thank you for explanation.
---
Atsushi Nemoto
On Mon, 05 Jan 2009 23:02:30 +0900 (JST), Atsushi Nemoto <[email protected]> wrote:
> On Mon, 29 Dec 2008 23:39:32 +0900 (JST), Atsushi Nemoto <[email protected]> wrote:
> > Hi. I'm writing a new DMA driver, using dw_dmac driver in kernel
> > 2.6.28 as a reference implementation.
> >
> > I can see an outline of the dw_dmac driver and framework (well,
> > hopefully), but have some questions in details.
>
> One more.
One more again.
5. dwc_issue_pending()
Is this needed? The dwc_tx_submit might queue the descriptor, but it
should be issued automatically after completion of all descriptors in
active list.
It yes, it seems implementation like this would be possible:
* dwc_tx_submit() just always add the descriptor to queue list even if
active list was empty.
* dwc_issue_pending() call dwc_dostart() and move queued descriptor(s)
to active list, if the active list was empty.
But I'm not sure "chain as many as possible at a time" is really
better then "start engine as soon as possible".
---
Atsushi Nemoto