2015-12-01 09:15:30

by Stanimir Varbanov

[permalink] [raw]
Subject: [PATCH 0/4] bam dma fixes and one dt extension

Here are three bam dma fixes, which I made during testing qcrypto
driver. The first one is a real fix, the second one is related to
peripherals on which the BAM IP is initialized by remote execution
environment, and the third one is a sleeping bug IMO.

The last patch is a extension again for peripherals which BAM is
remotely controlled by other execution environment.

All patches has been tested on db410c board with apq8016.

Comments are welcome!

regards,
Stan

Stanimir Varbanov (4):
dmaengine: qcom_bam_dma: fix dma free memory on remove
dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised
dmaengine: qcom_bam_dma: use correct pipe FIFO size
dmaengine: qcom_bam_dma: add controlled remotely dt property

.../devicetree/bindings/dma/qcom_bam_dma.txt | 2 ++
drivers/dma/qcom_bam_dma.c | 24 ++++++++++++++++----
2 files changed, 21 insertions(+), 5 deletions(-)

--
1.7.9.5


2015-12-01 09:16:09

by Stanimir Varbanov

[permalink] [raw]
Subject: [PATCH 1/4] dmaengine: qcom_bam_dma: fix dma free memory on remove

Building the driver as a module and when removing the already
inserted module gives below:

[ 1389.392788] Unable to handle kernel paging request at virtual address ffffffbdc000001c
[ 1389.421321] pgd = ffffffc02fa87000
[ 1389.447899] [ffffffbdc000001c] *pgd=0000000000000000, *pud=0000000000000000
[ 1389.460142] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[ 1389.466963] Modules linked in: qcom_bam_dma(-)
[ 1389.486608] CPU: 2 PID: 2442 Comm: rmmod Not tainted 4.2.0+ #407
[ 1389.493885] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 1389.501196] task: ffffffc035bae2c0 ti: ffffffc0368a8000 task.ti: ffffffc0368a8000
[ 1389.508566] PC is at __free_pages+0xc/0x40
[ 1389.515893] LR is at free_pages.part.93+0x30/0x38
[ 1389.523141] pc : [<ffffffc00016180c>] lr : [<ffffffc00016197c>] pstate: 80000145
[ 1389.530602] sp : ffffffc0368abc20
[ 1389.537931] x29: ffffffc0368abc20 x28: ffffffc0368a8000
[ 1389.549153] x27: 0000000000000000 x26: 0000000000000000
[ 1389.560412] x25: ffffffc000cb2000 x24: 0000000000000170
[ 1389.571530] x23: 0000000000000004 x22: ffffffc036bc5010
[ 1389.582721] x21: ffffffc036bc5010 x20: 0000000000000000
[ 1389.593981] x19: 0000000000000002 x18: 0000007fcbc8e8b0
[ 1389.605301] x17: 0000007f9b8226ec x16: ffffffc0002089e8
[ 1389.616647] x15: 0000007f9b8a0588 x14: 0ffffffffffffffc
[ 1389.628039] x13: 0000000000000030 x12: 0000000000000000
[ 1389.639436] x11: 0000000000000008 x10: ffffffc000ecc000
[ 1389.650872] x9 : ffffffc035bae2c0 x8 : ffffffc035bae9a8
[ 1389.662367] x7 : ffffffc035bae9a0 x6 : 0000000000000000
[ 1389.673906] x5 : ffffffbdc000001c x4 : 0000000080000000
[ 1389.685475] x3 : ffffffbdc0000000 x2 : 0000004080000000
[ 1389.697049] x1 : 0000000000000003 x0 : ffffffbdc0000000

The memory has been already freed by bam_free_chan() so fix this
by skiping already freed memory.

Signed-off-by: Stanimir Varbanov <[email protected]>
---
drivers/dma/qcom_bam_dma.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
index 5a250cdc8376..dc9da477eb69 100644
--- a/drivers/dma/qcom_bam_dma.c
+++ b/drivers/dma/qcom_bam_dma.c
@@ -1231,6 +1231,9 @@ static int bam_dma_remove(struct platform_device *pdev)
bam_dma_terminate_all(&bdev->channels[i].vc.chan);
tasklet_kill(&bdev->channels[i].vc.task);

+ if (!bdev->channels[i].fifo_virt)
+ continue;
+
dma_free_writecombine(bdev->dev, BAM_DESC_FIFO_SIZE,
bdev->channels[i].fifo_virt,
bdev->channels[i].fifo_phys);
--
1.7.9.5

2015-12-01 09:17:57

by Stanimir Varbanov

[permalink] [raw]
Subject: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

Currently we write BAM_IRQ_CLR register with zero even when no
BAM_IRQ occured. This write has some bad side effects when the
BAM instance is for the crypto engine. In case of crypto engine
some of the BAM registers are xPU protected and they cannot be
controlled by the driver.

Signed-off-by: Stanimir Varbanov <[email protected]>
---
drivers/dma/qcom_bam_dma.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
index dc9da477eb69..0f06f3b7a72b 100644
--- a/drivers/dma/qcom_bam_dma.c
+++ b/drivers/dma/qcom_bam_dma.c
@@ -800,13 +800,17 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
if (srcs & P_IRQ)
tasklet_schedule(&bdev->task);

- if (srcs & BAM_IRQ)
+ if (srcs & BAM_IRQ) {
clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));

- /* don't allow reorder of the various accesses to the BAM registers */
- mb();
+ /*
+ * don't allow reorder of the various accesses to the BAM
+ * registers
+ */
+ mb();

- writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
+ writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
+ }

return IRQ_HANDLED;
}
--
1.7.9.5

2015-12-01 09:16:56

by Stanimir Varbanov

[permalink] [raw]
Subject: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

The pipe fifo size register must instruct the bam hw
how many hw descriptors can be pushed to fifo. Currently
we isntruct the hw with 32KBytes but wrap the tail in
bam_start_dma in BAM_P_EVNT_REG on 4095 i.e. 32760. This
leads to stalled transactions when the tail wraps.

Fix this by use the correct fifo size in BAM_P_FIFO_SIZES
register i.e. 32K - 8.

Signed-off-by: Stanimir Varbanov <[email protected]>
---
drivers/dma/qcom_bam_dma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
index 0f06f3b7a72b..6d290de9ab2b 100644
--- a/drivers/dma/qcom_bam_dma.c
+++ b/drivers/dma/qcom_bam_dma.c
@@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
*/
writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
- writel_relaxed(BAM_DESC_FIFO_SIZE,
+ writel_relaxed(BAM_MAX_DATA_SIZE,
bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));

/* enable the per pipe interrupts, enable EOT, ERR, and INT irqs */
--
1.7.9.5

2015-12-01 09:16:15

by Stanimir Varbanov

[permalink] [raw]
Subject: [PATCH 4/4] dmaengine: qcom_bam_dma: add controlled remotely dt property

Some of the peripherals has bam which is controlled by remote
processor, thus the bam dma driver must avoid register writes
which initialise bam hw block. Those registers are protected
from xPU block and any writes to them will lead to secure
violation and system reboot.

Adding the contolled_remotely flag in bam driver to avoid
not permitted register writes in bam_init function.

Signed-off-by: Stanimir Varbanov <[email protected]>
---
.../devicetree/bindings/dma/qcom_bam_dma.txt | 2 ++
drivers/dma/qcom_bam_dma.c | 7 +++++++
2 files changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
index 1c9d48ea4914..87b6b2bf5e1e 100644
--- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
+++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
@@ -13,6 +13,8 @@ Required properties:
- clock-names: must contain "bam_clk" entry
- qcom,ee : indicates the active Execution Environment identifier (0-7) used in
the secure world.
+- qcom,controlled-remotely : optional, indicates that the bam is controled by
+ remote proccessor i.e. execution enviroment.

Example:

diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
index 6d290de9ab2b..5dedab77180a 100644
--- a/drivers/dma/qcom_bam_dma.c
+++ b/drivers/dma/qcom_bam_dma.c
@@ -387,6 +387,7 @@ struct bam_device {

/* execution environment ID, from DT */
u32 ee;
+ bool controlled_remotely;

const struct reg_offset_data *layout;

@@ -1039,6 +1040,9 @@ static int bam_init(struct bam_device *bdev)
val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
bdev->num_channels = val & BAM_NUM_PIPES_MASK;

+ if (bdev->controlled_remotely)
+ return 0;
+
/* s/w reset bam */
/* after reset all pipes are disabled and idle */
val = readl_relaxed(bam_addr(bdev, 0, BAM_CTRL));
@@ -1126,6 +1130,9 @@ static int bam_dma_probe(struct platform_device *pdev)
return ret;
}

+ bdev->controlled_remotely = of_property_read_bool(pdev->dev.of_node,
+ "qcom,controlled-remotely");
+
bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
if (IS_ERR(bdev->bamclk))
return PTR_ERR(bdev->bamclk);
--
1.7.9.5

2015-12-01 10:29:22

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On Tuesday 01 December 2015 11:14:58 Stanimir Varbanov wrote:
>
> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> index 0f06f3b7a72b..6d290de9ab2b 100644
> --- a/drivers/dma/qcom_bam_dma.c
> +++ b/drivers/dma/qcom_bam_dma.c
> @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
> */
> writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
> bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
> - writel_relaxed(BAM_DESC_FIFO_SIZE,
> + writel_relaxed(BAM_MAX_DATA_SIZE,
> bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
>
> /* enable the per pipe interrupts, enable EOT, ERR, and INT irqs */

I'm looking at that now and fail to see why these have to use writel_relaxed().

Could you add a patch to use readl/writel by default?

Arnd

2015-12-01 10:30:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

On Tuesday 01 December 2015 11:14:57 Stanimir Varbanov wrote:
> + if (srcs & BAM_IRQ) {
> clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));
>
> - /* don't allow reorder of the various accesses to the BAM registers */
> - mb();
> + /*
> + * don't allow reorder of the various accesses to the BAM
> + * registers
> + */
> + mb();
>
> - writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> + writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> + }
>

I think the comment here should be moved: change the writel_relaxed()
to writel(), which already includes the appropriate barriers, and
add a comment at the readl_relaxed() to explain why it doesn't need
a barrier.

Arnd

2015-12-01 17:23:13

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On Tue, Dec 01, 2015 at 11:14:58AM +0200, Stanimir Varbanov wrote:
> The pipe fifo size register must instruct the bam hw
> how many hw descriptors can be pushed to fifo. Currently
> we isntruct the hw with 32KBytes but wrap the tail in
> bam_start_dma in BAM_P_EVNT_REG on 4095 i.e. 32760. This
> leads to stalled transactions when the tail wraps.
>
> Fix this by use the correct fifo size in BAM_P_FIFO_SIZES
> register i.e. 32K - 8.
>
> Signed-off-by: Stanimir Varbanov <[email protected]>
> ---
> drivers/dma/qcom_bam_dma.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> index 0f06f3b7a72b..6d290de9ab2b 100644
> --- a/drivers/dma/qcom_bam_dma.c
> +++ b/drivers/dma/qcom_bam_dma.c
> @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
> */
> writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
> bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
> - writel_relaxed(BAM_DESC_FIFO_SIZE,
> + writel_relaxed(BAM_MAX_DATA_SIZE,
> bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));

This is just using the #define. That is ok, but if you use this instead of the
BAM_P_FIFO_SIZES then you need to fix your comment. Or actually use the
register value.... otherwise looks fine.

2015-12-01 17:25:41

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On Tue, Dec 01, 2015 at 11:28:32AM +0100, Arnd Bergmann wrote:
> On Tuesday 01 December 2015 11:14:58 Stanimir Varbanov wrote:
> >
> > diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> > index 0f06f3b7a72b..6d290de9ab2b 100644
> > --- a/drivers/dma/qcom_bam_dma.c
> > +++ b/drivers/dma/qcom_bam_dma.c
> > @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
> > */
> > writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
> > bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
> > - writel_relaxed(BAM_DESC_FIFO_SIZE,
> > + writel_relaxed(BAM_MAX_DATA_SIZE,
> > bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
> >
> > /* enable the per pipe interrupts, enable EOT, ERR, and INT irqs */
>
> I'm looking at that now and fail to see why these have to use writel_relaxed().

At some point I believe I got a comment about using (readl/writel)_relaxed
instead of readl/writel. So I used these instead. Has the wind direction
changed? =)

Using the readl/writel is nice w.r.t. having the implicit barriers, especially
with the funky 1K boundary on reordering of operations that can occur on Kraits.
This can hit you on accesses even within the same IP block.

2015-12-01 17:28:46

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

On Tue, Dec 01, 2015 at 11:14:57AM +0200, Stanimir Varbanov wrote:
> Currently we write BAM_IRQ_CLR register with zero even when no
> BAM_IRQ occured. This write has some bad side effects when the
> BAM instance is for the crypto engine. In case of crypto engine
> some of the BAM registers are xPU protected and they cannot be
> controlled by the driver.
>
> Signed-off-by: Stanimir Varbanov <[email protected]>
> ---
> drivers/dma/qcom_bam_dma.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> index dc9da477eb69..0f06f3b7a72b 100644
> --- a/drivers/dma/qcom_bam_dma.c
> +++ b/drivers/dma/qcom_bam_dma.c
> @@ -800,13 +800,17 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
> if (srcs & P_IRQ)
> tasklet_schedule(&bdev->task);
>
> - if (srcs & BAM_IRQ)
> + if (srcs & BAM_IRQ) {
> clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));
>
> - /* don't allow reorder of the various accesses to the BAM registers */
> - mb();
> + /*
> + * don't allow reorder of the various accesses to the BAM
> + * registers
> + */
> + mb();
>
> - writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> + writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> + }

Looks good. We shouldn't be accessing this unless there is actually an irq
shown in the srcs.


Thanks for catching this.


Reviewed-by: Andy Gross <[email protected]>

2015-12-01 17:30:32

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine: qcom_bam_dma: add controlled remotely dt property

On Tue, Dec 01, 2015 at 11:14:59AM +0200, Stanimir Varbanov wrote:
> Some of the peripherals has bam which is controlled by remote
> processor, thus the bam dma driver must avoid register writes
> which initialise bam hw block. Those registers are protected
> from xPU block and any writes to them will lead to secure
> violation and system reboot.
>
> Adding the contolled_remotely flag in bam driver to avoid
> not permitted register writes in bam_init function.
>
> Signed-off-by: Stanimir Varbanov <[email protected]>
> ---
> .../devicetree/bindings/dma/qcom_bam_dma.txt | 2 ++
> drivers/dma/qcom_bam_dma.c | 7 +++++++
> 2 files changed, 9 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> index 1c9d48ea4914..87b6b2bf5e1e 100644
> --- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> +++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> @@ -13,6 +13,8 @@ Required properties:
> - clock-names: must contain "bam_clk" entry
> - qcom,ee : indicates the active Execution Environment identifier (0-7) used in
> the secure world.
> +- qcom,controlled-remotely : optional, indicates that the bam is controled by
> + remote proccessor i.e. execution enviroment.

Please fix the spelling. controlled, environment. Otherwise looks ok.

>
> Example:
>
> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> index 6d290de9ab2b..5dedab77180a 100644
> --- a/drivers/dma/qcom_bam_dma.c
> +++ b/drivers/dma/qcom_bam_dma.c
> @@ -387,6 +387,7 @@ struct bam_device {
>
> /* execution environment ID, from DT */
> u32 ee;
> + bool controlled_remotely;
>
> const struct reg_offset_data *layout;
>
> @@ -1039,6 +1040,9 @@ static int bam_init(struct bam_device *bdev)
> val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
> bdev->num_channels = val & BAM_NUM_PIPES_MASK;
>
> + if (bdev->controlled_remotely)
> + return 0;
> +
> /* s/w reset bam */
> /* after reset all pipes are disabled and idle */
> val = readl_relaxed(bam_addr(bdev, 0, BAM_CTRL));
> @@ -1126,6 +1130,9 @@ static int bam_dma_probe(struct platform_device *pdev)
> return ret;
> }
>
> + bdev->controlled_remotely = of_property_read_bool(pdev->dev.of_node,
> + "qcom,controlled-remotely");
> +
> bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
> if (IS_ERR(bdev->bamclk))
> return PTR_ERR(bdev->bamclk);


Reviewed-by: Andy Gross <[email protected]>

2015-12-01 20:23:09

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On Tuesday 01 December 2015 11:25:35 Andy Gross wrote:
> On Tue, Dec 01, 2015 at 11:28:32AM +0100, Arnd Bergmann wrote:
> > On Tuesday 01 December 2015 11:14:58 Stanimir Varbanov wrote:
> > >
> > > diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> > > index 0f06f3b7a72b..6d290de9ab2b 100644
> > > --- a/drivers/dma/qcom_bam_dma.c
> > > +++ b/drivers/dma/qcom_bam_dma.c
> > > @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
> > > */
> > > writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
> > > bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
> > > - writel_relaxed(BAM_DESC_FIFO_SIZE,
> > > + writel_relaxed(BAM_MAX_DATA_SIZE,
> > > bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
> > >
> > > /* enable the per pipe interrupts, enable EOT, ERR, and INT irqs */
> >
> > I'm looking at that now and fail to see why these have to use writel_relaxed().
>
> At some point I believe I got a comment about using (readl/writel)_relaxed
> instead of readl/writel. So I used these instead. Has the wind direction
> changed? =)

Yes.

> Using the readl/writel is nice w.r.t. having the implicit barriers, especially
> with the funky 1K boundary on reordering of operations that can occur on Kraits.
> This can hit you on accesses even within the same IP block.

We had a couple of bugs that we should not have had when drivers were mindlessly
converted, so generally speaking at least I try to get people to only use
the relaxed functions for the hot path when they can show an advantage as well
as the fact that it's safe to use.

Arnd

2015-12-02 12:57:10

by Stanimir Varbanov

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

On 12/01/2015 12:29 PM, Arnd Bergmann wrote:
> On Tuesday 01 December 2015 11:14:57 Stanimir Varbanov wrote:
>> + if (srcs & BAM_IRQ) {
>> clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));
>>
>> - /* don't allow reorder of the various accesses to the BAM registers */
>> - mb();
>> + /*
>> + * don't allow reorder of the various accesses to the BAM
>> + * registers
>> + */
>> + mb();
>>
>> - writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
>> + writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
>> + }
>>
>
> I think the comment here should be moved: change the writel_relaxed()
> to writel(), which already includes the appropriate barriers, and

If we agree with such a change it should be subject to another patch.

> add a comment at the readl_relaxed() to explain why it doesn't need
> a barrier.

Infact I'm not sure that readl_relaxed(BAM_IRQ_STTS) does not need
barrier. If I read the code above correctly the mb() should guarantee
that all load and store operations before it are happened before the
write to BAM_IRQ_CLR register, and on the other hand if we replace
writel_relaxed with writel, the writel has wmb() which guarantees only
store operations. Did I miss something?

--
regards,
Stan

2015-12-02 13:06:12

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

On Wednesday 02 December 2015 14:56:57 Stanimir Varbanov wrote:
> On 12/01/2015 12:29 PM, Arnd Bergmann wrote:
> > On Tuesday 01 December 2015 11:14:57 Stanimir Varbanov wrote:
> >> + if (srcs & BAM_IRQ) {
> >> clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));
> >>
> >> - /* don't allow reorder of the various accesses to the BAM registers */
> >> - mb();
> >> + /*
> >> + * don't allow reorder of the various accesses to the BAM
> >> + * registers
> >> + */
> >> + mb();
> >>
> >> - writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> >> + writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
> >> + }
> >>
> >
> > I think the comment here should be moved: change the writel_relaxed()
> > to writel(), which already includes the appropriate barriers, and
>
> If we agree with such a change it should be subject to another patch.

Correct.

> > add a comment at the readl_relaxed() to explain why it doesn't need
> > a barrier.
>
> Infact I'm not sure that readl_relaxed(BAM_IRQ_STTS) does not need
> barrier. If I read the code above correctly the mb() should guarantee
> that all load and store operations before it are happened before the
> write to BAM_IRQ_CLR register, and on the other hand if we replace
> writel_relaxed with writel, the writel has wmb() which guarantees only
> store operations. Did I miss something?

You are right, we only guarantee that stores to memory are complete
before we writel() an MMIO register.

What do you gain from synchronizing reads before an MMIO write?

Arnd

2015-12-02 16:44:25

by Stanimir Varbanov

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On 12/01/2015 07:23 PM, Andy Gross wrote:
> On Tue, Dec 01, 2015 at 11:14:58AM +0200, Stanimir Varbanov wrote:
>> The pipe fifo size register must instruct the bam hw
>> how many hw descriptors can be pushed to fifo. Currently
>> we isntruct the hw with 32KBytes but wrap the tail in
>> bam_start_dma in BAM_P_EVNT_REG on 4095 i.e. 32760. This
>> leads to stalled transactions when the tail wraps.
>>
>> Fix this by use the correct fifo size in BAM_P_FIFO_SIZES
>> register i.e. 32K - 8.
>>
>> Signed-off-by: Stanimir Varbanov <[email protected]>
>> ---
>> drivers/dma/qcom_bam_dma.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
>> index 0f06f3b7a72b..6d290de9ab2b 100644
>> --- a/drivers/dma/qcom_bam_dma.c
>> +++ b/drivers/dma/qcom_bam_dma.c
>> @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
>> */
>> writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
>> bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
>> - writel_relaxed(BAM_DESC_FIFO_SIZE,
>> + writel_relaxed(BAM_MAX_DATA_SIZE,
>> bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
>
> This is just using the #define. That is ok, but if you use this instead of the
> BAM_P_FIFO_SIZES then you need to fix your comment. Or actually use the
> register value.... otherwise looks fine.

I did not follow your comment, but the intension of the patch is to set
the proper FIFO size in BAM_P_FIFO_SIZES register, i.e. 32K - 8.

--
regards,
Stan

2015-12-02 16:47:28

by Stanimir Varbanov

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: qcom_bam_dma: clear BAM interrupt only if it is rised

On 12/02/2015 03:05 PM, Arnd Bergmann wrote:
> On Wednesday 02 December 2015 14:56:57 Stanimir Varbanov wrote:
>> On 12/01/2015 12:29 PM, Arnd Bergmann wrote:
>>> On Tuesday 01 December 2015 11:14:57 Stanimir Varbanov wrote:
>>>> + if (srcs & BAM_IRQ) {
>>>> clr_mask = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_STTS));
>>>>
>>>> - /* don't allow reorder of the various accesses to the BAM registers */
>>>> - mb();
>>>> + /*
>>>> + * don't allow reorder of the various accesses to the BAM
>>>> + * registers
>>>> + */
>>>> + mb();
>>>>
>>>> - writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
>>>> + writel_relaxed(clr_mask, bam_addr(bdev, 0, BAM_IRQ_CLR));
>>>> + }
>>>>
>>>
>>> I think the comment here should be moved: change the writel_relaxed()
>>> to writel(), which already includes the appropriate barriers, and
>>
>> If we agree with such a change it should be subject to another patch.
>
> Correct.
>
>>> add a comment at the readl_relaxed() to explain why it doesn't need
>>> a barrier.
>>
>> Infact I'm not sure that readl_relaxed(BAM_IRQ_STTS) does not need
>> barrier. If I read the code above correctly the mb() should guarantee
>> that all load and store operations before it are happened before the
>> write to BAM_IRQ_CLR register, and on the other hand if we replace
>> writel_relaxed with writel, the writel has wmb() which guarantees only
>> store operations. Did I miss something?
>
> You are right, we only guarantee that stores to memory are complete
> before we writel() an MMIO register.
>
> What do you gain from synchronizing reads before an MMIO write?

I don't know just tried to understand the meaning of mb() above.

--
regards,
Stan

2015-12-02 17:23:07

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On Wed, Dec 02, 2015 at 06:44:11PM +0200, Stanimir Varbanov wrote:
> On 12/01/2015 07:23 PM, Andy Gross wrote:
> > On Tue, Dec 01, 2015 at 11:14:58AM +0200, Stanimir Varbanov wrote:
> >> The pipe fifo size register must instruct the bam hw
> >> how many hw descriptors can be pushed to fifo. Currently
> >> we isntruct the hw with 32KBytes but wrap the tail in
> >> bam_start_dma in BAM_P_EVNT_REG on 4095 i.e. 32760. This
> >> leads to stalled transactions when the tail wraps.
> >>
> >> Fix this by use the correct fifo size in BAM_P_FIFO_SIZES
> >> register i.e. 32K - 8.
> >>
> >> Signed-off-by: Stanimir Varbanov <[email protected]>
> >> ---
> >> drivers/dma/qcom_bam_dma.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
> >> index 0f06f3b7a72b..6d290de9ab2b 100644
> >> --- a/drivers/dma/qcom_bam_dma.c
> >> +++ b/drivers/dma/qcom_bam_dma.c
> >> @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
> >> */
> >> writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
> >> bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
> >> - writel_relaxed(BAM_DESC_FIFO_SIZE,
> >> + writel_relaxed(BAM_MAX_DATA_SIZE,
> >> bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
> >
> > This is just using the #define. That is ok, but if you use this instead of the
> > BAM_P_FIFO_SIZES then you need to fix your comment. Or actually use the
> > register value.... otherwise looks fine.
>
> I did not follow your comment, but the intension of the patch is to set
> the proper FIFO size in BAM_P_FIFO_SIZES register, i.e. 32K - 8.

Sorry, I mixed up the usage and was thinking there was something you read out
that told you the size. That's not how it works, unfortunately. The
MAX_DATA_SIZE is fine, but the name is a little misleading. Perhaps just
BAM_FIFO_SIZE?


Regards,

Andy

2015-12-10 13:18:39

by Stanimir Varbanov

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: qcom_bam_dma: use correct pipe FIFO size

On 12/02/2015 07:22 PM, Andy Gross wrote:
> On Wed, Dec 02, 2015 at 06:44:11PM +0200, Stanimir Varbanov wrote:
>> On 12/01/2015 07:23 PM, Andy Gross wrote:
>>> On Tue, Dec 01, 2015 at 11:14:58AM +0200, Stanimir Varbanov wrote:
>>>> The pipe fifo size register must instruct the bam hw
>>>> how many hw descriptors can be pushed to fifo. Currently
>>>> we isntruct the hw with 32KBytes but wrap the tail in
>>>> bam_start_dma in BAM_P_EVNT_REG on 4095 i.e. 32760. This
>>>> leads to stalled transactions when the tail wraps.
>>>>
>>>> Fix this by use the correct fifo size in BAM_P_FIFO_SIZES
>>>> register i.e. 32K - 8.
>>>>
>>>> Signed-off-by: Stanimir Varbanov <[email protected]>
>>>> ---
>>>> drivers/dma/qcom_bam_dma.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom_bam_dma.c
>>>> index 0f06f3b7a72b..6d290de9ab2b 100644
>>>> --- a/drivers/dma/qcom_bam_dma.c
>>>> +++ b/drivers/dma/qcom_bam_dma.c
>>>> @@ -458,7 +458,7 @@ static void bam_chan_init_hw(struct bam_chan *bchan,
>>>> */
>>>> writel_relaxed(ALIGN(bchan->fifo_phys, sizeof(struct bam_desc_hw)),
>>>> bam_addr(bdev, bchan->id, BAM_P_DESC_FIFO_ADDR));
>>>> - writel_relaxed(BAM_DESC_FIFO_SIZE,
>>>> + writel_relaxed(BAM_MAX_DATA_SIZE,
>>>> bam_addr(bdev, bchan->id, BAM_P_FIFO_SIZES));
>>>
>>> This is just using the #define. That is ok, but if you use this instead of the
>>> BAM_P_FIFO_SIZES then you need to fix your comment. Or actually use the
>>> register value.... otherwise looks fine.
>>
>> I did not follow your comment, but the intension of the patch is to set
>> the proper FIFO size in BAM_P_FIFO_SIZES register, i.e. 32K - 8.
>
> Sorry, I mixed up the usage and was thinking there was something you read out
> that told you the size. That's not how it works, unfortunately. The
> MAX_DATA_SIZE is fine, but the name is a little misleading. Perhaps just
> BAM_FIFO_SIZE?

OK I can rename BAM_MAX_DATA_SIZE to BAM_FIFO_SIZE, and use it when
setting BAM_P_FIFO_SIZES register. Is that fine to you?


--
regards,
Stan

2016-04-05 21:33:30

by Andy Gross

[permalink] [raw]
Subject: Re: [PATCH 0/4] bam dma fixes and one dt extension

On Tue, Dec 01, 2015 at 11:14:55AM +0200, Stanimir Varbanov wrote:
> Here are three bam dma fixes, which I made during testing qcrypto
> driver. The first one is a real fix, the second one is related to
> peripherals on which the BAM IP is initialized by remote execution
> environment, and the third one is a sleeping bug IMO.
>
> The last patch is a extension again for peripherals which BAM is
> remotely controlled by other execution environment.
>
> All patches has been tested on db410c board with apq8016.
>

Stan,

Can you resend these patches w/ updates? Or at least the remote control patch
if you can get that out faster. And also include the dmaengine list.

Regards,

Andy

2016-04-05 23:06:54

by Stanimir Varbanov

[permalink] [raw]
Subject: Re: [PATCH 0/4] bam dma fixes and one dt extension

Hi,

On 04/06/2016 12:33 AM, Andy Gross wrote:
> On Tue, Dec 01, 2015 at 11:14:55AM +0200, Stanimir Varbanov wrote:
>> Here are three bam dma fixes, which I made during testing qcrypto
>> driver. The first one is a real fix, the second one is related to
>> peripherals on which the BAM IP is initialized by remote execution
>> environment, and the third one is a sleeping bug IMO.
>>
>> The last patch is a extension again for peripherals which BAM is
>> remotely controlled by other execution environment.
>>
>> All patches has been tested on db410c board with apq8016.
>>
>
> Stan,
>
> Can you resend these patches w/ updates? Or at least the remote control patch
> if you can get that out faster. And also include the dmaengine list.

Done. Thanks for pining.

regards,
Stan