2023-01-04 09:07:45

by Icenowy Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] riscv: use VA+PA variant of CMO macros for DMA synchorization

在 2023-01-04星期三的 16:50 +0800,Guo Ren写道:
> On Wed, Jan 4, 2023 at 3:43 PM Icenowy Zheng <[email protected]> wrote:
> >
> > DMA synchorization is done on PA and the VA is calculated from the
> > PA.
> >
> > Use the alternative macro variant that takes both VA and PA as
> > parameters, thus in case the ISA extension used support PA
> > directly, the
> > overhead for re-converting VA to PA can be omitted.
> >
> > Suggested-by: Guo Ren <[email protected]>
> > Signed-off-by: Icenowy Zheng <[email protected]>
> > ---
> >  arch/riscv/mm/dma-noncoherent.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-
> > noncoherent.c
> > index d919efab6eba..a751f4aece62 100644
> > --- a/arch/riscv/mm/dma-noncoherent.c
> > +++ b/arch/riscv/mm/dma-noncoherent.c
> > @@ -19,13 +19,13 @@ void arch_sync_dma_for_device(phys_addr_t
> > paddr, size_t size,
> >
> >         switch (dir) {
> >         case DMA_TO_DEVICE:
> > -               ALT_CMO_OP(clean, vaddr, size,
> > riscv_cbom_block_size);
> > +               ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > riscv_cbom_block_size);
> ALT_CMO_OP -> ALT_CMO_OP_VPA, is the renaming necessary?

I didn't rename the original ALT_CMO_OP, ALT_CMO_OP_VPA is something
new.

>
> Others:
> Reviewed-by: Guo Ren <[email protected]>
>
> >                 break;
> >         case DMA_FROM_DEVICE:
> > -               ALT_CMO_OP(clean, vaddr, size,
> > riscv_cbom_block_size);
> > +               ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > riscv_cbom_block_size);
> >                 break;
> >         case DMA_BIDIRECTIONAL:
> > -               ALT_CMO_OP(flush, vaddr, size,
> > riscv_cbom_block_size);
> > +               ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > riscv_cbom_block_size);
> >                 break;
> >         default:
> >                 break;
> > @@ -42,7 +42,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr,
> > size_t size,
> >                 break;
> >         case DMA_FROM_DEVICE:
> >         case DMA_BIDIRECTIONAL:
> > -               ALT_CMO_OP(flush, vaddr, size,
> > riscv_cbom_block_size);
> > +               ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > riscv_cbom_block_size);
> >                 break;
> >         default:
> >                 break;
> > --
> > 2.38.1
> >
>
>


2023-01-04 09:36:52

by Guo Ren

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] riscv: use VA+PA variant of CMO macros for DMA synchorization

On Wed, Jan 4, 2023 at 4:59 PM Icenowy Zheng <[email protected]> wrote:
>
> 在 2023-01-04星期三的 16:50 +0800,Guo Ren写道:
> > On Wed, Jan 4, 2023 at 3:43 PM Icenowy Zheng <[email protected]> wrote:
> > >
> > > DMA synchorization is done on PA and the VA is calculated from the
> > > PA.
> > >
> > > Use the alternative macro variant that takes both VA and PA as
> > > parameters, thus in case the ISA extension used support PA
> > > directly, the
> > > overhead for re-converting VA to PA can be omitted.
> > >
> > > Suggested-by: Guo Ren <[email protected]>
> > > Signed-off-by: Icenowy Zheng <[email protected]>
> > > ---
> > > arch/riscv/mm/dma-noncoherent.c | 8 ++++----
> > > 1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-
> > > noncoherent.c
> > > index d919efab6eba..a751f4aece62 100644
> > > --- a/arch/riscv/mm/dma-noncoherent.c
> > > +++ b/arch/riscv/mm/dma-noncoherent.c
> > > @@ -19,13 +19,13 @@ void arch_sync_dma_for_device(phys_addr_t
> > > paddr, size_t size,
> > >
> > > switch (dir) {
> > > case DMA_TO_DEVICE:
> > > - ALT_CMO_OP(clean, vaddr, size,
> > > riscv_cbom_block_size);
> > > + ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > riscv_cbom_block_size);
> > ALT_CMO_OP -> ALT_CMO_OP_VPA, is the renaming necessary?
>
> I didn't rename the original ALT_CMO_OP, ALT_CMO_OP_VPA is something
> new.
The ##_VPA is really strange.

How about:
ALT_CMO_OP -> ALT_CMO_OP_VA
ALT_CMO_OP_VPA -> ALT_CMO_OP

>
> >
> > Others:
> > Reviewed-by: Guo Ren <[email protected]>
> >
> > > break;
> > > case DMA_FROM_DEVICE:
> > > - ALT_CMO_OP(clean, vaddr, size,
> > > riscv_cbom_block_size);
> > > + ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > riscv_cbom_block_size);
> > > break;
> > > case DMA_BIDIRECTIONAL:
> > > - ALT_CMO_OP(flush, vaddr, size,
> > > riscv_cbom_block_size);
> > > + ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > > riscv_cbom_block_size);
> > > break;
> > > default:
> > > break;
> > > @@ -42,7 +42,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr,
> > > size_t size,
> > > break;
> > > case DMA_FROM_DEVICE:
> > > case DMA_BIDIRECTIONAL:
> > > - ALT_CMO_OP(flush, vaddr, size,
> > > riscv_cbom_block_size);
> > > + ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > > riscv_cbom_block_size);
> > > break;
> > > default:
> > > break;
> > > --
> > > 2.38.1
> > >
> >
> >
>


--
Best Regards
Guo Ren

2023-01-04 09:45:47

by Icenowy Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] riscv: use VA+PA variant of CMO macros for DMA synchorization

在 2023-01-04星期三的 17:24 +0800,Guo Ren写道:
> On Wed, Jan 4, 2023 at 4:59 PM Icenowy Zheng <[email protected]> wrote:
> >
> > 在 2023-01-04星期三的 16:50 +0800,Guo Ren写道:
> > > On Wed, Jan 4, 2023 at 3:43 PM Icenowy Zheng <[email protected]>
> > > wrote:
> > > >
> > > > DMA synchorization is done on PA and the VA is calculated from
> > > > the
> > > > PA.
> > > >
> > > > Use the alternative macro variant that takes both VA and PA as
> > > > parameters, thus in case the ISA extension used support PA
> > > > directly, the
> > > > overhead for re-converting VA to PA can be omitted.
> > > >
> > > > Suggested-by: Guo Ren <[email protected]>
> > > > Signed-off-by: Icenowy Zheng <[email protected]>
> > > > ---
> > > >  arch/riscv/mm/dma-noncoherent.c | 8 ++++----
> > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/arch/riscv/mm/dma-noncoherent.c
> > > > b/arch/riscv/mm/dma-
> > > > noncoherent.c
> > > > index d919efab6eba..a751f4aece62 100644
> > > > --- a/arch/riscv/mm/dma-noncoherent.c
> > > > +++ b/arch/riscv/mm/dma-noncoherent.c
> > > > @@ -19,13 +19,13 @@ void arch_sync_dma_for_device(phys_addr_t
> > > > paddr, size_t size,
> > > >
> > > >         switch (dir) {
> > > >         case DMA_TO_DEVICE:
> > > > -               ALT_CMO_OP(clean, vaddr, size,
> > > > riscv_cbom_block_size);
> > > > +               ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > > riscv_cbom_block_size);
> > > ALT_CMO_OP -> ALT_CMO_OP_VPA, is the renaming necessary?
> >
> > I didn't rename the original ALT_CMO_OP, ALT_CMO_OP_VPA is
> > something
> > new.
> The ##_VPA is really strange.
>
> How about:
> ALT_CMO_OP          -> ALT_CMO_OP_VA
> ALT_CMO_OP_VPA -> ALT_CMO_OP

It's thus a much bigger change.

If you are not fond of _VPA, I can rename it to _VA_PA.

>
> >
> > >
> > > Others:
> > > Reviewed-by: Guo Ren <[email protected]>
> > >
> > > >                 break;
> > > >         case DMA_FROM_DEVICE:
> > > > -               ALT_CMO_OP(clean, vaddr, size,
> > > > riscv_cbom_block_size);
> > > > +               ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > > riscv_cbom_block_size);
> > > >                 break;
> > > >         case DMA_BIDIRECTIONAL:
> > > > -               ALT_CMO_OP(flush, vaddr, size,
> > > > riscv_cbom_block_size);
> > > > +               ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > > > riscv_cbom_block_size);
> > > >                 break;
> > > >         default:
> > > >                 break;
> > > > @@ -42,7 +42,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr,
> > > > size_t size,
> > > >                 break;
> > > >         case DMA_FROM_DEVICE:
> > > >         case DMA_BIDIRECTIONAL:
> > > > -               ALT_CMO_OP(flush, vaddr, size,
> > > > riscv_cbom_block_size);
> > > > +               ALT_CMO_OP_VPA(flush, vaddr, paddr, size,
> > > > riscv_cbom_block_size);
> > > >                 break;
> > > >         default:
> > > >                 break;
> > > > --
> > > > 2.38.1
> > > >
> > >
> > >
> >
>
>

2023-01-04 12:20:38

by Heiko Stübner

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] riscv: use VA+PA variant of CMO macros for DMA synchorization

Hi,

Am Mittwoch, 4. Januar 2023, 10:27:53 CET schrieb Icenowy Zheng:
> 在 2023-01-04星期三的 17:24 +0800,Guo Ren写道:
> > On Wed, Jan 4, 2023 at 4:59 PM Icenowy Zheng <[email protected]> wrote:
> > >
> > > 在 2023-01-04星期三的 16:50 +0800,Guo Ren写道:
> > > > On Wed, Jan 4, 2023 at 3:43 PM Icenowy Zheng <[email protected]>
> > > > wrote:
> > > > >
> > > > > DMA synchorization is done on PA and the VA is calculated from
> > > > > the
> > > > > PA.
> > > > >
> > > > > Use the alternative macro variant that takes both VA and PA as
> > > > > parameters, thus in case the ISA extension used support PA
> > > > > directly, the
> > > > > overhead for re-converting VA to PA can be omitted.
> > > > >
> > > > > Suggested-by: Guo Ren <[email protected]>
> > > > > Signed-off-by: Icenowy Zheng <[email protected]>
> > > > > ---
> > > > > arch/riscv/mm/dma-noncoherent.c | 8 ++++----
> > > > > 1 file changed, 4 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/arch/riscv/mm/dma-noncoherent.c
> > > > > b/arch/riscv/mm/dma-
> > > > > noncoherent.c
> > > > > index d919efab6eba..a751f4aece62 100644
> > > > > --- a/arch/riscv/mm/dma-noncoherent.c
> > > > > +++ b/arch/riscv/mm/dma-noncoherent.c
> > > > > @@ -19,13 +19,13 @@ void arch_sync_dma_for_device(phys_addr_t
> > > > > paddr, size_t size,
> > > > >
> > > > > switch (dir) {
> > > > > case DMA_TO_DEVICE:
> > > > > - ALT_CMO_OP(clean, vaddr, size,
> > > > > riscv_cbom_block_size);
> > > > > + ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > > > riscv_cbom_block_size);
> > > > ALT_CMO_OP -> ALT_CMO_OP_VPA, is the renaming necessary?
> > >
> > > I didn't rename the original ALT_CMO_OP, ALT_CMO_OP_VPA is
> > > something
> > > new.
> > The ##_VPA is really strange.
> >
> > How about:
> > ALT_CMO_OP -> ALT_CMO_OP_VA
> > ALT_CMO_OP_VPA -> ALT_CMO_OP
>
> It's thus a much bigger change.
>
> If you are not fond of _VPA, I can rename it to _VA_PA.

before you spend too much time on this, there is currently a parallel
discussion running about including all the other different vendor-
specific cache management.

See [0] and the thread before that for reference.

The consensus seems to be that cache-handling itself is not fast anyway,
and therefore to reduce complexity for the cache handling and move
non-zicbom cache-handling into a indirect function call that the can be
overridden at runtime.


Heiko

[0] https://lore.kernel.org/all/[email protected]/


2023-01-06 07:58:01

by Icenowy Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] riscv: use VA+PA variant of CMO macros for DMA synchorization

在 2023-01-04星期三的 13:16 +0100,Heiko Stübner写道:
> Hi,
>
> Am Mittwoch, 4. Januar 2023, 10:27:53 CET schrieb Icenowy Zheng:
> > 在 2023-01-04星期三的 17:24 +0800,Guo Ren写道:
> > > On Wed, Jan 4, 2023 at 4:59 PM Icenowy Zheng <[email protected]>
> > > wrote:
> > > >
> > > > 在 2023-01-04星期三的 16:50 +0800,Guo Ren写道:
> > > > > On Wed, Jan 4, 2023 at 3:43 PM Icenowy Zheng <[email protected]>
> > > > > wrote:
> > > > > >
> > > > > > DMA synchorization is done on PA and the VA is calculated
> > > > > > from
> > > > > > the
> > > > > > PA.
> > > > > >
> > > > > > Use the alternative macro variant that takes both VA and PA
> > > > > > as
> > > > > > parameters, thus in case the ISA extension used support PA
> > > > > > directly, the
> > > > > > overhead for re-converting VA to PA can be omitted.
> > > > > >
> > > > > > Suggested-by: Guo Ren <[email protected]>
> > > > > > Signed-off-by: Icenowy Zheng <[email protected]>
> > > > > > ---
> > > > > >  arch/riscv/mm/dma-noncoherent.c | 8 ++++----
> > > > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/riscv/mm/dma-noncoherent.c
> > > > > > b/arch/riscv/mm/dma-
> > > > > > noncoherent.c
> > > > > > index d919efab6eba..a751f4aece62 100644
> > > > > > --- a/arch/riscv/mm/dma-noncoherent.c
> > > > > > +++ b/arch/riscv/mm/dma-noncoherent.c
> > > > > > @@ -19,13 +19,13 @@ void
> > > > > > arch_sync_dma_for_device(phys_addr_t
> > > > > > paddr, size_t size,
> > > > > >
> > > > > >         switch (dir) {
> > > > > >         case DMA_TO_DEVICE:
> > > > > > -               ALT_CMO_OP(clean, vaddr, size,
> > > > > > riscv_cbom_block_size);
> > > > > > +               ALT_CMO_OP_VPA(clean, vaddr, paddr, size,
> > > > > > riscv_cbom_block_size);
> > > > > ALT_CMO_OP -> ALT_CMO_OP_VPA, is the renaming necessary?
> > > >
> > > > I didn't rename the original ALT_CMO_OP, ALT_CMO_OP_VPA is
> > > > something
> > > > new.
> > > The ##_VPA is really strange.
> > >
> > > How about:
> > > ALT_CMO_OP          -> ALT_CMO_OP_VA
> > > ALT_CMO_OP_VPA -> ALT_CMO_OP
> >
> > It's thus a much bigger change.
> >
> > If you are not fond of _VPA, I can rename it to _VA_PA.
>
> before you spend too much time on this, there is currently a parallel
> discussion running about including all the other different vendor-
> specific cache management.
>
> See [0] and the thread before that for reference.

The code shown here seems to be only a draft, and not even testable.

>
> The consensus seems to be that cache-handling itself is not fast
> anyway,
> and therefore to reduce complexity for the cache handling and move
> non-zicbom cache-handling into a indirect function call that the can
> be
> overridden at runtime.

Well yes I tested this patchset on my LiteX with OpenC906, and it does
not help at all on LiteETH throughtput. So maybe this is only some
theortical gain.

>
>
> Heiko
>
> [0]
> https://lore.kernel.org/all/[email protected]/
>
>