2024-01-09 18:50:40

by Yangyu Chen

[permalink] [raw]
Subject: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
in the system is very costly, limiting flush_icache_mm to be called only
when vma->vm_flags has VM_EXEC can help minimize the frequency of these
operations. It improves performance and reduces disturbances when
copy_from_user_page is needed such as profiling with perf.

For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
flags in the future since we have checked it in the __set_pte_at function.

Signed-off-by: Yangyu Chen <[email protected]>
---
arch/riscv/include/asm/cacheflush.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index 3cb53c4df27c..915f532dc336 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
* so instead we just flush the whole thing.
*/
#define flush_icache_range(start, end) flush_icache_all()
-#define flush_icache_user_page(vma, pg, addr, len) \
- flush_icache_mm(vma->vm_mm, 0)
+#define flush_icache_user_page(vma, pg, addr, len) \
+do { \
+ if (vma->vm_flags & VM_EXEC) \
+ flush_icache_mm(vma->vm_mm, 0); \
+} while (0)

#ifdef CONFIG_64BIT
#define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
--
2.43.0



2024-01-10 06:48:09

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

Hi Yangyu,

On 09/01/2024 19:48, Yangyu Chen wrote:
> As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
> in the system is very costly, limiting flush_icache_mm to be called only
> when vma->vm_flags has VM_EXEC can help minimize the frequency of these
> operations. It improves performance and reduces disturbances when


Which platform did you test this patchset? Do you have any measurement
of the performance improvements?


> copy_from_user_page is needed such as profiling with perf.
>
> For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
> flags in the future since we have checked it in the __set_pte_at function.
>
> Signed-off-by: Yangyu Chen <[email protected]>
> ---
> arch/riscv/include/asm/cacheflush.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index 3cb53c4df27c..915f532dc336 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
> * so instead we just flush the whole thing.
> */
> #define flush_icache_range(start, end) flush_icache_all()
> -#define flush_icache_user_page(vma, pg, addr, len) \
> - flush_icache_mm(vma->vm_mm, 0)
> +#define flush_icache_user_page(vma, pg, addr, len) \
> +do { \
> + if (vma->vm_flags & VM_EXEC) \
> + flush_icache_mm(vma->vm_mm, 0); \
> +} while (0)
>
> #ifdef CONFIG_64BIT
> #define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)


Otherwise, it looks good to me, so you can add:

Reviewed-by: Alexandre Ghiti <[email protected]>

Thanks,

Alex


2024-01-15 09:41:00

by Jisheng Zhang

[permalink] [raw]
Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

On Wed, Jan 10, 2024 at 02:48:59AM +0800, Yangyu Chen wrote:
> As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
> in the system is very costly, limiting flush_icache_mm to be called only
> when vma->vm_flags has VM_EXEC can help minimize the frequency of these
> operations. It improves performance and reduces disturbances when
> copy_from_user_page is needed such as profiling with perf.

Hi Yangyu,

Since this is for "performance", can you please add the benchmark, test
platform and performance numbers in your commit msg?

thanks

>
> For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
> flags in the future since we have checked it in the __set_pte_at function.
>
> Signed-off-by: Yangyu Chen <[email protected]>
> ---
> arch/riscv/include/asm/cacheflush.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index 3cb53c4df27c..915f532dc336 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
> * so instead we just flush the whole thing.
> */
> #define flush_icache_range(start, end) flush_icache_all()
> -#define flush_icache_user_page(vma, pg, addr, len) \
> - flush_icache_mm(vma->vm_mm, 0)
> +#define flush_icache_user_page(vma, pg, addr, len) \
> +do { \
> + if (vma->vm_flags & VM_EXEC) \
> + flush_icache_mm(vma->vm_mm, 0); \
> +} while (0)
>
> #ifdef CONFIG_64BIT
> #define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
> --
> 2.43.0
>

2024-03-20 00:48:23

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

On Tue, 09 Jan 2024 10:48:59 PST (-0800), [email protected] wrote:
> As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
> in the system is very costly, limiting flush_icache_mm to be called only
> when vma->vm_flags has VM_EXEC can help minimize the frequency of these
> operations. It improves performance and reduces disturbances when
> copy_from_user_page is needed such as profiling with perf.
>
> For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
> flags in the future since we have checked it in the __set_pte_at function.
>
> Signed-off-by: Yangyu Chen <[email protected]>
> ---
> arch/riscv/include/asm/cacheflush.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index 3cb53c4df27c..915f532dc336 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
> * so instead we just flush the whole thing.
> */
> #define flush_icache_range(start, end) flush_icache_all()
> -#define flush_icache_user_page(vma, pg, addr, len) \
> - flush_icache_mm(vma->vm_mm, 0)
> +#define flush_icache_user_page(vma, pg, addr, len) \
> +do { \
> + if (vma->vm_flags & VM_EXEC) \
> + flush_icache_mm(vma->vm_mm, 0); \
> +} while (0)
>
> #ifdef CONFIG_64BIT
> #define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)

I'm not super worried about the benchmarks, I think we can just
open-loop assume this is faster by avoiding the flushes. I do think we
need a hook into at least tlb_update_vma_flags(), though, to insert the
fence.i when upgrading a mapping to include VM_EXEC.

2024-03-22 15:51:12

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

On Wed, Mar 20, 2024 at 1:48 AM Palmer Dabbelt <[email protected]> wrote:
>
> On Tue, 09 Jan 2024 10:48:59 PST (-0800), [email protected] wrote:
> > As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
> > in the system is very costly, limiting flush_icache_mm to be called only
> > when vma->vm_flags has VM_EXEC can help minimize the frequency of these
> > operations. It improves performance and reduces disturbances when
> > copy_from_user_page is needed such as profiling with perf.
> >
> > For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
> > flags in the future since we have checked it in the __set_pte_at function.
> >
> > Signed-off-by: Yangyu Chen <[email protected]>
> > ---
> > arch/riscv/include/asm/cacheflush.h | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> > index 3cb53c4df27c..915f532dc336 100644
> > --- a/arch/riscv/include/asm/cacheflush.h
> > +++ b/arch/riscv/include/asm/cacheflush.h
> > @@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
> > * so instead we just flush the whole thing.
> > */
> > #define flush_icache_range(start, end) flush_icache_all()
> > -#define flush_icache_user_page(vma, pg, addr, len) \
> > - flush_icache_mm(vma->vm_mm, 0)
> > +#define flush_icache_user_page(vma, pg, addr, len) \
> > +do { \
> > + if (vma->vm_flags & VM_EXEC) \
> > + flush_icache_mm(vma->vm_mm, 0); \
> > +} while (0)
> >
> > #ifdef CONFIG_64BIT
> > #define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
>
> I'm not super worried about the benchmarks, I think we can just
> open-loop assume this is faster by avoiding the flushes. I do think we
> need a hook into at least tlb_update_vma_flags(), though, to insert the
> fence.i when upgrading a mapping to include VM_EXEC.

I'd say Yangyu is right when he mentions in the commit log: "For I-D
coherence concerns, it will not fail if such a page adds VM_EXEC flags
in the future since we have checked it in the __set_pte_at function.".
If a region indeed becomes executable, the page table will be modified
to reflect that and then will pass in __set_pte_at() which, as Yangyu
mentions, does the right thing. Or am I missing something?

Thanks,

Alex

2024-03-23 12:40:01

by Yangyu Chen

[permalink] [raw]
Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set



> On Mar 22, 2024, at 23:50, Alexandre Ghiti <[email protected]> wrote:
>
> On Wed, Mar 20, 2024 at 1:48 AM Palmer Dabbelt <[email protected]> wrote:
>>
>> On Tue, 09 Jan 2024 10:48:59 PST (-0800), [email protected] wrote:
>>> As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
>>> in the system is very costly, limiting flush_icache_mm to be called only
>>> when vma->vm_flags has VM_EXEC can help minimize the frequency of these
>>> operations. It improves performance and reduces disturbances when
>>> copy_from_user_page is needed such as profiling with perf.
>>>
>>> For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
>>> flags in the future since we have checked it in the __set_pte_at function.
>>>
>>> Signed-off-by: Yangyu Chen <[email protected]>
>>> ---
>>> arch/riscv/include/asm/cacheflush.h | 7 +++++--
>>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
>>> index 3cb53c4df27c..915f532dc336 100644
>>> --- a/arch/riscv/include/asm/cacheflush.h
>>> +++ b/arch/riscv/include/asm/cacheflush.h
>>> @@ -33,8 +33,11 @@ static inline void flush_dcache_page(struct page *page)
>>> * so instead we just flush the whole thing.
>>> */
>>> #define flush_icache_range(start, end) flush_icache_all()
>>> -#define flush_icache_user_page(vma, pg, addr, len) \
>>> - flush_icache_mm(vma->vm_mm, 0)
>>> +#define flush_icache_user_page(vma, pg, addr, len) \
>>> +do { \
>>> + if (vma->vm_flags & VM_EXEC) \
>>> + flush_icache_mm(vma->vm_mm, 0); \
>>> +} while (0)
>>>
>>> #ifdef CONFIG_64BIT
>>> #define flush_cache_vmap(start, end) flush_tlb_kernel_range(start, end)
>>
>> I'm not super worried about the benchmarks, I think we can just
>> open-loop assume this is faster by avoiding the flushes. I do think we
>> need a hook into at least tlb_update_vma_flags(), though, to insert the
>> fence.i when upgrading a mapping to include VM_EXEC.
>
> I'd say Yangyu is right when he mentions in the commit log: "For I-D
> coherence concerns, it will not fail if such a page adds VM_EXEC flags
> in the future since we have checked it in the __set_pte_at function.".
> If a region indeed becomes executable, the page table will be modified
> to reflect that and then will pass in __set_pte_at() which, as Yangyu
> mentions, does the right thing. Or am I missing something?
>

I think so. Unless we have any other way to update PTE rather than
using __set_pte_at, I think it’s safe. I’m too busy these days to
provide a micro enough benchmark.

Thanks,
Yangyu Chen


Subject: Re: [PATCH] RISC-V: only flush icache when it has VM_EXEC set

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <[email protected]>:

On Wed, 10 Jan 2024 02:48:59 +0800 you wrote:
> As I-Cache flush on current RISC-V needs to send IPIs to every CPU cores
> in the system is very costly, limiting flush_icache_mm to be called only
> when vma->vm_flags has VM_EXEC can help minimize the frequency of these
> operations. It improves performance and reduces disturbances when
> copy_from_user_page is needed such as profiling with perf.
>
> For I-D coherence concerns, it will not fail if such a page adds VM_EXEC
> flags in the future since we have checked it in the __set_pte_at function.
>
> [...]

Here is the summary with links:
- RISC-V: only flush icache when it has VM_EXEC set
https://git.kernel.org/riscv/c/542124fc0d5c

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html