2023-04-05 13:24:31

by Heiko Carstens

[permalink] [raw]
Subject: [PATCH 0/2] stackleak: allow to specify arch specific stackleak poison function

Factor out the code that fills the stack with the stackleak poison value in
order to allow architectures to provide a faster implementation.

Use this to provide an s390 specific implementation which can fill the
stack with the poison value much faster (factor of ~10 compared to the
current version).

Note that the s390 stackleak support is currently only available via
linux-next (as of today), and the s390 kernel tree at kernel.org[1].
Therefore, if there are no objections, I'd like to add these two patches to
the s390 tree, so they can go upstream via the next merge window together
with the s390 support.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=b94c0ebb1ec752016a3e41bfb66bb51ea905e533

Thanks,
Heiko

Heiko Carstens (2):
stackleak: allow to specify arch specific stackleak poison function
s390/stackleak: provide fast __stackleak_poison() implementation

arch/s390/include/asm/processor.h | 35 +++++++++++++++++++++++++++++++
kernel/stackleak.c | 17 +++++++++++----
2 files changed, 48 insertions(+), 4 deletions(-)

--
2.37.2


2023-04-05 13:24:41

by Heiko Carstens

[permalink] [raw]
Subject: [PATCH 2/2] s390/stackleak: provide fast __stackleak_poison() implementation

Provide an s390 specific __stackleak_poison() implementation which is
faster than the generic variant.

For the original implementation with an enforced 4kb stackframe for the
getpid() system call the system call overhead increases by a factor of 3 if
the stackleak feature is enabled. Using the s390 mvc based variant this is
reduced to an increase of 25% instead.

This is within the expected area, since the mvc based implementation is
more or less a memset64() variant which comes with similar results. See
commit 0b77d6701cf8 ("s390: implement memset16, memset32 & memset64").

Reviewed-by: Vasily Gorbik <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
---
arch/s390/include/asm/processor.h | 35 +++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index efffc28cbad8..dc17896a001a 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -118,6 +118,41 @@ unsigned long vdso_size(void);

#define HAVE_ARCH_PICK_MMAP_LAYOUT

+#define __stackleak_poison __stackleak_poison
+static __always_inline void __stackleak_poison(unsigned long erase_low,
+ unsigned long erase_high,
+ unsigned long poison)
+{
+ unsigned long tmp, count;
+
+ count = erase_high - erase_low;
+ if (!count)
+ return;
+ asm volatile(
+ " cghi %[count],8\n"
+ " je 2f\n"
+ " aghi %[count],-(8+1)\n"
+ " srlg %[tmp],%[count],8\n"
+ " ltgr %[tmp],%[tmp]\n"
+ " jz 1f\n"
+ "0: stg %[poison],0(%[addr])\n"
+ " mvc 8(256-8,%[addr]),0(%[addr])\n"
+ " la %[addr],256(%[addr])\n"
+ " brctg %[tmp],0b\n"
+ "1: stg %[poison],0(%[addr])\n"
+ " larl %[tmp],3f\n"
+ " ex %[count],0(%[tmp])\n"
+ " j 4f\n"
+ "2: stg %[poison],0(%[addr])\n"
+ " j 4f\n"
+ "3: mvc 8(1,%[addr]),0(%[addr])\n"
+ "4:\n"
+ : [addr] "+&a" (erase_low), [count] "+&d" (count), [tmp] "=&a" (tmp)
+ : [poison] "d" (poison)
+ : "memory", "cc"
+ );
+}
+
/*
* Thread structure
*/
--
2.37.2

2023-04-12 09:24:03

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 2/2] s390/stackleak: provide fast __stackleak_poison() implementation

On Wed, Apr 05, 2023 at 03:08:41PM +0200, Heiko Carstens wrote:
> Provide an s390 specific __stackleak_poison() implementation which is
> faster than the generic variant.
>
> For the original implementation with an enforced 4kb stackframe for the
> getpid() system call the system call overhead increases by a factor of 3 if
> the stackleak feature is enabled. Using the s390 mvc based variant this is
> reduced to an increase of 25% instead.
>
> This is within the expected area, since the mvc based implementation is
> more or less a memset64() variant which comes with similar results. See
> commit 0b77d6701cf8 ("s390: implement memset16, memset32 & memset64").

With that in mind, could we use memset64() directly (if we made it
noninstr-safe)?

Mark.

>
> Reviewed-by: Vasily Gorbik <[email protected]>
> Signed-off-by: Heiko Carstens <[email protected]>
> ---
> arch/s390/include/asm/processor.h | 35 +++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
> index efffc28cbad8..dc17896a001a 100644
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -118,6 +118,41 @@ unsigned long vdso_size(void);
>
> #define HAVE_ARCH_PICK_MMAP_LAYOUT
>
> +#define __stackleak_poison __stackleak_poison
> +static __always_inline void __stackleak_poison(unsigned long erase_low,
> + unsigned long erase_high,
> + unsigned long poison)
> +{
> + unsigned long tmp, count;
> +
> + count = erase_high - erase_low;
> + if (!count)
> + return;
> + asm volatile(
> + " cghi %[count],8\n"
> + " je 2f\n"
> + " aghi %[count],-(8+1)\n"
> + " srlg %[tmp],%[count],8\n"
> + " ltgr %[tmp],%[tmp]\n"
> + " jz 1f\n"
> + "0: stg %[poison],0(%[addr])\n"
> + " mvc 8(256-8,%[addr]),0(%[addr])\n"
> + " la %[addr],256(%[addr])\n"
> + " brctg %[tmp],0b\n"
> + "1: stg %[poison],0(%[addr])\n"
> + " larl %[tmp],3f\n"
> + " ex %[count],0(%[tmp])\n"
> + " j 4f\n"
> + "2: stg %[poison],0(%[addr])\n"
> + " j 4f\n"
> + "3: mvc 8(1,%[addr]),0(%[addr])\n"
> + "4:\n"
> + : [addr] "+&a" (erase_low), [count] "+&d" (count), [tmp] "=&a" (tmp)
> + : [poison] "d" (poison)
> + : "memory", "cc"
> + );
> +}
> +
> /*
> * Thread structure
> */
> --
> 2.37.2
>

2023-04-18 17:29:15

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH 0/2] stackleak: allow to specify arch specific stackleak poison function

On Wed, Apr 05, 2023 at 03:08:39PM +0200, Heiko Carstens wrote:
> Factor out the code that fills the stack with the stackleak poison value in
> order to allow architectures to provide a faster implementation.
>
> Use this to provide an s390 specific implementation which can fill the
> stack with the poison value much faster (factor of ~10 compared to the
> current version).
>
> Note that the s390 stackleak support is currently only available via
> linux-next (as of today), and the s390 kernel tree at kernel.org[1].
> Therefore, if there are no objections, I'd like to add these two patches to
> the s390 tree, so they can go upstream via the next merge window together
> with the s390 support.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features&id=b94c0ebb1ec752016a3e41bfb66bb51ea905e533
>
> Thanks,
> Heiko
>
> Heiko Carstens (2):
> stackleak: allow to specify arch specific stackleak poison function
> s390/stackleak: provide fast __stackleak_poison() implementation
>
> arch/s390/include/asm/processor.h | 35 +++++++++++++++++++++++++++++++
> kernel/stackleak.c | 17 +++++++++++----
> 2 files changed, 48 insertions(+), 4 deletions(-)

Given that this series seems to be straight forward, and Mark already gave
his Ack we're going to put these two patches on the s390 git tree, even
though there was no response from Kees yet.

If there will be any complaints I'm sure we can easily solve that.