2020-03-28 00:05:42

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH] drm/i915: check to see if the FPU is available before using it

It's not safe to just grab the FPU willy nilly without first checking to
see if it's available. This patch adds the usual call to may_use_simd()
and falls back to boring memcpy if it's not available.

Signed-off-by: Jason A. Donenfeld <[email protected]>
---
drivers/gpu/drm/i915/i915_memcpy.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
index fdd550405fd3..7c0e022586bc 100644
--- a/drivers/gpu/drm/i915/i915_memcpy.c
+++ b/drivers/gpu/drm/i915/i915_memcpy.c
@@ -24,6 +24,7 @@

#include <linux/kernel.h>
#include <asm/fpu/api.h>
+#include <asm/simd.h>

#include "i915_memcpy.h"

@@ -38,6 +39,12 @@ static DEFINE_STATIC_KEY_FALSE(has_movntdqa);
#ifdef CONFIG_AS_MOVNTDQA
static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
{
+ if (unlikely(!may_use_simd())) {
+ memcpy(dst, src, len);
+ return;
+ }
+
+
kernel_fpu_begin();

while (len >= 4) {
@@ -67,6 +74,11 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)

static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len)
{
+ if (unlikely(!may_use_simd())) {
+ memcpy(dst, src, len);
+ return;
+ }
+
kernel_fpu_begin();

while (len >= 4) {
--
2.26.0


2020-03-28 07:59:39

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: check to see if the FPU is available before using it

Quoting Jason A. Donenfeld (2020-03-28 00:04:22)
> It's not safe to just grab the FPU willy nilly without first checking to
> see if it's available. This patch adds the usual call to may_use_simd()
> and falls back to boring memcpy if it's not available.

These instructions do not use the fpu, nor are these registers aliased
over the fpu stack. This description and the may_use_simd() do not
look like they express the right granularity as to which simd state are
included

> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> drivers/gpu/drm/i915/i915_memcpy.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
> index fdd550405fd3..7c0e022586bc 100644
> --- a/drivers/gpu/drm/i915/i915_memcpy.c
> +++ b/drivers/gpu/drm/i915/i915_memcpy.c
> @@ -24,6 +24,7 @@
>
> #include <linux/kernel.h>
> #include <asm/fpu/api.h>
> +#include <asm/simd.h>
>
> #include "i915_memcpy.h"
>
> @@ -38,6 +39,12 @@ static DEFINE_STATIC_KEY_FALSE(has_movntdqa);
> #ifdef CONFIG_AS_MOVNTDQA
> static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len)
> {
> + if (unlikely(!may_use_simd())) {
> + memcpy(dst, src, len);
> + return;

Look at caller, return the error and let them decide if they can avoid
the read from WC, which quite often they can. And no, this is not done
from interrupt context, we would be crucified if we did.
-Chris

2020-04-08 00:41:23

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: check to see if the FPU is available before using it

On Sat, Mar 28, 2020 at 1:59 AM Chris Wilson <[email protected]> wrote:
>
> Quoting Jason A. Donenfeld (2020-03-28 00:04:22)
> > It's not safe to just grab the FPU willy nilly without first checking to
> > see if it's available. This patch adds the usual call to may_use_simd()
> > and falls back to boring memcpy if it's not available.
>
> These instructions do not use the fpu, nor are these registers aliased
> over the fpu stack. This description and the may_use_simd() do not
> look like they express the right granularity as to which simd state are
> included

Most of the time when discussing vector instructions in the kernel
with x86, "FPU" is used to denote the whole shebang, because of
similar XSAVE semantics and requirements with actual floating point
instructions and SIMD instructions. So when I say "grab the FPU", I'm
really referring to the act of messing with any registers that aren't
saved and restored by default during context switch and need the
explicit marking for XSAVE to come in -- the kernel_fpu_begin/end
calls that you already have.

With regards to the granularity here, you are in fact touching xmm
registers. That means you need kernel_fpu_begin/end, which you
correctly have. However, it also means that before using those, you
should check to see if that's okay by using the may_use_simd()
instruction.

Now you may claim that at the moment
may_use_simd()-->irq_fpu_usable()-->(!in_interrupt() ||
interrupted_user_mode() || interrupted_kernel_fpu_idle()) always holds
true, and you're a keen follower of the (recently changed) kernel fpu
x86 semantics in case those conditions change, and that your driver is
so strictly written that you know exactly the context this fancy
memcpy will run in, always, and you'll never deviate from it, and
therefore it's okay to depart from the rules and omit the check and
safe fallback code. But c'mon - i915 is complex, and mixed context
bugs abound, and the rules for using those registers might in fact
change without you noticing.

So why not apply this to have a safe fallback for when the numerous
assumptions no longer hold? (If you're extra worried I suppose you
could make it a `if (WARN_ON(!may_use_simd()))` instead or something,
but that seems like a bit much.)

> Look at caller, return the error and let them decide if they can avoid
> the read from WC, which quite often they can. And no, this is not done
> from interrupt context, we would be crucified if we did.

Ahh, now, reading this comment here I realize maybe I've misunderstood
you. Do you mean to say that checking for may_use_simd() is a thing
that you'd like to do after all, but you don't like it falling back to
slow memcpy. Instead, you'd like for the code to return an error
value, and then caller can just optionally skip the memcpy under some
complicated driver circumstances?

Jason