2019-02-28 15:08:07

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable

AKA. you don't need user_access_end() if user_access_begin() fails.

Cc: Chris Wilson <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1605,7 +1605,6 @@ static int eb_copy_relocations(const str
(char __user *)urelocs + copied,
len)) {
end_user:
- user_access_end();
kvfree(relocs);
err = -EFAULT;
goto err;
@@ -2628,8 +2627,8 @@ i915_gem_execbuffer2_ioctl(struct drm_de
&user_exec_list[i].offset,
end_user);
}
-end_user:
user_access_end();
+end_user:;
}

args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;




2019-02-28 15:27:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 03:10:44PM +0000, Chris Wilson wrote:
> Quoting Peter Zijlstra (2019-02-28 14:54:56)
> > drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
> > drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable
> >
> > AKA. you don't need user_access_end() if user_access_begin() fails.
>
> Cool, I had no idea if it was required or not. The closest to
> instructions on how to use user_access_begin() is in
> arch/x86/include/asm/uaccess.h ... which I guess is earlier in this
> series!

Ah, I don't think I put the rules in a comment; that would've been
useful I suppose ;-)

I did teach the rules to objtool though, and that is in the next patch:

https://lkml.kernel.org/r/[email protected]

so at least it will yell at you during compile time when you get it
wrong.

> > Cc: Chris Wilson <[email protected]>
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
> Reviewed-by: Chris Wilson <[email protected]>

Thanks!

2019-02-28 16:57:40

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

Quoting Peter Zijlstra (2019-02-28 14:54:56)
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable
>
> AKA. you don't need user_access_end() if user_access_begin() fails.

Cool, I had no idea if it was required or not. The closest to
instructions on how to use user_access_begin() is in
arch/x86/include/asm/uaccess.h ... which I guess is earlier in this
series!

> Cc: Chris Wilson <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Reviewed-by: Chris Wilson <[email protected]>
-Chris

2019-02-28 18:39:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 08:49:57AM -0800, Linus Torvalds wrote:
> Yes, it is redundant (but harmlessly so) if no fault occurs.

Oooh, there's a label hidden in unsafe_put_user()...

And yes, objtool seems to miss that, damn. I'll go stare at that.

2019-02-28 19:03:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 10:29:25AM -0800, Linus Torvalds wrote:
> On Thu, Feb 28, 2019 at 10:02 AM Peter Zijlstra <[email protected]> wrote:
> >
> > Weird, that jump is from C, not from a .fixup table. objtool _should_
> > see that and complain if there is a AC=1 path that reaches RET.
>
> No, unsafe_put_user() actually does the "asm goto" thing, so the jump
> is literally hidden as an exception entry. And apparently objtool
> doesn't follow exceptions (which *normally* doesn't matter for code
> liveness analysis since they normally jump back to right after the
> excepting instruction, but maybe it misses some exception handling
> code because of it?).
>
> You may have looked at unsafe_get_user(), which does indeed make the
> branch as C code, because gcc currently does not allow outputs from
> "asm goto" statements (which "get" obviously needs).

Indeed I did. But it looks like objtool actually does parse .fixup. What
appears to go wrong is the 'visited' marker for backward jumps.

If we've been there with AC=0 first, and then backjump with AC=1, things
go missing.

I've also now confused myself on how it branches from alternatives. It
looks like it now considers paths that take the STAC alternative, and
exit through the NOP alternative (which should be CLAC) and then hit
RET with AC=1.

I'll get this sorted, eventually..


2019-02-28 20:01:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 7:05 AM Peter Zijlstra <[email protected]> wrote:
>
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable
>
> AKA. you don't need user_access_end() if user_access_begin() fails.

NOOO!

This is complete garbage, and will end up running with AC set forever after.

PeterZ, you need to remove that "redundant UACCESS disable" check, or
make it a whole lot smarter. Because as it is, it's horribly horribly
wrong.

We absolutely _have_ to have that "user_access_end()" there.

Yes, it is redundant (but harmlessly so) if no fault occurs.

But if a fault occurs, that "user_access_end()" is what clears AC on
the faulting path. That's absolutely required, because we don't clear
it on return from exception (and we shouldn't - one common pattern for
user space exceptions is "try to do a big access, if that fails go
back to using small accesses until it fails again").

Your reachability clearly doesn't take exception handling into account.

That patch must die, and your incorrect reachability model must be fixed.

Right now the objtool rules seem to be worse than not using objtool,
if it causes these kinds of false positives.

Linus

2019-02-28 20:20:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 06:51:14PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 28, 2019 at 08:49:57AM -0800, Linus Torvalds wrote:
> > Yes, it is redundant (but harmlessly so) if no fault occurs.
>
> Oooh, there's a label hidden in unsafe_put_user()...
>
> And yes, objtool seems to miss that, damn. I'll go stare at that.

Weird, that jump is from C, not from a .fixup table. objtool _should_
see that and complain if there is a AC=1 path that reaches RET.

AFAIU validate_branch() should recurse down every single branch, right
Josh?

2019-02-28 20:23:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 10:02 AM Peter Zijlstra <[email protected]> wrote:
>
> Weird, that jump is from C, not from a .fixup table. objtool _should_
> see that and complain if there is a AC=1 path that reaches RET.

No, unsafe_put_user() actually does the "asm goto" thing, so the jump
is literally hidden as an exception entry. And apparently objtool
doesn't follow exceptions (which *normally* doesn't matter for code
liveness analysis since they normally jump back to right after the
excepting instruction, but maybe it misses some exception handling
code because of it?).

You may have looked at unsafe_get_user(), which does indeed make the
branch as C code, because gcc currently does not allow outputs from
"asm goto" statements (which "get" obviously needs).

[ One of these days I really should look at the gcc sources to try to
figure out why gcc doesn't like them. I wish we could have a rule like
"it's an output only for the fallthrough case, not for the goto
cases". Because I wonder if the gcc peoples aversion to "asm goto" and
outputs comes from "we can't set outputs in multiple places". But my
gcc-foo is not strong enough that I've felt confident enough to really
go take a deep dive into something that feels pretty subtle, so I've
_thought_ about doing it for a long time, but have never actually
built up the confidence to do so ]

Linus

2019-03-01 10:36:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Thu, Feb 28, 2019 at 08:01:11PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 28, 2019 at 10:29:25AM -0800, Linus Torvalds wrote:
> > On Thu, Feb 28, 2019 at 10:02 AM Peter Zijlstra <[email protected]> wrote:
> > >
> > > Weird, that jump is from C, not from a .fixup table. objtool _should_
> > > see that and complain if there is a AC=1 path that reaches RET.
> >
> > No, unsafe_put_user() actually does the "asm goto" thing, so the jump
> > is literally hidden as an exception entry. And apparently objtool
> > doesn't follow exceptions (which *normally* doesn't matter for code
> > liveness analysis since they normally jump back to right after the
> > excepting instruction, but maybe it misses some exception handling
> > code because of it?).
> >
> > You may have looked at unsafe_get_user(), which does indeed make the
> > branch as C code, because gcc currently does not allow outputs from
> > "asm goto" statements (which "get" obviously needs).
>
> Indeed I did. But it looks like objtool actually does parse .fixup. What
> appears to go wrong is the 'visited' marker for backward jumps.
>
> If we've been there with AC=0 first, and then backjump with AC=1, things
> go missing.
>
> I've also now confused myself on how it branches from alternatives. It
> looks like it now considers paths that take the STAC alternative, and
> exit through the NOP alternative (which should be CLAC) and then hit
> RET with AC=1.
>
> I'll get this sorted, eventually..

Ha!

Original file:

CC drivers/gpu/drm/i915/i915_gem_execbuffer.o
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x3c: redundant UACCESS disable
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x66: redundant UACCESS disable

With the dodgy patch:

CC drivers/gpu/drm/i915/i915_gem_execbuffer.o
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: eb_relocate_slow()+0x1f9: call to kvfree() with UACCESS enabled
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x315: call to kvfree() with UACCESS enabled


Let me do an allmodconfig build to see how much pain is caused by that
redundant CLAC warning.

2019-03-01 12:53:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Fri, Mar 01, 2019 at 11:34:52AM +0100, Peter Zijlstra wrote:

> Let me do an allmodconfig build to see how much pain is caused by that
> redundant CLAC warning.

arch/x86/lib/usercopy_64.o: warning: objtool: .altinstr_replacement+0x30: redundant UACCESS disable
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x22: redundant UACCESS disable
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: .altinstr_replacement+0x64: redundant UACCESS disable
drivers/xen/privcmd.o: warning: objtool: privcmd_ioctl()+0x1c0: call to {dynamic}() with UACCESS enabled

The usercopy one is difficult, that's copy_user_handle_tail(), it is
buggered though, because that lacks notrace and thus has a __fentry__
call in.

Also, afaict all exception jumps into copy_user_handle_tail() will have
AC=1, but the __{get,put}_user_nocheck() things do STAC/CLAC all over
again.

So what do we do? Annotate that we start with AC=1 and then immediately
do the clac, and then let __{get,put}_user_nocheck() do their own thing?
or make it use the unsafe stuff?

---
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index ee42bb0cbeb3..e1ab9a50937c 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -59,7 +59,7 @@ EXPORT_SYMBOL(clear_user);
* Since protection fault in copy_from/to_user is not a normal situation,
* it is not necessary to optimize tail handling.
*/
-__visible unsigned long
+__visible notrace unsigned long
copy_user_handle_tail(char *to, char *from, unsigned len)
{
for (; len; --len, to++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 485b259127c3..695212c5bd07 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1606,6 +1606,7 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb)
len)) {
end_user:
user_access_end();
+end:
kvfree(relocs);
err = -EFAULT;
goto err;
@@ -1625,7 +1626,7 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb)
* relocations were valid.
*/
if (!user_access_begin(urelocs, size))
- goto end_user;
+ goto end;

for (copied = 0; copied < nreloc; copied++)
unsafe_put_user(-1,
@@ -2616,7 +2617,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
* when we did the "copy_from_user()" above.
*/
if (!user_access_begin(user_exec_list, count * sizeof(*user_exec_list)))
- goto end_user;
+ goto end;

for (i = 0; i < args->buffer_count; i++) {
if (!(exec2_list[i].offset & UPDATE))
@@ -2630,6 +2631,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
}
end_user:
user_access_end();
+end:;
}

args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;

2019-03-01 13:24:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Fri, Mar 01, 2019 at 01:27:45PM +0100, Peter Zijlstra wrote:
> arch/x86/lib/usercopy_64.o: warning: objtool: .altinstr_replacement+0x30: redundant UACCESS disable

> The usercopy one is difficult, that's copy_user_handle_tail(), it is
> buggered though, because that lacks notrace and thus has a __fentry__
> call in.
>
> Also, afaict all exception jumps into copy_user_handle_tail() will have
> AC=1, but the __{get,put}_user_nocheck() things do STAC/CLAC all over
> again.
>
> So what do we do? Annotate that we start with AC=1 and then immediately
> do the clac, and then let __{get,put}_user_nocheck() do their own thing?
> or make it use the unsafe stuff?

Or.. we move the thing to assembly. Of course, I suck at (writing) asm,
so the below is probably broken in various ways.

--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -208,9 +208,6 @@ __copy_from_user_flushcache(void *dst, c
}

unsigned long
-copy_user_handle_tail(char *to, char *from, unsigned len);
-
-unsigned long
mcsafe_handle_tail(char *to, char *from, unsigned len);

#endif /* _ASM_X86_UACCESS_64_H */
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -194,6 +194,29 @@ ENDPROC(copy_user_enhanced_fast_string)
EXPORT_SYMBOL(copy_user_enhanced_fast_string)

/*
+ * Try to copy last bytes and clear the rest if needed.
+ * Since protection fault in copy_from/to_user is not a normal situation,
+ * it is not necessary to optimize tail handling.
+ *
+ * Input:
+ * rdi destination
+ * rsi source
+ * rdx count
+ *
+ * Output:
+ * eax uncopied bytes or 0 if successful.
+ */
+ENTRY(copy_user_handle_tail)
+ movl %edx,%ecx
+1: rep movsb
+2: mov %ecx,%eax
+ ASM_CLAC
+ ret
+
+ _ASM_EXTABLE_UA(1b, 2b)
+END(copy_user_handle_tail)
+
+/*
* copy_user_nocache - Uncached memory copy with exception handling
* This will force destination out of cache for more performance.
*
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -55,26 +55,6 @@ unsigned long clear_user(void __user *to
EXPORT_SYMBOL(clear_user);

/*
- * Try to copy last bytes and clear the rest if needed.
- * Since protection fault in copy_from/to_user is not a normal situation,
- * it is not necessary to optimize tail handling.
- */
-__visible unsigned long
-copy_user_handle_tail(char *to, char *from, unsigned len)
-{
- for (; len; --len, to++) {
- char c;
-
- if (__get_user_nocheck(c, from++, sizeof(char)))
- break;
- if (__put_user_nocheck(c, to, sizeof(char)))
- break;
- }
- clac();
- return len;
-}
-
-/*
* Similar to copy_user_handle_tail, probe for the write fault point,
* but reuse __memcpy_mcsafe in case a new read error is encountered.
* clac() is handled in _copy_to_iter_mcsafe().

2019-03-01 14:53:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Fri, Mar 01, 2019 at 01:57:18PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 01, 2019 at 01:27:45PM +0100, Peter Zijlstra wrote:
> > arch/x86/lib/usercopy_64.o: warning: objtool: .altinstr_replacement+0x30: redundant UACCESS disable
>
> > The usercopy one is difficult, that's copy_user_handle_tail(), it is
> > buggered though, because that lacks notrace and thus has a __fentry__
> > call in.
> >
> > Also, afaict all exception jumps into copy_user_handle_tail() will have
> > AC=1, but the __{get,put}_user_nocheck() things do STAC/CLAC all over
> > again.
> >
> > So what do we do? Annotate that we start with AC=1 and then immediately
> > do the clac, and then let __{get,put}_user_nocheck() do their own thing?
> > or make it use the unsafe stuff?
>
> Or.. we move the thing to assembly. Of course, I suck at (writing) asm,
> so the below is probably broken in various ways.

The advantage is that it now all lives in the same .o file and objtool
can actually follow and find the complete control flow.

I've made it ENDPROC() such that it becomes STT_FUNC and objtool does
all the normal things. I've also moved the ALIGN_DESTINATION macro into
the .S file.

Andy, do we have a sensible self-test for this path?

2019-03-01 15:39:18

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC



> On Mar 1, 2019, at 6:38 AM, Peter Zijlstra <[email protected]> wrote:
>
>> On Fri, Mar 01, 2019 at 01:57:18PM +0100, Peter Zijlstra wrote:
>>> On Fri, Mar 01, 2019 at 01:27:45PM +0100, Peter Zijlstra wrote:
>>> arch/x86/lib/usercopy_64.o: warning: objtool: .altinstr_replacement+0x30: redundant UACCESS disable
>>
>>> The usercopy one is difficult, that's copy_user_handle_tail(), it is
>>> buggered though, because that lacks notrace and thus has a __fentry__
>>> call in.
>>>
>>> Also, afaict all exception jumps into copy_user_handle_tail() will have
>>> AC=1, but the __{get,put}_user_nocheck() things do STAC/CLAC all over
>>> again.
>>>
>>> So what do we do? Annotate that we start with AC=1 and then immediately
>>> do the clac, and then let __{get,put}_user_nocheck() do their own thing?
>>> or make it use the unsafe stuff?
>>
>> Or.. we move the thing to assembly. Of course, I suck at (writing) asm,
>> so the below is probably broken in various ways.
>
> The advantage is that it now all lives in the same .o file and objtool
> can actually follow and find the complete control flow.
>
> I've made it ENDPROC() such that it becomes STT_FUNC and objtool does
> all the normal things. I've also moved the ALIGN_DESTINATION macro into
> the .S file.
>
> Andy, do we have a sensible self-test for this path?

Not that I know of. Something like my (rejected) strncpy_from_user test could probably be added fairly easily.

2019-03-01 16:16:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Fri, Mar 1, 2019 at 4:57 AM Peter Zijlstra <[email protected]> wrote:
>
> Or.. we move the thing to assembly. Of course, I suck at (writing) asm,
> so the below is probably broken in various ways.

Looks sane, and makes sense. I feel that the old "mixed C and asm" was
much worse.

My only comment is that I think you should remove the ENTRY() one,
because you don't actually want that symbol to be global any more,
because I think it's all from inside copy_user_64.S now.

So just

ALIGN
copy_user_handle_tail:
.. code goes here..

Hmm?

Linus

2019-03-01 17:44:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 6/8] i915,uaccess: Fix redundant CLAC

On Fri, Mar 1, 2019 at 2:35 AM Peter Zijlstra <[email protected]> wrote:
>
> With the dodgy patch:
>
> CC drivers/gpu/drm/i915/i915_gem_execbuffer.o
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: eb_relocate_slow()+0x1f9: call to kvfree() with UACCESS enabled
> drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x315: call to kvfree() with UACCESS enabled

Ok, so it looks like your patch does the right thing. Your fixups to
i915_gem_execbuffer.c (and the already commented-upon copy_user_tail
thing) also look sane.

Linus