With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
-- and indirectly, sign_extend64() -- from the user_access_begin/end
critical region (i.e, with SMAP disabled).
While it's probably harmless in this case, in general we like to avoid
extra function calls in SMAP-disabled regions because it can open up
inadvertent security holes.
Fix it by moving the gen8_canonical_addr() conversion to a separate loop
before user_access_begin() is called.
Note that gen8_canonical_addr() is now called *before* masking off the
PIN_OFFSET_MASK bits. That should be ok because it just does a sign
extension and ignores the masked lower bits anyway.
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d5a0f5ae4a8b..183cab13e028 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
u64_to_user_ptr(args->buffers_ptr);
unsigned int i;
+ /*
+ * Do the call to gen8_canonical_addr() outside the
+ * uaccess-enabled region to minimize uaccess exposure.
+ */
+ for (i = 0; i < args->buffer_count; i++)
+ exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset);
+
/* Copy the new buffer offsets back to the user's exec list. */
/*
* Note: count * sizeof(*user_exec_list) does not overflow,
@@ -2962,9 +2969,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
if (!(exec2_list[i].offset & UPDATE))
continue;
- exec2_list[i].offset =
- gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK);
- unsafe_put_user(exec2_list[i].offset,
+ unsafe_put_user(exec2_list[i].offset & PIN_OFFSET_MASK,
&user_exec_list[i].offset,
end_user);
}
--
2.21.1
Quoting Josh Poimboeuf (2020-02-27 22:08:26)
> With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
>
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
>
> This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
> -- and indirectly, sign_extend64() -- from the user_access_begin/end
> critical region (i.e, with SMAP disabled).
>
> While it's probably harmless in this case, in general we like to avoid
> extra function calls in SMAP-disabled regions because it can open up
> inadvertent security holes.
>
> Fix it by moving the gen8_canonical_addr() conversion to a separate loop
> before user_access_begin() is called.
>
> Note that gen8_canonical_addr() is now called *before* masking off the
> PIN_OFFSET_MASK bits. That should be ok because it just does a sign
> extension and ignores the masked lower bits anyway.
>
> Reported-by: Randy Dunlap <[email protected]>
> Signed-off-by: Josh Poimboeuf <[email protected]>
> ---
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d5a0f5ae4a8b..183cab13e028 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
> u64_to_user_ptr(args->buffers_ptr);
> unsigned int i;
>
> + /*
> + * Do the call to gen8_canonical_addr() outside the
> + * uaccess-enabled region to minimize uaccess exposure.
> + */
> + for (i = 0; i < args->buffer_count; i++)
> + exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset);
Another loop over all the objects, where we intentionally try and skip
unmodified entries? To save 2 instructions from inside the second loop?
Colour me skeptical.
-Chris
On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote:
> With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
>
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
>
> This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
> -- and indirectly, sign_extend64() -- from the user_access_begin/end
> critical region (i.e, with SMAP disabled).
>
> While it's probably harmless in this case, in general we like to avoid
> extra function calls in SMAP-disabled regions because it can open up
> inadvertent security holes.
>
> Fix it by moving the gen8_canonical_addr() conversion to a separate loop
> before user_access_begin() is called.
>
> Note that gen8_canonical_addr() is now called *before* masking off the
> PIN_OFFSET_MASK bits. That should be ok because it just does a sign
> extension and ignores the masked lower bits anyway.
How painful would it be to inline the damn thing?
<looks>
static inline u64 gen8_canonical_addr(u64 address)
{
return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT);
}
static inline __s64 sign_extend64(__u64 value, int index)
{
__u8 shift = 63 - index;
return (__s64)(value << shift) >> shift;
}
What the hell? Josh, what kind of .config do you have that these are
_not_ inlined? And why not mark gen8_canonical_addr() __always_inline?
On Thu, Feb 27, 2020 at 10:35:42PM +0000, Al Viro wrote:
> On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote:
> > With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
> >
> > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
> >
> > This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
> > -- and indirectly, sign_extend64() -- from the user_access_begin/end
> > critical region (i.e, with SMAP disabled).
> >
> > While it's probably harmless in this case, in general we like to avoid
> > extra function calls in SMAP-disabled regions because it can open up
> > inadvertent security holes.
> >
> > Fix it by moving the gen8_canonical_addr() conversion to a separate loop
> > before user_access_begin() is called.
> >
> > Note that gen8_canonical_addr() is now called *before* masking off the
> > PIN_OFFSET_MASK bits. That should be ok because it just does a sign
> > extension and ignores the masked lower bits anyway.
>
> How painful would it be to inline the damn thing?
> <looks>
> static inline u64 gen8_canonical_addr(u64 address)
> {
> return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT);
> }
> static inline __s64 sign_extend64(__u64 value, int index)
> {
> __u8 shift = 63 - index;
> return (__s64)(value << shift) >> shift;
> }
>
> What the hell? Josh, what kind of .config do you have that these are
> _not_ inlined?
I think this was seen with CONFIG_CC_OPTIMIZE_FOR_SIZE, which tends to
ignore inline.
> And why not mark gen8_canonical_addr() __always_inline?
Right, marking those two functions as __always_inline is the other
option. The problem is, if you keep doing it, eventually you end up
with __always_inline-itis spreading all over the place. And it affects
all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case.
At least this fix is localized.
But I agree my patch isn't ideal either.
--
Josh
On 2/27/20 5:03 PM, Josh Poimboeuf wrote:
> On Thu, Feb 27, 2020 at 10:35:42PM +0000, Al Viro wrote:
>> On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote:
>>> With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
>>>
>>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
>>>
>>> This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
>>> -- and indirectly, sign_extend64() -- from the user_access_begin/end
>>> critical region (i.e, with SMAP disabled).
>>>
>>> While it's probably harmless in this case, in general we like to avoid
>>> extra function calls in SMAP-disabled regions because it can open up
>>> inadvertent security holes.
>>>
>>> Fix it by moving the gen8_canonical_addr() conversion to a separate loop
>>> before user_access_begin() is called.
>>>
>>> Note that gen8_canonical_addr() is now called *before* masking off the
>>> PIN_OFFSET_MASK bits. That should be ok because it just does a sign
>>> extension and ignores the masked lower bits anyway.
>>
>> How painful would it be to inline the damn thing?
>> <looks>
>> static inline u64 gen8_canonical_addr(u64 address)
>> {
>> return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT);
>> }
>> static inline __s64 sign_extend64(__u64 value, int index)
>> {
>> __u8 shift = 63 - index;
>> return (__s64)(value << shift) >> shift;
>> }
>>
>> What the hell? Josh, what kind of .config do you have that these are
>> _not_ inlined?
>
> I think this was seen with CONFIG_CC_OPTIMIZE_FOR_SIZE, which tends to
so the commit message correctly says.
> ignore inline.
>
>> And why not mark gen8_canonical_addr() __always_inline?
>
> Right, marking those two functions as __always_inline is the other
> option. The problem is, if you keep doing it, eventually you end up
> with __always_inline-itis spreading all over the place. And it affects
> all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case.
> At least this fix is localized.
>
> But I agree my patch isn't ideal either.
fwiw,
Acked-by: Randy Dunlap <[email protected]> # build-tested
thanks.
--
~Randy
On Thu, Feb 27, 2020 at 10:26:00PM +0000, Chris Wilson wrote:
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
> > u64_to_user_ptr(args->buffers_ptr);
> > unsigned int i;
> >
> > + /*
> > + * Do the call to gen8_canonical_addr() outside the
> > + * uaccess-enabled region to minimize uaccess exposure.
> > + */
> > + for (i = 0; i < args->buffer_count; i++)
> > + exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset);
>
>
> Another loop over all the objects, where we intentionally try and skip
> unmodified entries? To save 2 instructions from inside the second loop?
>
> Colour me skeptical.
So are you're saying these arrays can be large and that you have
performance concerns?
--
Josh
On Thu, Feb 27, 2020 at 07:03:42PM -0600, Josh Poimboeuf wrote:
> > And why not mark gen8_canonical_addr() __always_inline?
>
> Right, marking those two functions as __always_inline is the other
> option. The problem is, if you keep doing it, eventually you end up
> with __always_inline-itis spreading all over the place. And it affects
> all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case.
> At least this fix is localized.
I'm all for __always_inline in this case, the compiler not inlining sign
extention is just retarded,
On Fri, Feb 28, 2020 at 07:04:41PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 27, 2020 at 07:03:42PM -0600, Josh Poimboeuf wrote:
> > > And why not mark gen8_canonical_addr() __always_inline?
> >
> > Right, marking those two functions as __always_inline is the other
> > option. The problem is, if you keep doing it, eventually you end up
> > with __always_inline-itis spreading all over the place. And it affects
> > all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case.
> > At least this fix is localized.
>
> I'm all for __always_inline in this case, the compiler not inlining sign
> extention is just retarded,
FWIW, in this case it's
salq $8, %rax
sarq $8, %rax
i.e. 8 bytes. Sure, that's 3 bytes longer than call, but really, WTF?