Weisbecker <[email protected]>,Stanislav Kinsburskiy <[email protected]>,Ingo Molnar <[email protected]>,Paolo Bonzini <[email protected]>,Dmitry Safonov <[email protected]>,Borislav Petkov <[email protected]>,Josh Poimboeuf <[email protected]>,Brian Gerst <[email protected]>,Jan Beulich <[email protected]>,Christian Borntraeger <[email protected]>,Fenghua Yu <[email protected]>,He Chen <[email protected]>,Russell King <[email protected]>,Vladimir Murzin <[email protected]>,Will Deacon <[email protected]>,Catalin Marinas <[email protected]>,Mark Rutland <[email protected]>,James Morse <[email protected]>,"David A . Long" <[email protected]>,Pratyush Anand <[email protected]>,Laura Abbott <[email protected]>,Andre Przywara <[email protected]>,Chris Metcalf <[email protected]>,linux-s390 <[email protected]>,LKML <[email protected]>,Linux API <[email protected]>,the arch/x86 maintainers
<[email protected]>,"[email protected]" <[email protected]>,Kernel Hardening <[email protected]>
From: [email protected]
Message-ID: <[email protected]>
On March 22, 2017 2:11:12 PM PDT, Thomas Garnier <[email protected]> wrote:
>On Wed, Mar 22, 2017 at 1:49 PM, H. Peter Anvin <[email protected]> wrote:
>> On 03/22/17 13:41, Thomas Garnier wrote:
>>>>> with the change below for additional feedback.
>>>>
>>>> Can you specify what that means?
>>>
>>> If I set inline by default, the compiler chose not to inline it on
>>> x86. If I force inline the size impact was actually bigger (without
>>> the architecture specific code).
>>>
>>
>> That's utterly bizarre. Something strange is going on there. I
>suspect
>> the right thing to do is to out-of-line the error case only, but even
>> that seems strange. It should be something like four instructions
>inline.
>>
>
>The compiler seemed to often inline other functions called by the
>syscall handlers. I assume the growth was due to changes in code
>optimization because the function is much larger at the end.
>
>>>>
>>>> On x86, where there is only one caller of this, it really seems
>like it
>>>> ought to reduce the overhead to almost zero (since it most likely
>is
>>>> hidden in the pipeline.)
>>>>
>>>> I would like to suggest defining it inline if
>>>> CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE is set; I really
>don't
>>>> care about an architecture which doesn't have it.
>>>
>>> But if there is only one caller, does the compiler is not suppose to
>>> inline the function based on options?
>>
>> If it is marked static in the same file, yes, but you have it in a
>> different file from what I can tell.
>
>If we do global optimization, it should. Having it as a static inline
>make it easier on all types of builds.
>
>>
>>> The assembly will call it too, so I would need an inline and a
>>> non-inline based on the caller.
>>
>> Where? I don't see that anywhere, at least for x86.
>
>After the latest changes on x86, yes. On arm/arm64, we call it with
>the CHECK_DATA_CORRUPTION config.
>
>>
>> -hpa
>>
If we do global optimization, yes, but global optimization (generally called link-time optimization, LTO, on Linux) is very much the exception and not the rule for the Linux kernel at this time.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.