2017-03-09 22:44:56

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code

From: "Steven Rostedt (Red Hat)" <[email protected]>

Linus mentioned that doing two compares can be replaced by a single
compare. That is, instead of:

movq $repeat_nmi, %rdx
cmpq 8(%rsp), %rdx
ja not_in_region
movq $end_repeat_nmi, %rdx
cmpq 8(%rsp), %rdx
ja in_region

we can replace that with:

movq 8(%rsp), %rdx
subq $repeat_nmi, %rdx
cmpq $end_repeat_nmi-repeat_nmi, %rdx
jb in_region

Inspired-by: Linus Torvalds <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
---
arch/x86/entry/entry_64.S | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 044d18ebc43c..3aad759aace2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1330,13 +1330,10 @@ ENTRY(nmi)
* resume the outer NMI.
*/

- movq $repeat_nmi, %rdx
- cmpq 8(%rsp), %rdx
- ja 1f
- movq $end_repeat_nmi, %rdx
- cmpq 8(%rsp), %rdx
- ja nested_nmi_out
-1:
+ movq 8(%rsp), %rdx
+ subq $repeat_nmi, %rdx
+ cmpq $end_repeat_nmi-repeat_nmi, %rdx
+ jb nested_nmi_out

/*
* Now check "NMI executing". If it's set, then we're nested.
--
2.10.2



2017-03-10 02:43:20

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code

On Thu, Mar 9, 2017 at 2:42 PM, Steven Rostedt <[email protected]> wrote:
> From: "Steven Rostedt (Red Hat)" <[email protected]>
>
> Linus mentioned that doing two compares can be replaced by a single
> compare. That is, instead of:
>
> movq $repeat_nmi, %rdx
> cmpq 8(%rsp), %rdx
> ja not_in_region
> movq $end_repeat_nmi, %rdx
> cmpq 8(%rsp), %rdx
> ja in_region
>
> we can replace that with:
>
> movq 8(%rsp), %rdx
> subq $repeat_nmi, %rdx
> cmpq $end_repeat_nmi-repeat_nmi, %rdx
> jb in_region

Seems reasonable to me. Good luck ever noticing the speedup :)

--Andy

2017-03-10 03:49:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code



On March 9, 2017 9:42:57 PM EST, Andy Lutomirski <[email protected]> wrote:
>On Thu, Mar 9, 2017 at 2:42 PM, Steven Rostedt <[email protected]>
>wrote:
>> From: "Steven Rostedt (Red Hat)" <[email protected]>
>>
>> Linus mentioned that doing two compares can be replaced by a single
>> compare. That is, instead of:
>>
>> movq $repeat_nmi, %rdx
>> cmpq 8(%rsp), %rdx
>> ja not_in_region
>> movq $end_repeat_nmi, %rdx
>> cmpq 8(%rsp), %rdx
>> ja in_region
>>
>> we can replace that with:
>>
>> movq 8(%rsp), %rdx
>> subq $repeat_nmi, %rdx
>> cmpq $end_repeat_nmi-repeat_nmi, %rdx
>> jb in_region
>
>Seems reasonable to me. Good luck ever noticing the speedup :)
>

It had nothing to do with speedup. Linus said that the current code makes the assembly programmer in him die a little. I want to cure that.

-- Steve

>--Andy

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2017-03-10 03:51:20

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code

On Thu, Mar 9, 2017 at 7:49 PM, Steven Rostedt <[email protected]> wrote:
>
>
> On March 9, 2017 9:42:57 PM EST, Andy Lutomirski <[email protected]> wrote:
>>On Thu, Mar 9, 2017 at 2:42 PM, Steven Rostedt <[email protected]>
>>wrote:
>>> From: "Steven Rostedt (Red Hat)" <[email protected]>
>>>
>>> Linus mentioned that doing two compares can be replaced by a single
>>> compare. That is, instead of:
>>>
>>> movq $repeat_nmi, %rdx
>>> cmpq 8(%rsp), %rdx
>>> ja not_in_region
>>> movq $end_repeat_nmi, %rdx
>>> cmpq 8(%rsp), %rdx
>>> ja in_region
>>>
>>> we can replace that with:
>>>
>>> movq 8(%rsp), %rdx
>>> subq $repeat_nmi, %rdx
>>> cmpq $end_repeat_nmi-repeat_nmi, %rdx
>>> jb in_region
>>
>>Seems reasonable to me. Good luck ever noticing the speedup :)
>>
>
> It had nothing to do with speedup. Linus said that the current code makes the assembly programmer in him die a little. I want to cure that.
>

One might argue that the world would be a better place if the assembly
programmer in some people died a little.

2017-03-10 07:21:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code


* Andy Lutomirski <[email protected]> wrote:

> > It had nothing to do with speedup. Linus said that the current code makes the
> > assembly programmer in him die a little. I want to cure that.
>
> One might argue that the world would be a better place if the assembly
> programmer in some people died a little.

Joking aside, I'll bite: while in the kernel we try to avoid ever actually
_writing_ new assembly code, assembly programming is still an invaluable skill,
because it indirectly improves all the other 99% of non-assembly .c code:

- Looking at the C compiler's assembly output tells us how close the code is to
optimal.

- Being able to tell whether our C abstractions are too far removed from how the
compiler will map it to machine instructions is invaluable.

- Being able to shape data structures and code in a machine-friendly way.

Much would be lost if the assembly programmer went extinct and it's no
accident that annotated assembly output is just two <Enter> keys away
after launching 'perf top' or 'perf report'. The more developers know
assembly the better, IMHO.

Thanks,

Ingo

2017-03-10 19:00:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code

On Thu, Mar 9, 2017 at 11:20 PM, Ingo Molnar <[email protected]> wrote:
>
> Joking aside, I'll bite: while in the kernel we try to avoid ever actually
> _writing_ new assembly code

.. also, when we do, I think we should care about it.

If you write asm, and the end result is noticeably worse than what
your average compiler would generate, exactly why are you writing it
in asm in the first place?

So I think people should aim to avoid asm. Andy certainly knows that,
and I loved his "rewrite a lot of the low-level system call code"
patches.

But the corollary to that is that if you _do_ write assembler, please
have some pride in the code, and don't half-arse it.

Linus

2017-03-10 19:04:16

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/nmi: Optimize the check for being in the repeat_nmi code

On Fri, Mar 10, 2017 at 11:00 AM, Linus Torvalds
<[email protected]> wrote:
> On Thu, Mar 9, 2017 at 11:20 PM, Ingo Molnar <[email protected]> wrote:
>>
>> Joking aside, I'll bite: while in the kernel we try to avoid ever actually
>> _writing_ new assembly code
>
> .. also, when we do, I think we should care about it.
>
> If you write asm, and the end result is noticeably worse than what
> your average compiler would generate, exactly why are you writing it
> in asm in the first place?
>
> So I think people should aim to avoid asm. Andy certainly knows that,
> and I loved his "rewrite a lot of the low-level system call code"
> patches.
>
> But the corollary to that is that if you _do_ write assembler, please
> have some pride in the code, and don't half-arse it.
>

Geez, I didn't expect anyone to take my silly comment remotely
seriously :) And I do like Steven's patches.

--Andy, who just looked at binutils source to figure out WTF "nobits"
meant. Take that, asm!