Avoid calling handlers on empty rmap entries and skip to the next non
empty rmap entry.
Empty rmap entries are noop in handlers.
Signed-off-by: Vipin Sharma <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Change-Id: I8abf0f4d82a2aae4c5d58b80bcc17ffc30785ffc
---
arch/x86/kvm/mmu/mmu.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 51671cb34fb6..f296340803ba 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1499,11 +1499,14 @@ static bool slot_rmap_walk_okay(struct slot_rmap_walk_iterator *iterator)
return !!iterator->rmap;
}
-static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
+static noinline void
+slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
{
- if (++iterator->rmap <= iterator->end_rmap) {
+ while (++iterator->rmap <= iterator->end_rmap) {
iterator->gfn += (1UL << KVM_HPAGE_GFN_SHIFT(iterator->level));
- return;
+
+ if (iterator->rmap->val)
+ return;
}
if (++iterator->level > iterator->end_level) {
base-commit: c9b8fecddb5bb4b67e351bbaeaa648a6f7456912
--
2.35.1.1021.g381101b075-goog
On Fri, Mar 25, 2022 at 4:31 PM Vipin Sharma <[email protected]> wrote:
>
> Avoid calling handlers on empty rmap entries and skip to the next non
> empty rmap entry.
>
> Empty rmap entries are noop in handlers.
>
> Signed-off-by: Vipin Sharma <[email protected]>
> Suggested-by: Sean Christopherson <[email protected]>
> Change-Id: I8abf0f4d82a2aae4c5d58b80bcc17ffc30785ffc
nit: Omit Change-Id tags from upstream commits.
> ---
> arch/x86/kvm/mmu/mmu.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 51671cb34fb6..f296340803ba 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1499,11 +1499,14 @@ static bool slot_rmap_walk_okay(struct slot_rmap_walk_iterator *iterator)
> return !!iterator->rmap;
> }
>
> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> +static noinline void
What is the reason to add noinline?
> +slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> {
> - if (++iterator->rmap <= iterator->end_rmap) {
> + while (++iterator->rmap <= iterator->end_rmap) {
> iterator->gfn += (1UL << KVM_HPAGE_GFN_SHIFT(iterator->level));
> - return;
> +
> + if (iterator->rmap->val)
> + return;
> }
>
> if (++iterator->level > iterator->end_level) {
>
> base-commit: c9b8fecddb5bb4b67e351bbaeaa648a6f7456912
> --
> 2.35.1.1021.g381101b075-goog
>
On Fri, Mar 25, 2022 at 4:53 PM David Matlack <[email protected]> wrote:
>
> On Fri, Mar 25, 2022 at 4:31 PM Vipin Sharma <[email protected]> wrote:
> >
> > Avoid calling handlers on empty rmap entries and skip to the next non
> > empty rmap entry.
> >
> > Empty rmap entries are noop in handlers.
> >
> > Signed-off-by: Vipin Sharma <[email protected]>
> > Suggested-by: Sean Christopherson <[email protected]>
> > Change-Id: I8abf0f4d82a2aae4c5d58b80bcc17ffc30785ffc
>
> nit: Omit Change-Id tags from upstream commits.
Thanks for catching it.
>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 9 ++++++---
> > 1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 51671cb34fb6..f296340803ba 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1499,11 +1499,14 @@ static bool slot_rmap_walk_okay(struct slot_rmap_walk_iterator *iterator)
> > return !!iterator->rmap;
> > }
> >
> > -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> > +static noinline void
>
> What is the reason to add noinline?
My understanding is that since this method is called from
__always_inline methods, noinline will avoid gcc inlining the
slot_rmap_walk_next in those functions and generate smaller code.
On 3/26/22 01:31, Vipin Sharma wrote:
>>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
>>> +static noinline void
>>
>> What is the reason to add noinline?
>
> My understanding is that since this method is called from
> __always_inline methods, noinline will avoid gcc inlining the
> slot_rmap_walk_next in those functions and generate smaller code.
>
Iterators are written in such a way that it's way more beneficial to
inline them. After inlining, compilers replace the aggregates (in this
case, struct slot_rmap_walk_iterator) with one variable per field and
that in turn enables a lot of optimizations, so the iterators should
actually be always_inline if anything.
For the same reason I'd guess the effect on the generated code should be
small (next time please include the output of "size mmu.o"), but should
still be there. I'll do a quick check of the generated code and apply
the patch.
Paolo
Thank you David and Paolo, for checking this patch carefully. With
hindsight, I should have explicitly mentioned adding "noinline" in my
patch email.
On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <[email protected]> wrote:
>
> On 3/26/22 01:31, Vipin Sharma wrote:
> >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> >>> +static noinline void
> >>
> >> What is the reason to add noinline?
> >
> > My understanding is that since this method is called from
> > __always_inline methods, noinline will avoid gcc inlining the
> > slot_rmap_walk_next in those functions and generate smaller code.
> >
>
> Iterators are written in such a way that it's way more beneficial to
> inline them. After inlining, compilers replace the aggregates (in this
> case, struct slot_rmap_walk_iterator) with one variable per field and
> that in turn enables a lot of optimizations, so the iterators should
> actually be always_inline if anything.
>
> For the same reason I'd guess the effect on the generated code should be
> small (next time please include the output of "size mmu.o"), but should
> still be there. I'll do a quick check of the generated code and apply
> the patch.
Yeah, I should have added the "size mmu.o" output. Here is what I have found:
size arch/x86/kvm/mmu/mmu.o
Without noinline:
text data bss dec hex filename
89938 15793 72 105803 19d4b arch/x86/kvm/mmu/mmu.o
With noinline:
text data bss dec hex filename
90058 15793 72 105923 19dc3 arch/x86/kvm/mmu/mmu.o
With noinline, increase in size = 120
Curiously, I also checked file size with "ls -l" command
File size:
Without noinline: 1394272 bytes
With noinline: 1381216 bytes
With noinline, decrease in size = 13056 bytes
I also disassembled mmu.o via "objdump -d" and found following
Total lines in the generated assembly:
Without noinline: 23438
With noinline: 23393
With noinline, decrease in assembly code = 45
I can see in assembly code that there are multiple "call" operations
in the "with noinline" object file, which is expected and has less
lines of code compared to "without noinline". I am not sure why the
size command is showing an increase in text segment for "with
noinline" and what to infer with all of this data.
Thanks
Vipin
On Mon, Mar 28, 2022, Vipin Sharma wrote:
> Thank you David and Paolo, for checking this patch carefully. With
> hindsight, I should have explicitly mentioned adding "noinline" in my
> patch email.
>
> On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <[email protected]> wrote:
> >
> > On 3/26/22 01:31, Vipin Sharma wrote:
> > >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> > >>> +static noinline void
> > >>
> > >> What is the reason to add noinline?
> > >
> > > My understanding is that since this method is called from
> > > __always_inline methods, noinline will avoid gcc inlining the
> > > slot_rmap_walk_next in those functions and generate smaller code.
> > >
> >
> > Iterators are written in such a way that it's way more beneficial to
> > inline them. After inlining, compilers replace the aggregates (in this
> > case, struct slot_rmap_walk_iterator) with one variable per field and
> > that in turn enables a lot of optimizations, so the iterators should
> > actually be always_inline if anything.
> >
> > For the same reason I'd guess the effect on the generated code should be
> > small (next time please include the output of "size mmu.o"), but should
> > still be there. I'll do a quick check of the generated code and apply
> > the patch.
>
> Yeah, I should have added the "size mmu.o" output. Here is what I have found:
>
> size arch/x86/kvm/mmu/mmu.o
>
> Without noinline:
> text data bss dec hex filename
> 89938 15793 72 105803 19d4b arch/x86/kvm/mmu/mmu.o
>
> With noinline:
> text data bss dec hex filename
> 90058 15793 72 105923 19dc3 arch/x86/kvm/mmu/mmu.o
>
> With noinline, increase in size = 120
>
> Curiously, I also checked file size with "ls -l" command
> File size:
> Without noinline: 1394272 bytes
> With noinline: 1381216 bytes
>
> With noinline, decrease in size = 13056 bytes
>
> I also disassembled mmu.o via "objdump -d" and found following
> Total lines in the generated assembly:
> Without noinline: 23438
> With noinline: 23393
>
> With noinline, decrease in assembly code = 45
>
> I can see in assembly code that there are multiple "call" operations
> in the "with noinline" object file, which is expected and has less
> lines of code compared to "without noinline". I am not sure why the
> size command is showing an increase in text segment for "with
> noinline" and what to infer with all of this data.
The most common takeaway from these types of exercises is that trying to be smarter
than the compiler is usually a fools errand. Smaller code footprint doesn't
necessarily equate to better runtime performance. And conversely, inlining may
not always be a win, which is why tagging static helpers (not in headers) with
"inline" is generally discouraged.
IMO, unless there's an explicit side effect we want (or want to avoid), we should
never use "noinline". E.g. the VMX <insn>_error() handlers use noinline so that
KVM only WARNs once per failure of instruction type, and fxregs_fixup() uses it
to keep the stack size manageable.
On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <[email protected]> wrote:
>
> On 3/26/22 01:31, Vipin Sharma wrote:
> >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> >>> +static noinline void
> >>
> >> What is the reason to add noinline?
> >
> > My understanding is that since this method is called from
> > __always_inline methods, noinline will avoid gcc inlining the
> > slot_rmap_walk_next in those functions and generate smaller code.
> >
>
> Iterators are written in such a way that it's way more beneficial to
> inline them. After inlining, compilers replace the aggregates (in this
> case, struct slot_rmap_walk_iterator) with one variable per field and
> that in turn enables a lot of optimizations, so the iterators should
> actually be always_inline if anything.
>
> For the same reason I'd guess the effect on the generated code should be
> small (next time please include the output of "size mmu.o"), but should
> still be there. I'll do a quick check of the generated code and apply
> the patch.
>
> Paolo
>
Let me know if you are still planning to modify the current patch by
removing "noinline" and merge or if you prefer a v2 without noinline.
Thanks
Vipin
On Fri, Apr 8, 2022 at 12:31 PM Vipin Sharma <[email protected]> wrote:
>
> On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <[email protected]> wrote:
> >
> > On 3/26/22 01:31, Vipin Sharma wrote:
> > >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> > >>> +static noinline void
> > >>
> > >> What is the reason to add noinline?
> > >
> > > My understanding is that since this method is called from
> > > __always_inline methods, noinline will avoid gcc inlining the
> > > slot_rmap_walk_next in those functions and generate smaller code.
> > >
> >
> > Iterators are written in such a way that it's way more beneficial to
> > inline them. After inlining, compilers replace the aggregates (in this
> > case, struct slot_rmap_walk_iterator) with one variable per field and
> > that in turn enables a lot of optimizations, so the iterators should
> > actually be always_inline if anything.
> >
> > For the same reason I'd guess the effect on the generated code should be
> > small (next time please include the output of "size mmu.o"), but should
> > still be there. I'll do a quick check of the generated code and apply
> > the patch.
> >
> > Paolo
> >
>
> Let me know if you are still planning to modify the current patch by
> removing "noinline" and merge or if you prefer a v2 without noinline.
Hi Paolo,
Any update on this patch?
Thanks
Vipin
On Mon, Apr 18, 2022 at 9:29 AM Vipin Sharma <[email protected]> wrote:
>
> On Fri, Apr 8, 2022 at 12:31 PM Vipin Sharma <[email protected]> wrote:
> >
> > On Sun, Mar 27, 2022 at 3:41 AM Paolo Bonzini <[email protected]> wrote:
> > >
> > > On 3/26/22 01:31, Vipin Sharma wrote:
> > > >>> -static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> > > >>> +static noinline void
> > > >>
> > > >> What is the reason to add noinline?
> > > >
> > > > My understanding is that since this method is called from
> > > > __always_inline methods, noinline will avoid gcc inlining the
> > > > slot_rmap_walk_next in those functions and generate smaller code.
> > > >
> > >
> > > Iterators are written in such a way that it's way more beneficial to
> > > inline them. After inlining, compilers replace the aggregates (in this
> > > case, struct slot_rmap_walk_iterator) with one variable per field and
> > > that in turn enables a lot of optimizations, so the iterators should
> > > actually be always_inline if anything.
> > >
> > > For the same reason I'd guess the effect on the generated code should be
> > > small (next time please include the output of "size mmu.o"), but should
> > > still be there. I'll do a quick check of the generated code and apply
> > > the patch.
> > >
> > > Paolo
> > >
> >
> > Let me know if you are still planning to modify the current patch by
> > removing "noinline" and merge or if you prefer a v2 without noinline.
>
> Hi Paolo,
>
> Any update on this patch?
>
Hi Paolo,
Still waiting for your response on this patch :)
Please let me know if you prefer v2 (without noinline) or you will
merge this patch without noinline from your side. If there is any
concern or feedback which I can address please let me know.
Thanks
Vipin Sharma