Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
checks, which are obviously intended to be logical statements. Switching
to a Logical OR is functionally a nop, but allows the compiler to better
optimize the checks.
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7269130ea5e2..72e845709027 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
{
int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
- return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
+ return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
}
--
2.24.1
Hi??
>
>Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit checks, which are obviously intended to be logical statements. Switching to a Logical OR is functionally a nop, but allows the compiler to better optimize the checks.
>
>Signed-off-by: Sean Christopherson <[email protected]>
>---
> arch/x86/kvm/mmu/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7269130ea5e2..72e845709027 100644
>--- a/arch/x86/kvm/mmu/mmu.c
>+++ b/arch/x86/kvm/mmu/mmu.c
>@@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) {
> int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
>
>- return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
>+ return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0); }
>
>--
>2.24.1
On the call chain walk_shadow_page_get_mmio_spte --> is_shadow_zero_bits_set --> __is_rsvd_bits_set, the
return value is used as:
reserved |= is_shadow_zero_bits_set(vcpu->arch.mmu, spte,
iterator.level);
But this seems ok because val reserved is bool type.
Thanks.
Sean Christopherson <[email protected]> writes:
> Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> checks, which are obviously intended to be logical statements. Switching
> to a Logical OR is functionally a nop, but allows the compiler to better
> optimize the checks.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/mmu/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7269130ea5e2..72e845709027 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> {
> int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
>
> - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
Redundant parentheses detected!
> }
Reviewed-by: Vitaly Kuznetsov <[email protected]>
--
Vitaly
On Wed, Jan 8, 2020 at 2:13 AM Vitaly Kuznetsov <[email protected]> wrote:
>
> Sean Christopherson <[email protected]> writes:
>
> > Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> > checks, which are obviously intended to be logical statements. Switching
> > to a Logical OR is functionally a nop, but allows the compiler to better
> > optimize the checks.
> >
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7269130ea5e2..72e845709027 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> > {
> > int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
> >
> > - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> > + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> > ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
>
> Redundant parentheses detected!
I think you mean superfluous rather than redundant.
> > }
>
> Reviewed-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
From: Sean Christopherson
> Sent: 08 January 2020 00:19
>
> Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> checks, which are obviously intended to be logical statements. Switching
> to a Logical OR is functionally a nop, but allows the compiler to better
> optimize the checks.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/mmu/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7269130ea5e2..72e845709027 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> {
> int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
>
> - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
Are you sure this isn't deliberate?
The best code almost certainly comes from also removing the '!= 0'.
You also don't want to convert the expression result to zero.
So:
return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | (rsvd_check->bad_mt_xwr & (1ull << low6));
The code then doesn't have any branches to get mispredicted.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Thu, Jan 09, 2020 at 02:13:48PM +0000, David Laight wrote:
> From: Sean Christopherson
> > Sent: 08 January 2020 00:19
> >
> > Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> > checks, which are obviously intended to be logical statements. Switching
> > to a Logical OR is functionally a nop, but allows the compiler to better
> > optimize the checks.
> >
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7269130ea5e2..72e845709027 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> > {
> > int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
> >
> > - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> > + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> > ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
>
> Are you sure this isn't deliberate?
> The best code almost certainly comes from also removing the '!= 0'.
> You also don't want to convert the expression result to zero.
The function is static inline bool, so it's almost certainly a mistake
originally. The != 0 is superfluous, but this will get inlined anyway.
>
> So:
> return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | (rsvd_check->bad_mt_xwr & (1ull << low6));
> The code then doesn't have any branches to get mispredicted.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
On Thu, Jan 09, 2020 at 10:26:30AM -0500, Arvind Sankar wrote:
> On Thu, Jan 09, 2020 at 02:13:48PM +0000, David Laight wrote:
> > From: Sean Christopherson
> > > Sent: 08 January 2020 00:19
> > >
> > > Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> > > checks, which are obviously intended to be logical statements. Switching
> > > to a Logical OR is functionally a nop, but allows the compiler to better
> > > optimize the checks.
> > >
> > > Signed-off-by: Sean Christopherson <[email protected]>
> > > ---
> > > arch/x86/kvm/mmu/mmu.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 7269130ea5e2..72e845709027 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> > > {
> > > int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
> > >
> > > - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> > > + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> > > ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
> >
> > Are you sure this isn't deliberate?
> > The best code almost certainly comes from also removing the '!= 0'.
The '!= 0' is truly superfluous, removing it doesn't affect code
generation.
> > You also don't want to convert the expression result to zero.
>
> The function is static inline bool, so it's almost certainly a mistake
> originally. The != 0 is superfluous, but this will get inlined anyway.
Ya, the bitwise-OR was added in commit 25d92081ae2f ("nEPT: Add nEPT
violation/misconfigration support"), and AFAICT it's unintentional.
That being said, I was a bit hasty in stating that a logical-OR allows for
better optimization, sort of.
For FNAME(prefetch_invalid_gpte) and FNAME(walk_addr_generic), which
branch on the result of is_rsvd_bits_set(), the logical-OR is marginally
better. FNAME(prefetch_invalid_gpte) is what I initially looked at when
saying "yep, that's better!".
But for walk_shadow_page_get_mmio_spte(), because it aggregates the result
in a loop, the bitwise-OR is better in that it eliminates a Jcc.
And all that being said, there are two vastly superior optimizations that
can be made:
- Reorder the checks in FNAME(prefetch_invalid_gpte) to perform the
!PRESENT and !ACCESSED checks before checking the reserved bits, as
they are both more likely to fail and do not require additional memory
accesses.
- Rewrite __is_rsvd_bits_set() to make it templated. The reserved MT
check is EPT only, i.e. bad_mt_xwr is always 0 for legacy 32/64-bit
paging.
So, I'll scrap this patch and send a mini series to effect the above
optimizations.
> >
> > So:
> > return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | (rsvd_check->bad_mt_xwr & (1ull << low6));
> > The code then doesn't have any branches to get mispredicted.
> >
> > David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)
> >
On Thu, Jan 09, 2020 at 08:36:24AM -0800, Sean Christopherson wrote:
> On Thu, Jan 09, 2020 at 10:26:30AM -0500, Arvind Sankar wrote:
> > On Thu, Jan 09, 2020 at 02:13:48PM +0000, David Laight wrote:
> > > From: Sean Christopherson
> > > > Sent: 08 January 2020 00:19
> > > >
> > > > Use a Logical OR in __is_rsvd_bits_set() to combine the two reserved bit
> > > > checks, which are obviously intended to be logical statements. Switching
> > > > to a Logical OR is functionally a nop, but allows the compiler to better
> > > > optimize the checks.
> > > >
> > > > Signed-off-by: Sean Christopherson <[email protected]>
> > > > ---
> > > > arch/x86/kvm/mmu/mmu.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index 7269130ea5e2..72e845709027 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -3970,7 +3970,7 @@ __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
> > > > {
> > > > int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
> > > >
> > > > - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> > > > + return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) ||
> > > > ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
> > >
> > > Are you sure this isn't deliberate?
> > > The best code almost certainly comes from also removing the '!= 0'.
>
> The '!= 0' is truly superfluous, removing it doesn't affect code
> generation.
Actually, it's not completely superfluous. Functionally the code is
identical, but ordered slightly differently for whatever reason.
On 09/01/20 17:36, Sean Christopherson wrote:
>>> You also don't want to convert the expression result to zero.
>> The function is static inline bool, so it's almost certainly a mistake
>> originally. The != 0 is superfluous, but this will get inlined anyway.
> Ya, the bitwise-OR was added in commit 25d92081ae2f ("nEPT: Add nEPT
> violation/misconfigration support"), and AFAICT it's unintentional.
It may not be intentional in this case, but it's certainly the kind of
code that I would have fun writing. :)
Paolo