2023-03-06 12:38:06

by Kautuk Consul

[permalink] [raw]
Subject: [PATCH v2 0/2] Improving calls to kvmppc_hv_entry

- remove .global scope of kvmppc_hv_entry
- remove r4 argument to kvmppc_hv_entry as it is not required

Changes since v1:
- replaced .global by SYM_INNER_LABEL for kvmpcc_hv_entry

Kautuk Consul (2):
arch/powerpc/kvm: kvmppc_hv_entry: remove .global scope
arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

--
2.36.1



2023-03-06 12:38:10

by Kautuk Consul

[permalink] [raw]
Subject: [PATCH v2 1/2] arch/powerpc/kvm: kvmppc_hv_entry: remove .global scope

kvmppc_hv_entry isn't called from anywhere other than
book3s_hv_rmhandlers.S itself. Removing .global scope for
this function and annotating it with SYM_INNER_LABEL.

Signed-off-by: Kautuk Consul <[email protected]>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index acf80915f406..b81ba4ee0521 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -502,8 +502,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
* *
*****************************************************************************/

-.global kvmppc_hv_entry
-kvmppc_hv_entry:
+SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)

/* Required state:
*
--
2.36.1


2023-03-06 12:38:14

by Kautuk Consul

[permalink] [raw]
Subject: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

kvmppc_hv_entry is called from only 2 locations within
book3s_hv_rmhandlers.S. Both of those locations set r4
as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
So, shift the r4 load instruction to kvmppc_hv_entry and
thus modify the calling convention of this function.

Signed-off-by: Kautuk Consul <[email protected]>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b81ba4ee0521..da9a15db12fe 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
RFI_TO_KERNEL

kvmppc_call_hv_entry:
- ld r4, HSTATE_KVM_VCPU(r13)
+ /* Enter guest. */
bl kvmppc_hv_entry

/* Back from guest - restore host state and return to caller */
@@ -352,9 +352,7 @@ kvm_secondary_got_guest:
mtspr SPRN_LDBAR, r0
isync
63:
- /* Order load of vcpu after load of vcore */
- lwsync
- ld r4, HSTATE_KVM_VCPU(r13)
+ /* Enter guest. */
bl kvmppc_hv_entry

/* Back from the guest, go back to nap */
@@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)

/* Required state:
*
- * R4 = vcpu pointer (or NULL)
* MSR = ~IR|DR
* R13 = PACA
* R1 = host R1
@@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
li r6, KVM_GUEST_MODE_HOST_HV
stb r6, HSTATE_IN_GUEST(r13)

+ ld r4, HSTATE_KVM_VCPU(r13)
+
#ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING
/* Store initial timestamp */
cmpdi r4, 0
--
2.36.1


2023-03-13 05:40:17

by Kautuk Consul

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Improving calls to kvmppc_hv_entry

Hi everyone,

Anyone interested in reviewing this small patch-set ?
I tested it on P8 and it works fine.

Thanks.

On 2023-03-06 07:37:38, Kautuk Consul wrote:
> - remove .global scope of kvmppc_hv_entry
> - remove r4 argument to kvmppc_hv_entry as it is not required
>
> Changes since v1:
> - replaced .global by SYM_INNER_LABEL for kvmpcc_hv_entry
>
> Kautuk Consul (2):
> arch/powerpc/kvm: kvmppc_hv_entry: remove .global scope
> arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument
>
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> --
> 2.36.1
>

2023-03-15 04:53:14

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

Kautuk Consul <[email protected]> writes:
> kvmppc_hv_entry is called from only 2 locations within
> book3s_hv_rmhandlers.S. Both of those locations set r4
> as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
> So, shift the r4 load instruction to kvmppc_hv_entry and
> thus modify the calling convention of this function.
>
> Signed-off-by: Kautuk Consul <[email protected]>
> ---
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b81ba4ee0521..da9a15db12fe 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
> RFI_TO_KERNEL
>
> kvmppc_call_hv_entry:
> - ld r4, HSTATE_KVM_VCPU(r13)
> + /* Enter guest. */
> bl kvmppc_hv_entry
>
> /* Back from guest - restore host state and return to caller */
> @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
> mtspr SPRN_LDBAR, r0
> isync
> 63:
> - /* Order load of vcpu after load of vcore */
> - lwsync

Where did this barrier go?

I don't see that it's covered by any existing barriers in
kvmppc_hv_entry, and you don't add it back anywhere.

> - ld r4, HSTATE_KVM_VCPU(r13)
> + /* Enter guest. */
> bl kvmppc_hv_entry
>
> /* Back from the guest, go back to nap */
> @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
>
> /* Required state:
> *
> - * R4 = vcpu pointer (or NULL)
> * MSR = ~IR|DR
> * R13 = PACA
> * R1 = host R1
> @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
> li r6, KVM_GUEST_MODE_HOST_HV
> stb r6, HSTATE_IN_GUEST(r13)
>
> + ld r4, HSTATE_KVM_VCPU(r13)
> +
> #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING
> /* Store initial timestamp */
> cmpdi r4, 0

cheers

2023-03-15 05:23:19

by Kautuk Consul

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

On 2023-03-15 15:48:53, Michael Ellerman wrote:
> Kautuk Consul <[email protected]> writes:
> > kvmppc_hv_entry is called from only 2 locations within
> > book3s_hv_rmhandlers.S. Both of those locations set r4
> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
> > So, shift the r4 load instruction to kvmppc_hv_entry and
> > thus modify the calling convention of this function.
> >
> > Signed-off-by: Kautuk Consul <[email protected]>
> > ---
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
> > 1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > index b81ba4ee0521..da9a15db12fe 100644
> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
> > RFI_TO_KERNEL
> >
> > kvmppc_call_hv_entry:
> > - ld r4, HSTATE_KVM_VCPU(r13)
> > + /* Enter guest. */
> > bl kvmppc_hv_entry
> >
> > /* Back from guest - restore host state and return to caller */
> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
> > mtspr SPRN_LDBAR, r0
> > isync
> > 63:
> > - /* Order load of vcpu after load of vcore */
> > - lwsync
>
> Where did this barrier go?
>
> I don't see that it's covered by any existing barriers in
> kvmppc_hv_entry, and you don't add it back anywhere.

My concept about this is that since now the call to kvmppc_hv_entry
is first taken before the load to r4 shouldn't the pending load in the
pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway
before-hand ? Or do you mesn to say that pending loads may not be
cleared/flushed across the "bl <funcname>" boundary ?
>
> > - ld r4, HSTATE_KVM_VCPU(r13)
> > + /* Enter guest. */
> > bl kvmppc_hv_entry
> >
> > /* Back from the guest, go back to nap */
> > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
> >
> > /* Required state:
> > *
> > - * R4 = vcpu pointer (or NULL)
> > * MSR = ~IR|DR
> > * R13 = PACA
> > * R1 = host R1
> > @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
> > li r6, KVM_GUEST_MODE_HOST_HV
> > stb r6, HSTATE_IN_GUEST(r13)
> >
> > + ld r4, HSTATE_KVM_VCPU(r13)
> > +
> > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING
> > /* Store initial timestamp */
> > cmpdi r4, 0
>
> cheers

2023-03-15 05:37:39

by Kautuk Consul

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

On 2023-03-15 10:48:01, Kautuk Consul wrote:
> On 2023-03-15 15:48:53, Michael Ellerman wrote:
> > Kautuk Consul <[email protected]> writes:
> > > kvmppc_hv_entry is called from only 2 locations within
> > > book3s_hv_rmhandlers.S. Both of those locations set r4
> > > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
> > > So, shift the r4 load instruction to kvmppc_hv_entry and
> > > thus modify the calling convention of this function.
> > >
> > > Signed-off-by: Kautuk Consul <[email protected]>
> > > ---
> > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
> > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > index b81ba4ee0521..da9a15db12fe 100644
> > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
> > > RFI_TO_KERNEL
> > >
> > > kvmppc_call_hv_entry:
> > > - ld r4, HSTATE_KVM_VCPU(r13)
> > > + /* Enter guest. */
> > > bl kvmppc_hv_entry
> > >
> > > /* Back from guest - restore host state and return to caller */
> > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
> > > mtspr SPRN_LDBAR, r0
> > > isync
> > > 63:
> > > - /* Order load of vcpu after load of vcore */
> > > - lwsync
> >
> > Where did this barrier go?
> >
> > I don't see that it's covered by any existing barriers in
> > kvmppc_hv_entry, and you don't add it back anywhere.
>
> My concept about this is that since now the call to kvmppc_hv_entry
> is first taken before the load to r4 shouldn't the pending load in the
> pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway
> before-hand ? Or do you mesn to say that pending loads may not be
> cleared/flushed across the "bl <funcname>" boundary ?
Sorry, I mean: " shouldn't the pending load in the pipeline (of the
HSTATE_KVM_VCORE) as per the earlier comment be ordered anyway
before-hand?"

Forgot the paranthesis.
> >
> > > - ld r4, HSTATE_KVM_VCPU(r13)
> > > + /* Enter guest. */
> > > bl kvmppc_hv_entry
> > >
> > > /* Back from the guest, go back to nap */
> > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
> > >
> > > /* Required state:
> > > *
> > > - * R4 = vcpu pointer (or NULL)
> > > * MSR = ~IR|DR
> > > * R13 = PACA
> > > * R1 = host R1
> > > @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL)
> > > li r6, KVM_GUEST_MODE_HOST_HV
> > > stb r6, HSTATE_IN_GUEST(r13)
> > >
> > > + ld r4, HSTATE_KVM_VCPU(r13)
> > > +
> > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING
> > > /* Store initial timestamp */
> > > cmpdi r4, 0
> >
> > cheers

2023-03-16 03:39:23

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

Kautuk Consul <[email protected]> writes:
> On 2023-03-15 15:48:53, Michael Ellerman wrote:
>> Kautuk Consul <[email protected]> writes:
>> > kvmppc_hv_entry is called from only 2 locations within
>> > book3s_hv_rmhandlers.S. Both of those locations set r4
>> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
>> > So, shift the r4 load instruction to kvmppc_hv_entry and
>> > thus modify the calling convention of this function.
>> >
>> > Signed-off-by: Kautuk Consul <[email protected]>
>> > ---
>> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
>> > 1 file changed, 4 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> > index b81ba4ee0521..da9a15db12fe 100644
>> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
>> > RFI_TO_KERNEL
>> >
>> > kvmppc_call_hv_entry:
>> > - ld r4, HSTATE_KVM_VCPU(r13)
>> > + /* Enter guest. */
>> > bl kvmppc_hv_entry
>> >
>> > /* Back from guest - restore host state and return to caller */
>> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
>> > mtspr SPRN_LDBAR, r0
>> > isync
>> > 63:
>> > - /* Order load of vcpu after load of vcore */
>> > - lwsync
>>
>> Where did this barrier go?
>>
>> I don't see that it's covered by any existing barriers in
>> kvmppc_hv_entry, and you don't add it back anywhere.
>
> My concept about this is that since now the call to kvmppc_hv_entry
> is first taken before the load to r4 shouldn't the pending load in the
> pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway
> before-hand ?

No.

> Or do you mean to say that pending loads may not be
> cleared/flushed across the "bl <funcname>" boundary ?

Right.

The "bl" imposes no ordering on loads before or after it.

In general nothing orders two independant loads, other than a barrier.

cheers

2023-03-16 03:40:51

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

Michael Ellerman <[email protected]> writes:
> Kautuk Consul <[email protected]> writes:
>> On 2023-03-15 15:48:53, Michael Ellerman wrote:
>>> Kautuk Consul <[email protected]> writes:
>>> > kvmppc_hv_entry is called from only 2 locations within
>>> > book3s_hv_rmhandlers.S. Both of those locations set r4
>>> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
>>> > So, shift the r4 load instruction to kvmppc_hv_entry and
>>> > thus modify the calling convention of this function.
>>> >
>>> > Signed-off-by: Kautuk Consul <[email protected]>
>>> > ---
>>> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
>>> > 1 file changed, 4 insertions(+), 5 deletions(-)
>>> >
>>> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> > index b81ba4ee0521..da9a15db12fe 100644
>>> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>>> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
>>> > RFI_TO_KERNEL
>>> >
>>> > kvmppc_call_hv_entry:
>>> > - ld r4, HSTATE_KVM_VCPU(r13)
>>> > + /* Enter guest. */
>>> > bl kvmppc_hv_entry
>>> >
>>> > /* Back from guest - restore host state and return to caller */
>>> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
>>> > mtspr SPRN_LDBAR, r0
>>> > isync
>>> > 63:
>>> > - /* Order load of vcpu after load of vcore */
>>> > - lwsync
>>>
>>> Where did this barrier go?
>>>
>>> I don't see that it's covered by any existing barriers in
>>> kvmppc_hv_entry, and you don't add it back anywhere.
>>
>> My concept about this is that since now the call to kvmppc_hv_entry
>> is first taken before the load to r4 shouldn't the pending load in the
>> pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway
>> before-hand ?
>
> No.
>
>> Or do you mean to say that pending loads may not be
>> cleared/flushed across the "bl <funcname>" boundary ?
>
> Right.
>
> The "bl" imposes no ordering on loads before or after it.
>
> In general nothing orders two independant loads, other than a barrier.

There's some docs on barriers here:

https://www.kernel.org/doc/Documentation/memory-barriers.txt

Though admittedly it is pretty dense.

cheers

2023-03-16 03:51:11

by Kautuk Consul

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] arch/powerpc/kvm: kvmppc_hv_entry: remove r4 argument

Hi,

On 2023-03-16 14:39:08, Michael Ellerman wrote:
> Kautuk Consul <[email protected]> writes:
> > On 2023-03-15 15:48:53, Michael Ellerman wrote:
> >> Kautuk Consul <[email protected]> writes:
> >> > kvmppc_hv_entry is called from only 2 locations within
> >> > book3s_hv_rmhandlers.S. Both of those locations set r4
> >> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry.
> >> > So, shift the r4 load instruction to kvmppc_hv_entry and
> >> > thus modify the calling convention of this function.
> >> >
> >> > Signed-off-by: Kautuk Consul <[email protected]>
> >> > ---
> >> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++-----
> >> > 1 file changed, 4 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> >> > index b81ba4ee0521..da9a15db12fe 100644
> >> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> >> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> >> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline)
> >> > RFI_TO_KERNEL
> >> >
> >> > kvmppc_call_hv_entry:
> >> > - ld r4, HSTATE_KVM_VCPU(r13)
> >> > + /* Enter guest. */
> >> > bl kvmppc_hv_entry
> >> >
> >> > /* Back from guest - restore host state and return to caller */
> >> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest:
> >> > mtspr SPRN_LDBAR, r0
> >> > isync
> >> > 63:
> >> > - /* Order load of vcpu after load of vcore */
> >> > - lwsync
> >>
> >> Where did this barrier go?
> >>
> >> I don't see that it's covered by any existing barriers in
> >> kvmppc_hv_entry, and you don't add it back anywhere.
> >
> > My concept about this is that since now the call to kvmppc_hv_entry
> > is first taken before the load to r4 shouldn't the pending load in the
> > pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway
> > before-hand ?
>
> No.
>
> > Or do you mean to say that pending loads may not be
> > cleared/flushed across the "bl <funcname>" boundary ?
>
> Right.
>
> The "bl" imposes no ordering on loads before or after it.
>
> In general nothing orders two independant loads, other than a barrier.
>
> cheers

Okay, I will post a patch v3 with lwsync before the load to r4 in
kvmppc_hv_entry.

Thanks.