From: Yin Fengwei
> Sent: 23 October 2019 08:50
> In function acpi_idle_do_entry(), an ioport access is used for dummy
> wait to guarantee hardware behavior. But it could trigger unnecessary
> vmexit if kernel is running as guest in virtualization environtment.
>
> If it's in virtualization environment, the deeper C state enter
> operation (inb()) will trap to hyervisor. It's not needed to do
> dummy wait after the inb() call. So we remove the dummy io port
> access to avoid unnecessary VMexit.
>
> We keep dummy io port access to maintain timing for native environment.
>
> Signed-off-by: Yin Fengwei <[email protected]>
> ---
> ChangeLog:
> v2 -> v3:
> - Remove dummy io port access totally for virtualization env.
>
> v1 -> v2:
> - Use ndelay instead of dead loop for dummy delay.
>
> drivers/acpi/processor_idle.c | 36 ++++++++++++++++++++++++++++++++---
> 1 file changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index ed56c6d20b08..0c4a97dd6917 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -58,6 +58,17 @@ struct cpuidle_driver acpi_idle_driver = {
> static
> DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
>
> +static void (*dummy_wait)(u64 address);
> +
> +static void default_dummy_wait(u64 address)
> +{
> + inl(address);
> +}
> +
> +static void default_noop_wait(u64 address)
> +{
> +}
> +
Overengineered...
Just add:
static void wait_for_freeze(void)
{
#ifdef CONFIG_X86
/* No delay is needed if we are a guest */
if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
return;
#endif
/* Dummy wait op - must do something useless after P_LVL2 read
because chipsets cannot guarantee that STPCLK# signal
gets asserted in time to freeze execution properly. */
inl(acpi_gbl_FADT.xpm_timer_block.address);
}
and use it to replace the inl().
...
> +#ifdef CONFIG_X86
> + /* For x86, if we are running in guest, we don't need extra
> + * access ioport as dummy wait.
> + */
> + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> + pr_err("We are in virtual env");
> + dummy_wait = default_noop_wait;
> + } else {
> + pr_err("We are not in virtual env");
> + }
> +#endif
WTF are the pr_err() for???
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Wed, Oct 23, 2019 at 10:45 AM David Laight <[email protected]> wrote:
>
> From: Yin Fengwei
> > Sent: 23 October 2019 08:50
>
>
> > In function acpi_idle_do_entry(), an ioport access is used for dummy
> > wait to guarantee hardware behavior. But it could trigger unnecessary
> > vmexit if kernel is running as guest in virtualization environtment.
> >
> > If it's in virtualization environment, the deeper C state enter
> > operation (inb()) will trap to hyervisor. It's not needed to do
> > dummy wait after the inb() call. So we remove the dummy io port
> > access to avoid unnecessary VMexit.
> >
> > We keep dummy io port access to maintain timing for native environment.
> >
> > Signed-off-by: Yin Fengwei <[email protected]>
> > ---
> > ChangeLog:
> > v2 -> v3:
> > - Remove dummy io port access totally for virtualization env.
> >
> > v1 -> v2:
> > - Use ndelay instead of dead loop for dummy delay.
> >
> > drivers/acpi/processor_idle.c | 36 ++++++++++++++++++++++++++++++++---
> > 1 file changed, 33 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> > index ed56c6d20b08..0c4a97dd6917 100644
> > --- a/drivers/acpi/processor_idle.c
> > +++ b/drivers/acpi/processor_idle.c
> > @@ -58,6 +58,17 @@ struct cpuidle_driver acpi_idle_driver = {
> > static
> > DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
> >
> > +static void (*dummy_wait)(u64 address);
> > +
> > +static void default_dummy_wait(u64 address)
> > +{
> > + inl(address);
> > +}
> > +
> > +static void default_noop_wait(u64 address)
> > +{
> > +}
> > +
>
> Overengineered...
> Just add:
>
> static void wait_for_freeze(void)
> {
> #ifdef CONFIG_X86
> /* No delay is needed if we are a guest */
> if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> return;
> #endif
>
> /* Dummy wait op - must do something useless after P_LVL2 read
> because chipsets cannot guarantee that STPCLK# signal
> gets asserted in time to freeze execution properly. */
> inl(acpi_gbl_FADT.xpm_timer_block.address);
> }
>
> and use it to replace the inl().
I was about to make a similar comment.
On 2019/10/23 下午4:45, David Laight wrote:
> From: Yin Fengwei
>> Sent: 23 October 2019 08:50
>
>
>> In function acpi_idle_do_entry(), an ioport access is used for dummy
>> wait to guarantee hardware behavior. But it could trigger unnecessary
>> vmexit if kernel is running as guest in virtualization environtment.
>>
>> If it's in virtualization environment, the deeper C state enter
>> operation (inb()) will trap to hyervisor. It's not needed to do
>> dummy wait after the inb() call. So we remove the dummy io port
>> access to avoid unnecessary VMexit.
>>
>> We keep dummy io port access to maintain timing for native environment.
>>
>> Signed-off-by: Yin Fengwei <[email protected]>
>> ---
>> ChangeLog:
>> v2 -> v3:
>> - Remove dummy io port access totally for virtualization env.
>>
>> v1 -> v2:
>> - Use ndelay instead of dead loop for dummy delay.
>>
>> drivers/acpi/processor_idle.c | 36 ++++++++++++++++++++++++++++++++---
>> 1 file changed, 33 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index ed56c6d20b08..0c4a97dd6917 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -58,6 +58,17 @@ struct cpuidle_driver acpi_idle_driver = {
>> static
>> DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
>>
>> +static void (*dummy_wait)(u64 address);
>> +
>> +static void default_dummy_wait(u64 address)
>> +{
>> + inl(address);
>> +}
>> +
>> +static void default_noop_wait(u64 address)
>> +{
>> +}
>> +
>
> Overengineered...
> Just add:
>
> static void wait_for_freeze(void)
> {
> #ifdef CONFIG_X86
> /* No delay is needed if we are a guest */
> if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> return;
> #endif
> /* Dummy wait op - must do something useless after P_LVL2 read
> because chipsets cannot guarantee that STPCLK# signal
> gets asserted in time to freeze execution properly. */
> inl(acpi_gbl_FADT.xpm_timer_block.address);
> }
>
> and use it to replace the inl().
OK. I was trying to avoid any impact to native case.
>
> ...
>> +#ifdef CONFIG_X86
>> + /* For x86, if we are running in guest, we don't need extra
>> + * access ioport as dummy wait.
>> + */
>> + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
>> + pr_err("We are in virtual env");
>> + dummy_wait = default_noop_wait;
>> + } else {
>> + pr_err("We are not in virtual env");
>> + }
>> +#endif
>
> WTF are the pr_err() for???
Sorry. Didn't remove my debug code...
Regards
Yin, Fengwei
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
On 2019/10/23 下午5:03, Rafael J. Wysocki wrote:
> On Wed, Oct 23, 2019 at 10:45 AM David Laight <[email protected]> wrote:
>>
>> From: Yin Fengwei
>>> Sent: 23 October 2019 08:50
>>
>>
>>> In function acpi_idle_do_entry(), an ioport access is used for dummy
>>> wait to guarantee hardware behavior. But it could trigger unnecessary
>>> vmexit if kernel is running as guest in virtualization environtment.
>>>
>>> If it's in virtualization environment, the deeper C state enter
>>> operation (inb()) will trap to hyervisor. It's not needed to do
>>> dummy wait after the inb() call. So we remove the dummy io port
>>> access to avoid unnecessary VMexit.
>>>
>>> We keep dummy io port access to maintain timing for native environment.
>>>
>>> Signed-off-by: Yin Fengwei <[email protected]>
>>> ---
>>> ChangeLog:
>>> v2 -> v3:
>>> - Remove dummy io port access totally for virtualization env.
>>>
>>> v1 -> v2:
>>> - Use ndelay instead of dead loop for dummy delay.
>>>
>>> drivers/acpi/processor_idle.c | 36 ++++++++++++++++++++++++++++++++---
>>> 1 file changed, 33 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>>> index ed56c6d20b08..0c4a97dd6917 100644
>>> --- a/drivers/acpi/processor_idle.c
>>> +++ b/drivers/acpi/processor_idle.c
>>> @@ -58,6 +58,17 @@ struct cpuidle_driver acpi_idle_driver = {
>>> static
>>> DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
>>>
>>> +static void (*dummy_wait)(u64 address);
>>> +
>>> +static void default_dummy_wait(u64 address)
>>> +{
>>> + inl(address);
>>> +}
>>> +
>>> +static void default_noop_wait(u64 address)
>>> +{
>>> +}
>>> +
>>
>> Overengineered...
>> Just add:
>>
>> static void wait_for_freeze(void)
>> {
>> #ifdef CONFIG_X86
>> /* No delay is needed if we are a guest */
>> if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
>> return;
>> #endif
>>
>> /* Dummy wait op - must do something useless after P_LVL2 read
>> because chipsets cannot guarantee that STPCLK# signal
>> gets asserted in time to freeze execution properly. */
>> inl(acpi_gbl_FADT.xpm_timer_block.address);
>> }
>>
>> and use it to replace the inl().
>
> I was about to make a similar comment.
OK. Will send v4 soon.
Regards
Yin, Fengwei
>