Disabling preemption in xen_irq_enable() is not needed. There is no
risk of missing events due to preemption, as preemption can happen
only in case an event is being received, which is just the opposite
of missing an event.
Signed-off-by: Juergen Gross <[email protected]>
---
arch/x86/xen/irq.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index dfa091d79c2e..ba9b14a97109 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
{
struct vcpu_info *vcpu;
- /*
- * We may be preempted as soon as vcpu->evtchn_upcall_mask is
- * cleared, so disable preemption to ensure we check for
- * events on the VCPU we are still running on.
- */
- preempt_disable();
-
vcpu = this_cpu_read(xen_vcpu);
vcpu->evtchn_upcall_mask = 0;
- /* Doesn't matter if we get preempted here, because any
- pending event will get dealt with anyway. */
+ /*
+ * Now preemption could happen, but this is only possible if an event
+ * was handled, so missing an event due to preemption is not
+ * possible at all.
+ * The worst possible case is to be preempted and then check events
+ * pending on the old vcpu, but this is not problematic.
+ */
barrier(); /* unmask then check (avoid races) */
if (unlikely(vcpu->evtchn_upcall_pending))
xen_force_evtchn_callback();
-
- preempt_enable();
}
PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
--
2.26.2
On 21.09.2021 09:02, Juergen Gross wrote:
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
> {
> struct vcpu_info *vcpu;
>
> - /*
> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
> - * cleared, so disable preemption to ensure we check for
> - * events on the VCPU we are still running on.
> - */
> - preempt_disable();
> -
> vcpu = this_cpu_read(xen_vcpu);
> vcpu->evtchn_upcall_mask = 0;
>
> - /* Doesn't matter if we get preempted here, because any
> - pending event will get dealt with anyway. */
> + /*
> + * Now preemption could happen, but this is only possible if an event
> + * was handled, so missing an event due to preemption is not
> + * possible at all.
> + * The worst possible case is to be preempted and then check events
> + * pending on the old vcpu, but this is not problematic.
> + */
I agree this isn't problematic from a functional perspective, but ...
> barrier(); /* unmask then check (avoid races) */
> if (unlikely(vcpu->evtchn_upcall_pending))
> xen_force_evtchn_callback();
... is a stray call here cheaper than ...
> -
> - preempt_enable();
... the preempt_{dis,en}able() pair?
Jan
On 21.09.21 09:53, Jan Beulich wrote:
> On 21.09.2021 09:02, Juergen Gross wrote:
>> --- a/arch/x86/xen/irq.c
>> +++ b/arch/x86/xen/irq.c
>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
>> {
>> struct vcpu_info *vcpu;
>>
>> - /*
>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
>> - * cleared, so disable preemption to ensure we check for
>> - * events on the VCPU we are still running on.
>> - */
>> - preempt_disable();
>> -
>> vcpu = this_cpu_read(xen_vcpu);
>> vcpu->evtchn_upcall_mask = 0;
>>
>> - /* Doesn't matter if we get preempted here, because any
>> - pending event will get dealt with anyway. */
>> + /*
>> + * Now preemption could happen, but this is only possible if an event
>> + * was handled, so missing an event due to preemption is not
>> + * possible at all.
>> + * The worst possible case is to be preempted and then check events
>> + * pending on the old vcpu, but this is not problematic.
>> + */
>
> I agree this isn't problematic from a functional perspective, but ...
>
>> barrier(); /* unmask then check (avoid races) */
>> if (unlikely(vcpu->evtchn_upcall_pending))
>> xen_force_evtchn_callback();
>
> ... is a stray call here cheaper than ...
>
>> -
>> - preempt_enable();
>
> ... the preempt_{dis,en}able() pair?
The question is if a stray call in case of preemption (very unlikely)
is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling.
I'm quite sure removing the preempt_*() calls will be a net benefit.
Juergen
On 21.09.2021 09:58, Juergen Gross wrote:
> On 21.09.21 09:53, Jan Beulich wrote:
>> On 21.09.2021 09:02, Juergen Gross wrote:
>>> --- a/arch/x86/xen/irq.c
>>> +++ b/arch/x86/xen/irq.c
>>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
>>> {
>>> struct vcpu_info *vcpu;
>>>
>>> - /*
>>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
>>> - * cleared, so disable preemption to ensure we check for
>>> - * events on the VCPU we are still running on.
>>> - */
>>> - preempt_disable();
>>> -
>>> vcpu = this_cpu_read(xen_vcpu);
>>> vcpu->evtchn_upcall_mask = 0;
>>>
>>> - /* Doesn't matter if we get preempted here, because any
>>> - pending event will get dealt with anyway. */
>>> + /*
>>> + * Now preemption could happen, but this is only possible if an event
>>> + * was handled, so missing an event due to preemption is not
>>> + * possible at all.
>>> + * The worst possible case is to be preempted and then check events
>>> + * pending on the old vcpu, but this is not problematic.
>>> + */
>>
>> I agree this isn't problematic from a functional perspective, but ...
>>
>>> barrier(); /* unmask then check (avoid races) */
>>> if (unlikely(vcpu->evtchn_upcall_pending))
>>> xen_force_evtchn_callback();
>>
>> ... is a stray call here cheaper than ...
>>
>>> -
>>> - preempt_enable();
>>
>> ... the preempt_{dis,en}able() pair?
>
> The question is if a stray call in case of preemption (very unlikely)
> is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling.
>
> I'm quite sure removing the preempt_*() calls will be a net benefit.
Well, yes, I agree. It would have been nice if the description pointed
out the fact that preemption kicking in precisely here is very unlikely.
But perhaps that's considered rather obvious ... The issue I'm having
is with the prior comments: They indicated that preemption happening
before the "pending" check would be okay, _despite_ the
preempt_{dis,en}able() pair. One could view this as an indication that
this pair was put there for another reason (e.g. to avoid the stray
calls). But it may of course also be that the comment simply was stale.
Reviewed-by: Jan Beulich <[email protected]>
Jan
On 21.09.21 10:11, Jan Beulich wrote:
> On 21.09.2021 09:58, Juergen Gross wrote:
>> On 21.09.21 09:53, Jan Beulich wrote:
>>> On 21.09.2021 09:02, Juergen Gross wrote:
>>>> --- a/arch/x86/xen/irq.c
>>>> +++ b/arch/x86/xen/irq.c
>>>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
>>>> {
>>>> struct vcpu_info *vcpu;
>>>>
>>>> - /*
>>>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
>>>> - * cleared, so disable preemption to ensure we check for
>>>> - * events on the VCPU we are still running on.
>>>> - */
>>>> - preempt_disable();
>>>> -
>>>> vcpu = this_cpu_read(xen_vcpu);
>>>> vcpu->evtchn_upcall_mask = 0;
>>>>
>>>> - /* Doesn't matter if we get preempted here, because any
>>>> - pending event will get dealt with anyway. */
>>>> + /*
>>>> + * Now preemption could happen, but this is only possible if an event
>>>> + * was handled, so missing an event due to preemption is not
>>>> + * possible at all.
>>>> + * The worst possible case is to be preempted and then check events
>>>> + * pending on the old vcpu, but this is not problematic.
>>>> + */
>>>
>>> I agree this isn't problematic from a functional perspective, but ...
>>>
>>>> barrier(); /* unmask then check (avoid races) */
>>>> if (unlikely(vcpu->evtchn_upcall_pending))
>>>> xen_force_evtchn_callback();
>>>
>>> ... is a stray call here cheaper than ...
>>>
>>>> -
>>>> - preempt_enable();
>>>
>>> ... the preempt_{dis,en}able() pair?
>>
>> The question is if a stray call in case of preemption (very unlikely)
>> is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling.
>>
>> I'm quite sure removing the preempt_*() calls will be a net benefit.
>
> Well, yes, I agree. It would have been nice if the description pointed
> out the fact that preemption kicking in precisely here is very unlikely.
> But perhaps that's considered rather obvious ... The issue I'm having
> is with the prior comments: They indicated that preemption happening
> before the "pending" check would be okay, _despite_ the
> preempt_{dis,en}able() pair. One could view this as an indication that
> this pair was put there for another reason (e.g. to avoid the stray
> calls). But it may of course also be that the comment simply was stale.
The comment is older than the preempt_*() calls.
Those were added 8 years ago claiming they'd prevent lost events, but
at the same time at lease one other patch was added which really
prevented lost events, so adding the preempt_*() calls might just have
been a guess at that time.
> Reviewed-by: Jan Beulich <[email protected]>
Thanks,
Juergen
On Tue, Sep 21, 2021 at 09:02:26AM +0200, Juergen Gross wrote:
> Disabling preemption in xen_irq_enable() is not needed. There is no
> risk of missing events due to preemption, as preemption can happen
> only in case an event is being received, which is just the opposite
> of missing an event.
>
> Signed-off-by: Juergen Gross <[email protected]>
> ---
> arch/x86/xen/irq.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> index dfa091d79c2e..ba9b14a97109 100644
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
> {
> struct vcpu_info *vcpu;
>
> - /*
> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
> - * cleared, so disable preemption to ensure we check for
> - * events on the VCPU we are still running on.
> - */
> - preempt_disable();
> -
> vcpu = this_cpu_read(xen_vcpu);
> vcpu->evtchn_upcall_mask = 0;
>
> - /* Doesn't matter if we get preempted here, because any
> - pending event will get dealt with anyway. */
> + /*
> + * Now preemption could happen, but this is only possible if an event
> + * was handled, so missing an event due to preemption is not
> + * possible at all.
> + * The worst possible case is to be preempted and then check events
> + * pending on the old vcpu, but this is not problematic.
> + */
>
> barrier(); /* unmask then check (avoid races) */
> if (unlikely(vcpu->evtchn_upcall_pending))
> xen_force_evtchn_callback();
> -
> - preempt_enable();
> }
> PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
>
> --
> 2.26.2
>
So the reason I asked about this is:
vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section
as reported by sfr here:
https://lkml.kernel.org/r/[email protected]
(I'm still not entirely sure why I didn't see them in my build, or why
0day didn't either)
Anyway, I can 'fix' xen_irq_disable(), see below, but I'm worried about
that still having a hole vs the preempt model. Consider:
xen_irq_disable()
preempt_disable();
<IRQ>
set_tif_need_resched()
</IRQ no preemption because preempt_count!=0>
this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; // IRQs are actually disabled
preempt_enable_no_resched(); // can't resched because IRQs are disabled
...
xen_irq_enable()
preempt_disable();
vcpu->evtch_upcall_mask = 0; // IRQs are on
preempt_enable() // catches the resched from above
Now your patch removes that preempt_enable() and we'll have a missing
preemption.
Trouble is, because this is noinstr, we can't do schedule().. catch-22
---
Subject: x86/xen: Fixup noinstr in xen_irq_{en,dis}able()
From: Peter Zijlstra <[email protected]>
Date: Mon Sep 20 13:46:19 CEST 2021
vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section
XXX, trades it for:
vmlinux.o: warning: objtool: xen_irq_enable()+0x5c: call to __SCT__preempt_schedule_notrace() leaves .noinstr.text section
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/x86/xen/irq.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -44,12 +44,18 @@ __PV_CALLEE_SAVE_REGS_THUNK(xen_save_fl,
asmlinkage __visible noinstr void xen_irq_disable(void)
{
- /* There's a one instruction preempt window here. We need to
- make sure we're don't switch CPUs between getting the vcpu
- pointer and updating the mask. */
- preempt_disable();
+ /*
+ * There's a one instruction preempt window here. We need to
+ * make sure we're don't switch CPUs between getting the vcpu
+ * pointer and updating the mask.
+ */
+ preempt_disable_notrace();
this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1;
- preempt_enable_no_resched();
+ /*
+ * We have IRQs disabled at this point, rescheduling isn't going to
+ * happen, so no point calling into the scheduler for it.
+ */
+ preempt_enable_no_resched_notrace();
}
__PV_CALLEE_SAVE_REGS_THUNK(xen_irq_disable, ".noinstr.text");
@@ -62,7 +68,7 @@ asmlinkage __visible noinstr void xen_ir
* cleared, so disable preemption to ensure we check for
* events on the VCPU we are still running on.
*/
- preempt_disable();
+ preempt_disable_notrace();
vcpu = this_cpu_read(xen_vcpu);
vcpu->evtchn_upcall_mask = 0;
@@ -74,7 +80,11 @@ asmlinkage __visible noinstr void xen_ir
if (unlikely(vcpu->evtchn_upcall_pending))
xen_force_evtchn_callback();
- preempt_enable();
+ /*
+ * XXX if we noinstr we shouldn't be calling schedule(), OTOH we also
+ * cannot not schedule() as that would violate PREEMPT.
+ */
+ preempt_enable_notrace();
}
__PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable, ".noinstr.text");
On 21.09.21 10:27, Peter Zijlstra wrote:
> On Tue, Sep 21, 2021 at 09:02:26AM +0200, Juergen Gross wrote:
>> Disabling preemption in xen_irq_enable() is not needed. There is no
>> risk of missing events due to preemption, as preemption can happen
>> only in case an event is being received, which is just the opposite
>> of missing an event.
>>
>> Signed-off-by: Juergen Gross <[email protected]>
>> ---
>> arch/x86/xen/irq.c | 18 +++++++-----------
>> 1 file changed, 7 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
>> index dfa091d79c2e..ba9b14a97109 100644
>> --- a/arch/x86/xen/irq.c
>> +++ b/arch/x86/xen/irq.c
>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
>> {
>> struct vcpu_info *vcpu;
>>
>> - /*
>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
>> - * cleared, so disable preemption to ensure we check for
>> - * events on the VCPU we are still running on.
>> - */
>> - preempt_disable();
>> -
>> vcpu = this_cpu_read(xen_vcpu);
>> vcpu->evtchn_upcall_mask = 0;
>>
>> - /* Doesn't matter if we get preempted here, because any
>> - pending event will get dealt with anyway. */
>> + /*
>> + * Now preemption could happen, but this is only possible if an event
>> + * was handled, so missing an event due to preemption is not
>> + * possible at all.
>> + * The worst possible case is to be preempted and then check events
>> + * pending on the old vcpu, but this is not problematic.
>> + */
>>
>> barrier(); /* unmask then check (avoid races) */
>> if (unlikely(vcpu->evtchn_upcall_pending))
>> xen_force_evtchn_callback();
>> -
>> - preempt_enable();
>> }
>> PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
>>
>> --
>> 2.26.2
>>
>
> So the reason I asked about this is:
>
> vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section
> vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section
>
> as reported by sfr here:
>
> https://lkml.kernel.org/r/[email protected]
>
> (I'm still not entirely sure why I didn't see them in my build, or why
> 0day didn't either)
>
> Anyway, I can 'fix' xen_irq_disable(), see below, but I'm worried about
> that still having a hole vs the preempt model. Consider:
>
> xen_irq_disable()
> preempt_disable();
> <IRQ>
> set_tif_need_resched()
> </IRQ no preemption because preempt_count!=0>
> this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; // IRQs are actually disabled
> preempt_enable_no_resched(); // can't resched because IRQs are disabled
>
> ...
>
> xen_irq_enable()
> preempt_disable();
> vcpu->evtch_upcall_mask = 0; // IRQs are on
> preempt_enable() // catches the resched from above
>
>
> Now your patch removes that preempt_enable() and we'll have a missing
> preemption.
>
> Trouble is, because this is noinstr, we can't do schedule().. catch-22
I think it is even worse. Looking at xen_save_fl() there is clearly
a missing preempt_disable().
But I think this all can be resolved by avoiding the need of disabling
preemption in those calls (xen_save_fl(), xen_irq_disable() and
xen_irq_enable()).
Right now disabling preemption is needed, because the flag to be tested
or modified is reached via a pointer (xen_vcpu) stored in the percpu
area. Looking where it might point to reveals the target address is
either an array indexed by smp_processor_id() or a percpu variable of
the local cpu (xen_vcpu_info).
Nowadays (since Xen 3.4, which is older than our minimal supported Xen
version) the array indexed by smp_processor_id() is used only during
early boot (interrupts are always off, only boot cpu is running) and
just after coming back from suspending the system (e.g. when being
live migrated). Early boot should be no problem, and the suspend case
isn't either, as that is happening under control of stop_machine()
(interrupts off on all cpus).
So I think I can switch the whole mess to only need to work on the
local percpu xen_vcpu_info instance, which will access always the
"correct" area via %gs.
Let me have a try ...
Juergen
On 21.09.21 09:02, Juergen Gross wrote:
> Disabling preemption in xen_irq_enable() is not needed. There is no
> risk of missing events due to preemption, as preemption can happen
> only in case an event is being received, which is just the opposite
> of missing an event.
>
> Signed-off-by: Juergen Gross <[email protected]>
Please ignore this patch, it is superseded now by
"[PATCH v2 0/2] x86/xen: simplify irq pvops"
Juergen
> ---
> arch/x86/xen/irq.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> index dfa091d79c2e..ba9b14a97109 100644
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
> {
> struct vcpu_info *vcpu;
>
> - /*
> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is
> - * cleared, so disable preemption to ensure we check for
> - * events on the VCPU we are still running on.
> - */
> - preempt_disable();
> -
> vcpu = this_cpu_read(xen_vcpu);
> vcpu->evtchn_upcall_mask = 0;
>
> - /* Doesn't matter if we get preempted here, because any
> - pending event will get dealt with anyway. */
> + /*
> + * Now preemption could happen, but this is only possible if an event
> + * was handled, so missing an event due to preemption is not
> + * possible at all.
> + * The worst possible case is to be preempted and then check events
> + * pending on the old vcpu, but this is not problematic.
> + */
>
> barrier(); /* unmask then check (avoid races) */
> if (unlikely(vcpu->evtchn_upcall_pending))
> xen_force_evtchn_callback();
> -
> - preempt_enable();
> }
> PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
>
>