2019-02-19 17:32:03

by Julien Grall

[permalink] [raw]
Subject: xen/evtchn and forced threaded irq

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now be threaded.
I can reproduce the same error with the vanilla Linux when passing the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to communicate with
xenstore.

From my understanding, this is happening because the interrupt handler is now
run in a thread. So we can have the following happening.

Interrupt context | Interrupt thread
|
receive interrupt port 6 |
clear the evtchn port |
set IRQF_RUNTHREAD |
kick interrupt thread |
| clear IRQF_RUNTHREAD
| call evtchn_interrupt
receive interrupt port 6 |
clear the evtchn port |
set IRQF_RUNTHREAD |
kick interrupt thread |
| disable interrupt port 6
| evtchn->enabled = false
| [....]
|
| *** Handling the second interrupt ***
| clear IRQF_RUNTHREAD
| call evtchn_interrupt
| WARN(...)

I am not entirely sure how to fix this. I have two solutions in mind:

1) Prevent the interrupt handler to be threaded. We would also need to
switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.

2) Remove the warning

None of them are ideals. Do you have an opionion/better suggestion?

[ 127.192087] Interrupt for port 6, but apparently not enabled; per-user 0000000078d39c7f
[ 127.200333] WARNING: CPU: 0 PID: 2553 at drivers/xen/evtchn.c:167 evtchn_interrupt+0xfc/0x120
[ 127.208799] Modules linked in:
[ 127.211939] CPU: 0 PID: 2553 Comm: irq/52-evtchn:x Tainted: G W
5.0.0-rc7-00023-g2a3d41623699 #1257
[ 127.222374] Hardware name: ARM Juno development board (r2) (DT)
[ 127.228381] pstate: 40000005 (nZcv daif -PAN -UAO)
[ 127.233256] pc : evtchn_interrupt+0xfc/0x120
[ 127.237607] lr : evtchn_interrupt+0xfc/0x120
[ 127.241952] sp : ffff000012d2bd60
[ 127.245347] x29: ffff000012d2bd60 x28: ffff00001015d608
[ 127.250741] x27: ffff8008b39de400 x26: ffff00001015d330
[ 127.256137] x25: ffff00001015d2d4 x24: 0000000000000001
[ 127.261532] x23: ffff00001015d570 x22: 0000000000000034
[ 127.266926] x21: 0000000000000000 x20: ffff8008b7f02400
[ 127.272322] x19: ffff8008b3aba000 x18: 0000000000000037
[ 127.277717] x17: 0000000000000000 x16: 0000000000000000
[ 127.283112] x15: 00000000fffffff0 x14: 0000000000000000
[ 127.288507] x13: 3030303030207265 x12: 000000000000000c
[ 127.293902] x11: ffff000010ea0a88 x10: 0000000000000000
[ 127.299297] x9 : 00000000fffb9fff x8 : 0000000000000000
[ 127.304701] x7 : 0000000000000001 x6 : ffff8008bac5c240
[ 127.310087] x5 : ffff8008bac5c240 x4 : 0000000000000000
[ 127.315483] x3 : ffff8008bac64708 x2 : ffff8008b39de400
[ 127.320877] x1 : 893ac9b38837b800 x0 : 0000000000000000
[ 127.326272] Call trace:
[ 127.328802] evtchn_interrupt+0xfc/0x120
[ 127.332804] irq_forced_thread_fn+0x38/0x98
[ 127.337066] irq_thread+0x190/0x238
[ 127.340636] kthread+0x134/0x138
[ 127.343942] ret_from_fork+0x10/0x1c
[ 127.347593] ---[ end trace 1d3fa385877cc18b ]---


Cheers,

--
Julien Grall


2019-02-20 00:04:17

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
> Hi all,
>
> I have been looking at using Linux RT in Dom0. Once the guest is started,
> the console is ending to have a lot of warning (see trace below).
>
> After some investigation, this is because the irq handler will now be threaded.
> I can reproduce the same error with the vanilla Linux when passing the option
> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
> not RT support).
>
> FWIW, the interrupt for port 6 is used to for the guest to communicate with
> xenstore.
>
> From my understanding, this is happening because the interrupt handler is now
> run in a thread. So we can have the following happening.
>
> Interrupt context | Interrupt thread
> |
> receive interrupt port 6 |
> clear the evtchn port |
> set IRQF_RUNTHREAD |
> kick interrupt thread |
> | clear IRQF_RUNTHREAD
> | call evtchn_interrupt
> receive interrupt port 6 |
> clear the evtchn port |
> set IRQF_RUNTHREAD |
> kick interrupt thread |
> | disable interrupt port 6
> | evtchn->enabled = false
> | [....]
> |
> | *** Handling the second interrupt ***
> | clear IRQF_RUNTHREAD
> | call evtchn_interrupt
> | WARN(...)
>
> I am not entirely sure how to fix this. I have two solutions in mind:
>
> 1) Prevent the interrupt handler to be threaded. We would also need to
> switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.
>
> 2) Remove the warning

I think access to evtchn->enabled is racy so (with or without the warning) we can't use it reliably.

Another alternative could be to queue the irq if !evtchn->enabled and handle it in evtchn_write() (which is where irq is supposed to be re-enabled).


-boris


>
> None of them are ideals. Do you have an opionion/better suggestion?
>
> [ 127.192087] Interrupt for port 6, but apparently not enabled; per-user 0000000078d39c7f
> [ 127.200333] WARNING: CPU: 0 PID: 2553 at drivers/xen/evtchn.c:167 evtchn_interrupt+0xfc/0x120
> [ 127.208799] Modules linked in:
> [ 127.211939] CPU: 0 PID: 2553 Comm: irq/52-evtchn:x Tainted: G W
> 5.0.0-rc7-00023-g2a3d41623699 #1257
> [ 127.222374] Hardware name: ARM Juno development board (r2) (DT)
> [ 127.228381] pstate: 40000005 (nZcv daif -PAN -UAO)
> [ 127.233256] pc : evtchn_interrupt+0xfc/0x120
> [ 127.237607] lr : evtchn_interrupt+0xfc/0x120
> [ 127.241952] sp : ffff000012d2bd60
> [ 127.245347] x29: ffff000012d2bd60 x28: ffff00001015d608
> [ 127.250741] x27: ffff8008b39de400 x26: ffff00001015d330
> [ 127.256137] x25: ffff00001015d2d4 x24: 0000000000000001
> [ 127.261532] x23: ffff00001015d570 x22: 0000000000000034
> [ 127.266926] x21: 0000000000000000 x20: ffff8008b7f02400
> [ 127.272322] x19: ffff8008b3aba000 x18: 0000000000000037
> [ 127.277717] x17: 0000000000000000 x16: 0000000000000000
> [ 127.283112] x15: 00000000fffffff0 x14: 0000000000000000
> [ 127.288507] x13: 3030303030207265 x12: 000000000000000c
> [ 127.293902] x11: ffff000010ea0a88 x10: 0000000000000000
> [ 127.299297] x9 : 00000000fffb9fff x8 : 0000000000000000
> [ 127.304701] x7 : 0000000000000001 x6 : ffff8008bac5c240
> [ 127.310087] x5 : ffff8008bac5c240 x4 : 0000000000000000
> [ 127.315483] x3 : ffff8008bac64708 x2 : ffff8008b39de400
> [ 127.320877] x1 : 893ac9b38837b800 x0 : 0000000000000000
> [ 127.326272] Call trace:
> [ 127.328802] evtchn_interrupt+0xfc/0x120
> [ 127.332804] irq_forced_thread_fn+0x38/0x98
> [ 127.337066] irq_thread+0x190/0x238
> [ 127.340636] kthread+0x134/0x138
> [ 127.343942] ret_from_fork+0x10/0x1c
> [ 127.347593] ---[ end trace 1d3fa385877cc18b ]---
>
>
> Cheers,
>
> --
> Julien Grall

2019-02-20 14:16:24

by Julien Grall

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:
> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>> Hi all,
>>
>> I have been looking at using Linux RT in Dom0. Once the guest is started,
>> the console is ending to have a lot of warning (see trace below).
>>
>> After some investigation, this is because the irq handler will now be threaded.
>> I can reproduce the same error with the vanilla Linux when passing the option
>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
>> not RT support).
>>
>> FWIW, the interrupt for port 6 is used to for the guest to communicate with
>> xenstore.
>>
>> From my understanding, this is happening because the interrupt handler is now
>> run in a thread. So we can have the following happening.
>>
>> Interrupt context | Interrupt thread
>> |
>> receive interrupt port 6 |
>> clear the evtchn port |
>> set IRQF_RUNTHREAD |
>> kick interrupt thread |
>> | clear IRQF_RUNTHREAD
>> | call evtchn_interrupt
>> receive interrupt port 6 |
>> clear the evtchn port |
>> set IRQF_RUNTHREAD |
>> kick interrupt thread |
>> | disable interrupt port 6
>> | evtchn->enabled = false
>> | [....]
>> |
>> | *** Handling the second interrupt ***
>> | clear IRQF_RUNTHREAD
>> | call evtchn_interrupt
>> | WARN(...)
>>
>> I am not entirely sure how to fix this. I have two solutions in mind:
>>
>> 1) Prevent the interrupt handler to be threaded. We would also need to
>> switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.
>>
>> 2) Remove the warning
>
> I think access to evtchn->enabled is racy so (with or without the warning) we can't use it reliably.

Thinking about it, it would not be the only issue. The ring is sized to contain
only one instance of the same event. So if you receive twice the event, you may
overflow the ring.

>
> Another alternative could be to queue the irq if !evtchn->enabled and handle it in evtchn_write() (which is where irq is supposed to be re-enabled).
What do you mean by queue? Is it queueing in the ring? If so, wouldn't it be
racy as described above?

Cheers,

--
Julien Grall

2019-02-20 17:10:04

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

On 2/20/19 9:15 AM, Julien Grall wrote:
> Hi Boris,
>
> Thank you for your answer.
>
> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>> Hi all,
>>>
>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>> started,
>>> the console is ending to have a lot of warning (see trace below).
>>>
>>> After some investigation, this is because the irq handler will now
>>> be threaded.
>>> I can reproduce the same error with the vanilla Linux when passing
>>> the option
>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>> that has
>>> not RT support).
>>>
>>> FWIW, the interrupt for port 6 is used to for the guest to
>>> communicate with
>>> xenstore.
>>>
>>>  From my understanding, this is happening because the interrupt
>>> handler is now
>>> run in a thread. So we can have the following happening.
>>>
>>>     Interrupt context            |     Interrupt thread
>>>                                  |
>>>     receive interrupt port 6     |
>>>     clear the evtchn port        |
>>>     set IRQF_RUNTHREAD            |
>>>     kick interrupt thread        |
>>>                                  |    clear IRQF_RUNTHREAD
>>>                                  |    call evtchn_interrupt
>>>     receive interrupt port 6     |
>>>     clear the evtchn port        |
>>>     set IRQF_RUNTHREAD           |
>>>     kick interrupt thread        |
>>>                                  |    disable interrupt port 6
>>>                                  |    evtchn->enabled = false
>>>                                  |    [....]
>>>                                  |
>>>                                  |    *** Handling the second
>>> interrupt ***
>>>                                  |    clear IRQF_RUNTHREAD
>>>                                  |    call evtchn_interrupt
>>>                                  |    WARN(...)
>>>
>>> I am not entirely sure how to fix this. I have two solutions in mind:
>>>
>>> 1) Prevent the interrupt handler to be threaded. We would also need to
>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>> RT-Linux.
>>>
>>> 2) Remove the warning
>>
>> I think access to evtchn->enabled is racy so (with or without the
>> warning) we can't use it reliably.
>
> Thinking about it, it would not be the only issue. The ring is sized
> to contain only one instance of the same event. So if you receive
> twice the event, you may overflow the ring.

Hm... That's another argument in favor of "unthreading" the handler.

>
>>
>> Another alternative could be to queue the irq if !evtchn->enabled and
>> handle it in evtchn_write() (which is where irq is supposed to be
>> re-enabled).
> What do you mean by queue? Is it queueing in the ring?


No, I was thinking about having a new structure for deferred interrupts.

-boris

> If so, wouldn't it be racy as described above?




2019-02-20 18:06:05

by Julien Grall

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

Hi,

On 20/02/2019 17:07, Boris Ostrovsky wrote:
> On 2/20/19 9:15 AM, Julien Grall wrote:
>> Hi Boris,
>>
>> Thank you for your answer.
>>
>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>> Hi all,
>>>>
>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>> started,
>>>> the console is ending to have a lot of warning (see trace below).
>>>>
>>>> After some investigation, this is because the irq handler will now
>>>> be threaded.
>>>> I can reproduce the same error with the vanilla Linux when passing
>>>> the option
>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>> that has
>>>> not RT support).
>>>>
>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>> communicate with
>>>> xenstore.
>>>>
>>>>  From my understanding, this is happening because the interrupt
>>>> handler is now
>>>> run in a thread. So we can have the following happening.
>>>>
>>>>     Interrupt context            |     Interrupt thread
>>>>                                  |
>>>>     receive interrupt port 6     |
>>>>     clear the evtchn port        |
>>>>     set IRQF_RUNTHREAD            |
>>>>     kick interrupt thread        |
>>>>                                  |    clear IRQF_RUNTHREAD
>>>>                                  |    call evtchn_interrupt
>>>>     receive interrupt port 6     |
>>>>     clear the evtchn port        |
>>>>     set IRQF_RUNTHREAD           |
>>>>     kick interrupt thread        |
>>>>                                  |    disable interrupt port 6
>>>>                                  |    evtchn->enabled = false
>>>>                                  |    [....]
>>>>                                  |
>>>>                                  |    *** Handling the second
>>>> interrupt ***
>>>>                                  |    clear IRQF_RUNTHREAD
>>>>                                  |    call evtchn_interrupt
>>>>                                  |    WARN(...)
>>>>
>>>> I am not entirely sure how to fix this. I have two solutions in mind:
>>>>
>>>> 1) Prevent the interrupt handler to be threaded. We would also need to
>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>> RT-Linux.
>>>>
>>>> 2) Remove the warning
>>>
>>> I think access to evtchn->enabled is racy so (with or without the
>>> warning) we can't use it reliably.
>>
>> Thinking about it, it would not be the only issue. The ring is sized
>> to contain only one instance of the same event. So if you receive
>> twice the event, you may overflow the ring.
>
> Hm... That's another argument in favor of "unthreading" the handler.

I first thought it would be possible to unthread it. However,
wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep, so this
cannot be used in an interrupt context.

So I think "unthreading" the handler is not an option here.

>
>>
>>>
>>> Another alternative could be to queue the irq if !evtchn->enabled and
>>> handle it in evtchn_write() (which is where irq is supposed to be
>>> re-enabled).
>> What do you mean by queue? Is it queueing in the ring?
>
>
> No, I was thinking about having a new structure for deferred interrupts.

Hmmm, I am not entirely sure what would be the structure here. Could you expand
your thinking?

Cheers,

--
Julien Grall

2019-02-20 20:06:44

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

On 2/20/19 1:05 PM, Julien Grall wrote:
> Hi,
>
> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>> Hi Boris,
>>>
>>> Thank you for your answer.
>>>
>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>> Hi all,
>>>>>
>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>> started,
>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>
>>>>> After some investigation, this is because the irq handler will now
>>>>> be threaded.
>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>> the option
>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>> that has
>>>>> not RT support).
>>>>>
>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>> communicate with
>>>>> xenstore.
>>>>>
>>>>>   From my understanding, this is happening because the interrupt
>>>>> handler is now
>>>>> run in a thread. So we can have the following happening.
>>>>>
>>>>>      Interrupt context            |     Interrupt thread
>>>>>                                   |
>>>>>      receive interrupt port 6     |
>>>>>      clear the evtchn port        |
>>>>>      set IRQF_RUNTHREAD            |
>>>>>      kick interrupt thread        |
>>>>>                                   |    clear IRQF_RUNTHREAD
>>>>>                                   |    call evtchn_interrupt
>>>>>      receive interrupt port 6     |
>>>>>      clear the evtchn port        |
>>>>>      set IRQF_RUNTHREAD           |
>>>>>      kick interrupt thread        |
>>>>>                                   |    disable interrupt port 6
>>>>>                                   |    evtchn->enabled = false
>>>>>                                   |    [....]
>>>>>                                   |
>>>>>                                   |    *** Handling the second
>>>>> interrupt ***
>>>>>                                   |    clear IRQF_RUNTHREAD
>>>>>                                   |    call evtchn_interrupt
>>>>>                                   |    WARN(...)
>>>>>
>>>>> I am not entirely sure how to fix this. I have two solutions in mind:
>>>>>
>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>> need to
>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>> RT-Linux.
>>>>>
>>>>> 2) Remove the warning
>>>>
>>>> I think access to evtchn->enabled is racy so (with or without the
>>>> warning) we can't use it reliably.
>>>
>>> Thinking about it, it would not be the only issue. The ring is sized
>>> to contain only one instance of the same event. So if you receive
>>> twice the event, you may overflow the ring.
>>
>> Hm... That's another argument in favor of "unthreading" the handler.
>
> I first thought it would be possible to unthread it. However,
> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
> so this cannot be used in an interrupt context.
>
> So I think "unthreading" the handler is not an option here.

That sounds like a different problem. I.e. there are two issues:
* threaded interrupts don't work properly (races, ring overflow)
* evtchn_interrupt() (threaded or not) has spin_lock(), which is not
going to work for RT

The first can be fixed by using non-threaded handlers.

>
>>
>>>
>>>>
>>>> Another alternative could be to queue the irq if !evtchn->enabled and
>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>> re-enabled).
>>> What do you mean by queue? Is it queueing in the ring?
>>
>>
>> No, I was thinking about having a new structure for deferred interrupts.
>
> Hmmm, I am not entirely sure what would be the structure here. Could
> you expand your thinking?

Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
implemented as a ring but not necessarily as Xen shared ring (if that's
what you were referring to).

-boris



2019-02-20 20:47:35

by Julien Grall

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

(+ Andrew and Jan for feedback on the event channel interrupt)

Hi Boris,

Thank you for the your feedback.

On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
> On 2/20/19 1:05 PM, Julien Grall wrote:
>> Hi,
>>
>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>>> Hi Boris,
>>>>
>>>> Thank you for your answer.
>>>>
>>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>>> started,
>>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>>
>>>>>> After some investigation, this is because the irq handler will now
>>>>>> be threaded.
>>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>>> the option
>>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>>> that has
>>>>>> not RT support).
>>>>>>
>>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>>> communicate with
>>>>>> xenstore.
>>>>>>
>>>>>>   From my understanding, this is happening because the interrupt
>>>>>> handler is now
>>>>>> run in a thread. So we can have the following happening.
>>>>>>
>>>>>>      Interrupt context            |     Interrupt thread
>>>>>>                                   |
>>>>>>      receive interrupt port 6     |
>>>>>>      clear the evtchn port        |
>>>>>>      set IRQF_RUNTHREAD            |
>>>>>>      kick interrupt thread        |
>>>>>>                                   |    clear IRQF_RUNTHREAD
>>>>>>                                   |    call evtchn_interrupt
>>>>>>      receive interrupt port 6     |
>>>>>>      clear the evtchn port        |
>>>>>>      set IRQF_RUNTHREAD           |
>>>>>>      kick interrupt thread        |
>>>>>>                                   |    disable interrupt port 6
>>>>>>                                   |    evtchn->enabled = false
>>>>>>                                   |    [....]
>>>>>>                                   |
>>>>>>                                   |    *** Handling the second
>>>>>> interrupt ***
>>>>>>                                   |    clear IRQF_RUNTHREAD
>>>>>>                                   |    call evtchn_interrupt
>>>>>>                                   |    WARN(...)
>>>>>>
>>>>>> I am not entirely sure how to fix this. I have two solutions in mind:
>>>>>>
>>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>>> need to
>>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>>> RT-Linux.
>>>>>>
>>>>>> 2) Remove the warning
>>>>>
>>>>> I think access to evtchn->enabled is racy so (with or without the
>>>>> warning) we can't use it reliably.
>>>>
>>>> Thinking about it, it would not be the only issue. The ring is sized
>>>> to contain only one instance of the same event. So if you receive
>>>> twice the event, you may overflow the ring.
>>>
>>> Hm... That's another argument in favor of "unthreading" the handler.
>>
>> I first thought it would be possible to unthread it. However,
>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>> so this cannot be used in an interrupt context.
>>
>> So I think "unthreading" the handler is not an option here.
>
> That sounds like a different problem. I.e. there are two issues:
> * threaded interrupts don't work properly (races, ring overflow)
> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
> going to work for RT

I am afraid that's not correct, you can use spin_lock() in threaded
interrupt handler.

> The first can be fixed by using non-threaded handlers.

The two are somewhat related, if you use a non-threaded handler then you
are not going to help the RT case.

In general, the unthreaded solution should be used in the last resort.

>>
>>>
>>>>
>>>>>
>>>>> Another alternative could be to queue the irq if !evtchn->enabled and
>>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>>> re-enabled).
>>>> What do you mean by queue? Is it queueing in the ring?
>>>
>>>
>>> No, I was thinking about having a new structure for deferred interrupts.
>>
>> Hmmm, I am not entirely sure what would be the structure here. Could
>> you expand your thinking?
>
> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
> implemented as a ring but not necessarily as Xen shared ring (if that's
> what you were referring to).

The underlying question is what happen if you miss an interrupt. Is it
going to be ok? If no, then we have to record everything and can't
kill/send an error to the user app because it is not its fault.

This means a FIFO would not be a viable. How do you size it? Static (i.e
pre-defined) size is not going to be possible because you don't know how
many interrupt you are going to receive before the thread handler runs.
You can't make it grow dynamically as it make become quite big for the
same reason.

Discussing with my team, a solution that came up would be to introduce
one atomic field per event to record the number of event received. I
will explore that solution tomorrow.

Cheers,

--
Julien Grall

2019-02-20 21:49:47

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

On 2/20/19 3:46 PM, Julien Grall wrote:
> (+ Andrew and Jan for feedback on the event channel interrupt)
>
> Hi Boris,
>
> Thank you for the your feedback.
>
> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>>>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>>>> Hi Boris,
>>>>>
>>>>> Thank you for your answer.
>>>>>
>>>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>>>> started,
>>>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>>>
>>>>>>> After some investigation, this is because the irq handler will now
>>>>>>> be threaded.
>>>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>>>> the option
>>>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>>>> that has
>>>>>>> not RT support).
>>>>>>>
>>>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>>>> communicate with
>>>>>>> xenstore.
>>>>>>>
>>>>>>>    From my understanding, this is happening because the interrupt
>>>>>>> handler is now
>>>>>>> run in a thread. So we can have the following happening.
>>>>>>>
>>>>>>>       Interrupt context            |     Interrupt thread
>>>>>>>                                    |
>>>>>>>       receive interrupt port 6     |
>>>>>>>       clear the evtchn port        |
>>>>>>>       set IRQF_RUNTHREAD            |
>>>>>>>       kick interrupt thread        |
>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>       receive interrupt port 6     |
>>>>>>>       clear the evtchn port        |
>>>>>>>       set IRQF_RUNTHREAD           |
>>>>>>>       kick interrupt thread        |
>>>>>>>                                    |    disable interrupt port 6
>>>>>>>                                    |    evtchn->enabled = false
>>>>>>>                                    |    [....]
>>>>>>>                                    |
>>>>>>>                                    |    *** Handling the second
>>>>>>> interrupt ***
>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>                                    |    WARN(...)
>>>>>>>
>>>>>>> I am not entirely sure how to fix this. I have two solutions in
>>>>>>> mind:
>>>>>>>
>>>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>>>> need to
>>>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>>>> RT-Linux.
>>>>>>>
>>>>>>> 2) Remove the warning
>>>>>>
>>>>>> I think access to evtchn->enabled is racy so (with or without the
>>>>>> warning) we can't use it reliably.
>>>>>
>>>>> Thinking about it, it would not be the only issue. The ring is sized
>>>>> to contain only one instance of the same event. So if you receive
>>>>> twice the event, you may overflow the ring.
>>>>
>>>> Hm... That's another argument in favor of "unthreading" the handler.
>>>
>>> I first thought it would be possible to unthread it. However,
>>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>>> so this cannot be used in an interrupt context.
>>>
>>> So I think "unthreading" the handler is not an option here.
>>
>> That sounds like a different problem. I.e. there are two issues:
>> * threaded interrupts don't work properly (races, ring overflow)
>> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
>> going to work for RT
>
> I am afraid that's not correct, you can use spin_lock() in threaded
> interrupt handler.

In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
you yourself said above?)

>
>> The first can be fixed by using non-threaded handlers.
>
> The two are somewhat related, if you use a non-threaded handler then
> you are not going to help the RT case.
>
> In general, the unthreaded solution should be used in the last resort.
>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Another alternative could be to queue the irq if !evtchn->enabled
>>>>>> and
>>>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>>>> re-enabled).
>>>>> What do you mean by queue? Is it queueing in the ring?
>>>>
>>>>
>>>> No, I was thinking about having a new structure for deferred
>>>> interrupts.
>>>
>>> Hmmm, I am not entirely sure what would be the structure here. Could
>>> you expand your thinking?
>>
>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>> implemented as a ring but not necessarily as Xen shared ring (if that's
>> what you were referring to).
>
> The underlying question is what happen if you miss an interrupt. Is it
> going to be ok?

This I am not sure about. I thought yes since we are signaling the
process only once.

-boris

> If no, then we have to record everything and can't kill/send an error
> to the user app because it is not its fault.
>
> This means a FIFO would not be a viable. How do you size it? Static
> (i.e pre-defined) size is not going to be possible because you don't
> know how many interrupt you are going to receive before the thread
> handler runs. You can't make it grow dynamically as it make become
> quite big for the same reason.
>
> Discussing with my team, a solution that came up would be to introduce
> one atomic field per event to record the number of event received. I
> will explore that solution tomorrow.
>




2019-02-20 22:05:08

by Julien Grall

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

Hi Boris,

On 2/20/19 9:46 PM, Boris Ostrovsky wrote:
> On 2/20/19 3:46 PM, Julien Grall wrote:
>> (+ Andrew and Jan for feedback on the event channel interrupt)
>>
>> Hi Boris,
>>
>> Thank you for the your feedback.
>>
>> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>>>>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>>>>> Hi Boris,
>>>>>>
>>>>>> Thank you for your answer.
>>>>>>
>>>>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>>>>> started,
>>>>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>>>>
>>>>>>>> After some investigation, this is because the irq handler will now
>>>>>>>> be threaded.
>>>>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>>>>> the option
>>>>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>>>>> that has
>>>>>>>> not RT support).
>>>>>>>>
>>>>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>>>>> communicate with
>>>>>>>> xenstore.
>>>>>>>>
>>>>>>>>    From my understanding, this is happening because the interrupt
>>>>>>>> handler is now
>>>>>>>> run in a thread. So we can have the following happening.
>>>>>>>>
>>>>>>>>       Interrupt context            |     Interrupt thread
>>>>>>>>                                    |
>>>>>>>>       receive interrupt port 6     |
>>>>>>>>       clear the evtchn port        |
>>>>>>>>       set IRQF_RUNTHREAD            |
>>>>>>>>       kick interrupt thread        |
>>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>>       receive interrupt port 6     |
>>>>>>>>       clear the evtchn port        |
>>>>>>>>       set IRQF_RUNTHREAD           |
>>>>>>>>       kick interrupt thread        |
>>>>>>>>                                    |    disable interrupt port 6
>>>>>>>>                                    |    evtchn->enabled = false
>>>>>>>>                                    |    [....]
>>>>>>>>                                    |
>>>>>>>>                                    |    *** Handling the second
>>>>>>>> interrupt ***
>>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>>                                    |    WARN(...)
>>>>>>>>
>>>>>>>> I am not entirely sure how to fix this. I have two solutions in
>>>>>>>> mind:
>>>>>>>>
>>>>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>>>>> need to
>>>>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>>>>> RT-Linux.
>>>>>>>>
>>>>>>>> 2) Remove the warning
>>>>>>>
>>>>>>> I think access to evtchn->enabled is racy so (with or without the
>>>>>>> warning) we can't use it reliably.
>>>>>>
>>>>>> Thinking about it, it would not be the only issue. The ring is sized
>>>>>> to contain only one instance of the same event. So if you receive
>>>>>> twice the event, you may overflow the ring.
>>>>>
>>>>> Hm... That's another argument in favor of "unthreading" the handler.
>>>>
>>>> I first thought it would be possible to unthread it. However,
>>>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>>>> so this cannot be used in an interrupt context.
>>>>
>>>> So I think "unthreading" the handler is not an option here.
>>>
>>> That sounds like a different problem. I.e. there are two issues:
>>> * threaded interrupts don't work properly (races, ring overflow)
>>> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
>>> going to work for RT
>>
>> I am afraid that's not correct, you can use spin_lock() in threaded
>> interrupt handler.
>
> In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
> you yourself said above?)

In RT-linux, interrupt handlers are threaded by default. So the handler
will not run in the interrupt context. Hence, it will be safe to call
spin_lock.

However, if you force the handler to not be threaded (IRQF_NO_THREAD),
it will run in interrupt context.

>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Another alternative could be to queue the irq if !evtchn->enabled
>>>>>>> and
>>>>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>>>>> re-enabled).
>>>>>> What do you mean by queue? Is it queueing in the ring?
>>>>>
>>>>>
>>>>> No, I was thinking about having a new structure for deferred
>>>>> interrupts.
>>>>
>>>> Hmmm, I am not entirely sure what would be the structure here. Could
>>>> you expand your thinking?
>>>
>>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>>> implemented as a ring but not necessarily as Xen shared ring (if that's
>>> what you were referring to).
>>
>> The underlying question is what happen if you miss an interrupt. Is it
>> going to be ok?
>
> This I am not sure about. I thought yes since we are signaling the
> process only once.

I have CCed Andrew and Jan to see if they can help here.

Cheers,

--
Julien Grall

2019-02-21 08:08:45

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On Wed, Feb 20, 2019 at 10:03:57PM +0000, Julien Grall wrote:
> Hi Boris,
>
> On 2/20/19 9:46 PM, Boris Ostrovsky wrote:
> > On 2/20/19 3:46 PM, Julien Grall wrote:
> > > (+ Andrew and Jan for feedback on the event channel interrupt)
> > >
> > > Hi Boris,
> > >
> > > Thank you for the your feedback.
> > >
> > > On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
> > > > On 2/20/19 1:05 PM, Julien Grall wrote:
> > > > > Hi,
> > > > >
> > > > > On 20/02/2019 17:07, Boris Ostrovsky wrote:
> > > > > > On 2/20/19 9:15 AM, Julien Grall wrote:
> > > > > > > Hi Boris,
> > > > > > >
> > > > > > > Thank you for your answer.
> > > > > > >
> > > > > > > On 20/02/2019 00:02, Boris Ostrovsky wrote:
> > > > > > > > On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I have been looking at using Linux RT in Dom0. Once the guest is
> > > > > > > > > started,
> > > > > > > > > the console is ending to have a lot of warning (see trace below).
> > > > > > > > >
> > > > > > > > > After some investigation, this is because the irq handler will now
> > > > > > > > > be threaded.
> > > > > > > > > I can reproduce the same error with the vanilla Linux when passing
> > > > > > > > > the option
> > > > > > > > > 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
> > > > > > > > > that has
> > > > > > > > > not RT support).
> > > > > > > > >
> > > > > > > > > FWIW, the interrupt for port 6 is used to for the guest to
> > > > > > > > > communicate with
> > > > > > > > > xenstore.
> > > > > > > > >
> > > > > > > > > ???From my understanding, this is happening because the interrupt
> > > > > > > > > handler is now
> > > > > > > > > run in a thread. So we can have the following happening.
> > > > > > > > >
> > > > > > > > > ????? Interrupt context??????????? |???? Interrupt thread
> > > > > > > > > ?????????????????????????????????? |
> > > > > > > > > ????? receive interrupt port 6???? |
> > > > > > > > > ????? clear the evtchn port??????? |
> > > > > > > > > ????? set IRQF_RUNTHREAD??????????? |
> > > > > > > > > ????? kick interrupt thread??????? |
> > > > > > > > > ?????????????????????????????????? |??? clear IRQF_RUNTHREAD
> > > > > > > > > ?????????????????????????????????? |??? call evtchn_interrupt
> > > > > > > > > ????? receive interrupt port 6???? |
> > > > > > > > > ????? clear the evtchn port??????? |
> > > > > > > > > ????? set IRQF_RUNTHREAD?????????? |
> > > > > > > > > ????? kick interrupt thread??????? |
> > > > > > > > > ?????????????????????????????????? |??? disable interrupt port 6
> > > > > > > > > ?????????????????????????????????? |??? evtchn->enabled = false
> > > > > > > > > ?????????????????????????????????? |??? [....]
> > > > > > > > > ?????????????????????????????????? |
> > > > > > > > > ?????????????????????????????????? |??? *** Handling the second
> > > > > > > > > interrupt ***
> > > > > > > > > ?????????????????????????????????? |??? clear IRQF_RUNTHREAD
> > > > > > > > > ?????????????????????????????????? |??? call evtchn_interrupt
> > > > > > > > > ?????????????????????????????????? |??? WARN(...)
> > > > > > > > >
> > > > > > > > > I am not entirely sure how to fix this. I have two solutions in
> > > > > > > > > mind:
> > > > > > > > >
> > > > > > > > > 1) Prevent the interrupt handler to be threaded. We would also
> > > > > > > > > need to
> > > > > > > > > switch from spin_lock to raw_spin_lock as the former may sleep on
> > > > > > > > > RT-Linux.
> > > > > > > > >
> > > > > > > > > 2) Remove the warning
> > > > > > > >
> > > > > > > > I think access to evtchn->enabled is racy so (with or without the
> > > > > > > > warning) we can't use it reliably.
> > > > > > >
> > > > > > > Thinking about it, it would not be the only issue. The ring is sized
> > > > > > > to contain only one instance of the same event. So if you receive
> > > > > > > twice the event, you may overflow the ring.
> > > > > >
> > > > > > Hm... That's another argument in favor of "unthreading" the handler.
> > > > >
> > > > > I first thought it would be possible to unthread it. However,
> > > > > wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
> > > > > so this cannot be used in an interrupt context.
> > > > >
> > > > > So I think "unthreading" the handler is not an option here.
> > > >
> > > > That sounds like a different problem. I.e. there are two issues:
> > > > * threaded interrupts don't work properly (races, ring overflow)
> > > > * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
> > > > going to work for RT
> > >
> > > I am afraid that's not correct, you can use spin_lock() in threaded
> > > interrupt handler.
> >
> > In non-RT handler -- yes, but not in an RT one (in fact, isn't this what
> > you yourself said above?)
>
> In RT-linux, interrupt handlers are threaded by default. So the handler will
> not run in the interrupt context. Hence, it will be safe to call spin_lock.
>
> However, if you force the handler to not be threaded (IRQF_NO_THREAD), it
> will run in interrupt context.
>
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Another alternative could be to queue the irq if !evtchn->enabled
> > > > > > > > and
> > > > > > > > handle it in evtchn_write() (which is where irq is supposed to be
> > > > > > > > re-enabled).
> > > > > > > What do you mean by queue? Is it queueing in the ring?
> > > > > >
> > > > > >
> > > > > > No, I was thinking about having a new structure for deferred
> > > > > > interrupts.
> > > > >
> > > > > Hmmm, I am not entirely sure what would be the structure here. Could
> > > > > you expand your thinking?
> > > >
> > > > Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
> > > > implemented as a ring but not necessarily as Xen shared ring (if that's
> > > > what you were referring to).
> > >
> > > The underlying question is what happen if you miss an interrupt. Is it
> > > going to be ok?
> >
> > This I am not sure about. I thought yes since we are signaling the
> > process only once.
>
> I have CCed Andrew and Jan to see if they can help here.

FWIW, you can also mask the interrupt while waiting for the thread to
execute the interrupt handler. Ie:

1. Interrupt injected
2. Execute guest event channel callback
3. Scan for pending interrupts
4. Mask interrupt
5. Clear pending field
6. Queue threaded handler
7. Go to 3 until all interrupts are drained
[...]
8. Execute interrupt handler in thread
9. Unmask interrupt

That should prevent you from stacking interrupts?

Roger.

2019-02-21 08:18:01

by Juergen Gross

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

On 19/02/2019 18:31, Julien Grall wrote:
> Hi all,
>
> I have been looking at using Linux RT in Dom0. Once the guest is started,
> the console is ending to have a lot of warning (see trace below).
>
> After some investigation, this is because the irq handler will now be threaded.
> I can reproduce the same error with the vanilla Linux when passing the option
> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7 that has
> not RT support).
>
> FWIW, the interrupt for port 6 is used to for the guest to communicate with
> xenstore.
>
> From my understanding, this is happening because the interrupt handler is now
> run in a thread. So we can have the following happening.
>
> Interrupt context | Interrupt thread
> |
> receive interrupt port 6 |
> clear the evtchn port |
> set IRQF_RUNTHREAD |
> kick interrupt thread |
> | clear IRQF_RUNTHREAD
> | call evtchn_interrupt
> receive interrupt port 6 |
> clear the evtchn port |
> set IRQF_RUNTHREAD |
> kick interrupt thread |
> | disable interrupt port 6
> | evtchn->enabled = false
> | [....]
> |
> | *** Handling the second interrupt ***
> | clear IRQF_RUNTHREAD
> | call evtchn_interrupt
> | WARN(...)
>
> I am not entirely sure how to fix this. I have two solutions in mind:
>
> 1) Prevent the interrupt handler to be threaded. We would also need to
> switch from spin_lock to raw_spin_lock as the former may sleep on RT-Linux.
>
> 2) Remove the warning

3) Split the handler (RT-only?) to a non-threaded part containing
everything until the "evtchn->enabled = false" and only then
kick the thread doing the rest, including the spin_lock().

Juergen

2019-02-22 11:44:56

by Jan Beulich

[permalink] [raw]
Subject: Re: xen/evtchn and forced threaded irq

>>> On 20.02.19 at 23:03, <[email protected]> wrote:
> On 2/20/19 9:46 PM, Boris Ostrovsky wrote:
>> On 2/20/19 3:46 PM, Julien Grall wrote:
>>> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>>>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>>>> implemented as a ring but not necessarily as Xen shared ring (if that's
>>>> what you were referring to).

I'm sort of lost here with the mixed references to "interrupts" and
"events". Interrupts can be shared (and have a per-instance
(irq,data) tuple), but there should be at most one interrupt
underlying the event channel systems, shouldn't there? Event
channels otoh can't be shared, and hence there's only one
"data" item to be associated with each channel, at which point a
plain counter ought to do.

>>> The underlying question is what happen if you miss an interrupt. Is it
>>> going to be ok?
>>
>> This I am not sure about. I thought yes since we are signaling the
>> process only once.
>
> I have CCed Andrew and Jan to see if they can help here.

What meaning of "miss" do you use here? As long as is only means
delay (i.e. miss an instance, but get notified as soon as feasible
even if there is not further event coming from the source) - that
would be okay, I think. But if you truly mean "miss" (i.e. event lost
altogether, with silence resulting if there's no further event), then
no, this would not be tolerable.

Jan



2019-02-22 12:39:00

by Oleksandr Andrushchenko

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On 2/20/19 10:46 PM, Julien Grall wrote:
> (+ Andrew and Jan for feedback on the event channel interrupt)
>
> Hi Boris,
>
> Thank you for the your feedback.
>
> On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
>> On 2/20/19 1:05 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 20/02/2019 17:07, Boris Ostrovsky wrote:
>>>> On 2/20/19 9:15 AM, Julien Grall wrote:
>>>>> Hi Boris,
>>>>>
>>>>> Thank you for your answer.
>>>>>
>>>>> On 20/02/2019 00:02, Boris Ostrovsky wrote:
>>>>>> On Tue, Feb 19, 2019 at 05:31:10PM +0000, Julien Grall wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have been looking at using Linux RT in Dom0. Once the guest is
>>>>>>> started,
>>>>>>> the console is ending to have a lot of warning (see trace below).
>>>>>>>
>>>>>>> After some investigation, this is because the irq handler will now
>>>>>>> be threaded.
>>>>>>> I can reproduce the same error with the vanilla Linux when passing
>>>>>>> the option
>>>>>>> 'threadirqs' on the command line (the trace below is from 5.0.0-rc7
>>>>>>> that has
>>>>>>> not RT support).
>>>>>>>
>>>>>>> FWIW, the interrupt for port 6 is used to for the guest to
>>>>>>> communicate with
>>>>>>> xenstore.
>>>>>>>
>>>>>>>    From my understanding, this is happening because the interrupt
>>>>>>> handler is now
>>>>>>> run in a thread. So we can have the following happening.
>>>>>>>
>>>>>>>       Interrupt context            |     Interrupt thread
>>>>>>>                                    |
>>>>>>>       receive interrupt port 6     |
>>>>>>>       clear the evtchn port        |
>>>>>>>       set IRQF_RUNTHREAD            |
>>>>>>>       kick interrupt thread        |
>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>       receive interrupt port 6     |
>>>>>>>       clear the evtchn port        |
>>>>>>>       set IRQF_RUNTHREAD           |
>>>>>>>       kick interrupt thread        |
>>>>>>>                                    |    disable interrupt port 6
>>>>>>>                                    | evtchn->enabled = false
>>>>>>>                                    |    [....]
>>>>>>>                                    |
>>>>>>>                                    |    *** Handling the second
>>>>>>> interrupt ***
>>>>>>>                                    |    clear IRQF_RUNTHREAD
>>>>>>>                                    |    call evtchn_interrupt
>>>>>>>                                    |    WARN(...)
>>>>>>>
>>>>>>> I am not entirely sure how to fix this. I have two solutions in
>>>>>>> mind:
>>>>>>>
>>>>>>> 1) Prevent the interrupt handler to be threaded. We would also
>>>>>>> need to
>>>>>>> switch from spin_lock to raw_spin_lock as the former may sleep on
>>>>>>> RT-Linux.
>>>>>>>
>>>>>>> 2) Remove the warning
>>>>>>
>>>>>> I think access to evtchn->enabled is racy so (with or without the
>>>>>> warning) we can't use it reliably.
>>>>>
>>>>> Thinking about it, it would not be the only issue. The ring is sized
>>>>> to contain only one instance of the same event. So if you receive
>>>>> twice the event, you may overflow the ring.
>>>>
>>>> Hm... That's another argument in favor of "unthreading" the handler.
>>>
>>> I first thought it would be possible to unthread it. However,
>>> wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
>>> so this cannot be used in an interrupt context.
>>>
>>> So I think "unthreading" the handler is not an option here.
>>
>> That sounds like a different problem. I.e. there are two issues:
>> * threaded interrupts don't work properly (races, ring overflow)
>> * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
>> going to work for RT
>
> I am afraid that's not correct, you can use spin_lock() in threaded
> interrupt handler.
>
>> The first can be fixed by using non-threaded handlers.
>
> The two are somewhat related, if you use a non-threaded handler then
> you are not going to help the RT case.
>
> In general, the unthreaded solution should be used in the last resort.
>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Another alternative could be to queue the irq if !evtchn->enabled
>>>>>> and
>>>>>> handle it in evtchn_write() (which is where irq is supposed to be
>>>>>> re-enabled).
>>>>> What do you mean by queue? Is it queueing in the ring?
>>>>
>>>>
>>>> No, I was thinking about having a new structure for deferred
>>>> interrupts.
>>>
>>> Hmmm, I am not entirely sure what would be the structure here. Could
>>> you expand your thinking?
>>
>> Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
>> implemented as a ring but not necessarily as Xen shared ring (if that's
>> what you were referring to).
>
> The underlying question is what happen if you miss an interrupt. Is it
> going to be ok? If no, then we have to record everything and can't
> kill/send an error to the user app because it is not its fault.
>
> This means a FIFO would not be a viable. How do you size it? Static
> (i.e pre-defined) size is not going to be possible because you don't
> know how many interrupt you are going to receive before the thread
> handler runs. You can't make it grow dynamically as it make become
> quite big for the same reason.
>
> Discussing with my team, a solution that came up would be to introduce
> one atomic field per event to record the number of event received. I
> will explore that solution tomorrow.
How will this help if events have some payload?
>
> Cheers,
>


2019-02-22 13:34:43

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> On 2/20/19 10:46 PM, Julien Grall wrote:
>> Discussing with my team, a solution that came up would be to introduce one
>> atomic field per event to record the number of event received. I will explore
>> that solution tomorrow.
> How will this help if events have some payload?

What payload? The event channel does not carry any payload. It only notify you
that something happen. Then this is up to the user to decide what to you with it.

Cheers,

--
Julien Grall

2019-02-25 13:25:23

by Oleksandr Andrushchenko

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On 2/22/19 3:33 PM, Julien Grall wrote:
> Hi,
>
> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>> Discussing with my team, a solution that came up would be to
>>> introduce one atomic field per event to record the number of event
>>> received. I will explore that solution tomorrow.
>> How will this help if events have some payload?
>
> What payload? The event channel does not carry any payload. It only
> notify you that something happen. Then this is up to the user to
> decide what to you with it.
Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost
>
> Cheers,
>
[1]
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756

2019-02-25 13:56:27

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> On 2/22/19 3:33 PM, Julien Grall wrote:
>> Hi,
>>
>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>> Discussing with my team, a solution that came up would be to introduce one
>>>> atomic field per event to record the number of event received. I will
>>>> explore that solution tomorrow.
>>> How will this help if events have some payload?
>>
>> What payload? The event channel does not carry any payload. It only notify you
>> that something happen. Then this is up to the user to decide what to you with it.
> Sorry, I was probably not precise enough. I mean that an event might have
> associated payload in the ring buffer, for example [1]. So, counting events
> may help somehow, but the ring's data may still be lost

From my understanding of event channels are edge interrupts. By definition,
they can be merged so you can get a signal notification to the guest for
multiple "events". So if you rely on the event to have an associated payload,
then you probably have done something wrong in your driver.

I haven't implemented PV drivers myself, but I would expect either side to block
if there were no space in the ring.

What do you do in the displif driver when the ring is full?

Cheers,

> [1]
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756
>

--
Julien Grall

2019-02-25 14:10:12

by Oleksandr Andrushchenko

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On 2/25/19 3:55 PM, Julien Grall wrote:
> Hi Oleksandr,
>
> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>> Discussing with my team, a solution that came up would be to
>>>>> introduce one atomic field per event to record the number of event
>>>>> received. I will explore that solution tomorrow.
>>>> How will this help if events have some payload?
>>>
>>> What payload? The event channel does not carry any payload. It only
>>> notify you that something happen. Then this is up to the user to
>>> decide what to you with it.
>> Sorry, I was probably not precise enough. I mean that an event might
>> have
>> associated payload in the ring buffer, for example [1]. So, counting
>> events
>> may help somehow, but the ring's data may still be lost
>
> From my understanding of event channels are edge interrupts. By
> definition, they can be merged so you can get a signal notification to
> the guest for multiple "events". So if you rely on the event to have
> an associated payload, then you probably have done something wrong in
> your driver.
>
> I haven't implemented PV drivers myself, but I would expect either
> side to block if there were no space in the ring.
>
> What do you do in the displif driver when the ring is full?
>
It is handled by the originator, the display backend in our case: it
doesn't send
events if it sees that the ring will overflow. But I was worried about
such a generic change with counting number of events received and if this
really helps to recover in general case
> Cheers,
>
>> [1]
>> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756
>>
>


2019-02-25 15:26:57

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi Oleksandr,

On 25/02/2019 14:08, Oleksandr Andrushchenko wrote:
> On 2/25/19 3:55 PM, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>> Discussing with my team, a solution that came up would be to introduce one
>>>>>> atomic field per event to record the number of event received. I will
>>>>>> explore that solution tomorrow.
>>>>> How will this help if events have some payload?
>>>>
>>>> What payload? The event channel does not carry any payload. It only notify
>>>> you that something happen. Then this is up to the user to decide what to you
>>>> with it.
>>> Sorry, I was probably not precise enough. I mean that an event might have
>>> associated payload in the ring buffer, for example [1]. So, counting events
>>> may help somehow, but the ring's data may still be lost
>>
>> From my understanding of event channels are edge interrupts. By definition,
>> they can be merged so you can get a signal notification to the guest for
>> multiple "events". So if you rely on the event to have an associated payload,
>> then you probably have done something wrong in your driver.
>>
>> I haven't implemented PV drivers myself, but I would expect either side to
>> block if there were no space in the ring.
>>
>> What do you do in the displif driver when the ring is full?
>>
> It is handled by the originator, the display backend in our case: it doesn't send
> events if it sees that the ring will overflow. But I was worried about
> such a generic change with counting number of events received and if this
> really helps to recover in general case

Well, I was originally looking at modifying only the /dev/evtchn driver but it
turns out the event flow for Xen irqchip is not entirely correct.

A lot of Xen PV drivers will thread the handler and set IRQF_ONESHOT expecting
the event to be masked until the event handler has ran. However, the flow we use
does not deal with it and actually warn you may receive another event before
executing the handler for the first event.

I have discussed with Marc Z. (one of the irqchip maintainers) about the issue.
He suggested a different interrupt flow that I need to try out.

Cheers,

--
Julien Grall

2019-02-26 09:16:41

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
> Hi Oleksandr,
>
> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > Hi,
> > >
> > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > Discussing with my team, a solution that came up would be to
> > > > > introduce one atomic field per event to record the number of
> > > > > event received. I will explore that solution tomorrow.
> > > > How will this help if events have some payload?
> > >
> > > What payload? The event channel does not carry any payload. It only
> > > notify you that something happen. Then this is up to the user to
> > > decide what to you with it.
> > Sorry, I was probably not precise enough. I mean that an event might have
> > associated payload in the ring buffer, for example [1]. So, counting events
> > may help somehow, but the ring's data may still be lost
>
> From my understanding of event channels are edge interrupts. By definition,

IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.

Roger.

2019-02-26 09:31:06

by Andrew Cooper

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On 26/02/2019 09:14, Roger Pau Monné wrote:
> On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>> Discussing with my team, a solution that came up would be to
>>>>>> introduce one atomic field per event to record the number of
>>>>>> event received. I will explore that solution tomorrow.
>>>>> How will this help if events have some payload?
>>>> What payload? The event channel does not carry any payload. It only
>>>> notify you that something happen. Then this is up to the user to
>>>> decide what to you with it.
>>> Sorry, I was probably not precise enough. I mean that an event might have
>>> associated payload in the ring buffer, for example [1]. So, counting events
>>> may help somehow, but the ring's data may still be lost
>> From my understanding of event channels are edge interrupts. By definition,
> IMO event channels are active high level interrupts.
>
> Let's take into account the following situation: you have an event
> channel masked and the event channel pending bit (akin to the line on
> bare metal) goes from low to high (0 -> 1), then you unmask the
> interrupt and you get an event injected. If it was an edge interrupt
> you wont get an event injected after unmasking, because you would
> have lost the edge. I think the problem here is that Linux treats
> event channels as edge interrupts, when they are actually level.

Event channels are edge interrupts.  There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification.  This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.

~Andrew

2019-02-26 09:47:00

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
> On 26/02/2019 09:14, Roger Pau Monn? wrote:
> > On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
> >> Hi Oleksandr,
> >>
> >> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> >>> On 2/22/19 3:33 PM, Julien Grall wrote:
> >>>> Hi,
> >>>>
> >>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> >>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
> >>>>>> Discussing with my team, a solution that came up would be to
> >>>>>> introduce one atomic field per event to record the number of
> >>>>>> event received. I will explore that solution tomorrow.
> >>>>> How will this help if events have some payload?
> >>>> What payload? The event channel does not carry any payload. It only
> >>>> notify you that something happen. Then this is up to the user to
> >>>> decide what to you with it.
> >>> Sorry, I was probably not precise enough. I mean that an event might have
> >>> associated payload in the ring buffer, for example [1]. So, counting events
> >>> may help somehow, but the ring's data may still be lost
> >> From my understanding of event channels are edge interrupts. By definition,
> > IMO event channels are active high level interrupts.
> >
> > Let's take into account the following situation: you have an event
> > channel masked and the event channel pending bit (akin to the line on
> > bare metal) goes from low to high (0 -> 1), then you unmask the
> > interrupt and you get an event injected. If it was an edge interrupt
> > you wont get an event injected after unmasking, because you would
> > have lost the edge. I think the problem here is that Linux treats
> > event channels as edge interrupts, when they are actually level.
>
> Event channels are edge interrupts.? There are several very subtle bugs
> to be had by software which treats them as line interrupts.
>
> Most critically, if you fail to ack them, rebind them to a new vcpu, and
> reenable interrupts, you don't get a new interrupt notification.? This
> was the source of a 4 month bug when XenServer was moving from
> classic-xen to PVOps where using irqbalance would cause dom0 to
> occasionally lose interrupts.

I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.

Thanks, Roger.

2019-02-26 09:48:10

by Paul Durrant

[permalink] [raw]
Subject: RE: [Xen-devel] xen/evtchn and forced threaded irq

> -----Original Message-----
> From: Xen-devel [mailto:[email protected]] On Behalf Of Andrew Cooper
> Sent: 26 February 2019 09:30
> To: Roger Pau Monne <[email protected]>; Julien Grall <[email protected]>
> Cc: Juergen Gross <[email protected]>; Stefano Stabellini <[email protected]>; Oleksandr
> Andrushchenko <[email protected]>; [email protected]; Jan Beulich <[email protected]>;
> xen-devel <[email protected]>; Boris Ostrovsky <[email protected]>; Dave P
> Martin <[email protected]>
> Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq
>
> On 26/02/2019 09:14, Roger Pau Monné wrote:
> > On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
> >> Hi Oleksandr,
> >>
> >> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> >>> On 2/22/19 3:33 PM, Julien Grall wrote:
> >>>> Hi,
> >>>>
> >>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> >>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
> >>>>>> Discussing with my team, a solution that came up would be to
> >>>>>> introduce one atomic field per event to record the number of
> >>>>>> event received. I will explore that solution tomorrow.
> >>>>> How will this help if events have some payload?
> >>>> What payload? The event channel does not carry any payload. It only
> >>>> notify you that something happen. Then this is up to the user to
> >>>> decide what to you with it.
> >>> Sorry, I was probably not precise enough. I mean that an event might have
> >>> associated payload in the ring buffer, for example [1]. So, counting events
> >>> may help somehow, but the ring's data may still be lost
> >> From my understanding of event channels are edge interrupts. By definition,
> > IMO event channels are active high level interrupts.
> >
> > Let's take into account the following situation: you have an event
> > channel masked and the event channel pending bit (akin to the line on
> > bare metal) goes from low to high (0 -> 1), then you unmask the
> > interrupt and you get an event injected. If it was an edge interrupt
> > you wont get an event injected after unmasking, because you would
> > have lost the edge. I think the problem here is that Linux treats
> > event channels as edge interrupts, when they are actually level.
>
> Event channels are edge interrupts.  There are several very subtle bugs
> to be had by software which treats them as line interrupts.

They are more subtle than that are they not? There is a single per-vcpu ACK which can cover multiple event channels.

Paul

>
> Most critically, if you fail to ack them, rebind them to a new vcpu, and
> reenable interrupts, you don't get a new interrupt notification.  This
> was the source of a 4 month bug when XenServer was moving from
> classic-xen to PVOps where using irqbalance would cause dom0 to
> occasionally lose interrupts.
>
> ~Andrew
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel

2019-02-26 10:05:41

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi Roger,

On 26/02/2019 09:44, Roger Pau Monné wrote:
> On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
>> On 26/02/2019 09:14, Roger Pau Monné wrote:
>>> On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
>>>> Hi Oleksandr,
>>>>
>>>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>>>> Discussing with my team, a solution that came up would be to
>>>>>>>> introduce one atomic field per event to record the number of
>>>>>>>> event received. I will explore that solution tomorrow.
>>>>>>> How will this help if events have some payload?
>>>>>> What payload? The event channel does not carry any payload. It only
>>>>>> notify you that something happen. Then this is up to the user to
>>>>>> decide what to you with it.
>>>>> Sorry, I was probably not precise enough. I mean that an event might have
>>>>> associated payload in the ring buffer, for example [1]. So, counting events
>>>>> may help somehow, but the ring's data may still be lost
>>>> From my understanding of event channels are edge interrupts. By definition,
>>> IMO event channels are active high level interrupts.
>>>
>>> Let's take into account the following situation: you have an event
>>> channel masked and the event channel pending bit (akin to the line on
>>> bare metal) goes from low to high (0 -> 1), then you unmask the
>>> interrupt and you get an event injected. If it was an edge interrupt
>>> you wont get an event injected after unmasking, because you would
>>> have lost the edge. I think the problem here is that Linux treats
>>> event channels as edge interrupts, when they are actually level.
>>
>> Event channels are edge interrupts.  There are several very subtle bugs
>> to be had by software which treats them as line interrupts.
>>
>> Most critically, if you fail to ack them, rebind them to a new vcpu, and
>> reenable interrupts, you don't get a new interrupt notification.  This
>> was the source of a 4 month bug when XenServer was moving from
>> classic-xen to PVOps where using irqbalance would cause dom0 to
>> occasionally lose interrupts.
>
> I would argue that you need to mask them first, rebind to a new vcpu
> and unmask, and then you will get an interrupt notification, or this
> should be fixed in Xen to work as you expect: trigger an interrupt
> notification when moving an asserted event channel between CPUs.
>
> Is there any document that describes how such non trivial things (like
> moving between CPUs) work for event/level interrupts?
>
> Maybe I'm being obtuse, but from the example I gave above it's quite
> clear to me event channels don't get triggered based on edge changes,
> but rather on the line level.

Your example above is not enough to give the semantics of level. You would only
use the MASK bit if your interrupt handler is threaded to avoid the interrupt
coming up again.

So if you remove the mask from the equation, then the interrupt flow should be:

1) handle interrupt
2) EOI

The EOI in our case would be clearing the PENDING state. In a proper level
interrupt, the state would stay PENDING if there were more to come. This is not
the case with the events and you will lose the interrupt.

So I don't think they are proper level interrupts. They have more a semantics of
edge interrupts with some property of level (i.e for the mask/unmask).

Cheers,

--
Julien Grall

2019-02-26 10:18:25

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On Tue, Feb 26, 2019 at 10:03:38AM +0000, Julien Grall wrote:
> Hi Roger,
>
> On 26/02/2019 09:44, Roger Pau Monn? wrote:
> > On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
> > > On 26/02/2019 09:14, Roger Pau Monn? wrote:
> > > > On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
> > > > > Hi Oleksandr,
> > > > >
> > > > > On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > > > > > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > > > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > > > > > Discussing with my team, a solution that came up would be to
> > > > > > > > > introduce one atomic field per event to record the number of
> > > > > > > > > event received. I will explore that solution tomorrow.
> > > > > > > > How will this help if events have some payload?
> > > > > > > What payload? The event channel does not carry any payload. It only
> > > > > > > notify you that something happen. Then this is up to the user to
> > > > > > > decide what to you with it.
> > > > > > Sorry, I was probably not precise enough. I mean that an event might have
> > > > > > associated payload in the ring buffer, for example [1]. So, counting events
> > > > > > may help somehow, but the ring's data may still be lost
> > > > > From my understanding of event channels are edge interrupts. By definition,
> > > > IMO event channels are active high level interrupts.
> > > >
> > > > Let's take into account the following situation: you have an event
> > > > channel masked and the event channel pending bit (akin to the line on
> > > > bare metal) goes from low to high (0 -> 1), then you unmask the
> > > > interrupt and you get an event injected. If it was an edge interrupt
> > > > you wont get an event injected after unmasking, because you would
> > > > have lost the edge. I think the problem here is that Linux treats
> > > > event channels as edge interrupts, when they are actually level.
> > >
> > > Event channels are edge interrupts.? There are several very subtle bugs
> > > to be had by software which treats them as line interrupts.
> > >
> > > Most critically, if you fail to ack them, rebind them to a new vcpu, and
> > > reenable interrupts, you don't get a new interrupt notification.? This
> > > was the source of a 4 month bug when XenServer was moving from
> > > classic-xen to PVOps where using irqbalance would cause dom0 to
> > > occasionally lose interrupts.
> >
> > I would argue that you need to mask them first, rebind to a new vcpu
> > and unmask, and then you will get an interrupt notification, or this
> > should be fixed in Xen to work as you expect: trigger an interrupt
> > notification when moving an asserted event channel between CPUs.
> >
> > Is there any document that describes how such non trivial things (like
> > moving between CPUs) work for event/level interrupts?
> >
> > Maybe I'm being obtuse, but from the example I gave above it's quite
> > clear to me event channels don't get triggered based on edge changes,
> > but rather on the line level.
>
> Your example above is not enough to give the semantics of level. You would
> only use the MASK bit if your interrupt handler is threaded to avoid the
> interrupt coming up again.
>
> So if you remove the mask from the equation, then the interrupt flow should be:
>
> 1) handle interrupt
> 2) EOI

This is bogus if you don't mask the interrupt source. You should
instead do

1) EOI
2) Handle interrupt

And loop over this.

> The EOI in our case would be clearing the PENDING state. In a proper level
> interrupt, the state would stay PENDING if there were more to come. This is
> not the case with the events and you will lose the interrupt.
>
> So I don't think they are proper level interrupts. They have more a
> semantics of edge interrupts with some property of level (i.e for the
> mask/unmask).

OK, I guess it depends on how you look at it, to me event channels are
maybe quirky level interrupts, but are definitely closer to level than
edge interrupts, specially taking into account the interrupt injection
that happens on unmask of a pending line, there's no such thing at all
with edge interrupts.

Thanks, Roger.

2019-02-26 10:27:24

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi,

On 26/02/2019 10:17, Roger Pau Monné wrote:
> On Tue, Feb 26, 2019 at 10:03:38AM +0000, Julien Grall wrote:
>> Hi Roger,
>>
>> On 26/02/2019 09:44, Roger Pau Monné wrote:
>>> On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
>>>> On 26/02/2019 09:14, Roger Pau Monné wrote:
>>>>> On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
>>>>>> Hi Oleksandr,
>>>>>>
>>>>>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>>>>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>>>>>> Discussing with my team, a solution that came up would be to
>>>>>>>>>> introduce one atomic field per event to record the number of
>>>>>>>>>> event received. I will explore that solution tomorrow.
>>>>>>>>> How will this help if events have some payload?
>>>>>>>> What payload? The event channel does not carry any payload. It only
>>>>>>>> notify you that something happen. Then this is up to the user to
>>>>>>>> decide what to you with it.
>>>>>>> Sorry, I was probably not precise enough. I mean that an event might have
>>>>>>> associated payload in the ring buffer, for example [1]. So, counting events
>>>>>>> may help somehow, but the ring's data may still be lost
>>>>>> From my understanding of event channels are edge interrupts. By definition,
>>>>> IMO event channels are active high level interrupts.
>>>>>
>>>>> Let's take into account the following situation: you have an event
>>>>> channel masked and the event channel pending bit (akin to the line on
>>>>> bare metal) goes from low to high (0 -> 1), then you unmask the
>>>>> interrupt and you get an event injected. If it was an edge interrupt
>>>>> you wont get an event injected after unmasking, because you would
>>>>> have lost the edge. I think the problem here is that Linux treats
>>>>> event channels as edge interrupts, when they are actually level.
>>>>
>>>> Event channels are edge interrupts.  There are several very subtle bugs
>>>> to be had by software which treats them as line interrupts.
>>>>
>>>> Most critically, if you fail to ack them, rebind them to a new vcpu, and
>>>> reenable interrupts, you don't get a new interrupt notification.  This
>>>> was the source of a 4 month bug when XenServer was moving from
>>>> classic-xen to PVOps where using irqbalance would cause dom0 to
>>>> occasionally lose interrupts.
>>>
>>> I would argue that you need to mask them first, rebind to a new vcpu
>>> and unmask, and then you will get an interrupt notification, or this
>>> should be fixed in Xen to work as you expect: trigger an interrupt
>>> notification when moving an asserted event channel between CPUs.
>>>
>>> Is there any document that describes how such non trivial things (like
>>> moving between CPUs) work for event/level interrupts?
>>>
>>> Maybe I'm being obtuse, but from the example I gave above it's quite
>>> clear to me event channels don't get triggered based on edge changes,
>>> but rather on the line level.
>>
>> Your example above is not enough to give the semantics of level. You would
>> only use the MASK bit if your interrupt handler is threaded to avoid the
>> interrupt coming up again.
>>
>> So if you remove the mask from the equation, then the interrupt flow should be:
>>
>> 1) handle interrupt
>> 2) EOI
>
> This is bogus if you don't mask the interrupt source. You should
> instead do
>
> 1) EOI
> 2) Handle interrupt
>
> And loop over this.
So that's not a level semantics. It is a edge one :). In the level case, you
would clear the state once you are done with the interrupt.

Also, it would be ACK and not EOI.

>
>> The EOI in our case would be clearing the PENDING state. In a proper level
>> interrupt, the state would stay PENDING if there were more to come. This is
>> not the case with the events and you will lose the interrupt.
>>
>> So I don't think they are proper level interrupts. They have more a
>> semantics of edge interrupts with some property of level (i.e for the
>> mask/unmask).
>
> OK, I guess it depends on how you look at it, to me event channels are
> maybe quirky level interrupts, but are definitely closer to level than
> edge interrupts, specially taking into account the interrupt injection
> that happens on unmask of a pending line, there's no such thing at all
> with edge interrupts.

Cheers,

--
Julien Grall

2019-02-26 11:05:26

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

On Tue, Feb 26, 2019 at 10:26:21AM +0000, Julien Grall wrote:
> Hi,
>
> On 26/02/2019 10:17, Roger Pau Monn? wrote:
> > On Tue, Feb 26, 2019 at 10:03:38AM +0000, Julien Grall wrote:
> > > Hi Roger,
> > >
> > > On 26/02/2019 09:44, Roger Pau Monn? wrote:
> > > > On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
> > > > > On 26/02/2019 09:14, Roger Pau Monn? wrote:
> > > > > > On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
> > > > > > > Hi Oleksandr,
> > > > > > >
> > > > > > > On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > > > > > > > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > > > > > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > > > > > > > Discussing with my team, a solution that came up would be to
> > > > > > > > > > > introduce one atomic field per event to record the number of
> > > > > > > > > > > event received. I will explore that solution tomorrow.
> > > > > > > > > > How will this help if events have some payload?
> > > > > > > > > What payload? The event channel does not carry any payload. It only
> > > > > > > > > notify you that something happen. Then this is up to the user to
> > > > > > > > > decide what to you with it.
> > > > > > > > Sorry, I was probably not precise enough. I mean that an event might have
> > > > > > > > associated payload in the ring buffer, for example [1]. So, counting events
> > > > > > > > may help somehow, but the ring's data may still be lost
> > > > > > > From my understanding of event channels are edge interrupts. By definition,
> > > > > > IMO event channels are active high level interrupts.
> > > > > >
> > > > > > Let's take into account the following situation: you have an event
> > > > > > channel masked and the event channel pending bit (akin to the line on
> > > > > > bare metal) goes from low to high (0 -> 1), then you unmask the
> > > > > > interrupt and you get an event injected. If it was an edge interrupt
> > > > > > you wont get an event injected after unmasking, because you would
> > > > > > have lost the edge. I think the problem here is that Linux treats
> > > > > > event channels as edge interrupts, when they are actually level.
> > > > >
> > > > > Event channels are edge interrupts.? There are several very subtle bugs
> > > > > to be had by software which treats them as line interrupts.
> > > > >
> > > > > Most critically, if you fail to ack them, rebind them to a new vcpu, and
> > > > > reenable interrupts, you don't get a new interrupt notification.? This
> > > > > was the source of a 4 month bug when XenServer was moving from
> > > > > classic-xen to PVOps where using irqbalance would cause dom0 to
> > > > > occasionally lose interrupts.
> > > >
> > > > I would argue that you need to mask them first, rebind to a new vcpu
> > > > and unmask, and then you will get an interrupt notification, or this
> > > > should be fixed in Xen to work as you expect: trigger an interrupt
> > > > notification when moving an asserted event channel between CPUs.
> > > >
> > > > Is there any document that describes how such non trivial things (like
> > > > moving between CPUs) work for event/level interrupts?
> > > >
> > > > Maybe I'm being obtuse, but from the example I gave above it's quite
> > > > clear to me event channels don't get triggered based on edge changes,
> > > > but rather on the line level.
> > >
> > > Your example above is not enough to give the semantics of level. You would
> > > only use the MASK bit if your interrupt handler is threaded to avoid the
> > > interrupt coming up again.
> > >
> > > So if you remove the mask from the equation, then the interrupt flow should be:
> > >
> > > 1) handle interrupt
> > > 2) EOI
> >
> > This is bogus if you don't mask the interrupt source. You should
> > instead do
> >
> > 1) EOI
> > 2) Handle interrupt
> >
> > And loop over this.
> So that's not a level semantics. It is a edge one :). In the level case, you
> would clear the state once you are done with the interrupt.
>
> Also, it would be ACK and not EOI.

For level triggered interrupts you have to somehow signal the device
to stop asserting the line, which doesn't happen for Xen devices
because they just signal interrupts to Xen, but don't have a way to
keep event channels asserted, so I agree that this is different from
traditional level interrupts because devices using event channels
don't have a way to keep lines asserted.

I guess the most similar native interrupt is MSI with masking
support?

Roger.

2019-02-27 11:31:08

by Julien Grall

[permalink] [raw]
Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq

Hi,

On 2/26/19 11:02 AM, Roger Pau Monné wrote:
> On Tue, Feb 26, 2019 at 10:26:21AM +0000, Julien Grall wrote:
>> On 26/02/2019 10:17, Roger Pau Monné wrote:
>>> On Tue, Feb 26, 2019 at 10:03:38AM +0000, Julien Grall wrote:
>>>> Hi Roger,
>>>>
>>>> On 26/02/2019 09:44, Roger Pau Monné wrote:
>>>>> On Tue, Feb 26, 2019 at 09:30:07AM +0000, Andrew Cooper wrote:
>>>>>> On 26/02/2019 09:14, Roger Pau Monné wrote:
>>>>>>> On Mon, Feb 25, 2019 at 01:55:42PM +0000, Julien Grall wrote:
>>>>>>>> Hi Oleksandr,
>>>>>>>>
>>>>>>>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>>>>>>>> On 2/22/19 3:33 PM, Julien Grall wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
>>>>>>>>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
>>>>>>>>>>>> Discussing with my team, a solution that came up would be to
>>>>>>>>>>>> introduce one atomic field per event to record the number of
>>>>>>>>>>>> event received. I will explore that solution tomorrow.
>>>>>>>>>>> How will this help if events have some payload?
>>>>>>>>>> What payload? The event channel does not carry any payload. It only
>>>>>>>>>> notify you that something happen. Then this is up to the user to
>>>>>>>>>> decide what to you with it.
>>>>>>>>> Sorry, I was probably not precise enough. I mean that an event might have
>>>>>>>>> associated payload in the ring buffer, for example [1]. So, counting events
>>>>>>>>> may help somehow, but the ring's data may still be lost
>>>>>>>> From my understanding of event channels are edge interrupts. By definition,
>>>>>>> IMO event channels are active high level interrupts.
>>>>>>>
>>>>>>> Let's take into account the following situation: you have an event
>>>>>>> channel masked and the event channel pending bit (akin to the line on
>>>>>>> bare metal) goes from low to high (0 -> 1), then you unmask the
>>>>>>> interrupt and you get an event injected. If it was an edge interrupt
>>>>>>> you wont get an event injected after unmasking, because you would
>>>>>>> have lost the edge. I think the problem here is that Linux treats
>>>>>>> event channels as edge interrupts, when they are actually level.
>>>>>>
>>>>>> Event channels are edge interrupts.  There are several very subtle bugs
>>>>>> to be had by software which treats them as line interrupts.
>>>>>>
>>>>>> Most critically, if you fail to ack them, rebind them to a new vcpu, and
>>>>>> reenable interrupts, you don't get a new interrupt notification.  This
>>>>>> was the source of a 4 month bug when XenServer was moving from
>>>>>> classic-xen to PVOps where using irqbalance would cause dom0 to
>>>>>> occasionally lose interrupts.
>>>>>
>>>>> I would argue that you need to mask them first, rebind to a new vcpu
>>>>> and unmask, and then you will get an interrupt notification, or this
>>>>> should be fixed in Xen to work as you expect: trigger an interrupt
>>>>> notification when moving an asserted event channel between CPUs.
>>>>>
>>>>> Is there any document that describes how such non trivial things (like
>>>>> moving between CPUs) work for event/level interrupts?
>>>>>
>>>>> Maybe I'm being obtuse, but from the example I gave above it's quite
>>>>> clear to me event channels don't get triggered based on edge changes,
>>>>> but rather on the line level.
>>>>
>>>> Your example above is not enough to give the semantics of level. You would
>>>> only use the MASK bit if your interrupt handler is threaded to avoid the
>>>> interrupt coming up again.
>>>>
>>>> So if you remove the mask from the equation, then the interrupt flow should be:
>>>>
>>>> 1) handle interrupt
>>>> 2) EOI
>>>
>>> This is bogus if you don't mask the interrupt source. You should
>>> instead do
>>>
>>> 1) EOI
>>> 2) Handle interrupt
>>>
>>> And loop over this.
>> So that's not a level semantics. It is a edge one :). In the level case, you
>> would clear the state once you are done with the interrupt.
>>
>> Also, it would be ACK and not EOI.
>
> For level triggered interrupts you have to somehow signal the device
> to stop asserting the line, which doesn't happen for Xen devices
> because they just signal interrupts to Xen, but don't have a way to
> keep event channels asserted, so I agree that this is different from
> traditional level interrupts because devices using event channels
> don't have a way to keep lines asserted.
>
> I guess the most similar native interrupt is MSI with masking
> support?

I don't know enough about MSI with masking support to be able to draw a
comparison :).

The flow I have been suggested to re-use in Linux is
handle_fasteoi_ack_irq. I haven't yet had time to have a try at it.

Cheers,

--
Julien Grall