2021-05-19 20:19:45

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC] trace: Add option for polling ring buffers

On Wed, May 19, 2021 at 07:57:55PM +0200, Nicolas Saenz Julienne wrote:
> To minimize trace's effect on isolated CPUs. That is, CPUs were only a
> handful or a single, process are allowed to run. Introduce a new trace
> option: 'poll-rb'.

maybe this should take a parameter in ms (us?) saying how frequently
to poll? it seems like a reasonable assumption that somebody running in
this kind of RT environment would be able to judge how often their
monitoring task needs to collect data.

> [1] The IPI, in this case, an irq_work, is needed since trace might run
> in NMI context. Which is not suitable for wake-ups.

could we also consider a try-wakeup which would not succeed if in NMI
context? or are there situations where we only gather data in NMI
context, and so would never succeed in waking up? if so, maybe
schedule the irq_work every 1000 failures to wake up.



2021-05-19 20:22:16

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [RFC] trace: Add option for polling ring buffers


Hi Willy,

On Wed, May 19, 2021 at 07:07:54PM +0100, Matthew Wilcox wrote:
> On Wed, May 19, 2021 at 07:57:55PM +0200, Nicolas Saenz Julienne wrote:
> > To minimize trace's effect on isolated CPUs. That is, CPUs were only a
> > handful or a single, process are allowed to run. Introduce a new trace
> > option: 'poll-rb'.
>
> maybe this should take a parameter in ms (us?) saying how frequently
> to poll? it seems like a reasonable assumption that somebody running in
> this kind of RT environment would be able to judge how often their
> monitoring task needs to collect data.

+1 (yes please).

> > [1] The IPI, in this case, an irq_work, is needed since trace might run
> > in NMI context. Which is not suitable for wake-ups.
>
> could we also consider a try-wakeup which would not succeed if in NMI
> context? or are there situations where we only gather data in NMI
> context, and so would never succeed in waking up? if so, maybe
> schedule the irq_work every 1000 failures to wake up.

We'd like to reduce overhead on the isolated (as in isolcpus=) CPUs as
much as possible (but yes this option was suggested).


2021-05-20 14:40:51

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [RFC] trace: Add option for polling ring buffers

Hi Matthew, thanks for your comments.

On Wed, 2021-05-19 at 19:07 +0100, Matthew Wilcox wrote:
> On Wed, May 19, 2021 at 07:57:55PM +0200, Nicolas Saenz Julienne wrote:
> > To minimize trace's effect on isolated CPUs. That is, CPUs were only a
> > handful or a single, process are allowed to run. Introduce a new trace
> > option: 'poll-rb'.
>
> maybe this should take a parameter in ms (us?) saying how frequently
> to poll? it seems like a reasonable assumption that somebody running in
> this kind of RT environment would be able to judge how often their
> monitoring task needs to collect data.

I'll look into it.

> > [1] The IPI, in this case, an irq_work, is needed since trace might run
> > in NMI context. Which is not suitable for wake-ups.
>
> could we also consider a try-wakeup which would not succeed if in NMI
> context?

Yes, in a similar vein, my original idea was to defer the wakeup process into a
non-isolated CPU using irq_work_on(). But that irq_work flavor is not NMI safe
(nor any other IPI mechanisms targeting other CPUs).

> or are there situations where we only gather data in NMI
> context, and so would never succeed in waking up?

Yes, that's a use-case. For ex. 'trace-cmd record -e nmi'.

> if so, maybe schedule the irq_work every 1000 failures to wake up.

You'd be generating latency spikes nonetheless. Which might eventually break
the isolated application latency requirements.

As Marcelo said, the least code we run on the isolated CPU the better.

--
Nicolás Sáenz