2023-09-26 03:25:42

by Ivaylo Dimitrov

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx



On 1.09.23 г. 17:18 ч., Sean Young wrote:
> The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
> handled entirely by the generic pwm-ir-tx driver, and in fact the
> pwm-ir-tx driver has been compatible with ir-rx51 from the start.
>

Unfortunately, pwm-ir-tx does not work on n900. My investigation shows
that for some reason usleep_range() sleeps for at least 300-400 us more
than what interval it is requested to sleep. I played with cyclictest
from rt-tests package and it gives similar results - increasing the
priority helps, but I was not able to make it sleep for less that 300 us
in average. I tried cpu_latency_qos_add_request() in pwm-ir-tx, but it
made no difference.

I get similar results on motorola droid4 (OMAP4), albeit there average
sleep is in 200-300 us range, which makes me believe that either OMAPs
have issues with hrtimers or the config we use has some issue which
leads to scheduler latency. Or, something else...

In either case help is appreciated to dig further trying to find the
reason for such a big delay.

Regards,
Ivo


2023-09-26 13:43:29

by Sean Young

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx

On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
> On 1.09.23 г. 17:18 ч., Sean Young wrote:
> > The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
> > handled entirely by the generic pwm-ir-tx driver, and in fact the
> > pwm-ir-tx driver has been compatible with ir-rx51 from the start.
> >
>
> Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
> for some reason usleep_range() sleeps for at least 300-400 us more than what
> interval it is requested to sleep. I played with cyclictest from rt-tests
> package and it gives similar results - increasing the priority helps, but I
> was not able to make it sleep for less that 300 us in average. I tried
> cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.
>
> I get similar results on motorola droid4 (OMAP4), albeit there average sleep
> is in 200-300 us range, which makes me believe that either OMAPs have issues
> with hrtimers or the config we use has some issue which leads to scheduler
> latency. Or, something else...

The pwm-ir-tx driver does suffer from this problem, but I was under the
impression that the ir-rx51 has the same problem.

> In either case help is appreciated to dig further trying to find the reason
> for such a big delay.

pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
usleep_range() uses hrtimers; however if you're not seeing the same delay
on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.

I don't have a n900 to test on, unfortunately.

Thanks
Sean

2023-09-26 13:51:23

by Ivaylo Dimitrov

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx



On 26.09.23 г. 10:16 ч., Sean Young wrote:
> On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
>> On 1.09.23 г. 17:18 ч., Sean Young wrote:
>>> The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
>>> handled entirely by the generic pwm-ir-tx driver, and in fact the
>>> pwm-ir-tx driver has been compatible with ir-rx51 from the start.
>>>
>>
>> Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
>> for some reason usleep_range() sleeps for at least 300-400 us more than what
>> interval it is requested to sleep. I played with cyclictest from rt-tests
>> package and it gives similar results - increasing the priority helps, but I
>> was not able to make it sleep for less that 300 us in average. I tried
>> cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.
>>
>> I get similar results on motorola droid4 (OMAP4), albeit there average sleep
>> is in 200-300 us range, which makes me believe that either OMAPs have issues
>> with hrtimers or the config we use has some issue which leads to scheduler
>> latency. Or, something else...
>
> The pwm-ir-tx driver does suffer from this problem, but I was under the
> impression that the ir-rx51 has the same problem.
>

Could you elaborate on the "pwm-ir-tx driver does suffer from this
problem"? Where do you see that?

ir-rx51 does not suffer from the same problem (albeit it has its own
one, see bellow)

>> In either case help is appreciated to dig further trying to find the reason
>> for such a big delay.
>
> pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
> usleep_range() uses hrtimers; however if you're not seeing the same delay
> on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.
>

usleep_range() is backed by hrtimers already, however the difference
comes from how hrtimer is used in ir-rx51: it uses timer callback
function that gets called in softirq context, while usleep_range() puts
the task in TASK_UNINTERRUPTIBLE state and then calls
schedule_hrtimeout_range(). For some reason it takes at least 200-400 us
(on average) even on OMAP4 to switch back to TASK_RUNNING state.

The issue with ir-rx51 and the way it uses hrtimers is that it calls
pwm_apply_state() from hrtimer function, which is not ok, per the
comment here
https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/pwm/core.c#L502

I can make pwm-ir-tx switch to hrtimers, that's not an issue, but I am
afraid that there is some general scheduler or timers (or something
else) issue that manifests itself with usleep_range() misbehaving.

> I don't have a n900 to test on, unfortunately.
>

I have and once I have an idea what's going on will port pwm-ir-tx to
hrtimers, if needed. Don't want to do it now as I am afraid the
completion I will have to use will have the same latency problems as
usleep_range()

Thanks,
Ivo

> Thanks
> Sean
>

2023-09-27 09:44:27

by Sean Young

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx

On Tue, Sep 26, 2023 at 03:43:18PM +0300, Ivaylo Dimitrov wrote:
> On 26.09.23 г. 10:16 ч., Sean Young wrote:
> > On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
> > > On 1.09.23 г. 17:18 ч., Sean Young wrote:
> > > > The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
> > > > handled entirely by the generic pwm-ir-tx driver, and in fact the
> > > > pwm-ir-tx driver has been compatible with ir-rx51 from the start.
> > > >
> > >
> > > Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
> > > for some reason usleep_range() sleeps for at least 300-400 us more than what
> > > interval it is requested to sleep. I played with cyclictest from rt-tests
> > > package and it gives similar results - increasing the priority helps, but I
> > > was not able to make it sleep for less that 300 us in average. I tried
> > > cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.
> > >
> > > I get similar results on motorola droid4 (OMAP4), albeit there average sleep
> > > is in 200-300 us range, which makes me believe that either OMAPs have issues
> > > with hrtimers or the config we use has some issue which leads to scheduler
> > > latency. Or, something else...
> >
> > The pwm-ir-tx driver does suffer from this problem, but I was under the
> > impression that the ir-rx51 has the same problem.
> >
>
> Could you elaborate on the "pwm-ir-tx driver does suffer from this problem"?
> Where do you see that?

So on a raspberry pi (model 3b), if I use the pwm-ir-tx driver, I get random
delays of up to 100us. It's a bit random and certainly depends on the load.

I'm measuring using a logic analyzer.

There have been reports by others on different machines with random delays
and/or transmit failures (as in the receiver occassionally fails to decode
the IR). I usually suggest they use the gpio-ir-tx driver, which does work
as far as I know (the signal looks perfect with a logic analyzer).

So far I've taken the view that the driver works ok for most situations,
since IR is usually fine with upto 100us missing here or there.

The gpio-ir-tx driver works much better because it does the entire send
under spinlock - obviously that has its own problems, because an IR transmit
can be 10s or even 100s of milliseconds.

I've never known of a solution to the pwm-ir-tx driver. If using hrtimers
directly improves the situation even a bit, then that would be great.

> ir-rx51 does not suffer from the same problem (albeit it has its own one,
> see bellow)
>
> > > In either case help is appreciated to dig further trying to find the reason
> > > for such a big delay.
> >
> > pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
> > usleep_range() uses hrtimers; however if you're not seeing the same delay
> > on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.
> >
>
> usleep_range() is backed by hrtimers already, however the difference comes
> from how hrtimer is used in ir-rx51: it uses timer callback function that
> gets called in softirq context, while usleep_range() puts the task in
> TASK_UNINTERRUPTIBLE state and then calls schedule_hrtimeout_range(). For
> some reason it takes at least 200-400 us (on average) even on OMAP4 to
> switch back to TASK_RUNNING state.
>
> The issue with ir-rx51 and the way it uses hrtimers is that it calls
> pwm_apply_state() from hrtimer function, which is not ok, per the comment
> here
> https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/pwm/core.c#L502
>
> I can make pwm-ir-tx switch to hrtimers, that's not an issue, but I am
> afraid that there is some general scheduler or timers (or something else)
> issue that manifests itself with usleep_range() misbehaving.

If we can switch pwm-ir-tx to hrtimers, that would be great.

The ir-rx51 removal patches have already been queued to media_staging;
we may have to remove them from there if we can't solve this problem.

> > I don't have a n900 to test on, unfortunately.
> >
>
> I have and once I have an idea what's going on will port pwm-ir-tx to
> hrtimers, if needed. Don't want to do it now as I am afraid the completion I
> will have to use will have the same latency problems as usleep_range()

That would be fantastic. Please do keep us up to date with how you are
getting on. Like I said, it would be nice to this resolved before the next
merge window.

Thanks,
Sean

2023-09-30 07:05:37

by Ivaylo Dimitrov

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx

Hi,

On 26.09.23 г. 23:18 ч., Sean Young wrote:
> On Tue, Sep 26, 2023 at 03:43:18PM +0300, Ivaylo Dimitrov wrote:
>> On 26.09.23 г. 10:16 ч., Sean Young wrote:
>>> On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
>>>> On 1.09.23 г. 17:18 ч., Sean Young wrote:
>>>>> The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
>>>>> handled entirely by the generic pwm-ir-tx driver, and in fact the
>>>>> pwm-ir-tx driver has been compatible with ir-rx51 from the start.
>>>>>
>>>>
>>>> Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
>>>> for some reason usleep_range() sleeps for at least 300-400 us more than what
>>>> interval it is requested to sleep. I played with cyclictest from rt-tests
>>>> package and it gives similar results - increasing the priority helps, but I
>>>> was not able to make it sleep for less that 300 us in average. I tried
>>>> cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.
>>>>
>>>> I get similar results on motorola droid4 (OMAP4), albeit there average sleep
>>>> is in 200-300 us range, which makes me believe that either OMAPs have issues
>>>> with hrtimers or the config we use has some issue which leads to scheduler
>>>> latency. Or, something else...
>>>
>>> The pwm-ir-tx driver does suffer from this problem, but I was under the
>>> impression that the ir-rx51 has the same problem.
>>>
>>
>> Could you elaborate on the "pwm-ir-tx driver does suffer from this problem"?
>> Where do you see that?
>
> So on a raspberry pi (model 3b), if I use the pwm-ir-tx driver, I get random
> delays of up to 100us. It's a bit random and certainly depends on the load.
>
> I'm measuring using a logic analyzer.
>
> There have been reports by others on different machines with random delays
> and/or transmit failures (as in the receiver occassionally fails to decode
> the IR). I usually suggest they use the gpio-ir-tx driver, which does work
> as far as I know (the signal looks perfect with a logic analyzer).
>
> So far I've taken the view that the driver works ok for most situations,
> since IR is usually fine with upto 100us missing here or there.
>
> The gpio-ir-tx driver works much better because it does the entire send
> under spinlock - obviously that has its own problems, because an IR transmit
> can be 10s or even 100s of milliseconds.
>
> I've never known of a solution to the pwm-ir-tx driver. If using hrtimers
> directly improves the situation even a bit, then that would be great.
>

The issue with hrtimers is that we cannot use them directly, as
pwm_apply_state() may sleep, but hrtimer function is called in atomic
context.

>> ir-rx51 does not suffer from the same problem (albeit it has its own one,
>> see bellow)
>>
>>>> In either case help is appreciated to dig further trying to find the reason
>>>> for such a big delay.
>>>
>>> pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
>>> usleep_range() uses hrtimers; however if you're not seeing the same delay
>>> on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.
>>>
>>
>> usleep_range() is backed by hrtimers already, however the difference comes
>> from how hrtimer is used in ir-rx51: it uses timer callback function that
>> gets called in softirq context, while usleep_range() puts the task in
>> TASK_UNINTERRUPTIBLE state and then calls schedule_hrtimeout_range(). For
>> some reason it takes at least 200-400 us (on average) even on OMAP4 to
>> switch back to TASK_RUNNING state.
>>
>> The issue with ir-rx51 and the way it uses hrtimers is that it calls
>> pwm_apply_state() from hrtimer function, which is not ok, per the comment
>> here
>> https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/pwm/core.c#L502
>>
>> I can make pwm-ir-tx switch to hrtimers, that's not an issue, but I am
>> afraid that there is some general scheduler or timers (or something else)
>> issue that manifests itself with usleep_range() misbehaving.
>
> If we can switch pwm-ir-tx to hrtimers, that would be great.
>

I made some POC here, but unfortunately it failed more or less. The idea
of POC is: setup hrtimer, start it in pwm_ir_tx() and do
wait_for_completion() in a loop while calling complete() for the timer
function. While it improves things a bit, I wouldn't say it makes the
driver working ok on n900 - my TV registers one of let's say 5-10 pulse
packages.

We have couple of issues:

- scheduler seems to use 32kHz timer, which means that we can never have
precise pulse width, with error up to ~30 us, no matter what we do, IIUC.

- wait_for_completion() suffers from the same latency issue that
usleep_range() has - it exits after 300-400 us after complete() has been
called in the timer function.

- turning pwm off needs ~300us, because of either omap_dm_timer_stop()
calling clk_get_rate() or __omap_dm_timer_stop() waiting for fclk period
* 3.5 (see
https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/clocksource/timer-ti-dm.c#L269)

- in order to achieve some sane latency distribution, I have to
set_user_nice(current, MIN_NICE); in pwm_ir_tx()


> The ir-rx51 removal patches have already been queued to media_staging;
> we may have to remove them from there if we can't solve this problem.
>

ir-rx51 has conceptual problem of calling function that might sleep from
atomic context, however, we can fix omap_dm_timer_stop() to not call
clk_get_rate() and that would make it working. So yeah, if we can't fix
pwm-ir-tx then patches removal along with fixing dmtimer and fixing a
couple of code issues ir-rx51 has, seems the only option to have working
IRTX on n900. Maybe we can rename it to pwm-ir-tx-hrtimer as there is
nothing n900 specific in it.

>>> I don't have a n900 to test on, unfortunately.
>>>
>>
>> I have and once I have an idea what's going on will port pwm-ir-tx to
>> hrtimers, if needed. Don't want to do it now as I am afraid the completion I
>> will have to use will have the same latency problems as usleep_range()
>
> That would be fantastic. Please do keep us up to date with how you are
> getting on. Like I said, it would be nice to this resolved before the next
> merge window.
>

The only thing I didn't try yet is to start another thread and to set
that thread to use FIFO scheduler. I will report once I have tried that.

Regards,
Ivo

> Thanks,
> Sean
>