On Tuesday 22 February 2011, Peppe CAVALLARO wrote:
> From: Stuart Menefy <[email protected]>
>
> Many devices targeted at the embedded market provide a number of
> generic timers which are capable of generating interrupts at a
> requested rate. These can then be used in the implementation of drivers
> for other peripherals which require a timer interrupt, without having
> to provide an additional timer as part of that peripheral.
>
> A code provides a simple abstraction layer which allows a timer to be
> registered, and for a driver to request a timer.
>
> Currently this doesn't provide any of the additional information, such
> as precision or position in clock framework which might be required
> for a fully featured driver.
This code should probably be discussed on a more broader
platform than the netdev and linux-sh mailing lists,
as the scope is neither sh nor network specific.
You should at least add [email protected], possibly
also [email protected].
Further, John and Thomas are responsible for the timekeeping
infrastructure, and they are probably interested in this
as well.
Why is this code useful to you? In the scenarios I've seen, the
board can always assign a timer to a specific device in a fixed
way that can be describe in a board file or device tree.
Also, what is the difference between this and clkdev?
> Signed-off-by: Stuart Menefy <[email protected]>
> Hacked-by: Giuseppe Cavallaro <[email protected]>
> ---
> drivers/clocksource/Makefile | 1 +
> drivers/clocksource/generictimer.c | 60 ++++++++++++++++++++++++++++++++++++
> include/linux/generictimer.h | 41 ++++++++++++++++++++++++
> 3 files changed, 102 insertions(+), 0 deletions(-)
> create mode 100644 drivers/clocksource/generictimer.c
> create mode 100644 include/linux/generictimer.h
I don't think it fits that well into the drivers/clocksource directory,
because you don't actually register a struct clock_event_device or
struct clocksource.
I don't know if this could also be merged with the clocksource infrastructure,
but your code currently doesn't do that.
> +struct generic_timer *generic_timer_claim(void (*handler) (void *), void *data)
> +{
> + struct generic_timer *gt = NULL;
> +
> + if (!handler) {
> + pr_err("%s: invalid handler\n", __func__);
> + return NULL;
> + }
> +
> + mutex_lock(>_mutex);
> + if (!list_empty(>_list)) {
> + struct list_head *list = gt_list.next;
> + list_del(list);
> + gt = container_of(list, struct generic_timer, list);
> + }
> + mutex_unlock(>_mutex);
> +
> + if (!gt)
> + return NULL;
> +
> + /* Prepare the new handler */
> + gt->priv_handler = handler;
> + gt->data = data;
> +
> + return gt;
> +}
This does not seem very generic. You put timers into the list and take
them out again, but don't have any way to deal with timers that match
specific purposes. It obviously works for your specific use case where
you register exactly one timer, and use that in exactly one driver.
If more drivers were converted to generic_timer, which is obviously
the intention, then you might have a situation with very different
timers (fixed/variable tick, high/low frequencies, accurate/inaccurate),
or you might have fewer timers than users.
> +static inline void generic_timer_set_rate(struct generic_timer *gt,
> + unsigned long rate)
> +{
> + gt->set_rate(gt, rate);
> +}
> +
> +#endif /* __STM_GENERIC_TIMER_H */
Shouldn't this one allow a return code?
Arnd
Hi Arnd
Thanks for the comments.
On 24/02/11 17:20, Arnd Bergmann wrote:
> On Tuesday 22 February 2011, Peppe CAVALLARO wrote:
>> From: Stuart Menefy <[email protected]>
>>
>> Many devices targeted at the embedded market provide a number of
>> generic timers which are capable of generating interrupts at a
>> requested rate. These can then be used in the implementation of drivers
>> for other peripherals which require a timer interrupt, without having
>> to provide an additional timer as part of that peripheral.
>>
>> A code provides a simple abstraction layer which allows a timer to be
>> registered, and for a driver to request a timer.
>>
>> Currently this doesn't provide any of the additional information, such
>> as precision or position in clock framework which might be required
>> for a fully featured driver.
>
> This code should probably be discussed on a more broader
> platform than the netdev and linux-sh mailing lists,
> as the scope is neither sh nor network specific.
>
> You should at least add [email protected], possibly
> also [email protected].
>
> Further, John and Thomas are responsible for the timekeeping
> infrastructure, and they are probably interested in this
> as well.
>
> Why is this code useful to you? In the scenarios I've seen, the
> board can always assign a timer to a specific device in a fixed
> way that can be describe in a board file or device tree.
What we were trying to do was separate the code which actually manipulates
the timer hardware from the code which wants that timer service. In this
case it was a network device driver which is used on multiple SoC devices,
while the timer hardware tends to differ from device to device.
The other user of this code which we have is an OProfile driver, which
with this change can now be independent of the hardware it is running on,
while the previous version manipulated the timer hardware directly.
> Also, what is the difference between this and clkdev?
clkdev can be used to find a struct clk, which is fine if you just want to
read the time. In this instance we want to get interrupts from the timer
hardware, which isn't supported by the clk infrastructure.
If anything this duplicates clockevents. The main reason for not using
clockevents was that nobody else does! Currently clockevents are
used strictly for time keeping within the kernel, and most architectures
only register those which are intended to be used for this purpose.
We felt a bit nervous about adding code to register all the device's timers
as clockevents, and having the network device driver pick up one of those
for its own use.
>> Signed-off-by: Stuart Menefy <[email protected]>
>> Hacked-by: Giuseppe Cavallaro <[email protected]>
>> ---
>> drivers/clocksource/Makefile | 1 +
>> drivers/clocksource/generictimer.c | 60 ++++++++++++++++++++++++++++++++++++
>> include/linux/generictimer.h | 41 ++++++++++++++++++++++++
>> 3 files changed, 102 insertions(+), 0 deletions(-)
>> create mode 100644 drivers/clocksource/generictimer.c
>> create mode 100644 include/linux/generictimer.h
>
> I don't think it fits that well into the drivers/clocksource directory,
> because you don't actually register a struct clock_event_device or
> struct clocksource.
True. The intent was that this would be a third interface onto timer
hardware, alongside clocksources and clockevents.
> I don't know if this could also be merged with the clocksource infrastructure,
> but your code currently doesn't do that.
It would probably be clockevent rather than clocksource, but it could be if
people felt that was a better way to go.
>> +struct generic_timer *generic_timer_claim(void (*handler) (void *), void *data)
>> +{
>> + struct generic_timer *gt = NULL;
>> +
>> + if (!handler) {
>> + pr_err("%s: invalid handler\n", __func__);
>> + return NULL;
>> + }
>> +
>> + mutex_lock(>_mutex);
>> + if (!list_empty(>_list)) {
>> + struct list_head *list = gt_list.next;
>> + list_del(list);
>> + gt = container_of(list, struct generic_timer, list);
>> + }
>> + mutex_unlock(>_mutex);
>> +
>> + if (!gt)
>> + return NULL;
>> +
>> + /* Prepare the new handler */
>> + gt->priv_handler = handler;
>> + gt->data = data;
>> +
>> + return gt;
>> +}
>
> This does not seem very generic. You put timers into the list and take
> them out again, but don't have any way to deal with timers that match
> specific purposes. It obviously works for your specific use case where
> you register exactly one timer, and use that in exactly one driver.
>
> If more drivers were converted to generic_timer, which is obviously
> the intention, then you might have a situation with very different
> timers (fixed/variable tick, high/low frequencies, accurate/inaccurate),
> or you might have fewer timers than users.
Agreed, this was a first 'keep it simple' pass, maybe its too simple.
On Tuesday 01 March 2011, Stuart Menefy wrote:
> On 24/02/11 17:20, Arnd Bergmann wrote:
>
> > Why is this code useful to you? In the scenarios I've seen, the
> > board can always assign a timer to a specific device in a fixed
> > way that can be describe in a board file or device tree.
>
> What we were trying to do was separate the code which actually manipulates
> the timer hardware from the code which wants that timer service. In this
> case it was a network device driver which is used on multiple SoC devices,
> while the timer hardware tends to differ from device to device.
Right. It certainly makes sense to have an well-defined interface between
the user and the provider of a timer interrupt.
> The other user of this code which we have is an OProfile driver, which
> with this change can now be independent of the hardware it is running on,
> while the previous version manipulated the timer hardware directly.
Ok.
> > Also, what is the difference between this and clkdev?
>
> clkdev can be used to find a struct clk, which is fine if you just want to
> read the time. In this instance we want to get interrupts from the timer
> hardware, which isn't supported by the clk infrastructure.
(adding Russell to Cc)
Is this something that could sensibly be added to clk/clkdev?
> If anything this duplicates clockevents. The main reason for not using
> clockevents was that nobody else does! Currently clockevents are
> used strictly for time keeping within the kernel, and most architectures
> only register those which are intended to be used for this purpose.
> We felt a bit nervous about adding code to register all the device's timers
> as clockevents, and having the network device driver pick up one of those
> for its own use.
I see. Using a clock_event_device for anything but the system timer tick
is currently not supported, so it certainly would not be straightforward.
I think you need a bit of both, clkdev and clockevent. I think the
options you have are:
1. copy the clkdev code to make it possible to associate a device with
a periodic timer.
2. extend the clkdev/clk code to handle periodic interrupts and reuse
the infrastructure there.
3. extend the clockevent code to make it possible for regular device
drivers to use a clockevent source.
I have no idea which makes the most sense (or if there are other ideas).
Maybe Russell, Thomas or John can comment.
Arnd
On Tue, 1 Mar 2011, Stuart Menefy wrote:
> On 24/02/11 17:20, Arnd Bergmann wrote:
> > On Tuesday 22 February 2011, Peppe CAVALLARO wrote:
> >> From: Stuart Menefy <[email protected]>
> >>
> >> Many devices targeted at the embedded market provide a number of
> >> generic timers which are capable of generating interrupts at a
> >> requested rate. These can then be used in the implementation of drivers
> >> for other peripherals which require a timer interrupt, without having
> >> to provide an additional timer as part of that peripheral.
Why can't you just use an hrtimer and be done with it? Just because
there is some extra hardware in the chip?
> If anything this duplicates clockevents. The main reason for not using
> clockevents was that nobody else does! Currently clockevents are
> used strictly for time keeping within the kernel, and most architectures
> only register those which are intended to be used for this purpose.
> We felt a bit nervous about adding code to register all the device's timers
> as clockevents, and having the network device driver pick up one of those
> for its own use.
We had this discussion before and there was never an real outcome as
it turned out that hrtimers provide enough abstraction for this kind
of problems.
> True. The intent was that this would be a third interface onto timer
> hardware, alongside clocksources and clockevents.
>
> > I don't know if this could also be merged with the clocksource infrastructure,
> > but your code currently doesn't do that.
>
> It would probably be clockevent rather than clocksource, but it could be if
> people felt that was a better way to go.
If we get some reasonable explanation why an extra timer interrupt is
solving a specific problem better than hrtimers we can do that, but
that needs more thought than picking the first available clockevent
from a list. If we come to the conclusion, that we want/need this kind
of functionality then I really prefer not to create yet another piece
of infrastructure which deals with timer devices.
Thanks,
tglx
On Tue, Mar 01, 2011 at 05:43:19PM +0100, Arnd Bergmann wrote:
> On Tuesday 01 March 2011, Stuart Menefy wrote:
> > On 24/02/11 17:20, Arnd Bergmann wrote:
> > > Also, what is the difference between this and clkdev?
> >
> > clkdev can be used to find a struct clk, which is fine if you just want to
> > read the time. In this instance we want to get interrupts from the timer
> > hardware, which isn't supported by the clk infrastructure.
>
> (adding Russell to Cc)
>
> Is this something that could sensibly be added to clk/clkdev?
I don't understand - why would anyone want to use clk/clkdev for timers.
clk/clkdev is all about those signals on the SoC which wiggle at regular
intervals between logic 0 and logic 1. It's not about things which count,
which seems to be an entirely separate problem, and hence why there's
nothing to deal with interrupts or setting timeouts etc.
On Tuesday 01 March 2011 21:26:11 Russell King - ARM Linux wrote:
> On Tue, Mar 01, 2011 at 05:43:19PM +0100, Arnd Bergmann wrote:
> > On Tuesday 01 March 2011, Stuart Menefy wrote:
> > > On 24/02/11 17:20, Arnd Bergmann wrote:
> > > > Also, what is the difference between this and clkdev?
> > >
> > > clkdev can be used to find a struct clk, which is fine if you just want to
> > > read the time. In this instance we want to get interrupts from the timer
> > > hardware, which isn't supported by the clk infrastructure.
> >
> > (adding Russell to Cc)
> >
> > Is this something that could sensibly be added to clk/clkdev?
>
> I don't understand - why would anyone want to use clk/clkdev for timers.
> clk/clkdev is all about those signals on the SoC which wiggle at regular
> intervals between logic 0 and logic 1. It's not about things which count,
> which seems to be an entirely separate problem, and hence why there's
> nothing to deal with interrupts or setting timeouts etc.
Ok, I see. I had mixed the two concepts. The clkdev infrastructure
seemed like a nice way to connect a source and a consumer of a timer,
but you're right that it's an entirely separate thing, and as Thomas
said, it's probably not needed either.
Arnd
Hello and many thanks to All for your feedback.
On 3/1/2011 5:48 PM, Thomas Gleixner wrote:
>
> On Tue, 1 Mar 2011, Stuart Menefy wrote:
> > On 24/02/11 17:20, Arnd Bergmann wrote:
> > > On Tuesday 22 February 2011, Peppe CAVALLARO wrote:
> > >> From: Stuart Menefy <[email protected]>
> > >>
> > >> Many devices targeted at the embedded market provide a number of
> > >> generic timers which are capable of generating interrupts at a
> > >> requested rate. These can then be used in the implementation of
> drivers
> > >> for other peripherals which require a timer interrupt, without having
> > >> to provide an additional timer as part of that peripheral.
>
> Why can't you just use an hrtimer and be done with it? Just because
> there is some extra hardware in the chip?
>
> > If anything this duplicates clockevents. The main reason for not using
> > clockevents was that nobody else does! Currently clockevents are
> > used strictly for time keeping within the kernel, and most architectures
> > only register those which are intended to be used for this purpose.
> > We felt a bit nervous about adding code to register all the device's
> timers
> > as clockevents, and having the network device driver pick up one of
> those
> > for its own use.
>
> We had this discussion before and there was never an real outcome as
> it turned out that hrtimers provide enough abstraction for this kind
> of problems.
>
> > True. The intent was that this would be a third interface onto timer
> > hardware, alongside clocksources and clockevents.
> >
> > > I don't know if this could also be merged with the clocksource
> infrastructure,
> > > but your code currently doesn't do that.
> >
> > It would probably be clockevent rather than clocksource, but it
> could be if
> > people felt that was a better way to go.
>
> If we get some reasonable explanation why an extra timer interrupt is
> solving a specific problem better than hrtimers we can do that, but
> that needs more thought than picking the first available clockevent
> from a list. If we come to the conclusion, that we want/need this kind
> of functionality then I really prefer not to create yet another piece
> of infrastructure which deals with timer devices.
>
The initial/main usage of this code was to provide
an interrupt timer to handle the rx/tx processes in
the stmmac Ethernet d.d.
Some Ethernet devices have embedded timer
used for mitigating the number of their DMA interrupts.
So, to be frankly, my first idea was to use an
extra HW (SH4 TMU channel 2), unused on our platforms,
because our MAC hadn't something like that.
I'm not sure if other Ethernet drivers use hrtimer for
handling the whole rx/tx processes (on CC the net-2.6 ML:
please correct me if I'm wrong) and if this could have
impact on the performances or reliability of the network
activity (pkt loss%).
At any rate, I am happy to use the stmmac as experimental
driver to do this kind tests.
Indeed, in the past, on old Kernel (IIRC 2.6.23), I tried to use
the kernel timers but I removed the code from it because
I had noticed packets loss and a strange phenomenon with cyclesoak
(that showed broken sysload % during the heavy network activities).
Let me know how to proceed:
1) experiment with stmmac and hrtimer for handling rx/tx?
2) rework the patches for the Generic Timer Infra?
Best Regards,
Peppe
> Thanks,
>
> tglx
>
On Wednesday 02 March 2011, Peppe CAVALLARO wrote:
> At any rate, I am happy to use the stmmac as experimental
> driver to do this kind tests.
> Indeed, in the past, on old Kernel (IIRC 2.6.23), I tried to use
> the kernel timers but I removed the code from it because
> I had noticed packets loss and a strange phenomenon with cyclesoak
> (that showed broken sysload % during the heavy network activities).
>
> Let me know how to proceed:
>
> 1) experiment with stmmac and hrtimer for handling rx/tx?
> 2) rework the patches for the Generic Timer Infra?
I'd suggest doing the first. I'm surprised that using an unrelated
timer for processing interrupts even helps you on stmmac.
The timers that you'd normally use for rx interrupt mitigation
are not periodic timers but are started when a packet arrives
from the outside.
Doing periodic wakeups for RX instead of just waiting for
packets to come in should have a significant impact on power
management on an otherwise idle system.
For tx resource reclaim, a relatively slow oneshot timer (not
even hrtimer) should be good enough, since it only needs to be
active when there is no other way to clean up. E.g. when you
are in napi polling mode (interrupt disabled), you know that
stmmac_poll gets called soon, and you can also do the reclaim
from stmmac_xmit() in order to prevent the timer from triggering
when you are constantly transmitting.
Arnd
Hi Arnd,
On 3/3/2011 9:45 AM, Arnd Bergmann wrote:
>
> On Wednesday 02 March 2011, Peppe CAVALLARO wrote:
> > At any rate, I am happy to use the stmmac as experimental
> > driver to do this kind tests.
> > Indeed, in the past, on old Kernel (IIRC 2.6.23), I tried to use
> > the kernel timers but I removed the code from it because
> > I had noticed packets loss and a strange phenomenon with cyclesoak
> > (that showed broken sysload % during the heavy network activities).
> >
> > Let me know how to proceed:
> >
> > 1) experiment with stmmac and hrtimer for handling rx/tx?
> > 2) rework the patches for the Generic Timer Infra?
>
> I'd suggest doing the first. I'm surprised that using an unrelated
> timer for processing interrupts even helps you on stmmac.
>
Indeed, this helped especially to save the cpu usage
on heavy IP traffic (but with Max Throughput and no pkt
loss).
> The timers that you'd normally use for rx interrupt mitigation
> are not periodic timers but are started when a packet arrives
> from the outside.
>
Yes you are right but unfortunately our mac devices have
not this kind of HW.
>
> Doing periodic wakeups for RX instead of just waiting for
> packets to come in should have a significant impact on power
> management on an otherwise idle system.
>
To "mitigate" this issue, the driver does a fast (and extra)
check in the rings and it does not start any rx processes
in case there are no incoming frames.
> For tx resource reclaim, a relatively slow oneshot timer (not
> even hrtimer) should be good enough, since it only needs to be
> active when there is no other way to clean up. E.g. when you
> are in napi polling mode (interrupt disabled), you know that
> stmmac_poll gets called soon, and you can also do the reclaim
> from stmmac_xmit() in order to prevent the timer from triggering
> when you are constantly transmitting.
>
This logic is already in the driver, indeed.
What I've seen on our embedded systems is that the
cost of RX interrupts is very hight and NAPI partially helps.
Typically, in an IP-STB, I receive a burst of UDP pkt
and this means that many interrupts occur (~99% of CPU
usage on slow platforms).
With the ext timer I was able to reduce the CPU usage in
these kind of scenarios to ~50%.
When there is no net traffic, indeed, the timer periodically
"disturbs" the system but the impact on the CPU usage
actually is low.
Thanks
Peppe
>
> Arnd
>
On Thursday 03 March 2011, Peppe CAVALLARO wrote:
> This logic is already in the driver, indeed.
> What I've seen on our embedded systems is that the
> cost of RX interrupts is very hight and NAPI partially helps.
> Typically, in an IP-STB, I receive a burst of UDP pkt
> and this means that many interrupts occur (~99% of CPU
> usage on slow platforms).
> With the ext timer I was able to reduce the CPU usage in
> these kind of scenarios to ~50%.
I don't understand. Shouldn't the interrupts be stopped as long
as the system is busy? This sounds like a bug in your NAPI
handling, or maybe you just need to use a lower NAPI_WEIGHT
so you stay in polling mode longer.
Arnd
On 3/3/2011 2:55 PM, Arnd Bergmann wrote:
>
> On Thursday 03 March 2011, Peppe CAVALLARO wrote:
> > This logic is already in the driver, indeed.
> > What I've seen on our embedded systems is that the
> > cost of RX interrupts is very hight and NAPI partially helps.
> > Typically, in an IP-STB, I receive a burst of UDP pkt
> > and this means that many interrupts occur (~99% of CPU
> > usage on slow platforms).
> > With the ext timer I was able to reduce the CPU usage in
> > these kind of scenarios to ~50%.
>
> I don't understand. Shouldn't the interrupts be stopped as long
> as the system is busy? This sounds like a bug in your NAPI
> handling, or maybe you just need to use a lower NAPI_WEIGHT
> so you stay in polling mode longer.
>
Hi Arnd,
yes you are right, in fact, I do not receive an interrupt
for each frame received but the average of the frames
I found in the ring is very low.
At any rate, let me do some tests so i'll come back with
some numbers. In the meantime, I'm looking at the napi
support inside the driver and experiment with lower
NAPI_WEIGHT (default is 64).
Peppe
>
>
> Arnd
>
On Tue, Mar 01, 2011 at 05:48:33PM +0100, Thomas Gleixner wrote:
> On Tue, 1 Mar 2011, Stuart Menefy wrote:
> > On 24/02/11 17:20, Arnd Bergmann wrote:
> > > On Tuesday 22 February 2011, Peppe CAVALLARO wrote:
> > >> From: Stuart Menefy <[email protected]>
> > >>
> > >> Many devices targeted at the embedded market provide a number of
> > >> generic timers which are capable of generating interrupts at a
> > >> requested rate. These can then be used in the implementation of drivers
> > >> for other peripherals which require a timer interrupt, without having
> > >> to provide an additional timer as part of that peripheral.
>
> Why can't you just use an hrtimer and be done with it? Just because
> there is some extra hardware in the chip?
>
..
> If we get some reasonable explanation why an extra timer interrupt is
> solving a specific problem better than hrtimers we can do that, but
> that needs more thought than picking the first available clockevent
> from a list. If we come to the conclusion, that we want/need this kind
> of functionality then I really prefer not to create yet another piece
> of infrastructure which deals with timer devices.
>
I've run in to this issue again while working on local timer support on
SH, as we're presently dependent on broadcast and dummy tick devices in
the SMP case, while quite a few timer channels remain available for use.
(Ironically, while the aforementioned driver was able to solve the
problem with hrtimers, we have the same need to _provide_ hrtimers).
In the sh_tmu/cmt/mtu2 case all timer channels are handed off as platform
devices, and the block itself is global for all CPUs, though we can tweak
the IRQ affinity relative to the cpumask. My tentative plan is to deal
with the clockevent device as a percpu thing in the local timer case
which will require some additional side infrastructure, but some
additional work will need to be done in the clockevents code regardless.
The ARM approach is a bit backwards from what we're interested in
solving, as it uses its own local timer infrastructure and some
TWD-specific API for registering per-cpu timers and only then inserting
them in to the clockevents list. Whereas in our case we've got all of the
timer channels available at platform probe time (earlier for the early
timer case), and simply need a method for tracking unused channels as
well as having a facility for picking them up and reconfiguring them
later on when the secondary CPUs come up. I'm not at all interested in
registering the same timer multiple times with slightly different APIs,
having two different drivers fighting over the same register space is not
a thought that fills me with joy.
In any event, I'll hack on it a bit and see what falls out, patches
later.