2005-09-01 06:30:37

by Thomas Schlichter

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

Hi Srivatsa,

on LKML I did see your patch trying to increase the accuracy of tme pmtmr by
directly converting the PM-timer-ticks to jiffies. I think this is a good
idea but as you already recognized, it is not completely correct...

There are at least these issues:
1. "offset_last" corresponds to the time when the last recognized
jiffyoccoured. So "delta" always corresponds to the time from the last
recognized jiffy to _now_. "monotonic_base" is increased by delta, so after
the first run it will correspond to _now_. But the next time the
offset-time between the last recognized jiffy and the last _now_ is added
_again_. So the monotonic clock ist too fast...
2. "offset_last is modified outside the "monotonic_lock", what is not allowed.

I fixed these issues by using most of the old code, but simply changed "delta"
and "offset_delay" to always contain PM-timer-ticks and compute the lost
jiffies directly using PMTMR_TICKS_PER_JIFFY.

I tested the attached patch during the last night and it sems to work...

Best regards
Thomas Schlichter


Attachments:
(No filename) (0.00 B)
signed data
(No filename) (189.00 B)
signature
Download all attachments

2005-09-01 07:23:55

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

On Thu, Sep 01, 2005 at 08:29:32AM +0200, Thomas Schlichter wrote:
> I tested the attached patch during the last night and it sems to work...

A quick feedback on your patch:

A litmus test that I use is if "zero" lost ticks are being hit,
which we should not w/o a patch like dynamic tick.

I still see zero lost ticks being reported with your patch (during
bootup atleast) which means all is still not well?


--


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

2005-09-01 07:43:05

by Thomas Schlichter

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

Am Donnerstag, 1. September 2005 09:23 schrieb Srivatsa Vaddagiri:
> On Thu, Sep 01, 2005 at 08:29:32AM +0200, Thomas Schlichter wrote:
> > I tested the attached patch during the last night and it sems to work...
>
> A quick feedback on your patch:
>
> A litmus test that I use is if "zero" lost ticks are being hit,
> which we should not w/o a patch like dynamic tick.
>
> I still see zero lost ticks being reported with your patch (during
> bootup atleast) which means all is still not well?

I think this can happen due to this:
http://bugzilla.kernel.org/show_bug.cgi?id=5127

Think about two adjacent regular timer interrupts. Now consider the first one
is handled very late (indeed even after the second interrupt already
occoured). Then will see two "lost" ticks.

Now directly the second timer interrupt handler is executed and, well it sees
there has _nearly_ no time passed, so no "lost" ticks are reported.

I think this could explain it, right?!

Best regards
Thomas Schlichter

2005-09-01 10:29:23

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

On Thu, Sep 01, 2005 at 09:42:23AM +0200, Thomas Schlichter wrote:
> Think about two adjacent regular timer interrupts. Now consider the first one
> is handled very late (indeed even after the second interrupt already
> occoured). Then will see two "lost" ticks.
>
> Now directly the second timer interrupt handler is executed and, well it sees
> there has _nearly_ no time passed, so no "lost" ticks are reported.

Thomas,
In such a case, we should not increment jiffie during second interrupt
(mainline code increments jiffie always on a timer interrupt).

Also in such a case, it is probably not a good idea to drop offset_delay.
I think we should retain it - otherwise next tick will show up as zero ticks
lost?). There is another complication that the offset_tick recorded during
init_pmtmr may not be aligned on jiffy boundary. Ideally we want offset_tick
to be aligned as closely as possible to jiffy boundary. This is one of the
reason why I added setup_pit_timer in my patch during init_pmtmr.

After all this, I think the patch you sent out and what I had
sent are fundamentally same - they rely on some constant ticks per jiffy to be
seen for lost tick calculation. I have seen this works on some machines and
not on others. I am trying to come up with another way of calculating
lost ticks, which partially depends on some known PMTMR_TICKS_PER_JIFFY
_and_ also reads the PIT to find out how late we are in processing
the interrupt (refer delay_at_last_interrupt calculation in timer_tsc.c).

Note that if you are testing mainline kernel, probably it wont test
the lost tick calculation code as much as dynamic tick does, since
not many ticks may be lost in practice. This could be one of the reasons
why no one has probably complained about time drift bitterly in mainline!
I would be much happy to accept a solution which works with dynamic tick.
For this you will need to apply the set of patches I mailed out.


P.S : I think PMTMR_TICKS_PER_JIFFY has to be calculated differently,
since a tick is not exactly 1/HZ of a second (see definition
of LATCH and pm_ticks_per_jiffy in my patch).

--


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

2005-09-01 11:05:45

by Thomas Schlichter

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

Hi Srivatsa,

Am Donnerstag, 1. September 2005 12:28 schrieb Srivatsa Vaddagiri:
> On Thu, Sep 01, 2005 at 09:42:23AM +0200, Thomas Schlichter wrote:
> > Think about two adjacent regular timer interrupts. Now consider the first
> > one is handled very late (indeed even after the second interrupt already
> > occoured). Then will see two "lost" ticks.
> >
> > Now directly the second timer interrupt handler is executed and, well it
> > sees there has _nearly_ no time passed, so no "lost" ticks are reported.
>
> Thomas,
> In such a case, we should not increment jiffie during second interrupt
> (mainline code increments jiffie always on a timer interrupt).

I know, that's a problem...

> Also in such a case, it is probably not a good idea to drop offset_delay.
> I think we should retain it - otherwise next tick will show up as zero
> ticks lost?).

Well, if lost < 1, than we have a problem. And maybe the better idea really is
to keep offset_delay. I simply wanted to stay as close as possible to the
original code.

> There is another complication that the offset_tick recorded
> during init_pmtmr may not be aligned on jiffy boundary. Ideally we want
> offset_tick to be aligned as closely as possible to jiffy boundary. This is
> one of the reason why I added setup_pit_timer in my patch during
> init_pmtmr.

OK, now I see...
Yes, so of course, using setup_pit_timer in init_pmtmr makes a lot of sense...
(But my mistake should not influence more than the first two jiffies, I think)

> After all this, I think the patch you sent out and what I had
> sent are fundamentally same - they rely on some constant ticks per jiffy to
> be seen for lost tick calculation.

Yes, the only real differences are the two points mentioned in my first
mail... I only wanted to help you fixing these.

> I have seen this works on some machines
> and not on others. I am trying to come up with another way of calculating
> lost ticks, which partially depends on some known PMTMR_TICKS_PER_JIFFY
> _and_ also reads the PIT to find out how late we are in processing the
> interrupt (refer delay_at_last_interrupt calculation in timer_tsc.c).

Well, that seems to be fine. If you want somebody to have a look over your
final patch, feel free to mail me...

> Note that if you are testing mainline kernel, probably it wont test
> the lost tick calculation code as much as dynamic tick does, since
> not many ticks may be lost in practice.

Yes, I am testing the mainline kernel, and I don't have any problems with time
drift...

> This could be one of the reasons
> why no one has probably complained about time drift bitterly in mainline!
> I would be much happy to accept a solution which works with dynamic tick.
> For this you will need to apply the set of patches I mailed out.

Well, I don't think I'll have enough spare time to test dynamic tick
extensively, but maybe I'll have a look at them this evening...

> P.S : I think PMTMR_TICKS_PER_JIFFY has to be calculated differently,
> since a tick is not exactly 1/HZ of a second (see definition
> of LATCH and pm_ticks_per_jiffy in my patch).

You are surely right, I used my simple macro because I am not _that_ familiar
with LATCH and CALIBRATE_LATCH and thus used the most intuitive way. So maybe
I should have defined PMTMR_TICKS_PER_JIFFY like you assigned
pm_ticks_per_jiffy (why did you use a static and constant variable and not a
macro?):

#define PMTMR_TICKS_PER_JIFFY (PMTMR_EXPECTED_RATE / (CALIBRATE_LATCH/LATCH))

Thomas

2005-09-01 11:35:13

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick

On Thu, Sep 01, 2005 at 01:05:23PM +0200, Thomas Schlichter wrote:
> Yes, the only real differences are the two points mentioned in my first
> mail... I only wanted to help you fixing these.

Thanks for pointing them out. I have fixed it in the experimental version
that I have now.

> Well, that seems to be fine. If you want somebody to have a look over your
> final patch, feel free to mail me...

sure!

> I should have defined PMTMR_TICKS_PER_JIFFY like you assigned
> pm_ticks_per_jiffy (why did you use a static and constant variable and not a
> macro?):

I actually started with using the calibrated value of PMTMR_TICKS_PER_JIFFY
rather than compile time constant (as found in verify_pmtmr), but later dropped
the idea since I didnt get good results with it. After that didnt revert
back pm_ticks_per_jiffy to be a macro. But good point, will include
this in the next patch that I am trying out. Will post it if I happen
to have success :)



--


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017