2009-12-19 02:20:29

by Steven King

[permalink] [raw]
Subject: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1; but
if I deselect hrtimers && tickless then it works.

--
Steven King -- sfking at fdwdc dot com


Attachments:
(No filename) (172.00 B)
.config (16.89 kB)
Download all attachments

2009-12-19 02:44:47

by john stultz

[permalink] [raw]
Subject: Re: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

On Fri, Dec 18, 2009 at 6:13 PM, Steven King <[email protected]> wrote:
> Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1; but
> if I deselect hrtimers && tickless then it works.

Sorry for the dup, forgot to cc lkml on my reply.

Fails to boot all together? Or does it hang at some point in the dmesg
that you can point out?

Could you run the following so we can narrow down which clocksource your using?
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
cat /sys/devices/system/clocksource/clocksource0/available_clocksource

Then with the kernel that doesn't boot, go through the clocksources
listed in available_clocksources and try booting w/
"clocksource=<clock name>" and see if the behavior changes.

thanks
-john

2009-12-19 03:20:21

by Steven King

[permalink] [raw]
Subject: Re: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

On Friday 18 December 2009 06:44:44 john stultz wrote:
> On Fri, Dec 18, 2009 at 6:13 PM, Steven King <[email protected]> wrote:
> > Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1;
> > but if I deselect hrtimers && tickless then it works.
>
> Sorry for the dup, forgot to cc lkml on my reply.
>
> Fails to boot all together? Or does it hang at some point in the dmesg
> that you can point out?

fails to boot all together; nothing on the serial console.
>
> Could you run the following so we can narrow down which clocksource your
> using? cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>
> Then with the kernel that doesn't boot, go through the clocksources
> listed in available_clocksources and try booting w/
> "clocksource=<clock name>" and see if the behavior changes.

on the working .32 kernel:

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
pit
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
pit

just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt make
any difference.

--
Steven King -- sfking at fdwdc dot com

2009-12-19 04:04:33

by john stultz

[permalink] [raw]
Subject: Re: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

On Fri, 2009-12-18 at 19:20 -0800, Steven King wrote:
> On Friday 18 December 2009 06:44:44 john stultz wrote:
> > On Fri, Dec 18, 2009 at 6:13 PM, Steven King <[email protected]> wrote:
> > > Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1;
> > > but if I deselect hrtimers && tickless then it works.
> >
> > Sorry for the dup, forgot to cc lkml on my reply.
> >
> > Fails to boot all together? Or does it hang at some point in the dmesg
> > that you can point out?
>
> fails to boot all together; nothing on the serial console.
> >
> > Could you run the following so we can narrow down which clocksource your
> > using? cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> >
> > Then with the kernel that doesn't boot, go through the clocksources
> > listed in available_clocksources and try booting w/
> > "clocksource=<clock name>" and see if the behavior changes.
>
> on the working .32 kernel:
>
> # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> pit
> # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> pit
>
> just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt make
> any difference.


Hrmm.. So looking at the code in arch/m68knommu/platform/coldfire/pit.c,
I'm a little confused on how this got marked as a continuous clocksource
(CLOCK_SOURCE_IS_CONTINUOUS), especially as it seems it couldn't handle
skipping an interrupt.

That said, I'm not sure how it worked in 2.6.32, as its been that way
for awhile it seems. Maybe my assumptions on how the PIT works is wrong
(or just biased in how it works on x86)?

Greg, could you clarify how the PIT can be used as a clocksource if its
also being used in oneshot mode?

Steven, I assume the patch below avoids the issue (by disabling highres
timers and nohz)?

thanks
-john



The m68knommu coldfire pit clocksource looks like it was incorrectly
marked as a continuous clocksource. From the looks of it, running with
it marked as a continuous clocksource could cause hangs when the system
switches to highres mode or enables nohz. I have no idea why it worked
in prior kernels, and I'm not 100% sure the following fix is really the
right solution.

This patch removes the CLOCK_SOURCE_IS_CONTINUOUS flag on the coldfire
pit clocksource. This will disallow systems using this clocksource from
entering oneshot mode (disabling highres timers and nohz).

Signed-off-by: John Stultz <[email protected]>

---

diff --git a/arch/m68knommu/platform/coldfire/pit.c b/arch/m68knommu/platform/coldfire/pit.c
index d8720ee..aebea19 100644
--- a/arch/m68knommu/platform/coldfire/pit.c
+++ b/arch/m68knommu/platform/coldfire/pit.c
@@ -146,7 +146,6 @@ static struct clocksource pit_clk = {
.read = pit_read_clk,
.shift = 20,
.mask = CLOCKSOURCE_MASK(32),
- .flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

/***************************************************************************/

2009-12-19 05:06:23

by Steven King

[permalink] [raw]
Subject: Re: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

On Friday 18 December 2009 08:04:23 john stultz wrote:
> On Fri, 2009-12-18 at 19:20 -0800, Steven King wrote:
> > On Friday 18 December 2009 06:44:44 john stultz wrote:
> > > On Fri, Dec 18, 2009 at 6:13 PM, Steven King <[email protected]> wrote:
> > > > Attach is the .config; it works on v2.6.32 but fails to boot on
> > > > .33-rc1; but if I deselect hrtimers && tickless then it works.
> > >
> > > Sorry for the dup, forgot to cc lkml on my reply.
> > >
> > > Fails to boot all together? Or does it hang at some point in the dmesg
> > > that you can point out?
> >
> > fails to boot all together; nothing on the serial console.
> >
> > > Could you run the following so we can narrow down which clocksource
> > > your using? cat
> > > /sys/devices/system/clocksource/clocksource0/current_clocksource cat
> > > /sys/devices/system/clocksource/clocksource0/available_clocksource
> > >
> > > Then with the kernel that doesn't boot, go through the clocksources
> > > listed in available_clocksources and try booting w/
> > > "clocksource=<clock name>" and see if the behavior changes.
> >
> > on the working .32 kernel:
> >
> > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > pit
> > # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > pit
> >
> > just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt
> > make any difference.
>
> Hrmm.. So looking at the code in arch/m68knommu/platform/coldfire/pit.c,
> I'm a little confused on how this got marked as a continuous clocksource
> (CLOCK_SOURCE_IS_CONTINUOUS), especially as it seems it couldn't handle
> skipping an interrupt.
>
> That said, I'm not sure how it worked in 2.6.32, as its been that way
> for awhile it seems. Maybe my assumptions on how the PIT works is wrong
> (or just biased in how it works on x86)?
>
> Greg, could you clarify how the PIT can be used as a clocksource if its
> also being used in oneshot mode?
>
> Steven, I assume the patch below avoids the issue (by disabling highres
> timers and nohz)?

Yes.

I suspect it wasnt working correctly on earlier kernels, we just got away with
it; I had recently added ntpclient to this target but the time reported by
date was always off by some odd amount, I had assume that it was a busybox or
ntpclient issue but hadnt gotten around to tracking it down. With your patch
(or, as I just now verified, on .32 without no_hz and hrtimers) the system
time is now correct. I probably never would have made the connection.

Thank you John!

--
Steven King -- sfking at fdwdc dot com

2009-12-22 00:39:30

by john stultz

[permalink] [raw]
Subject: [PATCH] m68knommu: Fix invalid flags on coldfire pit clocksource

Just re-sending this in case it was missed. Steven tested this and it
seems to be the right fix. Should be 2.6.33 material.

thanks
-john



The m68knommu coldfire pit clocksource looks like it was incorrectly
marked as a continuous clocksource. Running with it marked as a
continuous clocksource could cause hangs when the system switches to
highres mode or enables nohz.

This patch removes the CLOCK_SOURCE_IS_CONTINUOUS flag on the coldfire
pit clocksource. This will disallow systems using this clocksource from
entering oneshot mode (disabling highres timers and nohz).

Signed-off-by: John Stultz <[email protected]>

---

diff --git a/arch/m68knommu/platform/coldfire/pit.c b/arch/m68knommu/platform/coldfire/pit.c
index d8720ee..aebea19 100644
--- a/arch/m68knommu/platform/coldfire/pit.c
+++ b/arch/m68knommu/platform/coldfire/pit.c
@@ -146,7 +146,6 @@ static struct clocksource pit_clk = {
.read = pit_read_clk,
.shift = 20,
.mask = CLOCKSOURCE_MASK(32),
- .flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

/***************************************************************************/


2010-01-11 03:38:17

by Greg Ungerer

[permalink] [raw]
Subject: Re: 2.6.33-rc1: hrtimers and tickless broken on m68knommu.

Hi John,

Sorry for no response on this, I have been away.


john stultz wrote:
> On Fri, 2009-12-18 at 19:20 -0800, Steven King wrote:
>> On Friday 18 December 2009 06:44:44 john stultz wrote:
>>> On Fri, Dec 18, 2009 at 6:13 PM, Steven King <[email protected]> wrote:
>>>> Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1;
>>>> but if I deselect hrtimers && tickless then it works.
>>> Sorry for the dup, forgot to cc lkml on my reply.
>>>
>>> Fails to boot all together? Or does it hang at some point in the dmesg
>>> that you can point out?
>> fails to boot all together; nothing on the serial console.
>>> Could you run the following so we can narrow down which clocksource your
>>> using? cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>>>
>>> Then with the kernel that doesn't boot, go through the clocksources
>>> listed in available_clocksources and try booting w/
>>> "clocksource=<clock name>" and see if the behavior changes.
>> on the working .32 kernel:
>>
>> # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> pit
>> # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> pit
>>
>> just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt make
>> any difference.
>
>
> Hrmm.. So looking at the code in arch/m68knommu/platform/coldfire/pit.c,
> I'm a little confused on how this got marked as a continuous clocksource
> (CLOCK_SOURCE_IS_CONTINUOUS), especially as it seems it couldn't handle
> skipping an interrupt.
>
> That said, I'm not sure how it worked in 2.6.32, as its been that way
> for awhile it seems. Maybe my assumptions on how the PIT works is wrong
> (or just biased in how it works on x86)?
>
> Greg, could you clarify how the PIT can be used as a clocksource if its
> also being used in oneshot mode?

Looks broken. As Steven noted we just seem to have gotten away
with it, with no obvious breakage so far.

I see from follow-ons to this that Andrew has picked it up.
So I'll just ack that and let him push it.

Thanks
Greg


> Steven, I assume the patch below avoids the issue (by disabling highres
> timers and nohz)?
>
> thanks
> -john
>
>
>
> The m68knommu coldfire pit clocksource looks like it was incorrectly
> marked as a continuous clocksource. From the looks of it, running with
> it marked as a continuous clocksource could cause hangs when the system
> switches to highres mode or enables nohz. I have no idea why it worked
> in prior kernels, and I'm not 100% sure the following fix is really the
> right solution.
>
> This patch removes the CLOCK_SOURCE_IS_CONTINUOUS flag on the coldfire
> pit clocksource. This will disallow systems using this clocksource from
> entering oneshot mode (disabling highres timers and nohz).
>
> Signed-off-by: John Stultz <[email protected]>
>
> ---
>
> diff --git a/arch/m68knommu/platform/coldfire/pit.c b/arch/m68knommu/platform/coldfire/pit.c
> index d8720ee..aebea19 100644
> --- a/arch/m68knommu/platform/coldfire/pit.c
> +++ b/arch/m68knommu/platform/coldfire/pit.c
> @@ -146,7 +146,6 @@ static struct clocksource pit_clk = {
> .read = pit_read_clk,
> .shift = 20,
> .mask = CLOCKSOURCE_MASK(32),
> - .flags = CLOCK_SOURCE_IS_CONTINUOUS,
> };
>
> /***************************************************************************/
>
>
>


--
------------------------------------------------------------------------
Greg Ungerer -- Principal Engineer EMAIL: [email protected]
SnapGear Group, McAfee PHONE: +61 7 3435 2888
8 Gardner Close FAX: +61 7 3217 5323
Milton, QLD, 4064, Australia WEB: http://www.SnapGear.com