2009-07-24 14:07:53

by Victor Mataré

[permalink] [raw]
Subject: clock freezes??

Hello,

I have a dual Xeon server (old Xeon HT) with an Intel E7505 chipset,
with hrtimer and dynticks enabled. On bootup, the kernel
(2.6.29-gentoo-r5) tells me it's using the PM-Timer bug workaround, but
then it uses tsc as clocksource. Now the clock was running slow for
about 15sec/12hrs, which is quite a lot. So in a careless moment, I just
tried "echo jiffies > clocksource0/current_clocksource". This froze the
system time. Now I couldn't switch back to tsc or acpi_pm, echoing those
was just ignored. Subsequently, the entire system locked up and I needed
to reboot.

Now what does that mean? Is this supposed to happen? Should I disable
dynticks and/or hrtimer?

thanks...

--
Victor Matar?
Server- & Netzwerk-Administration

Lehrstuhl f?r Ingenieurgeologie und Hydrogeologie der RWTH-Aachen
Lochnerstra?e 4-20, 52064 Aachen

Tel: 0241 80 96778
Fax: 0241 80 92280


2009-07-28 21:46:42

by john stultz

[permalink] [raw]
Subject: Re: clock freezes??

On Fri, Jul 24, 2009 at 7:07 AM, Victor Matar?<[email protected]> wrote:
> I have a dual Xeon server (old Xeon HT) with an Intel E7505 chipset,
> with hrtimer and dynticks enabled. On bootup, the kernel
> (2.6.29-gentoo-r5) tells me it's using the PM-Timer bug workaround, but
> then it uses tsc as clocksource. Now the clock was running slow for
> about 15sec/12hrs, which is quite a lot. So in a careless moment, I just
> tried "echo jiffies > clocksource0/current_clocksource". This froze the
> system time. Now I couldn't switch back to tsc or acpi_pm, echoing those
> was just ignored. Subsequently, the entire system locked up and I needed
> to reboot.
>
> Now what does that mean? Is this supposed to happen? Should I disable
> dynticks and/or hrtimer?

The system lockup is a known issue and should be resolved with the
following commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f68535adad8dd89499505a65fb25d0e02d118cc

I might be curious if you could expand a bit more about the clock skew
(15sec per 12 hours) you're seeing. Are you running NTP? Do you have
the output of ntpdc -c kerninfo , ntpdc -c peers? Do you see lots of
ntp messages in /var/log/messages or /var/log/syslog ?

thanks
-john