2007-08-23 20:22:22

by Paolo Ornati

[permalink] [raw]
Subject: "double" hpet clocksource && hard freeze [bisected]

Hi,

since commit:

commit 0aa366f351d044703e25c8425e508170e80d83b1
Author: Tony Luck <[email protected]>
Date: Fri Jul 20 11:22:30 2007 -0700

[IA64] Convert to generic timekeeping/clocksource

This is a merge of Peter Keilty's initial patch (which was
revived by Bob Picco) for this with Hidetoshi Seto's fixes
and scaling improvements.


I have a double "hpet" entry in "available_clocksource":
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet hpet acpi_pm jiffies


And, more interesting, if I select hpet as clocksource (tsc is used by
default here) I get an hard freeze some time later (few minutes)...
SysRQ don't work ;)


config & dmesg attached


Motherboard is a Intel DG965SS

CPU:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz
stepping : 6
cpu MHz : 1864.804
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 3731.75
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz
stepping : 6
cpu MHz : 1864.804
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 3729.63
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

--
Paolo Ornati
Linux 2.6.22-g5bae7ac9 on x86_64


2007-08-23 20:23:32

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Thu, 23 Aug 2007 22:21:15 +0200
Paolo Ornati <[email protected]> wrote:

> config & dmesg attached

ehmm..

--
Paolo Ornati
Linux 2.6.22-g5bae7ac9 on x86_64


Attachments:
(No filename) (166.00 B)
config.gz (8.27 kB)
dmesg.gz (7.75 kB)
Download all attachments

2007-08-23 20:41:26

by Tony Luck

[permalink] [raw]
Subject: RE: "double" hpet clocksource && hard freeze [bisected]

> I have a double "hpet" entry in "available_clocksource":
> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> tsc hpet hpet acpi_pm jiffies

Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
both register a clocksource named "hpet". Probably a result of bringing
back to life a long lost patch, and having someone else (John Stultz, according
to git blame) make a similar change to a different file in the intervening
time.

Presumably the thing to do would be merge the x86_64 specific version
into the drivers/char/hpet.c version?

-Tony

2007-08-23 21:06:17

by john stultz

[permalink] [raw]
Subject: RE: "double" hpet clocksource && hard freeze [bisected]

On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > I have a double "hpet" entry in "available_clocksource":
> > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > tsc hpet hpet acpi_pm jiffies
>
> Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> both register a clocksource named "hpet". Probably a result of bringing
> back to life a long lost patch, and having someone else (John Stultz, according
> to git blame) make a similar change to a different file in the intervening
> time.
>
> Presumably the thing to do would be merge the x86_64 specific version
> into the drivers/char/hpet.c version?

Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
duplication, but at the moment I'm not comfortable that the
driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
why the shift value is only 10?).


I'm a little surprised by this, as the clocksource code use to prevent
duplicate named clocksources from being registered, so I'm not sure how
that check got dropped. Also I'm not quite sure I see where the hard
freeze is coming from.

My initial reaction would be to either ifdef ia64 implementation in
drivers/char/hpet.c or move the code under the ia64 arch dir until it is
really usable by all arches.

Bob, your thoughts?

2007-08-23 21:42:09

by Bob Picco

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

john stultz wrote: [Thu Aug 23 2007, 05:05:35PM EDT]
> On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > I have a double "hpet" entry in "available_clocksource":
> > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > tsc hpet hpet acpi_pm jiffies
> >
> > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > both register a clocksource named "hpet". Probably a result of bringing
> > back to life a long lost patch, and having someone else (John Stultz, according
> > to git blame) make a similar change to a different file in the intervening
> > time.
> >
> > Presumably the thing to do would be merge the x86_64 specific version
> > into the drivers/char/hpet.c version?
>
> Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> duplication, but at the moment I'm not comfortable that the
> driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> why the shift value is only 10?).
No I don't have a clue why Pete chose this value.
>
>
> I'm a little surprised by this, as the clocksource code use to prevent
> duplicate named clocksources from being registered, so I'm not sure how
> that check got dropped. Also I'm not quite sure I see where the hard
> freeze is coming from.
>
> My initial reaction would be to either ifdef ia64 implementation in
> drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> really usable by all arches.
>
> Bob, your thoughts?
It appears the ACPI for this platform might work. We don't know because
of a hpet driver probe error discussed below. I assume you're suggesting
the driver is only required by ia64? I think that might not be true.

Well I'm slightly confused. The fs_initcall was first into hpet_alloc.
It appears ACPI discovery failed during driver initialization because
of:
hpet_resources: 0xfed00000 is busy
from dmesg. So why do we have a second hpet registered? Also hpet_alloc
is suspose to check for redundant registration. I need to look more
tomorrow.
>
bob

2007-08-23 21:43:07

by john stultz

[permalink] [raw]
Subject: RE: "double" hpet clocksource && hard freeze [bisected]

On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > I have a double "hpet" entry in "available_clocksource":
> > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > tsc hpet hpet acpi_pm jiffies
> >
> > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > both register a clocksource named "hpet". Probably a result of bringing
> > back to life a long lost patch, and having someone else (John Stultz, according
> > to git blame) make a similar change to a different file in the intervening
> > time.
> >
> > Presumably the thing to do would be merge the x86_64 specific version
> > into the drivers/char/hpet.c version?
>
> Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> duplication, but at the moment I'm not comfortable that the
> driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> why the shift value is only 10?).
>
>
> I'm a little surprised by this, as the clocksource code use to prevent
> duplicate named clocksources from being registered, so I'm not sure how
> that check got dropped. Also I'm not quite sure I see where the hard
> freeze is coming from.
>
> My initial reaction would be to either ifdef ia64 implementation in
> drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> really usable by all arches.

Here is a possible quick fix. I'm open to other approaches, but I also
want to avoid too much churn before 2.6.23 goes out.

Paolo, could you verify this fixes the issue for you?

thanks
-john

diff --git a/drivers/char/hpet.c b/drivers/char/hpet.c
index 77bf4aa..c782d8c 100644
--- a/drivers/char/hpet.c
+++ b/drivers/char/hpet.c
@@ -78,7 +78,17 @@ static struct clocksource clocksource_hpet = {
.shift = 10,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};
+
+/* XXX - FIXME: i386, x86_64 and ia64 all have separate
+ * hpet clocksource implementations. They should be merged
+ * and this would be a good place for it.
+ * Right now this is ia64 only.
+ */
+#ifdef CONFIG_IA64
static struct clocksource *hpet_clocksource;
+#else /* this isn't generic enough to use for everyone yet */
+static struct clocksource *hpet_clocksource = (struct clocksource*)0xdead;
+#endif

/* A lock for concurrent access by app and isr hpet activity. */
static DEFINE_SPINLOCK(hpet_lock);


2007-08-24 07:03:57

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Thu, 23 Aug 2007 17:38:49 -0400
"Bob Picco" <[email protected]> wrote:

> It appears ACPI discovery failed during driver initialization because
> of:
> hpet_resources: 0xfed00000 is busy

Note: this was always there as far as I can remember.

--
Paolo Ornati
Linux 2.6.23-rc3-g1a8f4610-dirty on x86_64

2007-08-24 07:05:34

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Thu, 23 Aug 2007 14:41:45 -0700
john stultz <[email protected]> wrote:

> > My initial reaction would be to either ifdef ia64 implementation in
> > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> > really usable by all arches.
>
> Here is a possible quick fix. I'm open to other approaches, but I also
> want to avoid too much churn before 2.6.23 goes out.
>
> Paolo, could you verify this fixes the issue for you?

It works: there's only one "hpet" in "available_clocksource" and also
the hard freeze using it seems gone.

:)

--
Paolo Ornati
Linux 2.6.23-rc3-g1a8f4610-dirty on x86_64

2007-08-24 09:13:27

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Thu, 23 Aug 2007 14:05:35 -0700
john stultz <[email protected]> wrote:

> Also I'm not quite sure I see where the hard
> freeze is coming from.

Using plain 2.6.23-rc3-g1a8f4610 and "nmi_watchdog=1" I've managed to
capture a Kernel Panic (about 40 seconds after switching to hpet as
clocksource):

http://img170.imageshack.us/img170/1552/2hpetoopshw9.jpg

it's stuck in "update_wall_time()".

--
Paolo Ornati
Linux 2.6.23-rc3-g1a8f4610-dirty on x86_64

2007-08-24 12:46:46

by Bob Picco

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

john stultz wrote: [Thu Aug 23 2007, 05:41:45PM EDT]
> On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> > On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > > I have a double "hpet" entry in "available_clocksource":
> > > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > > tsc hpet hpet acpi_pm jiffies
> > >
> > > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > > both register a clocksource named "hpet". Probably a result of bringing
> > > back to life a long lost patch, and having someone else (John Stultz, according
> > > to git blame) make a similar change to a different file in the intervening
> > > time.
> > >
> > > Presumably the thing to do would be merge the x86_64 specific version
> > > into the drivers/char/hpet.c version?
> >
> > Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> > duplication, but at the moment I'm not comfortable that the
> > driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> > why the shift value is only 10?).
> >
> >
> > I'm a little surprised by this, as the clocksource code use to prevent
> > duplicate named clocksources from being registered, so I'm not sure how
> > that check got dropped. Also I'm not quite sure I see where the hard
> > freeze is coming from.
> >
> > My initial reaction would be to either ifdef ia64 implementation in
> > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> > really usable by all arches.
>
> Here is a possible quick fix. I'm open to other approaches, but I also
> want to avoid too much churn before 2.6.23 goes out.
>
> Paolo, could you verify this fixes the issue for you?
>
> thanks
> -john
>
[snip]

I saw what was missed by me in my brief examination of this last night.
The platform registers the hpet clocksource too.

Instead of adding the config flag to hpet driver, how about the patch
below? Since you already check for duplication by address then adding
a check for by name too seems okay to me.

bob


Prevent duplicate names being registered with clocksource. This also
eliminates the duplication of hpet clock registration when the arch
uses the hpet timer and the hpet driver does too. The patch was
compile and link tested.


Signed-off-by: Bob Picco <[email protected]>

kernel/time/clocksource.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc3/kernel/time/clocksource.c
===================================================================
--- linux-2.6.23-rc3.orig/kernel/time/clocksource.c 2007-08-23 16:44:03.000000000 -0400
+++ linux-2.6.23-rc3/kernel/time/clocksource.c 2007-08-24 08:36:41.000000000 -0400
@@ -281,7 +281,7 @@ static int clocksource_enqueue(struct cl
struct clocksource *cs;

cs = list_entry(tmp, struct clocksource, list);
- if (cs == c)
+ if (cs == c || !strcmp(cs->name, c->name))
return -EBUSY;
/* Keep track of the place, where to insert */
if (cs->rating >= c->rating)

2007-08-24 13:28:18

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Fri, 24 Aug 2007 08:46:31 -0400
"Bob Picco" <[email protected]> wrote:

> Prevent duplicate names being registered with clocksource. This also
> eliminates the duplication of hpet clock registration when the arch
> uses the hpet timer and the hpet driver does too. The patch was
> compile and link tested.

This one works too.

--
Paolo Ornati
Linux 2.6.23-rc3-g1a8f4610-dirty on x86_64

2007-08-24 16:04:33

by Tony Luck

[permalink] [raw]
Subject: RE: "double" hpet clocksource && hard freeze [bisected]

> > Prevent duplicate names being registered with clocksource. This also
> > eliminates the duplication of hpet clock registration when the arch
> > uses the hpet timer and the hpet driver does too. The patch was
> > compile and link tested.
>
> This one works too.

It is good to avoid registering two clocksources with the same name, but
the fix might be a bit more fragile than the eariler one that temporarily
marked the drivers/char/hpet.c one as CONFIG_IA64 only. Given that the
hang went away when you applied the earlier patch, I conclude that the
drivers/char/hpet.c code is the one that got selected when you had two
"hpet" entries ... and that there is something wrong with that code that
doesn't work right on x86_64. The fix to prevent registering a duplicate
name is presumably working for you simply because arch/x86_64/kernel/hpet.c
happens to get there first with its "hpet", so the drivers/char/hpet.c one
is dropped. If something changed that reversed the order of these registrations,
then you'd get the "hpet" clocksource that results in a hang.

-Tony

2007-08-24 16:13:44

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Fri, 24 Aug 2007 09:04:22 -0700
"Luck, Tony" <[email protected]> wrote:

> It is good to avoid registering two clocksources with the same name, but
> the fix might be a bit more fragile than the eariler one that temporarily
> marked the drivers/char/hpet.c one as CONFIG_IA64 only. Given that the
> hang went away when you applied the earlier patch, I conclude that the
> drivers/char/hpet.c code is the one that got selected when you had two
> "hpet" entries ... and that there is something wrong with that code that
> doesn't work right on x86_64. The fix to prevent registering a duplicate
> name is presumably working for you simply because arch/x86_64/kernel/hpet.c
> happens to get there first with its "hpet", so the drivers/char/hpet.c one
> is dropped. If something changed that reversed the order of these registrations,
> then you'd get the "hpet" clocksource that results in a hang.

100% agree

--
Paolo Ornati
Linux 2.6.23-rc3-g1a8f4610-dirty on x86_64

2007-08-24 18:18:19

by john stultz

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Fri, 2007-08-24 at 08:46 -0400, Bob Picco wrote:
> john stultz wrote: [Thu Aug 23 2007, 05:41:45PM EDT]
> > On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> > > On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > > > I have a double "hpet" entry in "available_clocksource":
> > > > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > > > tsc hpet hpet acpi_pm jiffies
> > > >
> > > > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > > > both register a clocksource named "hpet". Probably a result of bringing
> > > > back to life a long lost patch, and having someone else (John Stultz, according
> > > > to git blame) make a similar change to a different file in the intervening
> > > > time.
> > > >
> > > > Presumably the thing to do would be merge the x86_64 specific version
> > > > into the drivers/char/hpet.c version?
> > >
> > > Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> > > duplication, but at the moment I'm not comfortable that the
> > > driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> > > why the shift value is only 10?).
> > >
> > >
> > > I'm a little surprised by this, as the clocksource code use to prevent
> > > duplicate named clocksources from being registered, so I'm not sure how
> > > that check got dropped. Also I'm not quite sure I see where the hard
> > > freeze is coming from.
> > >
> > > My initial reaction would be to either ifdef ia64 implementation in
> > > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> > > really usable by all arches.
> >
> > Here is a possible quick fix. I'm open to other approaches, but I also
> > want to avoid too much churn before 2.6.23 goes out.
> >
> > Paolo, could you verify this fixes the issue for you?
> >
> > thanks
> > -john
> >
> [snip]
>
> I saw what was missed by me in my brief examination of this last night.
> The platform registers the hpet clocksource too.
>
> Instead of adding the config flag to hpet driver, how about the patch
> below? Since you already check for duplication by address then adding
> a check for by name too seems okay to me.
>
> bob
>
>
> Prevent duplicate names being registered with clocksource. This also
> eliminates the duplication of hpet clock registration when the arch
> uses the hpet timer and the hpet driver does too. The patch was
> compile and link tested.

Yea. While I'm still not completely comfortable leaving this up to boot
order alone (the ia64 hpet clocksource is clearly causing issues on
x86_64), I think this patch is something we need as well.

>
> Signed-off-by: Bob Picco <[email protected]>

Acked-by: John Stultz <[email protected]>

> kernel/time/clocksource.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6.23-rc3/kernel/time/clocksource.c
> ===================================================================
> --- linux-2.6.23-rc3.orig/kernel/time/clocksource.c 2007-08-23 16:44:03.000000000 -0400
> +++ linux-2.6.23-rc3/kernel/time/clocksource.c 2007-08-24 08:36:41.000000000 -0400
> @@ -281,7 +281,7 @@ static int clocksource_enqueue(struct cl
> struct clocksource *cs;
>
> cs = list_entry(tmp, struct clocksource, list);
> - if (cs == c)
> + if (cs == c || !strcmp(cs->name, c->name))
> return -EBUSY;
> /* Keep track of the place, where to insert */
> if (cs->rating >= c->rating)

2007-08-24 18:43:57

by john stultz

[permalink] [raw]
Subject: RE: "double" hpet clocksource && hard freeze [bisected]

On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > I have a double "hpet" entry in "available_clocksource":
> > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > tsc hpet hpet acpi_pm jiffies
> >
> > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > both register a clocksource named "hpet". Probably a result of bringing
> > back to life a long lost patch, and having someone else (John Stultz, according
> > to git blame) make a similar change to a different file in the intervening
> > time.
> >
> > Presumably the thing to do would be merge the x86_64 specific version
> > into the drivers/char/hpet.c version?
>
> Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> duplication, but at the moment I'm not comfortable that the
> driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> why the shift value is only 10?).
>
>
> I'm a little surprised by this, as the clocksource code use to prevent
> duplicate named clocksources from being registered, so I'm not sure how
> that check got dropped. Also I'm not quite sure I see where the hard
> freeze is coming from.
>
> My initial reaction would be to either ifdef ia64 implementation in
> drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> really usable by all arches.

Ok, since no one screamed too badly about this one, and it does fix this
issue, I'm sending it out for reals.

I think Bob's patch which addresses duplicate clocksources should go in,
but this one is a little more forceful in keeping the wrong hpet
clocksource from possibly being selected.

Andrew, would you mind picking this up?

thanks
-john



The ia64 hpet clocksource was implemented in generic code, and is not
yet ready to replace the i386 and x86_64 implementations. This results
in multiple hpet clocksources being registered, and if the ia64 one is
chosen on x86_64 some users have experienced hangs.

This patch only allows the generic implementation to be used on ia64,
until the duplicate i386 and x86_64 versions can be merged in and
tested.

Signed-off-by: John Stultz <[email protected]>


diff --git a/drivers/char/hpet.c b/drivers/char/hpet.c
index 77bf4aa..c782d8c 100644
--- a/drivers/char/hpet.c
+++ b/drivers/char/hpet.c
@@ -78,7 +78,17 @@ static struct clocksource clocksource_hpet = {
.shift = 10,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};
+
+/* XXX - FIXME: i386, x86_64 and ia64 all have separate
+ * hpet clocksource implementations. They should be merged
+ * and this would be a good place for it.
+ * Right now this is ia64 only.
+ */
+#ifdef CONFIG_IA64
static struct clocksource *hpet_clocksource;
+#else /* this isn't generic enough to use for everyone yet */
+static struct clocksource *hpet_clocksource = (struct clocksource*)0xdead;
+#endif

/* A lock for concurrent access by app and isr hpet activity. */
static DEFINE_SPINLOCK(hpet_lock);


Subject: Re: "double" hpet clocksource && hard freeze [bisected]

Em Fri, 24 Aug 2007 11:17:34 -0700
john stultz <[email protected]> escreveu:

| On Fri, 2007-08-24 at 08:46 -0400, Bob Picco wrote:
| > john stultz wrote: [Thu Aug 23 2007, 05:41:45PM EDT]
| > > On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
| > > > On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
| > > > > > I have a double "hpet" entry in "available_clocksource":
| > > > > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
| > > > > > tsc hpet hpet acpi_pm jiffies
| > > > >
| > > > > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
| > > > > both register a clocksource named "hpet". Probably a result of bringing
| > > > > back to life a long lost patch, and having someone else (John Stultz, according
| > > > > to git blame) make a similar change to a different file in the intervening
| > > > > time.
| > > > >
| > > > > Presumably the thing to do would be merge the x86_64 specific version
| > > > > into the drivers/char/hpet.c version?
| > > >
| > > > Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
| > > > duplication, but at the moment I'm not comfortable that the
| > > > driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
| > > > why the shift value is only 10?).
| > > >
| > > >
| > > > I'm a little surprised by this, as the clocksource code use to prevent
| > > > duplicate named clocksources from being registered, so I'm not sure how
| > > > that check got dropped. Also I'm not quite sure I see where the hard
| > > > freeze is coming from.
| > > >
| > > > My initial reaction would be to either ifdef ia64 implementation in
| > > > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
| > > > really usable by all arches.
| > >
| > > Here is a possible quick fix. I'm open to other approaches, but I also
| > > want to avoid too much churn before 2.6.23 goes out.
| > >
| > > Paolo, could you verify this fixes the issue for you?
| > >
| > > thanks
| > > -john
| > >
| > [snip]
| >
| > I saw what was missed by me in my brief examination of this last night.
| > The platform registers the hpet clocksource too.
| >
| > Instead of adding the config flag to hpet driver, how about the patch
| > below? Since you already check for duplication by address then adding
| > a check for by name too seems okay to me.
| >
| > bob
| >
| >
| > Prevent duplicate names being registered with clocksource. This also
| > eliminates the duplication of hpet clock registration when the arch
| > uses the hpet timer and the hpet driver does too. The patch was
| > compile and link tested.
|
| Yea. While I'm still not completely comfortable leaving this up to boot
| order alone (the ia64 hpet clocksource is clearly causing issues on
| x86_64), I think this patch is something we need as well.

Does -stable need this too?

--
Luiz Fernando N. Capitulino

2007-08-27 21:52:29

by john stultz

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Mon, 2007-08-27 at 17:34 -0300, Luiz Fernando N. Capitulino wrote:
> Em Fri, 24 Aug 2007 11:17:34 -0700
> john stultz <[email protected]> escreveu:
>
> | On Fri, 2007-08-24 at 08:46 -0400, Bob Picco wrote:
> | > john stultz wrote: [Thu Aug 23 2007, 05:41:45PM EDT]
> | > > On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> | > > > On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> | > > > > > I have a double "hpet" entry in "available_clocksource":
> | > > > > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> | > > > > > tsc hpet hpet acpi_pm jiffies
> | > > > >
> | > > > > Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> | > > > > both register a clocksource named "hpet". Probably a result of bringing
> | > > > > back to life a long lost patch, and having someone else (John Stultz, according
> | > > > > to git blame) make a similar change to a different file in the intervening
> | > > > > time.
> | > > > >
> | > > > > Presumably the thing to do would be merge the x86_64 specific version
> | > > > > into the drivers/char/hpet.c version?
> | > > >
> | > > > Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> | > > > duplication, but at the moment I'm not comfortable that the
> | > > > driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> | > > > why the shift value is only 10?).
> | > > >
> | > > >
> | > > > I'm a little surprised by this, as the clocksource code use to prevent
> | > > > duplicate named clocksources from being registered, so I'm not sure how
> | > > > that check got dropped. Also I'm not quite sure I see where the hard
> | > > > freeze is coming from.
> | > > >
> | > > > My initial reaction would be to either ifdef ia64 implementation in
> | > > > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> | > > > really usable by all arches.
> | > >
> | > > Here is a possible quick fix. I'm open to other approaches, but I also
> | > > want to avoid too much churn before 2.6.23 goes out.
> | > >
> | > > Paolo, could you verify this fixes the issue for you?
> | > >
> | > > thanks
> | > > -john
> | > >
> | > [snip]
> | >
> | > I saw what was missed by me in my brief examination of this last night.
> | > The platform registers the hpet clocksource too.
> | >
> | > Instead of adding the config flag to hpet driver, how about the patch
> | > below? Since you already check for duplication by address then adding
> | > a check for by name too seems okay to me.
> | >
> | > bob
> | >
> | >
> | > Prevent duplicate names being registered with clocksource. This also
> | > eliminates the duplication of hpet clock registration when the arch
> | > uses the hpet timer and the hpet driver does too. The patch was
> | > compile and link tested.
> |
> | Yea. While I'm still not completely comfortable leaving this up to boot
> | order alone (the ia64 hpet clocksource is clearly causing issues on
> | x86_64), I think this patch is something we need as well.
>
> Does -stable need this too?

Looking at the git log, the ia64 clocksource code didn't land until
7/20, so I don't think so.

thanks
-john


2007-08-28 06:12:47

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Mon, 27 Aug 2007 17:34:58 -0300
"Luiz Fernando N. Capitulino" <[email protected]> wrote:

> | Yea. While I'm still not completely comfortable leaving this up to boot
> | order alone (the ia64 hpet clocksource is clearly causing issues on
> | x86_64), I think this patch is something we need as well.
>
> Does -stable need this too?

No :)

--
Paolo Ornati
Linux 2.6.22.5 on x86_64

2007-08-28 08:30:50

by Clemens Ladisch

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

Luck, Tony wrote:
> [...] Given that the hang went away when you applied the earlier patch, I
> conclude that the drivers/char/hpet.c code is the one that got selected when
> you had two "hpet" entries ... and that there is something wrong with that
> code that doesn't work right on x86_64.

Apparently, the 'generic' code was just copied from ia64 and assumes that the
timer is 64 bits. This is not true with hardware from VIA (even on x86_64).

This patch should make it work (although I'd prefer to set the mask
dynamically
according to the hardware caps).

--- a/drivers/char/hpet.c Tue Aug 28 09:42:22 2007
+++ b/drivers/char/hpet.c Tue Aug 28 10:16:54 2007
@@ -73,7 +73,7 @@
.name = "hpet",
.rating = 250,
.read = read_hpet,
- .mask = CLOCKSOURCE_MASK(64),
+ .mask = CLOCKSOURCE_MASK(32),
.mult = 0, /*to be caluclated*/
.shift = 10,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,


2007-08-28 20:22:07

by Paolo Ornati

[permalink] [raw]
Subject: Re: "double" hpet clocksource && hard freeze [bisected]

On Tue, 28 Aug 2007 10:27:09 +0200
Clemens Ladisch <[email protected]> wrote:

> Luck, Tony wrote:
> > [...] Given that the hang went away when you applied the earlier patch, I
> > conclude that the drivers/char/hpet.c code is the one that got selected when
> > you had two "hpet" entries ... and that there is something wrong with that
> > code that doesn't work right on x86_64.
>
> Apparently, the 'generic' code was just copied from ia64 and assumes that the
> timer is 64 bits. This is not true with hardware from VIA (even on x86_64).

The hardware of this PC is Intel, not VIA.

>
> This patch should make it work (although I'd prefer to set the mask
> dynamically
> according to the hardware caps).
>
> --- a/drivers/char/hpet.c Tue Aug 28 09:42:22 2007
> +++ b/drivers/char/hpet.c Tue Aug 28 10:16:54 2007
> @@ -73,7 +73,7 @@
> .name = "hpet",
> .rating = 250,
> .read = read_hpet,
> - .mask = CLOCKSOURCE_MASK(64),
> + .mask = CLOCKSOURCE_MASK(32),


Anyway, I've applied it (manually... whitespace/mime damage) to -rc4 and
it seems to work, no crash so far (I'm sure I'm testing it because
plain -rc4 doesn't have the "2hpet" fix and still manifest the problem).

--
Paolo Ornati
Linux 2.6.23-rc4-dirty on x86_64