2009-06-05 03:29:53

by Feng Tang

[permalink] [raw]
Subject: [PATCH] tick: add check for the existence of broadcast clock event device

>From 2f076e1867c8bbb145b74d289358174644d9fed8 Mon Sep 17 00:00:00 2001
From: Feng Tang <[email protected]>
Date: Fri, 5 Jun 2009 10:36:15 +0800
Subject: [PATCH] tick: add check for the existence of broadcast clock event device

Some platform may have no broadcast clock event device, as it use always-on
external timers for per-cpu timer and has no extra one for broadcast device.
This check will secure the access to bc device when system get some boradcast
on/off and enter/exit message

Signed-off-by: Feng Tang <[email protected]>
---
kernel/time/tick-broadcast.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 118a3b3..110e0bc 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -214,10 +214,13 @@ static void tick_do_broadcast_on_off(void *why)

spin_lock_irqsave(&tick_broadcast_lock, flags);

+ bc = tick_broadcast_device.evtdev;
+ if (!bc)
+ goto out;
+
cpu = smp_processor_id();
td = &per_cpu(tick_cpu_device, cpu);
dev = td->evtdev;
- bc = tick_broadcast_device.evtdev;

/*
* Is the device not affected by the powerstate ?
@@ -468,6 +471,9 @@ void tick_broadcast_oneshot_control(unsigned long reason)
goto out;

bc = tick_broadcast_device.evtdev;
+ if (!bc)
+ goto out;
+
cpu = smp_processor_id();
td = &per_cpu(tick_cpu_device, cpu);
dev = td->evtdev;
--
1.5.6.3


2009-06-06 09:25:10

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Fri, 5 Jun 2009, Feng Tang wrote:

> >From 2f076e1867c8bbb145b74d289358174644d9fed8 Mon Sep 17 00:00:00 2001
> From: Feng Tang <[email protected]>
> Date: Fri, 5 Jun 2009 10:36:15 +0800
> Subject: [PATCH] tick: add check for the existence of broadcast clock event device
>
> Some platform may have no broadcast clock event device, as it use always-on
> external timers for per-cpu timer and has no extra one for broadcast device.
> This check will secure the access to bc device when system get some boradcast
> on/off and enter/exit message
>
> Signed-off-by: Feng Tang <[email protected]>
> ---
> kernel/time/tick-broadcast.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 118a3b3..110e0bc 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -214,10 +214,13 @@ static void tick_do_broadcast_on_off(void *why)
>
> spin_lock_irqsave(&tick_broadcast_lock, flags);
>
> + bc = tick_broadcast_device.evtdev;
> + if (!bc)
> + goto out;
> +

This check is not necessary because we check whether the percpu device
is affected by the stops in C3 madness _before_ we touch the broadcast
device.

if (!dev || !(dev->features & CLOCK_EVT_FEAT_C3STOP))
got out;

If your percpu devices are always on (not affected by C3 stop) then
you never dereference bc. So why do we need an extra check for !bc ?

Thanks,

tglx

2009-06-06 12:48:20

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Sat, 6 Jun 2009 17:24:50 +0800
Thomas Gleixner <[email protected]> wrote:

> On Fri, 5 Jun 2009, Feng Tang wrote:
>
> > >From 2f076e1867c8bbb145b74d289358174644d9fed8 Mon Sep 17 00:00:00
> > >2001
> > From: Feng Tang <[email protected]>
> > Date: Fri, 5 Jun 2009 10:36:15 +0800
> > Subject: [PATCH] tick: add check for the existence of broadcast
> > clock event device
> >
> > Some platform may have no broadcast clock event device, as it use
> > always-on external timers for per-cpu timer and has no extra one
> > for broadcast device. This check will secure the access to bc
> > device when system get some boradcast on/off and enter/exit message
> >
> > Signed-off-by: Feng Tang <[email protected]>
> > ---
> > kernel/time/tick-broadcast.c | 8 +++++++-
> > 1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/time/tick-broadcast.c
> > b/kernel/time/tick-broadcast.c index 118a3b3..110e0bc 100644
> > --- a/kernel/time/tick-broadcast.c
> > +++ b/kernel/time/tick-broadcast.c
> > @@ -214,10 +214,13 @@ static void tick_do_broadcast_on_off(void
> > *why)
> > spin_lock_irqsave(&tick_broadcast_lock, flags);
> >
> > + bc = tick_broadcast_device.evtdev;
> > + if (!bc)
> > + goto out;
> > +
>
> This check is not necessary because we check whether the percpu device
> is affected by the stops in C3 madness _before_ we touch the broadcast
> device.
>
> if (!dev || !(dev->features & CLOCK_EVT_FEAT_C3STOP))
> got out;
>
> If your percpu devices are always on (not affected by C3 stop) then
> you never dereference bc. So why do we need an extra check for !bc ?

Hi tglx,

Thanks for the explanation. But we really ran into the NULL pointer case, in our platform, there are 2 X86 CPUs which have lapic, also it has 2 external timers which are pretty similar with HPET timers, those 2 external timers will be used as per-cpu timers (higher rating than lapic timer). In system's power cycle of suspend and resume, disable_nontboot_cpus will be called before goto suspend state,and enable_nonboot_cpus will be called for the resume process, so lapic timer of cpu1 will be first registered as per-cpu timer, and our external timer will be registered later after get a CPU_ONLINE notifier (similar with HPET), right in this time slot that lapic is the per-cpu timer, when system get the CLOCK_EVT_BROADCAST_ENTER/EXIT msg, tick_do_broadcast_on_off() is called and hit the NULL pointer case.

Our external timer driver is very similar with HPET dirver, why HPET doesn't see such an issue? becuase HPET has enough number of timers, and it use "hpet0" as the bc device, while our platform doesn't have a extra one to act as bc.

- Feng




>
> Thanks,
>
> tglx

2009-06-06 12:55:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

Feng,

On Sat, 6 Jun 2009, Feng Tang wrote:
> > If your percpu devices are always on (not affected by C3 stop) then
> > you never dereference bc. So why do we need an extra check for !bc ?
>
> Hi tglx,

> Thanks for the explanation. But we really ran into the NULL pointer
> case, in our platform, there are 2 X86 CPUs which have lapic, also
> it has 2 external timers which are pretty similar with HPET timers,
> those 2 external timers will be used as per-cpu timers (higher
> rating than lapic timer). In system's power cycle of suspend and
> resume, disable_nontboot_cpus will be called before goto suspend
> state,and enable_nonboot_cpus will be called for the resume process,
> so lapic timer of cpu1 will be first registered as per-cpu timer,
> and our external timer will be registered later after get a
> CPU_ONLINE notifier (similar with HPET), right in this time slot
> that lapic is the per-cpu timer, when system get the
> CLOCK_EVT_BROADCAST_ENTER/EXIT msg, tick_do_broadcast_on_off() is
> called and hit the NULL pointer case.

Ok, I can understand now why we need it. I'll apply your patch and add
some more info into the commit msg so we do not look at it in a year
and scratch our heads. :)

> Our external timer driver is very similar with HPET dirver, why HPET
> doesn't see such an issue? becuase HPET has enough number of timers,
> and it use "hpet0" as the bc device, while our platform doesn't have
> a extra one to act as bc.

Correct.

Thanks,

tglx

2009-06-06 16:19:17

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Sat, 6 Jun 2009, Thomas Gleixner wrote:
> Feng,
>
> On Sat, 6 Jun 2009, Feng Tang wrote:
> > > If your percpu devices are always on (not affected by C3 stop) then
> > > you never dereference bc. So why do we need an extra check for !bc ?
> >
> > Hi tglx,
>
> > Thanks for the explanation. But we really ran into the NULL pointer
> > case, in our platform, there are 2 X86 CPUs which have lapic, also
> > it has 2 external timers which are pretty similar with HPET timers,
> > those 2 external timers will be used as per-cpu timers (higher
> > rating than lapic timer). In system's power cycle of suspend and
> > resume, disable_nontboot_cpus will be called before goto suspend
> > state,and enable_nonboot_cpus will be called for the resume process,
> > so lapic timer of cpu1 will be first registered as per-cpu timer,
> > and our external timer will be registered later after get a
> > CPU_ONLINE notifier (similar with HPET), right in this time slot
> > that lapic is the per-cpu timer, when system get the
> > CLOCK_EVT_BROADCAST_ENTER/EXIT msg, tick_do_broadcast_on_off() is
> > called and hit the NULL pointer case.
>
> Ok, I can understand now why we need it. I'll apply your patch and add
> some more info into the commit msg so we do not look at it in a year
> and scratch our heads. :)

Hmm, thought more about it.

1) How do you calibrate the local APIC timer if you do not have some
initial timer device ?

2) If you have some initial timer device (PIT/HPET) why isn't it
registered as broadcast device.

3) When the CPU uses the local APIC before the external timer is
initialized what happens if the system goes into a deeper c-state ?

Thanks,

tglx

2009-06-08 01:59:17

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Sun, 7 Jun 2009 00:18:55 +0800
Thomas Gleixner <[email protected]> wrote:

> On Sat, 6 Jun 2009, Thomas Gleixner wrote:
> > Feng,
> >
> > On Sat, 6 Jun 2009, Feng Tang wrote:
> > > > If your percpu devices are always on (not affected by C3 stop)
> > > > then you never dereference bc. So why do we need an extra check
> > > > for !bc ?
> > >
> > > Hi tglx,
> >
> > > Thanks for the explanation. But we really ran into the NULL
> > > pointer case, in our platform, there are 2 X86 CPUs which have
> > > lapic, also it has 2 external timers which are pretty similar
> > > with HPET timers, those 2 external timers will be used as per-cpu
> > > timers (higher rating than lapic timer). In system's power cycle
> > > of suspend and resume, disable_nontboot_cpus will be called
> > > before goto suspend state,and enable_nonboot_cpus will be called
> > > for the resume process, so lapic timer of cpu1 will be first
> > > registered as per-cpu timer, and our external timer will be
> > > registered later after get a CPU_ONLINE notifier (similar with
> > > HPET), right in this time slot that lapic is the per-cpu timer,
> > > when system get the CLOCK_EVT_BROADCAST_ENTER/EXIT msg,
> > > tick_do_broadcast_on_off() is called and hit the NULL pointer
> > > case.
> >
> > Ok, I can understand now why we need it. I'll apply your patch and
> > add some more info into the commit msg so we do not look at it in a
> > year and scratch our heads. :)
>
> Hmm, thought more about it.
>
> 1) How do you calibrate the local APIC timer if you do not have some
> initial timer device ?
Yes, we have external timer device with the name "apbt"
>
> 2) If you have some initial timer device (PIT/HPET) why isn't it
> registered as broadcast device.
We only have 2 available apbt, apbt0 for cpu0 and apbt1 for cpu1. And based on my understanding, there is no explicit API to register a timer as bc, and bc is setup during clockevents_register_device() --> tick_check_new_device() -->tick_check_broadcast_device(), and HPET case can just meet the condition: lapic's rating is 110, timer "hpet"is 50 and set to be the broadcast device, and other hpet(2/3/4/5..)'s are 150 and set to be the per-cpu timer

>
> 3) When the CPU uses the local APIC before the external timer is
> initialized what happens if the system goes into a deeper c-state ?
Good point, for cpu1, this is not an issue, as cpu0 will register apbt1 for cpu1. there is a window cpu1 is in deep c-idle state and no timer wake it up, but window is small as cpu0 will send IPI soon.
>
> Thanks,
>
> tglx

2009-06-08 05:44:00

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Mon, 8 Jun 2009, Feng Tang wrote:
> > 1) How do you calibrate the local APIC timer if you do not have some
> > initial timer device ?
> Yes, we have external timer device with the name "apbt"

So that timer is initialized and registered before the local apic,
right ?

Why is the local APIC timer used at all ? The apbt timer should have a
higher rating as the local APIC timer, so when APIC is registered it
is not selected and the check in the broadcast functions

!(dev->features & CLOCK_EVT_FEAT_C3STOP))

should protect you because that bit is not set on your apbt device.

Thanks,

tglx

2009-06-08 06:14:06

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Mon, 8 Jun 2009 13:43:43 +0800
Thomas Gleixner <[email protected]> wrote:

> On Mon, 8 Jun 2009, Feng Tang wrote:
> > > 1) How do you calibrate the local APIC timer if you do not have
> > > some initial timer device ?
> > Yes, we have external timer device with the name "apbt"
>
> So that timer is initialized and registered before the local apic,
> right ?
Hi tglx,

I would describe a little more about our apbt timer, and I believe the
complete driver will be posted by my colleague after it get into a good shape.

Our apbt driver is pretty similar with HPET's, including its cpu hotplug
notifier. But our platform only has 2 available apbt to use, otherwise we will
configure it just like HPET, using one timer as bc and others for per-cpu ones,
then it won't hit this case

There are 2 situations, one is for the normal boot, apbt0 will be inited first
and registered to OS as cpu0's timer, then tsc/lapic is calculated based on it,
and apbt1 is registered later in a fs_initcall() (just like hpet.c does) after basic
kernel core is up. so the sequence is:
apbt0 --> lapic0 --> lapic1 --> apbt1

The other situation is in a suspend/resume cycle, in which disable/enable_nonboot_cpus()
are called, in this case, the apbt0 is alwasy there and not unregistered for cpu0, but
apbt1 is first dumped from cpu1, and in the resume process, lapic1 is first inited and
registered in the cpu_up() call, and apbt1 initialization is put into a workqueue after
got a CPU_ONLINE notifier. so for the cpu1 after resume, lapic1 first and then apbt1,
if the BROADCAST_ENTER/EXIT comes in between, the NULL pointer case is triggered.

Thanks,
Feng
>
> Why is the local APIC timer used at all ? The apbt timer should have a
> higher rating as the local APIC timer, so when APIC is registered it
> is not selected and the check in the broadcast functions
>
> !(dev->features & CLOCK_EVT_FEAT_C3STOP))
>
> should protect you because that bit is not set on your apbt device.
>
> Thanks,
>
> tglx

2009-06-08 06:33:32

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

Feng,

On Mon, 8 Jun 2009, Feng Tang wrote:
> Our apbt driver is pretty similar with HPET's, including its cpu hotplug
> notifier. But our platform only has 2 available apbt to use, otherwise we will
> configure it just like HPET, using one timer as bc and others for per-cpu ones,
> then it won't hit this case
>
> There are 2 situations, one is for the normal boot, apbt0 will be inited first
> and registered to OS as cpu0's timer, then tsc/lapic is calculated based on it,
> and apbt1 is registered later in a fs_initcall() (just like hpet.c does) after basic
> kernel core is up. so the sequence is:
> apbt0 --> lapic0 --> lapic1 --> apbt1

Hmm, I do not like that at all. That explicitely relies on CPU0 doing
some work which will kick CPU1. That's fragile as hell.

Why do you want to make the CPU1 case special? You setup apbt0 before
you setup local APIC on CPU0, so why can't you do the same for apbt1
on CPU1 ? That will also remove the complete hotplug logic from your
code.

Thanks,

tglx

2009-06-08 06:48:39

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Mon, 8 Jun 2009 14:33:14 +0800
Thomas Gleixner <[email protected]> wrote:

> Feng,
>
> On Mon, 8 Jun 2009, Feng Tang wrote:
> > Our apbt driver is pretty similar with HPET's, including its cpu
> > hotplug notifier. But our platform only has 2 available apbt to
> > use, otherwise we will configure it just like HPET, using one timer
> > as bc and others for per-cpu ones, then it won't hit this case
> >
> > There are 2 situations, one is for the normal boot, apbt0 will be
> > inited first and registered to OS as cpu0's timer, then tsc/lapic
> > is calculated based on it, and apbt1 is registered later in a
> > fs_initcall() (just like hpet.c does) after basic kernel core is
> > up. so the sequence is: apbt0 --> lapic0 --> lapic1 --> apbt1
>
> Hmm, I do not like that at all. That explicitely relies on CPU0 doing
> some work which will kick CPU1. That's fragile as hell.
I understand the concern. apbt0 is inited in a very early boot phase when
the cpu1 is not up yet, and os don't even know wether there is a cpu1, that's
why we registered apbt1 in fs_initcall(). If we explicitly setup apbt1 when
OS brings up cpu1, it is a little brutal and not generic as only our platform
has apbt, and I guess cpu hotplug maintainer won't like it :p

As I said if we have one more apbt, we will make it totally follow the HPET way,
sad thing is we don't have it :(

Thanks,
Feng

>
> Why do you want to make the CPU1 case special? You setup apbt0 before
> you setup local APIC on CPU0, so why can't you do the same for apbt1
> on CPU1 ? That will also remove the complete hotplug logic from your
> code.
>
> Thanks,
>
> tglx

2009-06-08 07:00:36

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event device

On Mon, 8 Jun 2009, Feng Tang wrote:
> On Mon, 8 Jun 2009 14:33:14 +0800 Thomas Gleixner <[email protected]> wrote:
> > On Mon, 8 Jun 2009, Feng Tang wrote:
> > > Our apbt driver is pretty similar with HPET's, including its cpu
> > > hotplug notifier. But our platform only has 2 available apbt to
> > > use, otherwise we will configure it just like HPET, using one timer
> > > as bc and others for per-cpu ones, then it won't hit this case
> > >
> > > There are 2 situations, one is for the normal boot, apbt0 will be
> > > inited first and registered to OS as cpu0's timer, then tsc/lapic
> > > is calculated based on it, and apbt1 is registered later in a
> > > fs_initcall() (just like hpet.c does) after basic kernel core is
> > > up. so the sequence is: apbt0 --> lapic0 --> lapic1 --> apbt1
> >
> > Hmm, I do not like that at all. That explicitely relies on CPU0 doing
> > some work which will kick CPU1. That's fragile as hell.
>
> I understand the concern. apbt0 is inited in a very early boot phase when
> the cpu1 is not up yet, and os don't even know wether there is a cpu1, that's
> why we registered apbt1 in fs_initcall(). If we explicitly setup apbt1 when
> OS brings up cpu1, it is a little brutal and not generic as only our platform
> has apbt, and I guess cpu hotplug maintainer won't like it :p

Why is that a problem ? You already have a special case for apbt0 in
the early setup code. So where is the problem when you have an apbt1
init call on CPU1 _before_ the local APIC is initialized on CPU1.

That's definitely saner than relying on magic IPI wakeups.

Thanks,

tglx

2009-06-08 07:48:44

by Feng Tang

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device



>-----Original Message-----
>From: Thomas Gleixner [mailto:[email protected]]
>Sent: 2009??6??8?? 15:00
>To: Tang, Feng
>Cc: [email protected]; [email protected]; Li, Shaohua; Pan, Jacob jun
>Subject: Re: [PATCH] tick: add check for the existence of broadcast clock event
>device
>
>On Mon, 8 Jun 2009, Feng Tang wrote:
>> On Mon, 8 Jun 2009 14:33:14 +0800 Thomas Gleixner <[email protected]> wrote:
>> > On Mon, 8 Jun 2009, Feng Tang wrote:
>> > > Our apbt driver is pretty similar with HPET's, including its cpu
>> > > hotplug notifier. But our platform only has 2 available apbt to
>> > > use, otherwise we will configure it just like HPET, using one timer
>> > > as bc and others for per-cpu ones, then it won't hit this case
>> > >
>> > > There are 2 situations, one is for the normal boot, apbt0 will be
>> > > inited first and registered to OS as cpu0's timer, then tsc/lapic
>> > > is calculated based on it, and apbt1 is registered later in a
>> > > fs_initcall() (just like hpet.c does) after basic kernel core is
>> > > up. so the sequence is: apbt0 --> lapic0 --> lapic1 --> apbt1
>> >
>> > Hmm, I do not like that at all. That explicitely relies on CPU0 doing
>> > some work which will kick CPU1. That's fragile as hell.
>>
>> I understand the concern. apbt0 is inited in a very early boot phase when
>> the cpu1 is not up yet, and os don't even know wether there is a cpu1, that's
>> why we registered apbt1 in fs_initcall(). If we explicitly setup apbt1 when
>> OS brings up cpu1, it is a little brutal and not generic as only our platform
>> has apbt, and I guess cpu hotplug maintainer won't like it :p
>
>Why is that a problem ? You already have a special case for apbt0 in
>the early setup code. So where is the problem when you have an apbt1
>init call on CPU1 _before_ the local APIC is initialized on CPU1.

Yes, your proposal makes good sense. But our apbt0 initialization is an x86 legacy one like HPET and is put into the same position where hpet_enable() exists, it keeps the impact to x86 arch dependent code be minimal, if we put the apbt1 init code into x86's cpu initialization code, I think it will cause much more complains, that's why we chose to follow hpet's method to use delayed init.

The reason we proposed to add pointer check is there are already such checks in the broadcast code, adding these won't do harm to existing code.

Thanks,
Feng
>
>That's definitely saner than relying on magic IPI wakeups.
>
>Thanks,
>
> tglx
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2009-06-08 13:42:17

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device

Feng,

can you please fix your mail client to do proper line breaks around 78
characters ?

On Mon, 8 Jun 2009, Tang, Feng wrote:
> >From: Thomas Gleixner [mailto:[email protected]]
> >Why is that a problem ? You already have a special case for apbt0 in
> >the early setup code. So where is the problem when you have an apbt1
> >init call on CPU1 _before_ the local APIC is initialized on CPU1.

> Yes, your proposal makes good sense. But our apbt0 initialization is
> an x86 legacy one like HPET and is put into the same position where
> hpet_enable() exists, it keeps the impact to x86 arch dependent code
> be minimal, if we put the apbt1 init code into x86's cpu
> initialization code, I think it will cause much more complains,
> that's why we chose to follow hpet's method to use delayed init.

I understand that, but HPET does not rely on some magic events
happening. That's what I'm worried about. The boot code is fragile and
I prefer some explicit setup call over a fragile solution which
happens to work.

There is not much to complain about a platform specific function call
to set up special devices if there is a requirement for a call order.

In fact you can avoid setting up the local APIC timer at all. So what
you want is something like the patch below. You can set the
setup_secondary_clock pointer in the quirks structure when you detect
that you are running on such a system.

> The reason we proposed to add pointer check is there are already
> such checks in the broadcast code, adding these won't do harm to
> existing code.

I have no objections against the NULL pointer check in the broadcast
code, but I really wanted to understand why it's necessary. The patch
papers over the real problem and just works because CPU0 sends an
IPI. The correct solution is to use the real timer right away and not
rely on magic wakeups via IPIs. Ok ?

Thanks,

tglx
---
Subject: x86: add quirk for secondary clock setup
From: Thomas Gleixner <[email protected]>
Date: Mon, 08 Jun 2009 13:05:35 +0200

Some newer platforms require to use a separate per cpu clock event
device instead of the local APIC timer. Allow them to override the
setup for the secondary clock with a quirk.

Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/include/asm/setup.h | 1 +
arch/x86/kernel/smpboot.c | 13 ++++++++++++-
2 files changed, 13 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/include/asm/setup.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/setup.h
+++ linux-2.6/arch/x86/include/asm/setup.h
@@ -31,6 +31,7 @@ struct x86_quirks {
void (*smp_read_mpc_oem)(struct mpc_oemtable *oemtable,
unsigned short oemsize);
int (*setup_ioapic_ids)(void);
+ void (*setup_secondary_clock)(void);
};

extern void x86_quirk_pre_intr_init(void);
Index: linux-2.6/arch/x86/kernel/smpboot.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/smpboot.c
+++ linux-2.6/arch/x86/kernel/smpboot.c
@@ -263,6 +263,17 @@ static void __cpuinit smp_callin(void)
}

/*
+ * Setup secondary clock
+ */
+notrace static void __cpuinit __setup_secondary_clock(void)
+{
+ if (x86_quirks->setup_secondary_clock)
+ x86_quirks->setup_secondary_clock();
+ else
+ setup_secondary_clock();
+}
+
+/*
* Activate a secondary processor.
*/
notrace static void __cpuinit start_secondary(void *unused)
@@ -323,7 +334,7 @@ notrace static void __cpuinit start_seco
/* enable local interrupts */
local_irq_enable();

- setup_secondary_clock();
+ __setup_secondary_clock();

wmb();
cpu_idle();

2009-06-09 00:21:46

by Jacob Pan

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device

>I understand that, but HPET does not rely on some magic events
>happening. That's what I'm worried about. The boot code is fragile and
>I prefer some explicit setup call over a fragile solution which
>happens to work.
>
>There is not much to complain about a platform specific function call
>to set up special devices if there is a requirement for a call order.
>
>In fact you can avoid setting up the local APIC timer at all. So what
>you want is something like the patch below. You can set the
>setup_secondary_clock pointer in the quirks structure when you detect
>that you are running on such a system.
>
[[JPAN]] Hi Thomas, I have been developing the APB timer driver as Feng mentioned, thank you for the suggestions.
I agree with you that direct setting up secondary clockevent is a better solution.

But maybe I misunderstand the HPET code, isn't it true that per CPU HPET timer also rely on IPI? In hpet_cpuhp_notify().

Also for using x86_quirks, I think the default quirks are more generic in the sense of x86 platform, but here we are really choosing timer device, so can we switch based on availability of the timer devices instead of quirks? Otherwise, if more platforms share the same setup_secondary_clock() quirk, we would have to check for timer devices anyway.

Jacob

2009-06-09 08:19:28

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device

On Mon, 8 Jun 2009, Pan, Jacob jun wrote:
> >I understand that, but HPET does not rely on some magic events
> >happening. That's what I'm worried about. The boot code is fragile and
> >I prefer some explicit setup call over a fragile solution which
> >happens to work.
> >
> >There is not much to complain about a platform specific function call
> >to set up special devices if there is a requirement for a call order.
> >
> >In fact you can avoid setting up the local APIC timer at all. So what
> >you want is something like the patch below. You can set the
> >setup_secondary_clock pointer in the quirks structure when you detect
> >that you are running on such a system.
> >

> [[JPAN]] Hi Thomas, I have been developing the APB timer driver as
> Feng mentioned, thank you for the suggestions. I agree with you
> that direct setting up secondary clockevent is a better solution.
> But maybe I misunderstand the HPET code, isn't it true that per CPU
> HPET timer also rely on IPI? In hpet_cpuhp_notify().

Well the difference is, that the HPET setup always has a functional
clock event device (local APIC + the broadcast fallback) and it does
not depend on that IPI to make progress.

> Also for using x86_quirks, I think the default quirks are more
> generic in the sense of x86 platform, but here we are really
> choosing timer device, so can we switch based on availability of the
> timer devices instead of quirks? Otherwise, if more platforms share
> the same setup_secondary_clock() quirk, we would have to check for
> timer devices anyway.

The quirks mechanism is exactly the right thing. It provides platforms
and I consider your system a platform which has neither PIT nor HPET a
mechanism to override the default implementation. And overriding the
default local APIC timer setup is exaclty what you want to do.

Thanks,

tglx

2009-06-09 12:50:07

by Jacob Pan

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device

Thanks for the explanations, I will add the secondary clock code to the
x86_quirks. It should be submitted for review soon.

Jacob

>-----Original Message-----
>From: Thomas Gleixner [mailto:[email protected]]
>Sent: Tuesday, June 09, 2009 1:19 AM
>To: Pan, Jacob jun
>Cc: Tang, Feng; [email protected]; [email protected]; Li, Shaohua
>Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event
>device
>
>On Mon, 8 Jun 2009, Pan, Jacob jun wrote:
>> >I understand that, but HPET does not rely on some magic events
>> >happening. That's what I'm worried about. The boot code is fragile and
>> >I prefer some explicit setup call over a fragile solution which
>> >happens to work.
>> >
>> >There is not much to complain about a platform specific function call
>> >to set up special devices if there is a requirement for a call order.
>> >
>> >In fact you can avoid setting up the local APIC timer at all. So what
>> >you want is something like the patch below. You can set the
>> >setup_secondary_clock pointer in the quirks structure when you detect
>> >that you are running on such a system.
>> >
>
>> [[JPAN]] Hi Thomas, I have been developing the APB timer driver as
>> Feng mentioned, thank you for the suggestions. I agree with you
>> that direct setting up secondary clockevent is a better solution.
>> But maybe I misunderstand the HPET code, isn't it true that per CPU
>> HPET timer also rely on IPI? In hpet_cpuhp_notify().
>
>Well the difference is, that the HPET setup always has a functional
>clock event device (local APIC + the broadcast fallback) and it does
>not depend on that IPI to make progress.
>
>> Also for using x86_quirks, I think the default quirks are more
>> generic in the sense of x86 platform, but here we are really
>> choosing timer device, so can we switch based on availability of the
>> timer devices instead of quirks? Otherwise, if more platforms share
>> the same setup_secondary_clock() quirk, we would have to check for
>> timer devices anyway.
>
>The quirks mechanism is exactly the right thing. It provides platforms
>and I consider your system a platform which has neither PIT nor HPET a
>mechanism to override the default implementation. And overriding the
>default local APIC timer setup is exaclty what you want to do.
>
>Thanks,
>
> tglx

2009-06-09 16:53:30

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH] tick: add check for the existence of broadcast clock event device

On Tue, 9 Jun 2009, Pan, Jacob jun wrote:

> Thanks for the explanations, I will add the secondary clock code to the
> x86_quirks. It should be submitted for review soon.

Please take the patch I sent in the thread and make use of it in your
platform support. You just need set your function pointer in
x86_quirks->setup_secondary_clock when you detect on which platform
you are running.

Thanks,

tglx