2007-05-06 20:56:15

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] x86-64 highres/dyntick support

I'm pleased to announce the first cut of the final x86_64
highres/dyntick support, which I did based on Chris Wright's patch set,
which is again based on Arjan van de Ven's initial work:

http://www.tglx.de/projects/hrtimers/2.6.21-git2-x86-64/patches-2.6.21-git2.patch.bz2

Broken out version:

http://www.tglx.de/projects/hrtimers/2.6.21-git2-x86-64/patches-2.6.21-git2.tar.bz2

It applies on top of 2.6.21-git2 and contains the following patches:

# Andi's x86_64 queue (already in -mm and pending mainline merges)
x86_64-2.6.21-git2.patch

# Outstanding fixups to highres/dyntick core and i386
# (-mm and mainline pending)
highres-dyntick-avoid-xtime-lock-contention.patch
acpi-keep-tsc-stable-when-lapic-timer-c2-ok-is-set.patch
clocksource-fix-resume-logic.patch
clockevents-fix-resume-logic.patch

# x86_64 dyntick support
x86-64-untangle-hpet-headers.patch
x86-64-drive-set-rtc-mss.patch
i386-move-pit-setup-to-i8253-h.patch
x86-64-remove-dead-code-tsc-c.patch
x86-64-convert-to-clockevents.patch
x86-64-prepare-idle-for-dyntick.patch
x86-64-enable-highres-dyntick.patch

The x86-64-convert-to-clockevents.patch is rather large, but there is no
way to do this incremental. The clockevents conversion has to be done in
one go.

The x86-64 clockevents patch set overall summary is:

22 files changed, 631 insertions(+), 1199 deletions(-)

due to sharing the code of PIT and HPET with i386.

I did not dare to tackle sharing apic.c yet, but there is definitely a
chance to get this done some day when I get bored and do a:
# mkdir arch/x86 :)

I'm going to post the x86_64 set for review to LKML once the outstanding
highres/dyntick fixups have hit mainline resp. -mm

To create a highres / dyntick enabled kernel for x86_64:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.21.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.21-git2.bz2
http://www.tglx.de/projects/hrtimers/2.6.21-git2-x86-64/patches-2.6.21-git2.patch.bz2


Comments, bugreports, patches are welcome as ususal

Thanks,

tglx



2007-05-07 09:16:40

by Nicolas Mailhot

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

Thomas Gleixner <tglx <at> linutronix.de> writes:

>
> I'm pleased to announce the first cut of the final x86_64
> highres/dyntick support, which I did based on Chris Wright's patch set,
> which is again based on Arjan van de Ven's initial work:

Do you have a 2.6.21-mm1 patchset?

Regards,

--
Nicolas Mailhot




2007-05-07 13:13:17

by Mats Johannesson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Sun May 06 2007 - Europe Evening Time Thomas Gleixner wrote:

> I'm pleased to announce the first cut of the final x86_64
> highres/dyntick support, which I did based on Chris Wright's patch
> set, which is again based on Arjan van de Ven's initial work:
[...]
> Comments, bugreports, patches are welcome as ususal

Are questions welcome? Then I'd ask: "What are the _minimal_ CPU
requirements to gain anything (eg less power consumption) with
dyntick?"

I ask because of a trial round with Chris Wright's patch set on a
fresh battery, idle system outside X with wifi card shut off and HZ
set to 100 (from my normal 1000):

root@sleipner:~# ls -l battest-the-new-battery/*
battest-the-new-battery/dyn-100hz-2.6.21:
total 4
-rw-r--r-- 1 root root 0 2007-04-27 00:50 start
-rw-r--r-- 1 root root 0 2007-04-27 03:54 stop
-rwxr-xr-x 1 root root 72 2007-04-26 22:16 test-batt.bash

battest-the-new-battery/plain-2.6.21:
total 4
-rw-r--r-- 1 root root 0 2007-04-27 13:16 start
-rw-r--r-- 1 root root 0 2007-04-27 16:22 stop
-rwxr-xr-x 1 root root 72 2007-04-26 22:16 test-batt.bash
root@sleipner:~#

The script just touched the "stop" file with a 2 minutes interval until
the machine died. As seen by the plus/minus 2 minutes results there is
absolutely no difference.

This AMD 64 Mobile processor only has a C1 level which isn't used:

root@sleipner:~# cat /proc/acpi/processor/CPU0/power
active state: C1
max_cstate: C8
bus master activity: 00000000
maximum allowed latency: 2000 usec
states:
*C1: type[C1] promotion[--] demotion[--]
latency[000] usage[00000000] duration[00000000000000000000]

But shouldn't the the kernel 'hlt' routine, or whatever it's called,
work in conjunction with dyntick to achieve... something...?

CPU markings are:

Mobile AMD Athlon 64
AMA3400BEX5AR 1169004L40404
CAAZC 0451APMW 2001 AMD
Assembled in Malaysia

root@sleipner:~# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 4
model name : AMD Athlon(tm) 64 Processor 3400+
stepping : 10
cpu MHz : 800.000
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm
3dnowext 3dnow
bogomips : 1601.73
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

Mvh,
Mats

2007-05-07 15:10:42

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Mon, 2007-05-07 at 14:07 +0200, Mats Johannesson wrote:
> This AMD 64 Mobile processor only has a C1 level which isn't used:
>
> root@sleipner:~# cat /proc/acpi/processor/CPU0/power
> active state: C1
> max_cstate: C8
> bus master activity: 00000000
> maximum allowed latency: 2000 usec
> states:
> *C1: type[C1] promotion[--] demotion[--]
> latency[000] usage[00000000] duration[00000000000000000000]
>
> But shouldn't the the kernel 'hlt' routine, or whatever it's called,
> work in conjunction with dyntick to achieve... something...?

To make real power savings from dynticks you need deeper power states in
the CPU. Dyntick can give the idle state code an idea how long the sleep
is going to be, so this code can decide to go into deeper power states
in one go rather than stepping down over time. On a CPU which has no
deeper C states the power saving of dynticks is probably not even
measurable,

tglx


2007-05-07 15:25:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Mon, 2007-05-07 at 09:16 +0000, Nicolas Mailhot wrote:
> Thomas Gleixner <tglx <at> linutronix.de> writes:
>
> >
> > I'm pleased to announce the first cut of the final x86_64
> > highres/dyntick support, which I did based on Chris Wright's patch set,
> > which is again based on Arjan van de Ven's initial work:
>
> Do you have a 2.6.21-mm1 patchset?

Not yet, but it should be halfways easy to do one. Stay tuned.

tglx


2007-05-07 16:32:48

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

* Thomas Gleixner ([email protected]) wrote:
> I'm pleased to announce the first cut of the final x86_64
> highres/dyntick support, which I did based on Chris Wright's patch set,
> which is again based on Arjan van de Ven's initial work:

Cool. I had finished mine as well, just didn't get time to polish
and resubmit. Going through to see where we differ and if there's
any bits of my set that yours needs.

thanks,
-chris

2007-05-07 16:43:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Mon, 2007-05-07 at 09:31 -0700, Chris Wright wrote:
> * Thomas Gleixner ([email protected]) wrote:
> > I'm pleased to announce the first cut of the final x86_64
> > highres/dyntick support, which I did based on Chris Wright's patch set,
> > which is again based on Arjan van de Ven's initial work:
>
> Cool. I had finished mine as well, just didn't get time to polish
> and resubmit.

Hrmpf, you could have saved my weekend :)

> Going through to see where we differ and if there's
> any bits of my set that yours needs.

Yes please.

tglx


2007-05-07 17:20:17

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

* Thomas Gleixner ([email protected]) wrote:
> On Mon, 2007-05-07 at 09:31 -0700, Chris Wright wrote:
> > * Thomas Gleixner ([email protected]) wrote:
> > > I'm pleased to announce the first cut of the final x86_64
> > > highres/dyntick support, which I did based on Chris Wright's patch set,
> > > which is again based on Arjan van de Ven's initial work:
> >
> > Cool. I had finished mine as well, just didn't get time to polish
> > and resubmit.
>
> Hrmpf, you could have saved my weekend :)

Yeah, timing was just off, sorry about that. I'm sure it'll improve
the end result ;-)

thanks,
-chris

2007-05-07 22:45:59

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support


Needs this minor fix to build on i386.

Signed-off-by: Venkatesh Pallipadi <[email protected]>

Index: linux-2.6.21-tolkml/include/asm-i386/hpet.h
===================================================================
--- linux-2.6.21-tolkml.orig/include/asm-i386/hpet.h 2007-05-07 14:32:37.000000000 -0700
+++ linux-2.6.21-tolkml/include/asm-i386/hpet.h 2007-05-07 14:43:54.000000000 -0700
@@ -71,17 +71,6 @@
#ifdef CONFIG_X86_64
/* hpet_readl/writel defines */
#include <asm/vsyscall.h>
-
-#else
-static inline unsigned long hpet_readl(unsigned long a)
-{
- return readl(hpet_virt_address + a);
-}
-
-static inline void hpet_writel(unsigned long d, unsigned long a)
-{
- writel(d, hpet_virt_address + a);
-}
#endif

#ifdef CONFIG_HPET_EMULATE_RTC
Index: linux-2.6.21-tolkml/arch/i386/kernel/hpet.c
===================================================================
--- linux-2.6.21-tolkml.orig/arch/i386/kernel/hpet.c 2007-05-07 14:31:34.000000000 -0700
+++ linux-2.6.21-tolkml/arch/i386/kernel/hpet.c 2007-05-07 14:43:36.000000000 -0700
@@ -47,6 +47,16 @@

static void __iomem * hpet_virt_address;

+static inline unsigned long hpet_readl(unsigned long a)
+{
+ return readl(hpet_virt_address + a);
+}
+
+static inline void hpet_writel(unsigned long d, unsigned long a)
+{
+ writel(d, hpet_virt_address + a);
+}
+
static inline void hpet_set_mapping(void)
{
hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);

2007-05-07 23:13:02

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Mon, 2007-05-07 at 15:43 -0700, Venki Pallipadi wrote:
> Needs this minor fix to build on i386.

Ouch.

Should have compiled i386 myself once more. There is another fixlet
missing in x86_64, which was caused by my inability to cope with the
intellegence of quilt.

Updated patches uploaded to:

http://www.tglx.de/projects/hrtimers/2.6.21-git2-x86-64/

tglx


2007-05-08 09:40:37

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

* Thomas Gleixner ([email protected]) wrote:
> On Mon, 2007-05-07 at 09:31 -0700, Chris Wright wrote:
> > Going through to see where we differ and if there's
> > any bits of my set that yours needs.
>
> Yes please.

OK, looks very similar all things considered. One thing I didn't do
was fix lapic timer calibration (was hoping you'd do that part, and you
did ;-) I've noticed that something has changed and I'm seeing irq0
handled on cpu3 (4 cpu system), where it used to be on cpu0 as expected.
In addition lapic timer is firing there, and I'm seeing a higher
interrupt load than I used to. This is the same in your set and mine.
Following is small set of patches that were the more obvious bits.

thanks,
-chris

2007-05-08 09:43:06

by Chris Wright

[permalink] [raw]
Subject: [PATCH 1/5] x86_64: tsc compile fix


Signed-off-by: Chris Wright <[email protected]>
---
Just double checked, this is already picked up in -v2 patch.

arch/x86_64/kernel/tsc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- tglx.orig/arch/x86_64/kernel/tsc.c
+++ tglx/arch/x86_64/kernel/tsc.c
@@ -84,7 +84,7 @@ static struct notifier_block time_cpufre
static int __init cpufreq_tsc(void)
{
cpufreq_register_notifier(&time_cpufreq_notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER))
+ CPUFREQ_TRANSITION_NOTIFIER);
return 0;
}

2007-05-08 09:44:13

by Chris Wright

[permalink] [raw]
Subject: [PATCH 2/5] x86_64: __setup_APIC_LVTT whitespace fix

Completely trivial one.

Signed-off-by: Chris Wright <[email protected]>
---
arch/x86_64/kernel/apic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- tglx.orig/arch/x86_64/kernel/apic.c
+++ tglx/arch/x86_64/kernel/apic.c
@@ -779,7 +779,7 @@ void __init init_apic_mappings(void)

#define APIC_DIVISOR 16

-static void __setup_APIC_LVTT(unsigned int clocks, int oneshot,int irqen)
+static void __setup_APIC_LVTT(unsigned int clocks, int oneshot, int irqen)
{
unsigned int lvtt_value, tmp_value;

2007-05-08 09:45:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86_64: tsc compile fix

On Tue, 2007-05-08 at 02:42 -0700, Chris Wright wrote:
> Signed-off-by: Chris Wright <[email protected]>
> ---
> Just double checked, this is already picked up in -v2 patch.
>
> arch/x86_64/kernel/tsc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- tglx.orig/arch/x86_64/kernel/tsc.c
> +++ tglx/arch/x86_64/kernel/tsc.c
> @@ -84,7 +84,7 @@ static struct notifier_block time_cpufre
> static int __init cpufreq_tsc(void)
> {
> cpufreq_register_notifier(&time_cpufreq_notifier_block,
> - CPUFREQ_TRANSITION_NOTIFIER))
> + CPUFREQ_TRANSITION_NOTIFIER);
> return 0;
> }

Yep, have this already in my -v2 set along with the i386 compile fix
from Venki.

tglx



2007-05-08 09:45:29

by Chris Wright

[permalink] [raw]
Subject: [PATCH 3/5] i386: hpet assumes boot cpu is 0

I fixed this in x86_64. Looks like the kind of thing that will break
voyager on i386.

Signed-off-by: Chris Wright <[email protected]>
---
arch/i386/kernel/hpet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- tglx.orig/arch/i386/kernel/hpet.c
+++ tglx/arch/i386/kernel/hpet.c
@@ -334,7 +334,7 @@ int __init hpet_enable(void)
* Start hpet with the boot cpu mask and make it
* global after the IO_APIC has been initialized.
*/
- hpet_clockevent.cpumask =cpumask_of_cpu(0);
+ hpet_clockevent.cpumask = cpumask_of_cpu(smp_processor_id());
clockevents_register_device(&hpet_clockevent);
global_clock_event = &hpet_clockevent;
return 1;

2007-05-08 09:45:37

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2/5] x86_64: __setup_APIC_LVTT whitespace fix

On Tue, 2007-05-08 at 02:43 -0700, Chris Wright wrote:
> Completely trivial one.
>
> Signed-off-by: Chris Wright <[email protected]>
> ---
> arch/x86_64/kernel/apic.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- tglx.orig/arch/x86_64/kernel/apic.c
> +++ tglx/arch/x86_64/kernel/apic.c
> @@ -779,7 +779,7 @@ void __init init_apic_mappings(void)
>
> #define APIC_DIVISOR 16
>
> -static void __setup_APIC_LVTT(unsigned int clocks, int oneshot,int irqen)
> +static void __setup_APIC_LVTT(unsigned int clocks, int oneshot, int irqen)
> {
> unsigned int lvtt_value, tmp_value;
>

Thanks,

tglx


2007-05-08 09:48:12

by Chris Wright

[permalink] [raw]
Subject: [PATCH 5/5] x86_64: restore restore nohpet cmdline

Lost when merged with i386. Happy to drop, but I suspect Andi would
rather not break existing users (I noticed because it was part of my
testing). If dropped, Documentation needs updating.

Signed-off-by: Chris Wright <[email protected]>
---
arch/i386/kernel/hpet.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- tglx.orig/arch/i386/kernel/hpet.c
+++ tglx/arch/i386/kernel/hpet.c
@@ -78,6 +78,14 @@ static int __init hpet_setup(char* str)
return 1;
}
__setup("hpet=", hpet_setup);
+#ifdef CONFIG_X86_64
+static int __init disable_hpet(char *str)
+{
+ boot_hpet_disable = 1;
+ return 1;
+}
+__setup("nohpet", disable_hpet);
+#endif

static inline int is_hpet_capable(void)
{

2007-05-08 09:48:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Tue, 2007-05-08 at 02:39 -0700, Chris Wright wrote:

> OK, looks very similar all things considered. One thing I didn't do
> was fix lapic timer calibration (was hoping you'd do that part, and you
> did ;-) I've noticed that something has changed and I'm seeing irq0
> handled on cpu3 (4 cpu system), where it used to be on cpu0 as expected.

Strange, irq balancing ?

> In addition lapic timer is firing there, and I'm seeing a higher
> interrupt load than I used to. This is the same in your set and mine.

Is irq0 _and_ the lapic timer firing ?

tglx


2007-05-08 09:50:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 4/5] i386: i8253 clockevent shutdown and unused using pit

On Tue, 2007-05-08 at 02:46 -0700, Chris Wright wrote:
> Disable by programming pit directly when performing CLOCK_EVT_MODE_UNUSED
> or CLOCK_EVT_MODE_SHUTDOWN transitions. (A variant) tested successfully
> by Joachim Deguara on a Geode that exhibited BZ 8027 behaviour (with
> bad bogomips).

Sigh, we had problems with that one on Ingo's Vaio-of-fun-emulator
laptop. That's why I did the irq disable.

Ingo, can you please retest ?

tglx


2007-05-08 09:51:09

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 5/5] x86_64: restore restore nohpet cmdline

On Tue, 2007-05-08 at 02:47 -0700, Chris Wright wrote:
> Lost when merged with i386. Happy to drop, but I suspect Andi would
> rather not break existing users (I noticed because it was part of my
> testing). If dropped, Documentation needs updating.

Fair enough.

> Signed-off-by: Chris Wright <[email protected]>
> ---
> arch/i386/kernel/hpet.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> --- tglx.orig/arch/i386/kernel/hpet.c
> +++ tglx/arch/i386/kernel/hpet.c
> @@ -78,6 +78,14 @@ static int __init hpet_setup(char* str)
> return 1;
> }
> __setup("hpet=", hpet_setup);
> +#ifdef CONFIG_X86_64
> +static int __init disable_hpet(char *str)
> +{
> + boot_hpet_disable = 1;
> + return 1;
> +}
> +__setup("nohpet", disable_hpet);
> +#endif
>
> static inline int is_hpet_capable(void)
> {

2007-05-08 09:51:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 3/5] i386: hpet assumes boot cpu is 0

On Tue, 2007-05-08 at 02:44 -0700, Chris Wright wrote:
> I fixed this in x86_64. Looks like the kind of thing that will break
> voyager on i386.

voyager has hpet ? Anyway good point.

tglx


2007-05-08 09:52:34

by Chris Wright

[permalink] [raw]
Subject: [PATCH 4/5] i386: i8253 clockevent shutdown and unused using pit

Disable by programming pit directly when performing CLOCK_EVT_MODE_UNUSED
or CLOCK_EVT_MODE_SHUTDOWN transitions. (A variant) tested successfully
by Joachim Deguara on a Geode that exhibited BZ 8027 behaviour (with
bad bogomips).

Signed-off-by: Chris Wright <[email protected]>
Cc: Joachim Deguara <[email protected]>
---
arch/i386/kernel/i8253.c | 25 ++++---------------------
1 file changed, 4 insertions(+), 21 deletions(-)

--- tglx.orig/arch/i386/kernel/i8253.c
+++ tglx/arch/i386/kernel/i8253.c
@@ -29,24 +29,6 @@ EXPORT_SYMBOL(i8253_lock);
*/
struct clock_event_device *global_clock_event;

-/* Status of the PIT interrupt */
-static int pit_irq_disabled;
-
-/*
- * Control pit interrupt enable / disable
- */
-static void pit_control_irq(int disable)
-{
- if (pit_irq_disabled == disable)
- return;
-
- pit_irq_disabled = disable;
- if (disable)
- disable_irq(0);
- else
- enable_irq(0);
-}
-
/*
* Initialize the PIT timer.
*
@@ -65,17 +47,18 @@ static void init_pit_timer(enum clock_ev
outb_p(0x34, PIT_MODE);
outb_p(LATCH & 0xff , PIT_CH0); /* LSB */
outb(LATCH >> 8 , PIT_CH0); /* MSB */
- pit_control_irq(0);
break;

case CLOCK_EVT_MODE_SHUTDOWN:
case CLOCK_EVT_MODE_UNUSED:
- pit_control_irq(1);
+ outb_p(0x30, PIT_MODE);
+ outb_p(0, PIT_CH0); /* LSB */
+ outb_p(0, PIT_CH0); /* MSB */
break;
+
case CLOCK_EVT_MODE_ONESHOT:
/* One shot setup */
outb_p(0x38, PIT_MODE);
- pit_control_irq(0);
break;

case CLOCK_EVT_MODE_RESUME:

2007-05-08 09:52:42

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

* Thomas Gleixner ([email protected]) wrote:
> On Tue, 2007-05-08 at 02:39 -0700, Chris Wright wrote:
>
> > OK, looks very similar all things considered. One thing I didn't do
> > was fix lapic timer calibration (was hoping you'd do that part, and you
> > did ;-) I've noticed that something has changed and I'm seeing irq0
> > handled on cpu3 (4 cpu system), where it used to be on cpu0 as expected.
>
> Strange, irq balancing ?

That's what I was wondering, although i have same setup for 32-bit
and it behaves as expected with cpu0 taking hpet or pit on irq0
and lapic timer picked up on the other 3 cpus.

> > In addition lapic timer is firing there, and I'm seeing a higher
> > interrupt load than I used to. This is the same in your set and mine.
>
> Is irq0 _and_ the lapic timer firing ?

Yes.

2007-05-08 09:55:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Tue, 2007-05-08 at 02:51 -0700, Chris Wright wrote:
> That's what I was wondering, although i have same setup for 32-bit
> and it behaves as expected with cpu0 taking hpet or pit on irq0
> and lapic timer picked up on the other 3 cpus.
>
> > > In addition lapic timer is firing there, and I'm seeing a higher
> > > interrupt load than I used to. This is the same in your set and mine.
> >
> > Is irq0 _and_ the lapic timer firing ?
>
> Yes.

Hmm, that's even more strange. Can you please provide the output
of /proc/timer_list ?

tglx


2007-05-08 10:07:36

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

* Thomas Gleixner ([email protected]) wrote:
> Hmm, that's even more strange. Can you please provide the output
> of /proc/timer_list ?

It's quite normal looking, and a printk in clockevents_set_mode looks normal too.

[chrisw@localhost ~]$ cat /proc/timer_list
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 313565930359 nsecs

cpu: 0
clock 0:
.index: 0
.resolution: 4000250 nsecs
.get_time: ktime_get_real
active timers:
clock 1:
.index: 1
.resolution: 4000250 nsecs
.get_time: ktime_get
active timers:
#0: <ffff810150223e38>, hrtimer_wakeup, S:01
# expires at 313700587872 nsecs [in 134657513 nsecs]
#1: <ffff810150223e38>, it_real_fn, S:01
# expires at 332024684987 nsecs [in 18458754628 nsecs]
#2: <ffff810150223e38>, hrtimer_wakeup, S:01
# expires at 337032779067 nsecs [in 23466848708 nsecs]
#3: <ffff810150223e38>, hrtimer_wakeup, S:01
# expires at 1851212334864 nsecs [in 1537646404505 nsecs]
#4: <ffff810150223e38>, it_real_fn, S:01
# expires at 3635317355388 nsecs [in 3321751425029 nsecs]
#5: <ffff810150223e38>, it_real_fn, S:01
# expires at 3635630194993 nsecs [in 3322064264634 nsecs]

cpu: 1
clock 0:
.index: 0
.resolution: 4000250 nsecs
.get_time: ktime_get_real
active timers:
clock 1:
.index: 1
.resolution: 4000250 nsecs
.get_time: ktime_get
active timers:

cpu: 2
clock 0:
.index: 0
.resolution: 4000250 nsecs
.get_time: ktime_get_real
active timers:
clock 1:
.index: 1
.resolution: 4000250 nsecs
.get_time: ktime_get
active timers:

cpu: 3
clock 0:
.index: 0
.resolution: 4000250 nsecs
.get_time: ktime_get_real
active timers:
clock 1:
.index: 1
.resolution: 4000250 nsecs
.get_time: ktime_get
active timers:
#0: <ffff810150223e38>, it_real_fn, S:01
# expires at 313558625026 nsecs [in -7305333 nsecs]
#1: <ffff810150223e38>, hrtimer_wakeup, S:01
# expires at 369334850252 nsecs [in 55768919893 nsecs]
#2: <ffff810150223e38>, it_real_fn, S:01
# expires at 3936979059363 nsecs [in 3623413129004 nsecs]


Tick Device: mode: 0
Clock Event Device: hpet
max_delta_ns: 85899346200
min_delta_ns: 1920
mult: 107374182
shift: 32
mode: 2
next_event: 9223372036854775807 nsecs
set_next_event: hpet_next_event
set_mode: hpet_set_mode
event_handler: tick_handle_periodic_broadcast
tick_broadcast_mask: 00000001


Tick Device: mode: 0
Clock Event Device: lapic
max_delta_ns: 671088687
min_delta_ns: 1200
mult: 53687081
shift: 32
mode: 1
next_event: 0 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: tick_handle_periodic

Tick Device: mode: 0
Clock Event Device: lapic
max_delta_ns: 671088687
min_delta_ns: 1200
mult: 53687081
shift: 32
mode: 2
next_event: 0 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: tick_handle_periodic

Tick Device: mode: 0
Clock Event Device: lapic
max_delta_ns: 671088687
min_delta_ns: 1200
mult: 53687081
shift: 32
mode: 2
next_event: 0 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: tick_handle_periodic

Tick Device: mode: 0
Clock Event Device: lapic
max_delta_ns: 671088687
min_delta_ns: 1200
mult: 53687081
shift: 32
mode: 2
next_event: 0 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: tick_handle_periodic


[chrisw@localhost ~]$ grep -e 0: -e LOC: /proc/interrupts
0: 72 2 58 36008 IO-APIC-edge timer
LOC: 36655 36772 36725 36683
[chrisw@localhost ~]$ grep -e 0: -e LOC: /proc/interrupts
0: 72 2 58 36165 IO-APIC-edge timer
LOC: 36812 36929 36882 36840
[chrisw@localhost ~]$ grep -e 0: -e LOC: /proc/interrupts
0: 72 2 58 36297 IO-APIC-edge timer
LOC: 36944 37061 37014 36972
[chrisw@localhost ~]$ grep -e 0: -e LOC: /proc/interrupts
0: 72 2 58 36424 IO-APIC-edge timer
LOC: 37071 37188 37141 37099

2007-05-08 10:32:01

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Tue, 2007-05-08 at 03:06 -0700, Chris Wright wrote:

> Tick Device: mode: 0
> Clock Event Device: hpet
> max_delta_ns: 85899346200
> min_delta_ns: 1920
> mult: 107374182
> shift: 32
> mode: 2
> next_event: 9223372036854775807 nsecs
> set_next_event: hpet_next_event
> set_mode: hpet_set_mode
> event_handler: tick_handle_periodic_broadcast
> tick_broadcast_mask: 00000001

Broadcasting is active. The system has C2 state, right ?

If it's not AMD you can add lapic_timer_c2_ok to the commandline.

tglx


2007-05-08 13:15:44

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [PATCH] x86-64 highres/dyntick support



>-----Original Message-----
>From: Chris Wright [mailto:[email protected]]
>Sent: Tuesday, May 08, 2007 2:52 AM
>To: Thomas Gleixner
>Cc: Chris Wright; LKML; Pallipadi, Venkatesh; john stultz;
>Ingo Molnar; Arjan van de Ven; Steven Rostedt; Andi Kleen;
>Andrew Morton
>Subject: Re: [PATCH] x86-64 highres/dyntick support
>
>* Thomas Gleixner ([email protected]) wrote:
>> On Tue, 2007-05-08 at 02:39 -0700, Chris Wright wrote:
>>
>> > OK, looks very similar all things considered. One thing I
>didn't do
>> > was fix lapic timer calibration (was hoping you'd do that
>part, and you
>> > did ;-) I've noticed that something has changed and I'm
>seeing irq0
>> > handled on cpu3 (4 cpu system), where it used to be on
>cpu0 as expected.
>>
>> Strange, irq balancing ?
>
>That's what I was wondering, although i have same setup for 32-bit
>and it behaves as expected with cpu0 taking hpet or pit on irq0
>and lapic timer picked up on the other 3 cpus.
>

Yes. Looks like irq balancing issue. On i386 I see irq has flag
IRQF_NOBALANCING, but x86-64 does not have this flag. Can you add that
and check whether that makes any difference.

Thanks,
Venki

2007-05-08 17:08:30

by Nicolas Mailhot

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

Le lundi 07 mai 2007 à 17:28 +0200, Thomas Gleixner a écrit :
> On Mon, 2007-05-07 at 09:16 +0000, Nicolas Mailhot wrote:
> > Thomas Gleixner <tglx <at> linutronix.de> writes:
> >
> > >
> > > I'm pleased to announce the first cut of the final x86_64
> > > highres/dyntick support, which I did based on Chris Wright's patch set,
> > > which is again based on Arjan van de Ven's initial work:
> >
> > Do you have a 2.6.21-mm1 patchset?
>
> Not yet, but it should be halfways easy to do one. Stay tuned.

Ok, tuning now

Thanks

--
Nicolas Mailhot

2007-05-14 01:17:29

by Alistair John Strachan

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

On Sunday 06 May 2007 21:58:47 Thomas Gleixner wrote:
> I'm pleased to announce the first cut of the final x86_64
> highres/dyntick support, which I did based on Chris Wright's patch set,
> which is again based on Arjan van de Ven's initial work:

I've noticed a few problems with this patch series, which I manually (without
difficulty) ported to 2.6.22-rc1.

Firstly, the output of /proc/interrupts looks a bit strange. NO_HZ isn't
enabled, just high resolution timers. HZ was set to 1000.

alistair@damocles:~$ cat /proc/interrupts
CPU0 CPU1
0: 195 0 IO-APIC-edge timer
1: 21 13698 IO-APIC-edge i8042
8: 0 0 IO-APIC-edge rtc
9: 0 0 IO-APIC-fasteoi acpi
14: 117 51178 IO-APIC-edge libata
15: 0 0 IO-APIC-edge libata
18: 455 449373 IO-APIC-fasteoi lan-sky, EMU10K1
19: 0 3 IO-APIC-fasteoi yenta, ohci1394
20: 220 86188 IO-APIC-fasteoi ohci_hcd:usb2
21: 0 26 IO-APIC-fasteoi ehci_hcd:usb1
22: 0 273 IO-APIC-fasteoi sata_nv
23: 928 507781 IO-APIC-fasteoi sata_nv, lan
NMI: 0 0
LOC: 4723272 4797930
ERR: 0

Only 195 timer interrupts? I only see this on an AMD Opteron, it doesn't occur
on an Intel Core 2 Duo. Mainline reports this counter regularly increasing
with or without the acpi_pm clocksource loaded.

Secondly, /proc/cpuinfo seems to be broken:

alistair@damocles:~$ cat /proc/cpuinfo | grep MHz
cpu MHz : 210779.550
cpu MHz : 210779.550

Unless my CPU is just under 80 times faster than it used to be, these numbers
are incorrect. I expect 2700.50, or something similar. cpufreq isn't compiled
in.

Finally, and possibly related, the dmesg timestamps seem to be totally broken.
Apparently my machine booted in less than 1 second, with the last messages
as:

[ 0.607862] bridge: topology change detected, propagating
[ 0.607862] bridge: port 2(lan-sky) entering forwarding state
[ 0.830472] Linux agpgart interface v0.102 (c) Dave Jones

Of course, all of these problems might be PEBKAC, but I'm sceptical.

Find the kernel config, updated patch (for 2.6.22-rc1), dmesg, contents
of /proc/interrupts and /proc/cpuinfo here:

http://devzero.co.uk/~alistair/2.6.22-rc1-hrt/

--
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

2007-05-14 06:29:58

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support

Alistair,

On Mon, 2007-05-14 at 02:17 +0100, Alistair John Strachan wrote:
> I've noticed a few problems with this patch series, which I manually (without
> difficulty) ported to 2.6.22-rc1.

There are a couple of fixups pending. I'm going to push out a new queue
today.

> Only 195 timer interrupts? I only see this on an AMD Opteron, it doesn't occur
> on an Intel Core 2 Duo. Mainline reports this counter regularly increasing
> with or without the acpi_pm clocksource loaded.

We stop the PIT when we switch to the local APICs. On your Intel box you
might have deeper C-states which switch the system back to broadcast
mode via PIT.

> Secondly, /proc/cpuinfo seems to be broken:
>
> alistair@damocles:~$ cat /proc/cpuinfo | grep MHz
> cpu MHz : 210779.550
> cpu MHz : 210779.550
>
> Unless my CPU is just under 80 times faster than it used to be, these numbers
> are incorrect. I expect 2700.50, or something similar. cpufreq isn't compiled
> in.

Hmm. It seems the new TSC calibration routine is hosed.

> Finally, and possibly related, the dmesg timestamps seem to be totally broken.
> Apparently my machine booted in less than 1 second, with the last messages
> as:
>
> [ 0.607862] bridge: topology change detected, propagating
> [ 0.607862] bridge: port 2(lan-sky) entering forwarding state
> [ 0.830472] Linux agpgart interface v0.102 (c) Dave Jones

Hey, you have a really fast box :)

tglx


2007-05-14 10:23:06

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

I'm pleased to announce an updated version of the x86_64 highres/dyntick
support patches against 2.6.22-rc1:

To build a highres / dyntick enabled kernel for x86_64:

http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.21.tar.bz2
http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.22-rc1.bz2
http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v1.patch

Broken out version is available here:
http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v1.patches.tar.bz2

Changes since the last version:

- Various fixups from Chris Wright
- TSC calibration fix (pointed out by Alistair John Strachan)

Comments, bugreports, patches are welcome as ususal

Thanks,

tglx


2007-05-14 20:13:17

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

On Mon, 14 May 2007 12:26:08 +0200, Thomas Gleixner said:

> Broken out version is available here:
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v1.patches.tar.bz2

How unhappy am I likely to be if I try to apply this to a 21-mm2 kernel? It
doesn't *look* like any of the patches are in -mm2 (at least not under the
same name).


Attachments:
(No filename) (226.00 B)

2007-05-14 20:20:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

On Mon, 2007-05-14 at 16:10 -0400, [email protected] wrote:
> On Mon, 14 May 2007 12:26:08 +0200, Thomas Gleixner said:
>
> > Broken out version is available here:
> > http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v1.patches.tar.bz2
>
> How unhappy am I likely to be if I try to apply this to a 21-mm2 kernel? It
> doesn't *look* like any of the patches are in -mm2 (at least not under the
> same name).

There are probably some conflicts, but the wreckage should be not that
big.

tglx


2007-05-14 21:15:42

by Alistair John Strachan

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

On Monday 14 May 2007 11:26:08 Thomas Gleixner wrote:
> I'm pleased to announce an updated version of the x86_64 highres/dyntick
> support patches against 2.6.22-rc1:
[snip]
> - Various fixups from Chris Wright
> - TSC calibration fix (pointed out by Alistair John Strachan)
>
> Comments, bugreports, patches are welcome as ususal

Neither of the bugs I reported appear to be fixed.

I took a clean git tree from the 2.6.22-rc1 tag and patched with this version;
my CPU MHz and dmesg counter still appear to be broken (v3 was used).

--
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

2007-05-14 22:01:55

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

On Mon, 2007-05-14 at 22:15 +0100, Alistair John Strachan wrote:
> On Monday 14 May 2007 11:26:08 Thomas Gleixner wrote:
> > I'm pleased to announce an updated version of the x86_64 highres/dyntick
> > support patches against 2.6.22-rc1:
> [snip]
> > - Various fixups from Chris Wright
> > - TSC calibration fix (pointed out by Alistair John Strachan)
> >
> > Comments, bugreports, patches are welcome as ususal
>
> Neither of the bugs I reported appear to be fixed.
>
> I took a clean git tree from the 2.6.22-rc1 tag and patched with this version;
> my CPU MHz and dmesg counter still appear to be broken (v3 was used).

Sigh. /me feels stupid.

Can you please apply the following patch on top and check, whether it
fixes the problem ? Please provide the debug output, when it fails.

Thanks,

tglx

Index: linux-2.6.21/arch/x86_64/kernel/tsc.c
===================================================================
--- linux-2.6.21.orig/arch/x86_64/kernel/tsc.c
+++ linux-2.6.21/arch/x86_64/kernel/tsc.c
@@ -135,7 +135,7 @@ static unsigned long __init tsc_read_ref
for (i = 0; i < MAX_RETRIES; i++) {
t1 = get_cycles_sync();
if (hpet)
- *hpet = hpet_readl(HPET_COUNTER);
+ *hpet = hpet_readl(HPET_COUNTER) & 0xFFFFFFFF;
else
*pm = acpi_pm_read_early();
t2 = get_cycles_sync();
@@ -177,8 +177,10 @@ void __init tsc_calibrate(void)
tsc_khz = (tr2 - tr1) / 50;

/* hpet or pmtimer available ? */
- if (!hpet && !pm1 && !pm2)
+ if (!hpet && !pm1 && !pm2) {
+ printk(KERN_INFO "TSC calibrated against tick interrupt\n");
return;
+ }

/* Check, whether the sampling was disturbed by an SMI */
if (tsc1 == ULONG_MAX || tsc2 == ULONG_MAX) {
@@ -190,14 +192,25 @@ void __init tsc_calibrate(void)
tsc2 = (tsc2 - tsc1) * 1000000000L;

if (hpet) {
- hpet2 = (hpet2 - hpet1) & 0xFFFFFFFF;
+ printk(KERN_INFO "TSC calibrated against hpet %ld %ld",
+ hpet1, hpet2);
+ if (hpet2 < hpet1)
+ hpet2 += 0x100000000;
+ hpet2 -= hpet1;
tsc1 = (hpet2 * hpet_readl(HPET_PERIOD)) / 1000;
+ printk(" %lu %lu\n", hpet2, tsc1);
} else {
- pm2 = (pm2 -pm1) & ACPI_PM_MASK;
+ printk(KERN_INFO "TSC calibrated against pm_timer %ld %ld",
+ pm1, pm2);
+ if (pm2 < pm1)
+ pm2 += ACPI_PM_OVRRUN;
+ pm2 -= pm1;
tsc1 = (pm2 * PMTMR_TICKS_PER_SEC) / 1000;
+ printk(" %lu %lu\n", pm2, tsc1);
}

tsc_khz = tsc2 / tsc1;
+ printk(KERN_INFO "tsckhz: %lu %lu %u\n", tsc2, tsc1, tsc_khz);
}

/*


2007-05-14 22:42:43

by Alistair John Strachan

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v1

On Monday 14 May 2007 23:05:14 Thomas Gleixner wrote:
> On Mon, 2007-05-14 at 22:15 +0100, Alistair John Strachan wrote:
> > On Monday 14 May 2007 11:26:08 Thomas Gleixner wrote:
> > > I'm pleased to announce an updated version of the x86_64
> > > highres/dyntick support patches against 2.6.22-rc1:
> >
> > [snip]
> >
> > > - Various fixups from Chris Wright
> > > - TSC calibration fix (pointed out by Alistair John Strachan)
> > >
> > > Comments, bugreports, patches are welcome as ususal
> >
> > Neither of the bugs I reported appear to be fixed.
> >
> > I took a clean git tree from the 2.6.22-rc1 tag and patched with this
> > version; my CPU MHz and dmesg counter still appear to be broken (v3 was
> > used).
>
> Sigh. /me feels stupid.
>
> Can you please apply the following patch on top and check, whether it
> fixes the problem ? Please provide the debug output, when it fails.

Doesn't fix the problem, and here is the debug:

TSC calibrated against pm_timer 8927439 9106459 179020 640810145
tsckhz: 135069550000000000 640810145 210779356
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 210779.356 MHz processor.

(Perhaps if this debug effort has to continue, we could remove some of these
gentlemen from CC?)

--
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

2007-05-15 08:14:40

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

I've uploaded a new version of the x86_64 highres/dyntick patches:

http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v4.patch

Broken out version:

http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v4.patches.tar.bz2

Changes since last version:

- TSC calibration against PM-Timer

tglx


2007-05-15 14:55:50

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Gleixner wrote:
> I've uploaded a new version of the x86_64 highres/dyntick patches:
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v4.patch

Hangs at boot here:
Kernel alive
Kernel direct mapping tables up to 100000000 @ 8000-d000
(and that's it)

This is a Dell Inspiron E1705 with a Core 2 Duo 2.16GHz

highres-v3 also hung at the same point, but 2.6.21-git2-v2 worked
2.6.22-rc1 boots without problem

Is there any information I can provide to help track down the problem?

Thanks,

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGSb5JaI0dwg4A47wRAjkgAJ9Urvpo+cTAbRvblYovBYy3PD76jACfbtjj
EZSrOZDFMnOYjc02nHSAxaM=
=/ffY
-----END PGP SIGNATURE-----

2007-05-15 21:49:44

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

Frank,

On Tue, 2007-05-15 at 09:06 -0500, Frank Sorenson wrote:
> Hangs at boot here:
> Kernel alive
> Kernel direct mapping tables up to 100000000 @ 8000-d000
> (and that's it)
>
> This is a Dell Inspiron E1705 with a Core 2 Duo 2.16GHz
>
> highres-v3 also hung at the same point, but 2.6.21-git2-v2 worked
> 2.6.22-rc1 boots without problem

Can you please try the following three command line option addons ?

1: highres=off nohz=off
2: highres=off
3: nohz=off

Thanks,

tglx


2007-05-15 23:24:17

by Alistair John Strachan

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

On Tuesday 15 May 2007 09:18:02 Thomas Gleixner wrote:
> I've uploaded a new version of the x86_64 highres/dyntick patches:
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-hig
>hres-v4.patch
>
> Broken out version:
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-hig
>hres-v4.patches.tar.bz2
>
> Changes since last version:
>
> - TSC calibration against PM-Timer

Working fine now, thanks a lot. Great latencies on usleep() now too, just what
I was looking for.

(BTW, with HRT (but not NO_HZ), does the HZ value have any effect on usleep()
(and friends) latencies?)

--
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

2007-05-16 00:06:47

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Gleixner wrote:
> Frank,
>
> On Tue, 2007-05-15 at 09:06 -0500, Frank Sorenson wrote:
>> Hangs at boot here:
>> Kernel alive
>> Kernel direct mapping tables up to 100000000 @ 8000-d000
>> (and that's it)
>>
>> This is a Dell Inspiron E1705 with a Core 2 Duo 2.16GHz
>>
>> highres-v3 also hung at the same point, but 2.6.21-git2-v2 worked
>> 2.6.22-rc1 boots without problem
>
> Can you please try the following three command line option addons ?
>
> 1: highres=off nohz=off
> 2: highres=off
> 3: nohz=off
>
> Thanks,
>
> tglx

All 3 hang at the same point.

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGSkA6aI0dwg4A47wRAuQzAKDcmAMR2e2Ce/3ytR+39XxgG76XjwCfVtbj
vWlY59M1gHz2Z8dIJWRYqTk=
=algX
-----END PGP SIGNATURE-----

2007-05-16 05:05:56

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Sorenson wrote:
> Thomas Gleixner wrote:
>> Frank,
>
>> On Tue, 2007-05-15 at 09:06 -0500, Frank Sorenson wrote:
>>> Hangs at boot here:
>>> Kernel alive
>>> Kernel direct mapping tables up to 100000000 @ 8000-d000
>>> (and that's it)
>>>
>>> This is a Dell Inspiron E1705 with a Core 2 Duo 2.16GHz
>>>
>>> highres-v3 also hung at the same point, but 2.6.21-git2-v2 worked
>>> 2.6.22-rc1 boots without problem
>> Can you please try the following three command line option addons ?
>
>> 1: highres=off nohz=off
>> 2: highres=off
>> 3: nohz=off
>
>> Thanks,
>
>> tglx
>
> All 3 hang at the same point.

I have tracked down the offending patch in the series to
x86-64-convert-to-clockevents.patch

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGSpB2aI0dwg4A47wRAglRAJ4mJgbJClPd0hkXKp+YHq7G5VxQvgCgkVkv
TtOjrSrjrwiHQPkNCqlq314=
=lu8g
-----END PGP SIGNATURE-----

2007-05-16 05:50:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

On Wed, 2007-05-16 at 00:23 +0100, Alistair John Strachan wrote:
> > - TSC calibration against PM-Timer
>
> Working fine now, thanks a lot. Great latencies on usleep() now too, just what
> I was looking for.
>
> (BTW, with HRT (but not NO_HZ), does the HZ value have any effect on usleep()
> (and friends) latencies?)

Not directly.

tglx


2007-05-16 06:16:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

On Wed, 2007-05-16 at 00:02 -0500, Frank Sorenson wrote:
> >>> highres-v3 also hung at the same point, but 2.6.21-git2-v2 worked
> >>> 2.6.22-rc1 boots without problem
> >> Can you please try the following three command line option addons ?
> >
> >> 1: highres=off nohz=off
> >> 2: highres=off
> >> 3: nohz=off
> >
> > All 3 hang at the same point.

Ok.

> I have tracked down the offending patch in the series to
> x86-64-convert-to-clockevents.patch

Not surprising. :)

I'm going to add some early_printks for the next version, so we can get
an idea where it gets stuck

tglx


2007-05-16 08:57:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v4

Frank,

On Wed, 2007-05-16 at 08:20 +0200, Thomas Gleixner wrote:
> > I have tracked down the offending patch in the series to
> > x86-64-convert-to-clockevents.patch
>
> Not surprising. :)
>
> I'm going to add some early_printks for the next version, so we can get
> an idea where it gets stuck

I went through the relevant changes since the git2-v2 version and the
only thing, which could affect the early boot process is the patch
snippet below.

Can you apply this either to the git2-v2 version and check, if it fails
as well, or reverse apply it to rc1-v4 and check, if the problem goes
away ?

Thanks,

tglx


diff -uprN --exclude-from=/home/tglx/bin/diffit.exclude linux-2.6.21-git-x86-64/arch/i386/kernel/i8253.c linux-2.6.21/arch/i386/kernel/i8253.c
--- linux-2.6.21-git-x86-64/arch/i386/kernel/i8253.c 2007-05-16 09:58:01.000000000 +0200
+++ linux-2.6.21/arch/i386/kernel/i8253.c 2007-05-16 09:10:34.000000000 +0200
@@ -29,24 +29,6 @@ EXPORT_SYMBOL(i8253_lock);
*/
struct clock_event_device *global_clock_event;

-/* Status of the PIT interrupt */
-static int pit_irq_disabled;
-
-/*
- * Control pit interrupt enable / disable
- */
-static void pit_control_irq(int disable)
-{
- if (pit_irq_disabled == disable)
- return;
-
- pit_irq_disabled = disable;
- if (disable)
- disable_irq(0);
- else
- enable_irq(0);
-}
-
/*
* Initialize the PIT timer.
*
@@ -65,17 +47,18 @@ static void init_pit_timer(enum clock_ev
outb_p(0x34, PIT_MODE);
outb_p(LATCH & 0xff , PIT_CH0); /* LSB */
outb(LATCH >> 8 , PIT_CH0); /* MSB */
- pit_control_irq(0);
break;

case CLOCK_EVT_MODE_SHUTDOWN:
case CLOCK_EVT_MODE_UNUSED:
- pit_control_irq(1);
+ outb_p(0x30, PIT_MODE);
+ outb_p(0, PIT_CH0); /* LSB */
+ outb_p(0, PIT_CH0); /* MSB */
break;
+
case CLOCK_EVT_MODE_ONESHOT:
/* One shot setup */
outb_p(0x38, PIT_MODE);
- pit_control_irq(0);
break;

case CLOCK_EVT_MODE_RESUME:
@@ -129,7 +112,7 @@ void __init setup_pit_timer(void)
* Start pit with the boot cpu mask and make it global after the
* IO_APIC has been initialized.
*/
- pit_clockevent.cpumask = cpumask_of_cpu(0);
+ pit_clockevent.cpumask = cpumask_of_cpu(smp_processor_id());
pit_clockevent.mult = div_sc(CLOCK_TICK_RATE, NSEC_PER_SEC, 32);
pit_clockevent.max_delta_ns =
clockevent_delta2ns(0x7FFF, &pit_clockevent);


2007-05-16 09:59:10

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

I'm pleased to announce an updated version of the x86_64 highres/dyntick
support patches against 2.6.22-rc1:

http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v5.patch

Broken out version is available here:
http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v5.patches.tar.bz2

Changes since the last version:

- Enforced enabling of HPET (Venki Pallipadi)
Detects and enables HPET on Intel ICHx chipsets, when the BIOS
hides it.

Venki, great work, thanks !

- Another variant of PIT wreckage fixup
Frank, does this one work for you ?

- Various fixlets (me)

Comments, bugreports, patches are welcome as usual

Thanks,

tglx


2007-05-16 19:54:38

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thomas Gleixner wrote:
> I'm pleased to announce an updated version of the x86_64 highres/dyntick
> support patches against 2.6.22-rc1:
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v5.patch
>
> Broken out version is available here:
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v5.patches.tar.bz2
>
> Changes since the last version:
>
> - Enforced enabling of HPET (Venki Pallipadi)
> Detects and enables HPET on Intel ICHx chipsets, when the BIOS
> hides it.
>
> Venki, great work, thanks !
>
> - Another variant of PIT wreckage fixup
> Frank, does this one work for you ?

Unfortunately, no.

After adding *lots* of early_printks, I see that it hangs in
hpet_is_known(hdp) called from hpet_alloc(&hd), so something in the hpet
code is still buggy. Adding nohpet to the kernel command line allows it
to boot correctly.

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGS1/3aI0dwg4A47wRAuaIAKD1MG5x4cvWXGRCwkHEGBzhUHyJlQCfez+S
ODHcJ4FEJ4W4LB7NzW32X6Q=
=r1q5
-----END PGP SIGNATURE-----

2007-05-17 04:31:36

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Sorenson wrote:

> After adding *lots* of early_printks, I see that it hangs in
> hpet_is_known(hdp) called from hpet_alloc(&hd), so something in the hpet
> code is still buggy. Adding nohpet to the kernel command line allows it
> to boot correctly.

Hrm. Looks like it gets past the hpet_is_known There's still something
in the hpet detection code, but I didn't get to the bottom of it yet.
I'll do some more debugging to track down where it's really hanging.
Sorry for the noise.

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGS9lFaI0dwg4A47wRApdSAJoDsFphRHZq/tu3d4nJaqMvt+tLGQCghf1L
OCuPEpCRr9tBSnBdVNiShRE=
=NDZn
-----END PGP SIGNATURE-----

2007-05-17 19:15:35

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Sorenson wrote:

> Hrm. Looks like it gets past the hpet_is_known There's still something
> in the hpet detection code, but I didn't get to the bottom of it yet.
> I'll do some more debugging to track down where it's really hanging.
> Sorry for the noise.

I've tracked down this hang to a kzalloc in the hpet code that never
returns. But only when using SLUB. Using SLAB, the highres/dyntick
patch boots without problem.

...adding Christoph to the CC list...

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGTKkLaI0dwg4A47wRArYeAJwLs4fJDfj8CuYmUpCaifou6DBsHgCg9nvP
ilEqTd1DdOD13LSl7xVeKls=
=RT2W
-----END PGP SIGNATURE-----

2007-05-17 19:19:42

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

On Thu, 17 May 2007, Frank Sorenson wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Frank Sorenson wrote:
>
> > Hrm. Looks like it gets past the hpet_is_known There's still something
> > in the hpet detection code, but I didn't get to the bottom of it yet.
> > I'll do some more debugging to track down where it's really hanging.
> > Sorry for the noise.
>
> I've tracked down this hang to a kzalloc in the hpet code that never
> returns. But only when using SLUB. Using SLAB, the highres/dyntick
> patch boots without problem.
>
> ...adding Christoph to the CC list...

Please boot with slub_debug.

2007-05-17 22:05:07

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v7

I'm pleased to announce an updated version of the x86_64 highres/dyntick
support patches against 2.6.22-rc1:

http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v7.patch

Broken out version is available here:
http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v7.patches.tar.bz2

Changes since the last version:

- suspend / resume fix
- pc speaker preliminary locking fix
- networking nohz problem workaround (Mikulas Patocka)

Comments, bug reports, patches are welcome as usual

Thanks,

tglx


2007-05-18 03:59:37

by Frank Sorenson

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christoph Lameter wrote:
> On Thu, 17 May 2007, Frank Sorenson wrote:
>> Frank Sorenson wrote:
>>
>>> Hrm. Looks like it gets past the hpet_is_known There's still something
>>> in the hpet detection code, but I didn't get to the bottom of it yet.
>>> I'll do some more debugging to track down where it's really hanging.
>>> Sorry for the noise.
>> I've tracked down this hang to a kzalloc in the hpet code that never
>> returns. But only when using SLUB. Using SLAB, the highres/dyntick
>> patch boots without problem.
>>
>> ...adding Christoph to the CC list...
>
> Please boot with slub_debug.

No debugging output at all. Still hangs with only:
Kernel alive
Kernel direct mapping tables up to 100000000 @ 8000-d000

Frank
- --
Frank Sorenson - KD7TZK
Linux Systems Engineer, DSS Engineering, UBS AG
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGTSPMaI0dwg4A47wRAkiaAJ0X80qq/ASp6zDEDAibOebZ1awBLACgh0OM
mHK5zxK+rwSNoiDlVRv6p8g=
=vegA
-----END PGP SIGNATURE-----

2007-05-18 16:58:39

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

On Thu, 17 May 2007, Frank Sorenson wrote:

> >> I've tracked down this hang to a kzalloc in the hpet code that never
> >> returns. But only when using SLUB. Using SLAB, the highres/dyntick
> >> patch boots without problem.
> >>
> >> ...adding Christoph to the CC list...
> >
> > Please boot with slub_debug.
>
> No debugging output at all. Still hangs with only:
> Kernel alive
> Kernel direct mapping tables up to 100000000 @ 8000-d000

Is there some way you can get a stack trace?

2007-05-18 19:44:38

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5

On Thu, 17 May 2007, Frank Sorenson wrote:

> > Please boot with slub_debug.
>
> No debugging output at all. Still hangs with only:
> Kernel alive
> Kernel direct mapping tables up to 100000000 @ 8000-d000

Hmmmm..... No other output? Could it be that early console output is not
available? Try earlyprintk=xx? Try another platform that has working early
printk support (x86_64 seems broken to me)?

2007-05-21 01:17:31

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v7

On Fri, 18 May 2007 00:09:53 +0200, Thomas Gleixner said:
> Broken out version is available here:
> http://www.tglx.de/projects/hrtimers/2.6.22-rc1/linux-2.6.22-rc1-x86_64-highres-v7.patches.tar.bz2

By the time I got there, you'd put the -v8 version out there. It applied
to a 2.6.22-rc1-mm1 tree with only minor bashing (basically, the patches
clocksource-fix-mismerge.diff and clocksource-watchdog-resumed-lockless.patch
appear to already be in -rc1-mm1, and hpet-check-counter.diff in a ever-so
slightly different form). I also threw on the 22-rc1 patch from linuxpowertop
to tweak some timeouts in the kernel...

Boots and runs on a Dell Latitude D820 laptop with an Intel Core2 T7200.
Intel's 'powertop' was reporting as low as 25 wakeups/sec in single-user
mode, even with CONFIG_HZ=1000. This is looking like a good -mm candidate...

Some 'powertop' results (I'm not at all clear why it's reporting 3000+ wakeups/
sec on the non-NOHZ kernels with HZ=1000. I can see 1000, or 2000 because it's
a dual-core, but 3000?). It doesn't look like NOHZ gains me a *lot* of extra
battery time, but a bit (I believe the wattages to be reasonably accurate, the
'hours left' to be a crock...).

-rc1-mm1 with X desktop running:
Cn Avg residency (5s) Long term residency avg
C0 (cpu running) (18.8%)
C1 0.0ms ( 0.0%) 0.0ms
C2 0.3ms ( 2.6%) 0.3ms
C3 0.5ms (78.6%) 0.5ms

Wakeups-from-idle per second : 3183.2
Power usage (ACPI estimate) : 27.4 W (2.7 hours left)

Top causes for wakeups:
16.4% (62.4) <interrupt> : nvidia
13.2% (50.0) S06cpuspeed : queue_delayed_work_on (delayed_work_timer_
12.6% (48.0) esd : schedule_timeout (process_timeout)
8.4% (31.8) gkrellm : schedule_timeout (process_timeout)
7.7% (29.2) <interrupt> : uhci_hcd:usb4
6.6% (25.2) Xorg : do_setitimer (it_real_fn)
6.6% (25.0) <interrupt> : uhci_hcd:usb3, HDA Intel
5.3% (20.0) gyachi : schedule_timeout (process_timeout)
3.3% (12.6) e16 : schedule_timeout (process_timeout)
2.6% (10.0) <kernel core> : ehci_work (ehci_watchdog)

Suggestion: Enable the CONFIG_NO_HZ kernel configuration option.

-rc1-mm1-hrt8 with X desktop running:
Cn Avg residency (5s) Long term residency avg
C0 (cpu running) ( 8.5%)
C1 0.0ms ( 0.0%) 0.0ms
C2 0.8ms ( 8.3%) 1.0ms
C3 1.5ms (83.2%) 1.5ms

Wakeups-from-idle per second : 1290.0
Power usage (ACPI estimate) : 26.8 W (2.7 hours left)

Top causes for wakeups:
23.8% (62.0) <interrupt> : nvidia
15.8% (41.2) S06cpuspeed : queue_delayed_work_on (delayed_work_timer_
14.6% (38.0) gkrellm : schedule_timeout (process_timeout)
8.7% (22.6) Xorg : do_setitimer (it_real_fn)
7.6% (19.8) gyachi : schedule_timeout (process_timeout)
3.8% (10.0) <kernel core> : ehci_work (ehci_watchdog)
3.4% ( 8.8) e16 : schedule_timeout (process_timeout)
3.2% ( 8.4) firefox-bin : futex_wait (hrtimer_wakeup)
3.1% ( 8.0) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func)
2.5% ( 6.6) pcscd : schedule_timeout (process_timeout)

-rc1-mm1 single-user:
C0 (cpu running) (10.0%)
C1 0.0ms ( 0.0%) 0.0ms
C2 0.5ms ( 0.0%) 0.6ms
C3 0.6ms (90.0%) 0.6ms

Wakeups-from-idle per second : 3003.0
Power usage (ACPI estimate) : 24.5 W (3.1 hours left)

Top causes for wakeups:
42.1% ( 8.0) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func)
26.3% ( 5.0) <kernel core> : fbcon_add_cursor_timer (cursor_timer_handl
14.7% ( 2.8) <kernel core> : queue_delayed_work_on (delayed_work_timer_
6.3% ( 1.2) <interrupt> : libata
5.3% ( 1.0) kedac : schedule_timeout (process_timeout)
1.1% ( 0.2) init : schedule_timeout (process_timeout)
1.1% ( 0.2) <kernel core> : page_writeback_init (wb_timer_fn)
1.1% ( 0.2) pdflush : blk_plug_device (blk_unplug_timeout)
1.1% ( 0.2) <kernel core> : neigh_table_init_no_netlink (neigh_periodi
1.1% ( 0.2) rc.sysinit : start_this_handle (commit_timeout)

Suggestion: Enable the CONFIG_NO_HZ kernel configuration option.

-rc1-mm1-hrt8 single-user:
Cn Avg residency (5s) Long term residency avg
C0 (cpu running) ( 0.1%)
C1 0.0ms ( 0.0%) 0.0ms
C2 1.9ms ( 0.2%) 1.9ms
C3 58.6ms (99.7%) 58.6ms

Wakeups-from-idle per second : 36.0
Power usage (ACPI estimate) : 23.2 W (2.9 hours left)

Top causes for wakeups:
48.2% ( 8.0) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func)
30.1% ( 5.0) <kernel core> : fbcon_add_cursor_timer (cursor_timer_handl
12.0% ( 2.0) <kernel core> : queue_delayed_work_on (delayed_work_timer_
6.0% ( 1.0) kedac : schedule_timeout (process_timeout)
1.2% ( 0.2) init : schedule_timeout (process_timeout)
1.2% ( 0.2) <kernel core> : neigh_table_init_no_netlink (neigh_periodi
1.2% ( 0.2) rc.sysinit : start_this_handle (commit_timeout)


Attachments:
(No filename) (226.00 B)