2008-07-24 10:57:47

by Martin Wilck

[permalink] [raw]
Subject: [PATCH] x86 (64): make calibrate_APIC_clock() smp-safe

Hi Thomas and Peter, hi everyone,

Asynchrounous events (e.g.SMIs) which occur during the APIC timer
calibration can cause timer miscalibrations, sometimes by large amounts.

This patch fixes this by two separate measures:
a) make sure that no significant interruption occurs between APIC and
TSC reads
b) make sure that the measurement loop isn't significantly longer
than originally intended.

I am sorry, due to a misconfiguration of our SMTP server I need to send
the patch as attachment.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html


Attachments:
calibrate_APIC_clock-1.diff (2.33 kB)

2008-07-24 11:17:00

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() smp-safe

[Martin Wilck - Thu, Jul 24, 2008 at 12:47:56PM +0200]
> Hi Thomas and Peter, hi everyone,
>
> Asynchrounous events (e.g.SMIs) which occur during the APIC timer
> calibration can cause timer miscalibrations, sometimes by large amounts.
>
> This patch fixes this by two separate measures:
> a) make sure that no significant interruption occurs between APIC and
> TSC reads
> b) make sure that the measurement loop isn't significantly longer
> than originally intended.
>
> I am sorry, due to a misconfiguration of our SMTP server I need to send
> the patch as attachment.
>
> Martin
>
> --
> Martin Wilck
> PRIMERGY System Software Engineer
> FSC IP ESP DEV 6
>
> Fujitsu Siemens Computers GmbH
> Heinz-Nixdorf-Ring 1
> 33106 Paderborn
> Germany
>
> Tel: ++49 5251 8 15113
> Fax: ++49 5251 8 20209
> Email: mailto:[email protected]
> Internet: http://www.fujitsu-siemens.com
> Company Details: http://www.fujitsu-siemens.com/imprint.html

| Patch: make calibrate_APIC_clock() SMI-safe
|
| Asynchrounous events (e.g.SMIs) which occur during the APIC timer calibration
| can cause timer miscalibrations, sometimes by large amounts. This patch fixes
| this by two separate measures:
| a) make sure that no significant interruption occurs between APIC and
| TSC reads
| b) make sure that the measurement loop isn't significantly longer
| than originally intended.
|
| Signed-off-by: Martin Wilck <[email protected]>
| Signed-off-by: Gerhard Wichert <[email protected]>
|
| --- linux-2.6.26/arch/x86/kernel/apic_64.c 2008-07-13 23:51:29.000000000 +0200
| +++ linux-2.6.26/arch/x86/kernel/apic_64.c.new 2008-07-24 11:41:24.000000000 +0200
| @@ -314,6 +314,19 @@ static void setup_APIC_timer(void)
|
| #define TICK_COUNT 100000000
|
| +#define MAX_DIFFERENCE 1000UL
| +static inline void __read_tsc_and_apic(unsigned long *tsc, unsigned *apic)
| +{
| + unsigned long tsc0, tsc1, diff;
| + do {
| + rdtscll(tsc0);
| + *apic = apic_read(APIC_TMCCT);
| + rdtscll(tsc1);
| + diff = tsc1 - tsc0;
| + } while (diff > MAX_DIFFERENCE);
| + *tsc = tsc0 + (diff >> 1);
| +}
| +
| static void __init calibrate_APIC_clock(void)
| {
| unsigned apic, apic_start;
| @@ -329,25 +342,37 @@ static void __init calibrate_APIC_clock(
| *
| * No interrupt enable !
| */
| +smi_occured:
| __setup_APIC_LVTT(250000000, 0, 0);
|
| - apic_start = apic_read(APIC_TMCCT);
| #ifdef CONFIG_X86_PM_TIMER
| if (apic_calibrate_pmtmr && pmtmr_ioport) {
| + apic_start = apic_read(APIC_TMCCT);
| pmtimer_wait(5000); /* 5ms wait */
| apic = apic_read(APIC_TMCCT);
| result = (apic_start - apic) * 1000L / 5;
| } else
| #endif
| {
| - rdtscll(tsc_start);
| + __read_tsc_and_apic(&tsc_start, &apic_start);
|
| do {
| - apic = apic_read(APIC_TMCCT);
| - rdtscll(tsc);
| + __read_tsc_and_apic(&tsc, &apic);
| } while ((tsc - tsc_start) < TICK_COUNT &&
| (apic_start - apic) < TICK_COUNT);
|
| + /*
| + * If this takes significantly longer than TICK_COUNT,
| + * some interruption must have occured - retry.
| + */
| + if ((tsc - tsc_start) > (TICK_COUNT + TICK_COUNT/1000) ||
| + (apic_start - apic) > (TICK_COUNT + TICK_COUNT/1000)) {
| + printk(KERN_ERR
| + "calibrate_APIC_clock: SMI occured? %lx %x",
| + tsc - tsc_start, apic_start - apic);
| + goto smi_occured;
| + }
| +
| result = (apic_start - apic) * 1000L * tsc_khz /
| (tsc - tsc_start);
| }

Hi, but what if SMI flood happens? We could stuck here forever, don't we?

- Cyrill -

2008-07-24 11:58:44

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() smp-safe

Cyrill Gorcunov wrote:

> Hi, but what if SMI flood happens? We could stuck here forever, don't we?
>
> - Cyrill -


Yes, if the SMI flood never ends. But you wouldn't want to work on such
a system anyway (with the current kernel code, your APIC timer would
most probably be miscalibrated, which can have the weirdest effects).

Anyway, we can add a maximum iteration count to the patch so that there
is no risk of getting stuck.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-24 12:05:30

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() smp-safe

[Martin Wilck - Thu, Jul 24, 2008 at 01:58:37PM +0200]
> Cyrill Gorcunov wrote:
>
>> Hi, but what if SMI flood happens? We could stuck here forever, don't we?
>>
>> - Cyrill -
>
>
> Yes, if the SMI flood never ends. But you wouldn't want to work on such
> a system anyway (with the current kernel code, your APIC timer would
> most probably be miscalibrated, which can have the weirdest effects).
>
> Anyway, we can add a maximum iteration count to the patch so that there
> is no risk of getting stuck.
>
> Martin
>
> --
> Martin Wilck
> PRIMERGY System Software Engineer
> FSC IP ESP DEV 6
>
> Fujitsu Siemens Computers GmbH
> Heinz-Nixdorf-Ring 1
> 33106 Paderborn
> Germany
>
> Tel: ++49 5251 8 15113
> Fax: ++49 5251 8 20209
> Email: mailto:[email protected]
> Internet: http://www.fujitsu-siemens.com
> Company Details: http://www.fujitsu-siemens.com/imprint.html
>

yes, it will issue some effects but it's better then stuck there.
More over in 'case of SMI flood with current patch you don't get
error message printed i think so you better add max iteration
counter so user will see on console (or whatever) that he is got
problems.
- Cyrill -

2008-07-24 13:32:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() smp-safe

Martin Wilck wrote:
> Hi Thomas and Peter, hi everyone,
>
> Asynchrounous events (e.g.SMIs) which occur during the APIC timer
> calibration can cause timer miscalibrations, sometimes by large amounts.
>
> This patch fixes this by two separate measures:
> a) make sure that no significant interruption occurs between APIC and
> TSC reads
> b) make sure that the measurement loop isn't significantly longer
> than originally intended.
>
> I am sorry, due to a misconfiguration of our SMTP server I need to send
> the patch as attachment.
>

The other thing we may want to consider is doing multiple runs of the
calibration loop. I suspect we'd get a more accurate result running,
say, 9 loops of the 1/9 the current duration and looking either for the
best result or the median (NOT the mean.)

-hpa

2008-07-24 13:42:39

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe

H. Peter Anvin wrote:

> The other thing we may want to consider is doing multiple runs of the
> calibration loop. I suspect we'd get a more accurate result running,
> say, 9 loops of the 1/9 the current duration and looking either for the
> best result or the median (NOT the mean.)
>
> -hpa

We considered that. However I think if you know what result to expect
(here: TICK_COUNT + a few cycles) it is both easier and safer to compare
against the expected value than to do statistics. SMI protection is all
about outliers (thus, the median is a good suggestion but I'd say it
would be over-enigineered).

Martin

PS: I am sorry for the stupid, misleading typo (SMi -> smp) in the mail
subject.

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-24 13:55:35

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe

Cyrill Gorcunov wrote:

> yes, it will issue some effects but it's better then stuck there.
> More over in 'case of SMI flood with current patch you don't get
> error message printed i think so you better add max iteration
> counter so user will see on console (or whatever) that he is got
> problems.
> - Cyrill -

I disagree. If you have a system that generates SMIs in this extreme
frequency, you're better off stuck than running on such an unstable
system. The user _will_ see messages on the console if this happens.
Note that apparently there are few people who have trouble with this. We
did see problems, but never had more than 1 SMI disturbing the
calibration procedure.

Anyway, here is another patch that defines max iteration counts. I
haven't added a "Signed-off:" line, because I prefer the original version.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html


Attachments:
calibrate_APIC_clock-2.diff (2.29 kB)

2008-07-24 14:32:10

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe

[Martin Wilck - Thu, Jul 24, 2008 at 03:55:02PM +0200]
> Cyrill Gorcunov wrote:
>
>> yes, it will issue some effects but it's better then stuck there.
>> More over in 'case of SMI flood with current patch you don't get
>> error message printed i think so you better add max iteration
>> counter so user will see on console (or whatever) that he is got
>> problems.
>> - Cyrill -
>
> I disagree. If you have a system that generates SMIs in this extreme
> frequency, you're better off stuck than running on such an unstable
> system. The user _will_ see messages on the console if this happens.
> Note that apparently there are few people who have trouble with this. We
> did see problems, but never had more than 1 SMI disturbing the
> calibration procedure.
>
> Anyway, here is another patch that defines max iteration counts. I
> haven't added a "Signed-off:" line, because I prefer the original
> version.
>
> Martin
>

yes, Martin, it'll be written on console (just forgot it's not interrupt
driven). I've Cc'ed Maciej in previous message so we should better wait
for his opinion I think. For me the almost ideal solution could be like -
lets user to choose what he wants. I mean you even could add some boot
param to specify behaviour on a such case like panic on SMI flood during
calibration. yes - if we got smi flood we have serious troubles anyway but
i don't think that being just stuck is good choise. And that is why I do like
much more _this_ patch. Anyway - thanks!

- Cyrill -

2008-07-24 15:01:47

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe

[Cyrill Gorcunov - Thu, Jul 24, 2008 at 06:31:51PM +0400]
| [Martin Wilck - Thu, Jul 24, 2008 at 03:55:02PM +0200]
| > Cyrill Gorcunov wrote:
| >
| >> yes, it will issue some effects but it's better then stuck there.
| >> More over in 'case of SMI flood with current patch you don't get
| >> error message printed i think so you better add max iteration
| >> counter so user will see on console (or whatever) that he is got
| >> problems.
| >> - Cyrill -
| >
| > I disagree. If you have a system that generates SMIs in this extreme
| > frequency, you're better off stuck than running on such an unstable
| > system. The user _will_ see messages on the console if this happens.
| > Note that apparently there are few people who have trouble with this. We
| > did see problems, but never had more than 1 SMI disturbing the
| > calibration procedure.
| >
| > Anyway, here is another patch that defines max iteration counts. I
| > haven't added a "Signed-off:" line, because I prefer the original
| > version.
| >
| > Martin
| >
|
| yes, Martin, it'll be written on console (just forgot it's not interrupt
| driven). I've Cc'ed Maciej in previous message so we should better wait
| for his opinion I think. For me the almost ideal solution could be like -
| lets user to choose what he wants. I mean you even could add some boot
| param to specify behaviour on a such case like panic on SMI flood during
| calibration. yes - if we got smi flood we have serious troubles anyway but
| i don't think that being just stuck is good choise. And that is why I do like
| much more _this_ patch. Anyway - thanks!
|
| - Cyrill -

btw, Martin, don't get me wrong please - i'm not just complaining :)
The changes you propose is important enough _but_ it could introduce
regression. Look, with situation of miscalibrated apic timer kernel
was working before but with the patch it could stop to work. So if
user has a such screwed motherboard he could be shocked if it stop
booting with message about SMI happened. we defenitely have to provide
some workaround for this. And your max iteration counter solution
would be fine I think.

- Cyrill -

2008-07-24 15:13:37

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe

Hi Cyrill,

> btw, Martin, don't get me wrong please - i'm not just complaining :)
> The changes you propose is important enough _but_ it could introduce
> regression. Look, with situation of miscalibrated apic timer kernel
> was working before but with the patch it could stop to work. So if
> user has a such screwed motherboard he could be shocked if it stop
> booting with message about SMI happened. we defenitely have to provide
> some workaround for this. And your max iteration counter solution
> would be fine I think.

Let's see what other people think about this. I am fine with both
solutions. Currently only the first one has been tested though (testing
these patches thoroughly needs long-time reboot tests).

One more remark: There are similar calibration routines around in the
kernel which suffer from similar problems as calibrate_APIC_clock().
AFAIK, only calibrate_delay() was made SMI-safe by Venkatesh Pallipadi
years ago.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-25 09:02:18

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

I wrote:

> This patch fixes this by two separate measures:
> a) make sure that no significant interruption occurs between APIC and
> TSC reads
> b) make sure that the measurement loop isn't significantly longer
> than originally intended.

Here is a new, simplified version of our patch that only uses measure a).
We verified that this is sufficient for accurate calibration.

If we fail to determine the start or end time of the calibration
correctly 10 times in a row, we will print a critical error message and
go on. One might as well argue that this should cause a kernel panic (it
is impossible to run on the CPU for only a few cycles without being
interrupted by an SMI!), but Cyrill probably won't agree.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html


Attachments:
calibrate_APIC_clock-4.diff (2.80 kB)

2008-07-25 10:08:59

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

[Martin Wilck - Fri, Jul 25, 2008 at 11:02:06AM +0200]
> I wrote:
>
>> This patch fixes this by two separate measures:
>> a) make sure that no significant interruption occurs between APIC and
>> TSC reads
>> b) make sure that the measurement loop isn't significantly longer
>> than originally intended.
>
> Here is a new, simplified version of our patch that only uses measure a).
> We verified that this is sufficient for accurate calibration.
>
> If we fail to determine the start or end time of the calibration
> correctly 10 times in a row, we will print a critical error message and
> go on. One might as well argue that this should cause a kernel panic (it
> is impossible to run on the CPU for only a few cycles without being
> interrupted by an SMI!), but Cyrill probably won't agree.
>
> Martin
>
> --
> Martin Wilck
> PRIMERGY System Software Engineer
> FSC IP ESP DEV 6
>
> Fujitsu Siemens Computers GmbH
> Heinz-Nixdorf-Ring 1
> 33106 Paderborn
> Germany
>
> Tel: ++49 5251 8 15113
> Fax: ++49 5251 8 20209
> Email: mailto:[email protected]
> Internet: http://www.fujitsu-siemens.com
> Company Details: http://www.fujitsu-siemens.com/imprint.html

| [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)
|
| Non-maskable asynchronous events (e.g. SMIs) which occur during the APIC
| timer calibration can cause timer miscalibrations, sometimes by large amounts.
| This patch fixes this by making sure that no significant interruption occurs
| between APIC and TSC reads. SMIs may still occur at some stage in the
| calibration loop, causing the loop to last longer than intended. This
| doesn't matter though, as long as the start and end values are both
| taken simultaneously.
|
| Signed-off-by: Martin Wilck <[email protected]>
| Signed-off-by: Gerhard Wichert <[email protected]>
|
| --- arch/x86/kernel/apic_64.c 2008-07-25 10:45:09.000000000 +0200
| +++ arch/x86/kernel/apic_64.c.new 2008-07-25 10:45:19.000000000 +0200
| @@ -300,6 +300,31 @@ static void setup_APIC_timer(void)
| }
|
| /*
| + * Helper function for calibrate_APIC_clock(): Make sure that
| + * APIC TMCTT and TSC are read at the same time, to reasonable
| + * accuracy. On any sane system, the retry loop won't need more
| + * than a single retry, given that the rdtsc/apic_read/rdtsc
| + * sequence won't take more than a few cycles.
| + */
| +
| +#define MAX_DIFFERENCE 1000UL
| +#define MAX_ITER 10
| +static inline int
| +__read_tsc_and_apic(unsigned long *tsc, unsigned *apic)
| +{
| + unsigned long tsc0, tsc1, diff;
| + int i = 0;
| + do {
| + rdtscll(tsc0);
| + *apic = apic_read(APIC_TMCCT);
| + rdtscll(tsc1);
| + diff = tsc1 - tsc0;
| + } while (diff > MAX_DIFFERENCE && ++i < MAX_ITER);
| + *tsc = tsc0 + (diff >> 1);
| + return diff > MAX_DIFFERENCE ? -EIO : 0;
| +}
| +
| +/*
| * In this function we calibrate APIC bus clocks to the external
| * timer. Unfortunately we cannot use jiffies and the timer irq
| * to calibrate, since some later bootup code depends on getting
| @@ -318,7 +343,7 @@ static void __init calibrate_APIC_clock(
| {
| unsigned apic, apic_start;
| unsigned long tsc, tsc_start;
| - int result;
| + int result, err_start, err;
|
| local_irq_disable();
|
| @@ -331,23 +356,25 @@ static void __init calibrate_APIC_clock(
| */
| __setup_APIC_LVTT(250000000, 0, 0);
|
| - apic_start = apic_read(APIC_TMCCT);
| #ifdef CONFIG_X86_PM_TIMER
| if (apic_calibrate_pmtmr && pmtmr_ioport) {
| + apic_start = apic_read(APIC_TMCCT);
| pmtimer_wait(5000); /* 5ms wait */
| apic = apic_read(APIC_TMCCT);
| result = (apic_start - apic) * 1000L / 5;
| } else
| #endif
| {
| - rdtscll(tsc_start);
| + err_start = __read_tsc_and_apic(&tsc_start, &apic_start);
|
| do {
| - apic = apic_read(APIC_TMCCT);
| - rdtscll(tsc);
| + err = __read_tsc_and_apic(&tsc, &apic);
| } while ((tsc - tsc_start) < TICK_COUNT &&
| (apic_start - apic) < TICK_COUNT);
|
| + if (err_start || err)
| + printk(KERN_CRIT "calibrate_APIC_clock: SMI flood - "
| + "the APIC timer calibration may be wrong!\n");
| result = (apic_start - apic) * 1000L * tsc_khz /
| (tsc - tsc_start);
| }

Hi Martin, what about the patch below - I simplified it a bit.
Actually we have to handle 32bit mode as well I think. Anyway,
take a look. I don't really mind against your patch but we better
should wait until Maciej could take a look (he will be able in
a week or maybe a bit later).

- Cyrill -

---

Index: linux-2.6.git/arch/x86/kernel/apic_64.c
===================================================================
--- linux-2.6.git.orig/arch/x86/kernel/apic_64.c 2008-07-25 13:38:11.000000000 +0400
+++ linux-2.6.git/arch/x86/kernel/apic_64.c 2008-07-25 14:01:43.000000000 +0400
@@ -378,6 +378,35 @@ static void setup_APIC_timer(void)
}

/*
+ * Helper function for calibrate_APIC_clock(): Make sure that
+ * APIC TMCTT and TSC are read at the same time, to reasonable
+ * accuracy. On any sane system, the retry loop won't need more
+ * than a single retry, given that the rdtsc/apic_read/rdtsc
+ * sequence won't take more than a few cycles.
+ */
+
+#define MAX_DIFFERENCE 1000UL
+#define MAX_ITER 10
+static inline int __read_tsc_and_apic(unsigned long *tsc, unsigned *apic)
+{
+ unsigned long tsc0, tsc1, diff;
+ int i = 0;
+
+ for (i = 0; i < MAX_ITER; i++) {
+ rdtscll(tsc0);
+ *apic = apic_read(APIC_TMCCT);
+ rdtscll(tsc1);
+ diff = tsc1 - tsc0;
+ if (diff < MAX_DIFFERENCE) {
+ *tsc = tsc0 + diff / 2;
+ return 0;
+ }
+ }
+
+ return -EIO ;
+}
+
+/*
* In this function we calibrate APIC bus clocks to the external
* timer. Unfortunately we cannot use jiffies and the timer irq
* to calibrate, since some later bootup code depends on getting
@@ -396,7 +425,7 @@ static int __init calibrate_APIC_clock(v
{
unsigned apic, apic_start;
unsigned long tsc, tsc_start;
- int result;
+ int result, err_start, err;

local_irq_disable();

@@ -409,23 +438,25 @@ static int __init calibrate_APIC_clock(v
*/
__setup_APIC_LVTT(250000000, 0, 0);

- apic_start = apic_read(APIC_TMCCT);
#ifdef CONFIG_X86_PM_TIMER
if (apic_calibrate_pmtmr && pmtmr_ioport) {
+ apic_start = apic_read(APIC_TMCCT);
pmtimer_wait(5000); /* 5ms wait */
apic = apic_read(APIC_TMCCT);
result = (apic_start - apic) * 1000L / 5;
} else
#endif
{
- rdtscll(tsc_start);
+ err_start = __read_tsc_and_apic(&tsc_start, &apic_start);

do {
- apic = apic_read(APIC_TMCCT);
- rdtscll(tsc);
+ err = __read_tsc_and_apic(&tsc, &apic);
} while ((tsc - tsc_start) < TICK_COUNT &&
(apic_start - apic) < TICK_COUNT);

+ if (err_start || err)
+ printk(KERN_CRIT "calibrate_APIC_clock: SMI flood - "
+ "the APIC timer calibration may be wrong!\n");
result = (apic_start - apic) * 1000L * tsc_khz /
(tsc - tsc_start);
}

2008-07-25 12:29:34

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

Cyrill Gorcunov wrote:

> Hi Martin, what about the patch below - I simplified it a bit.
> Actually we have to handle 32bit mode as well I think.

Yes.

> Anyway,
> take a look. I don't really mind against your patch but we better
> should wait until Maciej could take a look (he will be able in
> a week or maybe a bit later).
>

> + for (i = 0; i < MAX_ITER; i++) {
> + rdtscll(tsc0);
> + *apic = apic_read(APIC_TMCCT);
> + rdtscll(tsc1);
> + diff = tsc1 - tsc0;
> + if (diff < MAX_DIFFERENCE) {
> + *tsc = tsc0 + diff / 2;
> + return 0;
> + }
> + }
> +
> + return -EIO ;

This is wrong - you need to set *tsc also in the -EIO case, otherwise
the function can return total bogus.

I have to say that my simplified patch failed to do the calibration
correctly on our test system (the original patch worked well). Please
stay tuned, we are investigating this currently.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-25 13:00:29

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

[Martin Wilck - Fri, Jul 25, 2008 at 02:29:23PM +0200]
> Cyrill Gorcunov wrote:
>
>> Hi Martin, what about the patch below - I simplified it a bit.
>> Actually we have to handle 32bit mode as well I think.
>
> Yes.
>
>> Anyway,
>> take a look. I don't really mind against your patch but we better
>> should wait until Maciej could take a look (he will be able in
>> a week or maybe a bit later).
>>
>
>> + for (i = 0; i < MAX_ITER; i++) {
>> + rdtscll(tsc0);
>> + *apic = apic_read(APIC_TMCCT);
>> + rdtscll(tsc1);
>> + diff = tsc1 - tsc0;
>> + if (diff < MAX_DIFFERENCE) {
>> + *tsc = tsc0 + diff / 2;
>> + return 0;
>> + }
>> + }
> > +
> > + return -EIO ;
>
> This is wrong - you need to set *tsc also in the -EIO case, otherwise
> the function can return total bogus.

indeed, thanks! that is not the only problem - I also initialized 'i'
twice :)

>
> I have to say that my simplified patch failed to do the calibration
> correctly on our test system (the original patch worked well). Please
> stay tuned, we are investigating this currently.
>
> Martin
>

ok, Martin, I'm leaving for vacation soon - hope someone else could
take a look :)

- Cyrill -

2008-07-25 13:40:37

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

I wrote:

> I have to say that my simplified patch failed to do the calibration
> correctly on our test system (the original patch worked well). Please
> stay tuned, we are investigating this currently.

Please forget that. It was observed on an old "enterprise" kernel with
which we are currently testing the backported patch with, and it was due
to the fact that the initial counter value on that kernel was divided by
APIC_DIVISOR (=16). The resulting initial counter value was too low in a
"SMI flood" case, the counter could overlap. APIC_DIVISOR is no longer
used in the current kernel.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-25 13:48:44

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

[Martin Wilck - Fri, Jul 25, 2008 at 03:38:47PM +0200]
> I wrote:
>
>> I have to say that my simplified patch failed to do the calibration
>> correctly on our test system (the original patch worked well). Please
>> stay tuned, we are investigating this currently.
>
> Please forget that. It was observed on an old "enterprise" kernel with
> which we are currently testing the backported patch with, and it was due
> to the fact that the initial counter value on that kernel was divided by
> APIC_DIVISOR (=16). The resulting initial counter value was too low in a
> "SMI flood" case, the counter could overlap. APIC_DIVISOR is no longer
> used in the current kernel.
>
> Martin
>
> --

Martin, if I understood you right - this means your patch is not
needed? Actually on 64bit mode APIC_DIVISOR is a bit hidden in
__setup_APIC_LVTT - you may see it as APIC_TDR_DIV_16 while setting
up divisor register. I was proposing patch for that but it leaded
to potetntial overflow (thanks Ingo for catching) so we leave it as
is. Maybe I miss something?

- Cyrill -

2008-07-25 14:01:18

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

Cyrill Gorcunov wrote:

> Martin, if I understood you right - this means your patch is not
> needed?

The patch would still be needed. Just the reported failure of my
simplified patch on the old kernel would not have occurred in the
current kernel. IOW, the patch is fine for the current kernel, but not
for the old one.

> Actually on 64bit mode APIC_DIVISOR is a bit hidden in
> __setup_APIC_LVTT - you may see it as APIC_TDR_DIV_16 while setting
> up divisor register. I was proposing patch for that but it leaded
> to potetntial overflow (thanks Ingo for catching) so we leave it as
> is. Maybe I miss something?

The problem was not that the divisor 16 was used for the counter speed
(APIC_TDR_DIV_16), but that the old code set the counter start value to
(250000000/16) rather than just 250000000. That means the counter will
underflow earlier.

I am attaching a "take 3" patch which minimizes the risk of an underflow
by using the maximum possible initial value for the APIC timer.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html

2008-07-25 14:15:22

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

[Martin Wilck - Fri, Jul 25, 2008 at 04:01:08PM +0200]
> Cyrill Gorcunov wrote:
>
>> Martin, if I understood you right - this means your patch is not
>> needed?
>
> The patch would still be needed. Just the reported failure of my
> simplified patch on the old kernel would not have occurred in the
> current kernel. IOW, the patch is fine for the current kernel, but not
> for the old one.
>
>> Actually on 64bit mode APIC_DIVISOR is a bit hidden in
>> __setup_APIC_LVTT - you may see it as APIC_TDR_DIV_16 while setting
>> up divisor register. I was proposing patch for that but it leaded
>> to potetntial overflow (thanks Ingo for catching) so we leave it as
>> is. Maybe I miss something?
>
> The problem was not that the divisor 16 was used for the counter speed
> (APIC_TDR_DIV_16), but that the old code set the counter start value to
> (250000000/16) rather than just 250000000. That means the counter will
> underflow earlier.
>
> I am attaching a "take 3" patch which minimizes the risk of an underflow
> by using the maximum possible initial value for the APIC timer.
>
> Martin
>
> --
> Martin Wilck
> PRIMERGY System Software Engineer
> FSC IP ESP DEV 6
>
> Fujitsu Siemens Computers GmbH
> Heinz-Nixdorf-Ring 1
> 33106 Paderborn
> Germany
>
> Tel: ++49 5251 8 15113
> Fax: ++49 5251 8 20209
> Email: mailto:[email protected]
> Internet: http://www.fujitsu-siemens.com
> Company Details: http://www.fujitsu-siemens.com/imprint.html
>

ah, ok, thanks

- Cyrill -

2008-07-25 15:01:20

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

[Martin Wilck - Fri, Jul 25, 2008 at 04:01:08PM +0200]
> Cyrill Gorcunov wrote:
>
>> Martin, if I understood you right - this means your patch is not
>> needed?
>
> The patch would still be needed. Just the reported failure of my
> simplified patch on the old kernel would not have occurred in the
> current kernel. IOW, the patch is fine for the current kernel, but not
> for the old one.
>
>> Actually on 64bit mode APIC_DIVISOR is a bit hidden in
>> __setup_APIC_LVTT - you may see it as APIC_TDR_DIV_16 while setting
>> up divisor register. I was proposing patch for that but it leaded
>> to potetntial overflow (thanks Ingo for catching) so we leave it as
>> is. Maybe I miss something?
>
> The problem was not that the divisor 16 was used for the counter speed
> (APIC_TDR_DIV_16), but that the old code set the counter start value to
> (250000000/16) rather than just 250000000. That means the counter will
> underflow earlier.
>
> I am attaching a "take 3" patch which minimizes the risk of an underflow
> by using the maximum possible initial value for the APIC timer.
>
> Martin
>

Martin, it seems you forgot to attach the patch :) (since it was
continious tense I thought I would see patch in minute but...
it was not eventually apeeared :)

- Cyrill -

2008-07-25 15:13:27

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

Cyrill Gorcunov wrote:
> Martin, it seems you forgot to attach the patch :)

I am sorry, here it is. Thanks :)

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 8 15113
Fax: ++49 5251 8 20209
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html


Attachments:
calibrate_APIC_clock-4a.diff (2.96 kB)

2008-07-25 15:39:48

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

[Martin Wilck - Fri, Jul 25, 2008 at 05:13:21PM +0200]
> Cyrill Gorcunov wrote:
>> Martin, it seems you forgot to attach the patch :)
>
> I am sorry, here it is. Thanks :)
>
> Martin
>
> --
> Martin Wilck
> PRIMERGY System Software Engineer
> FSC IP ESP DEV 6
>
> Fujitsu Siemens Computers GmbH
> Heinz-Nixdorf-Ring 1
> 33106 Paderborn
> Germany
>
> Tel: ++49 5251 8 15113
> Fax: ++49 5251 8 20209
> Email: mailto:[email protected]
> Internet: http://www.fujitsu-siemens.com
> Company Details: http://www.fujitsu-siemens.com/imprint.html

| [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)
|
| Non-maskable asynchronous events (e.g. SMIs) which occur during the APIC
| timer calibration can cause timer miscalibrations, sometimes by large amounts.
| This patch fixes this by making sure that no significant interruption occurs
| between APIC and TSC reads. SMIs may still occur at some stage in the
| calibration loop, causing the loop to last longer than intended. This
| doesn't matter though, as long as the start and end values are both
| taken simultaneously.
|
| Changed wrt take 2: Use max. possible start value for the APIC timer
| to avoid underflow.
|
| Signed-off-by: Martin Wilck <[email protected]>
| Signed-off-by: Gerhard Wichert <[email protected]>
|
| --- arch/x86/kernel/apic_64.c 2008-07-25 15:39:51.000000000 +0200
| +++ arch/x86/kernel/apic_64.c.new 2008-07-25 15:55:08.000000000 +0200
| @@ -300,6 +300,31 @@ static void setup_APIC_timer(void)
| }
|
| /*
| + * Helper function for calibrate_APIC_clock(): Make sure that
| + * APIC TMCTT and TSC are read at the same time, to reasonable
| + * accuracy. On any sane system, the retry loop won't need more
| + * than a single retry, given that the rdtsc/apic_read/rdtsc
| + * sequence won't take more than a few cycles.
| + */
| +
| +#define MAX_DIFFERENCE 1000UL
| +#define MAX_ITER 10
| +static inline int
| +__read_tsc_and_apic(unsigned long *tsc, unsigned *apic)
| +{
| + unsigned long tsc0, tsc1, diff;
| + int i = 0;
| + do {
| + rdtscll(tsc0);
| + *apic = apic_read(APIC_TMCCT);
| + rdtscll(tsc1);
| + diff = tsc1 - tsc0;
| + } while (diff > MAX_DIFFERENCE && ++i < MAX_ITER);
| + *tsc = tsc0 + (diff >> 1);
| + return diff > MAX_DIFFERENCE ? -EIO : 0;
| +}
| +
| +/*
| * In this function we calibrate APIC bus clocks to the external
| * timer. Unfortunately we cannot use jiffies and the timer irq
| * to calibrate, since some later bootup code depends on getting
| @@ -318,7 +343,7 @@ static void __init calibrate_APIC_clock(
| {
| unsigned apic, apic_start;
| unsigned long tsc, tsc_start;
| - int result;
| + int result, err_start, err;
|
| local_irq_disable();
|
| @@ -329,25 +354,27 @@ static void __init calibrate_APIC_clock(
| *
| * No interrupt enable !
| */
| - __setup_APIC_LVTT(250000000, 0, 0);
| + __setup_APIC_LVTT(0xffffffff, 0, 0);
|
| - apic_start = apic_read(APIC_TMCCT);
| #ifdef CONFIG_X86_PM_TIMER
| if (apic_calibrate_pmtmr && pmtmr_ioport) {
| + apic_start = apic_read(APIC_TMCCT);
| pmtimer_wait(5000); /* 5ms wait */
| apic = apic_read(APIC_TMCCT);
| result = (apic_start - apic) * 1000L / 5;
| } else
| #endif
| {
| - rdtscll(tsc_start);
| + err_start = __read_tsc_and_apic(&tsc_start, &apic_start);
|
| do {
| - apic = apic_read(APIC_TMCCT);
| - rdtscll(tsc);
| + err = __read_tsc_and_apic(&tsc, &apic);
| } while ((tsc - tsc_start) < TICK_COUNT &&
| (apic_start - apic) < TICK_COUNT);
|
| + if (err_start || err)
| + printk(KERN_CRIT "calibrate_APIC_clock: SMI flood - "
| + "the APIC timer calibration may be wrong!\n");
| result = (apic_start - apic) * 1000L * tsc_khz /
| (tsc - tsc_start);
| }

Thanks, Martin! Patch looks good for me (actually I would better
wait for Maciej opinion since I'm not that specialist in this area).
Thanks!

- Cyrill -

2008-07-25 16:32:47

by Olaf Dabrunz

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 2)

On 25-Jul-08, Martin Wilck wrote:
> I wrote:
>
>> This patch fixes this by two separate measures:
>> a) make sure that no significant interruption occurs between APIC and
>> TSC reads
>> b) make sure that the measurement loop isn't significantly longer
>> than originally intended.
>
> Here is a new, simplified version of our patch that only uses measure a).
> We verified that this is sufficient for accurate calibration.
>
> If we fail to determine the start or end time of the calibration correctly
> 10 times in a row, we will print a critical error message and go on. One
> might as well argue that this should cause a kernel panic (it is impossible
> to run on the CPU for only a few cycles without being interrupted by an
> SMI!), but Cyrill probably won't agree.

Note that the SMIs may be triggered when the APIC is read. This may
change after the first (or the first few) SMIs have been triggered. So
the "too many SMIs" case during the calibration does not necessarily
mean that the system can not run normally after the calibration is done.

This is why I would prefer the solution with the error message.

Regards,

--
Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg

2008-07-26 15:40:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)


* Martin Wilck <[email protected]> wrote:

> [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)
>
> Non-maskable asynchronous events (e.g. SMIs) which occur during the
> APIC timer calibration can cause timer miscalibrations, sometimes by
> large amounts. This patch fixes this by making sure that no
> significant interruption occurs between APIC and TSC reads. SMIs may
> still occur at some stage in the calibration loop, causing the loop to
> last longer than intended. This doesn't matter though, as long as the
> start and end values are both taken simultaneously.
>
> Changed wrt take 2: Use max. possible start value for the APIC timer
> to avoid underflow.
>
> Signed-off-by: Martin Wilck <[email protected]>
> Signed-off-by: Gerhard Wichert <[email protected]>
>
> --- arch/x86/kernel/apic_64.c 2008-07-25 15:39:51.000000000 +0200
> +++ arch/x86/kernel/apic_64.c.new 2008-07-25 15:55:08.000000000 +0200

nice - could you please implement it symmetrically on 32-bit APIC
calibration as well?

Ingo

2009-03-12 09:41:47

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

Hi all,

Le vendredi 25 juillet 2008, Martin Wilck a ?crit?:
> Cyrill Gorcunov wrote:
> > Martin, it seems you forgot to attach the patch :)
>
> I am sorry, here it is. Thanks :)

Sorry for replying to such an old thread, but did this patch go
anywhere? I can't seem to find it in git. Was it somehow obsoleted
by a different fix for the same issue (SMI flood during APIC
calibration)? Or is the upstream kernel still affected?

Thanks,
--
Jean Delvare
Suse L3

2009-03-12 13:48:50

by Martin Wilck

[permalink] [raw]
Subject: Re: [PATCH] x86 (64): make calibrate_APIC_clock() SMI-safe (take 3)

Hi Jean,

same answer as on Bugzilla...

> Sorry for replying to such an old thread, but did this patch go
> anywhere? I can't seem to find it in git. Was it somehow obsoleted
> by a different fix for the same issue (SMI flood during APIC
> calibration)? Or is the upstream kernel still affected?

We were asked for a solution for all affected architectures (in
particular, also i386) which was more than we were able to do
at the time, given that the pressure had been reduced by finding a BIOS
fix for the particular situation on system in question.

The APIC clock calibration code on i386 is more complex than x86_64, we
saw no way to provide a high-quality, regression-safe, tested patch for
that to upstream with a reasonable amount of effort.

That aside, I still think the x86_64 patch is fine and won't cause
regressions. It just doesn't help on 32bit systems.

Martin

--
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DEV 6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel: ++49 5251 525 2796
Fax: ++49 5251 525 2820
Email: mailto:[email protected]
Internet: http://www.fujitsu-siemens.com
Company Details: http://www.fujitsu-siemens.com/imprint.html