2005-09-19 19:17:04

by john stultz

[permalink] [raw]
Subject: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Andrew,
This patch should resolve the issue seen in bugme bug #5105, where it
is assumed that dualcore x86_64 systems have synced TSCs. This is not
the case, and alternate timesources should be used instead.

For more details, see:
http://bugzilla.kernel.org/show_bug.cgi?id=5105


Please consider for inclusion in your tree.

thanks
-john

diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
are handled in the OEM check above. */
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
return 0;
- /* All in a single socket - should be synchronized */
- if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
- return 0;
#endif
/* Assume multi socket systems are not synchronized */
return num_online_cpus() > 1;



2005-09-19 19:31:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> This patch should resolve the issue seen in bugme bug #5105, where it
> is assumed that dualcore x86_64 systems have synced TSCs. This is not
> the case, and alternate timesources should be used instead.


I asked AMD some time ago and they told me it was synchronized.
The TSC on K8 is C state invariant, but not P state invariant,
but P states always happen synchronized on dual cores.

So I'm not quite convinced of your explanation yet.

Most likely you workaround some other bug by switching to pmtimer,
Or just changed the timing enough because pmtimer is incredibly
slow. It would be better to find the other bug.


>
> For more details, see:
> http://bugzilla.kernel.org/show_bug.cgi?id=5105
>
>
> Please consider for inclusion in your tree.

Please don't for now.

-Andi

>
> thanks
> -john
>
> diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> --- a/arch/x86_64/kernel/time.c
> +++ b/arch/x86_64/kernel/time.c
> @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
> are handled in the OEM check above. */
> if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> return 0;
> - /* All in a single socket - should be synchronized */
> - if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> - return 0;
> #endif
> /* Assume multi socket systems are not synchronized */
> return num_online_cpus() > 1;
>
>

--

2005-09-19 19:42:20

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > This patch should resolve the issue seen in bugme bug #5105, where it
> > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > the case, and alternate timesources should be used instead.
>
>
> I asked AMD some time ago and they told me it was synchronized.
> The TSC on K8 is C state invariant, but not P state invariant,
> but P states always happen synchronized on dual cores.
>
> So I'm not quite convinced of your explanation yet.

Would a litter userspace test checking the TSC synchronization maybe
shed additional light on the issue?

thanks
-john



2005-09-19 19:49:36

by Andi Kleen

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
> On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > This patch should resolve the issue seen in bugme bug #5105, where it
> > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > the case, and alternate timesources should be used instead.
> >
> >
> > I asked AMD some time ago and they told me it was synchronized.
> > The TSC on K8 is C state invariant, but not P state invariant,
> > but P states always happen synchronized on dual cores.
> >
> > So I'm not quite convinced of your explanation yet.
>
> Would a litter userspace test checking the TSC synchronization maybe
> shed additional light on the issue?

Sure you can try it.

-Andi

2005-09-20 18:59:50

by john stultz

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, 2005-09-19 at 21:49 +0200, Andi Kleen wrote:
> On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
> > On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> > > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > > This patch should resolve the issue seen in bugme bug #5105, where it
> > > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > > the case, and alternate timesources should be used instead.
> > >
> > >
> > > I asked AMD some time ago and they told me it was synchronized.
> > > The TSC on K8 is C state invariant, but not P state invariant,
> > > but P states always happen synchronized on dual cores.
> > >
> > > So I'm not quite convinced of your explanation yet.
> >
> > Would a litter userspace test checking the TSC synchronization maybe
> > shed additional light on the issue?
>
> Sure you can try it.

So, bugzilla.kernel.org has (temporarily at least) lost the reports from
yesterday, but from the email i got, folks using my TSC consistency
check that I posted were seeing what appears to be unsynched TSCs on
dualcore AMD systems.

Personally I suspect that the powernow driver is putting the cores
independently into low power sleep and the TSCs are being independently
halted, causing them to become unsynchronized.

Do you still feel there is some other issue here? Any ideas for shaking
out whatever else might in play?

thanks
-john



2005-09-21 04:03:50

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Tue, Sep 20, 2005 at 11:59:45AM -0700, john stultz wrote:
> So, bugzilla.kernel.org has (temporarily at least) lost the reports from
> yesterday, but from the email i got, folks using my TSC consistency
> check that I posted were seeing what appears to be unsynched TSCs on
> dualcore AMD systems.
>
> Personally I suspect that the powernow driver is putting the cores
> independently into low power sleep and the TSCs are being independently
> halted, causing them to become unsynchronized.
>
> Do you still feel there is some other issue here? Any ideas for shaking
> out whatever else might in play?

FYI, at least I have reproduced this without powernow loaded.

--
Daniel Jacobowitz
CodeSourcery, LLC

2005-09-21 14:55:24

by Ray Bryant

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:

>
> FYI, at least I have reproduced this without powernow loaded.

There are cases that we are aware of where the TSC will count slower while the
processor is halted. This can make TSC's get out of sync on dual cores.

I wonder if you can reproduce this problem while also running a pair of cpu
bound tasks on your dual core box. If you can't, then this is the culprit.

In general, however, on multisocket systems, you can't depend on TSC's being
synchronized between sockets, so all of this is moot. We just have to deal
with it.

--
Ray Bryant
AMD Performance Labs Austin, Tx
512-602-0038 (o) 512-507-7807 (c)

2005-09-21 15:04:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Wed, Sep 21, 2005 at 10:15:08AM -0500, Ray Bryant wrote:
> On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:
>
> >
> > FYI, at least I have reproduced this without powernow loaded.
>
> There are cases that we are aware of where the TSC will count slower while the
> processor is halted. This can make TSC's get out of sync on dual cores.

Ok thanks for the confirmation. I guess John's patch is ok then.
Drawback is much slower to extremly slow gettimeofday (depending
if the chipset/BIOS has usable HPET, most seem not to)

>
> I wonder if you can reproduce this problem while also running a pair of cpu
> bound tasks on your dual core box. If you can't, then this is the culprit.
>
> In general, however, on multisocket systems, you can't depend on TSC's being
> synchronized between sockets, so all of this is moot. We just have to deal
> with it.

We handle this, but single socket dual core was special cased because
I was told previously it should be ok.

-Andi

2005-09-21 15:26:36

by Ray Bryant

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Wednesday 21 September 2005 10:04, Andi Kleen wrote:

>
> We handle this, but single socket dual core was special cased because
> I was told previously it should be ok.
>
> -Andi

AFAIK there is a processor state bit that enables/disables this behavior.
Apparently some BIOS's are setting this one way for desktop systems and the
other way for servers. If it is thought to be important I can track that
down and see if it can be externally documented. (It may actually be in the
bios and kernel developer guide...)

--
Ray Bryant
AMD Performance Labs Austin, Tx
512-602-0038 (o) 512-507-7807 (c)

2005-09-21 20:19:06

by Andrew Morton

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Andi Kleen <[email protected]> wrote:
>
> On Wed, Sep 21, 2005 at 10:15:08AM -0500, Ray Bryant wrote:
> > On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:
> >
> > >
> > > FYI, at least I have reproduced this without powernow loaded.
> >
> > There are cases that we are aware of where the TSC will count slower while the
> > processor is halted. This can make TSC's get out of sync on dual cores.

You mean a single `hlt' instruction? I guess that rules out resyncing them.

> Ok thanks for the confirmation. I guess John's patch is ok then.
> Drawback is much slower to extremly slow gettimeofday (depending
> if the chipset/BIOS has usable HPET, most seem not to)

That's a really big drawback. Will this affect many CPU types?

If the user was prepared to use `idle=poll' then they could get their fast
gettimeofday() back, perhaps.

2005-09-22 07:57:57

by Jonas Oreland

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Ray Bryant wrote:
> On Wednesday 21 September 2005 10:04, Andi Kleen wrote:
>
>
>>We handle this, but single socket dual core was special cased because
>>I was told previously it should be ok.
>>
>>-Andi
>
>
> AFAIK there is a processor state bit that enables/disables this behavior.
> Apparently some BIOS's are setting this one way for desktop systems and the
> other way for servers. If it is thought to be important I can track that
> down and see if it can be externally documented. (It may actually be in the
> bios and kernel developer guide...)
>

Hi,

This would be very good (for us single socket dual core users)
I tried a very small benchmark:

clock_gettime(CLOCK_REALTIME): elapsed 7336657 -> 733.665700ns/call
clock_gettime(CLOCK_PROCESS_CPUTIME_ID): elapsed 763247 -> 76.324700ns/call

It's a factor 10 faster if the TSC were to be in sync.

/Jonas

2005-10-07 12:26:39

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Bootdata ok (command line is root=/dev/sda3 ro idle=poll )
Linux version 2.6.13.2 (vsavkin@forum) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Fri Oct 7 16:17:05 MSD 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000bffb0000 (usable)
BIOS-e820: 00000000bffb0000 - 00000000bffc0000 (ACPI data)
BIOS-e820: 00000000bffc0000 - 00000000bfff0000 (ACPI NVS)
BIOS-e820: 00000000bfff0000 - 00000000c0000000 (reserved)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
ACPI: RSDP (v002 ACPIAM ) @ 0x00000000000fa7c0
ACPI: XSDT (v001 A M I OEMXSDT 0x05000519 MSFT 0x00000097) @ 0x00000000bffb0100
ACPI: FADT (v003 A M I OEMFACP 0x05000519 MSFT 0x00000097) @ 0x00000000bffb0290
ACPI: MADT (v001 A M I OEMAPIC 0x05000519 MSFT 0x00000097) @ 0x00000000bffb0390
ACPI: OEMB (v001 A M I OEMBIOS 0x05000519 MSFT 0x00000097) @ 0x00000000bffc0040
ACPI: DSDT (v001 A0036 A0036001 0x00000001 MSFT 0x0100000d) @ 0x0000000000000000
On node 0 totalpages: 1048399
DMA zone: 3999 pages, LIFO batch:1
Normal zone: 1044400 pages, LIFO batch:31
HighMem zone: 0 pages, LIFO batch:1
Looks like a VIA chipset. Disabling IOMMU. Overwrite with "iommu=allowed"
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:11 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:11 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at c0000000 (gap: c0000000:3f780000)
Built 1 zonelists
Kernel command line: root=/dev/sda3 ro idle=poll
using polling idle threads.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 2002.578 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Placing software IO TLB between 0x633d000 - 0x833d000
Memory: 4070468k/5242880k available (2282k kernel code, 122840k reserved, 1211k data, 532k init)
Calibrating delay using timer specific routine.. 4012.03 BogoMIPS (lpj=2006016)
Security Framework v1.0.0 initialized
Capability LSM initialized
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(2) -> Node 0 -> Core 0
mtrr: v2.0 (20020519)
tbxface-0120 [02] acpi_load_tables : ACPI Tables successfully acquired
Parsing all Control Methods:...................................................................................................................................................
Table [DSDT](id F004) - 547 Objects with 51 Devices 147 Methods 25 Regions
ACPI Namespace successfully loaded at root ffffffff804c8000
evxfevnt-0096 [03] acpi_enable : Transition to ACPI mode successful
Using local APIC timer interrupts.
Detected 12.516 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4004.57 BogoMIPS (lpj=2002289)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1(2) -> Node 0 -> Core 1
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 568 cycles)
Brought up 2 CPUs
time.c: Using PIT/TSC based timekeeping.
testing NMI watchdog ... OK.
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Subsystem revision 20050408
evgpeblk-1016 [06] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0x9
evgpeblk-1024 [06] ev_create_gpe_block : Found 7 Wake, Enabled 0 Runtime GPEs in this block
Completing Region/Field/Buffer/Package initialization:............................................................................................................................
Initialized 24/25 Regions 44/44 Fields 41/41 Buffers 15/16 Packages (556 nodes)
Executing all Device _STA and_INI methods:.......................................................
55 Devices found containing: 55 _STA, 0 _INI methods
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] segment is 0
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
agpgart: Detected AGP bridge 0
agpgart: AGP aperture is 64M @ 0xdc000000
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
PCI: Bridge: 0000:00:01.0
IO window: e000-efff
MEM window: fbe00000-fbffffff
PREFETCH window: e0000000-faffffff
acpi_bus-0212 [01] acpi_bus_set_power : Device is not power manageable
PCI: Setting latency timer of device 0000:00:01.0 to 64
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1128687641.426:1): initialized
Total HugeTLB memory allocated, 0
Initializing Cryptographic API
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
acpi_bus-0212 [01] acpi_bus_set_power : Device is not power manageable
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 169
PCI: Via IRQ fixup for 0000:00:0f.1, from 255 to 9
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: IC35L080AVVA07-0, ATA DISK drive
isa bounce pool size: 16 pages
ide1 at 0x170-0x177,0x376 on irq 15
hdc: max request size: 128KiB
hdc: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=65535/16/63, UDMA(100)
hdc: cache flushes supported
hdc: hdc1 hdc2 hdc3 hdc4
libata version 1.12 loaded.
sata_via version 1.1
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 169
PCI: Via IRQ fixup for 0000:00:0f.0, from 10 to 9
sata_via(0000:00:0f.0): routed to hard irq line 9
ata1: SATA max UDMA/133 cmd 0xC400 ctl 0xC002 bmdma 0xB000 irq 169
ata2: SATA max UDMA/133 cmd 0xB800 ctl 0xB402 bmdma 0xB008 irq 169
ata1: no device found (phy stat 00000000)
scsi0 : sata_via
ata2: dev 0 cfg 49:2f00 82:74eb 83:7fea 84:4023 85:74e8 86:3c02 87:4023 88:203f
ata2: dev 0 ATA, max UDMA/100, 488397168 sectors: lba48
ata2: dev 0 configured for UDMA/100
scsi1 : sata_via
Vendor: ATA Model: HDS722525VLSA80 Rev: V36O
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >
Attached scsi disk sda at scsi1, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi1, channel 0, id 0, lun 0, type 0
mice: PS/2 mouse device common for all mice
input: PC Speaker
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: [email protected]
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 532k freed
input: AT Translated Set 2 keyboard on isa0060/serio0
Adding 7912004k swap on /dev/sda2. Priority:2 extents:1
Adding 3906496k swap on /dev/hdc3. Priority:4 extents:1
802.1Q VLAN Support v1.8 Ben Greear <[email protected]>
All bugs added by David S. Miller <[email protected]>
8139too Fast Ethernet driver 0.9.27
ACPI: PCI Interrupt 0000:00:0c.0[A] -> GSI 17 (level, low) -> IRQ 177
eth0: RealTek RTL8139 at 0xffffc20000012000, 00:c0:26:a1:92:f5, IRQ 177
eth0: Identified 8139 chip type 'RTL-8139C'
ReiserFS: sda9: found reiserfs format "3.6" with standard journal
ReiserFS: sda9: using ordered data mode
ReiserFS: sda9: journal params: device sda9, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda9: checking transaction log (sda9)
ReiserFS: sda9: Using r5 hash to sort names
ReiserFS: hdc4: found reiserfs format "3.6" with standard journal
ReiserFS: hdc4: using ordered data mode
ReiserFS: hdc4: journal params: device hdc4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc4: checking transaction log (hdc4)
ReiserFS: hdc4: Using r5 hash to sort names
ReiserFS: sda11: found reiserfs format "3.6" with standard journal
ReiserFS: sda11: using ordered data mode
ReiserFS: sda11: journal params: device sda11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda11: checking transaction log (sda11)
ReiserFS: sda11: Using r5 hash to sort names
ReiserFS: sda8: found reiserfs format "3.6" with standard journal
ReiserFS: sda8: using ordered data mode
ReiserFS: sda8: journal params: device sda8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda8: checking transaction log (sda8)
ReiserFS: sda8: Using r5 hash to sort names
ReiserFS: hdc1: found reiserfs format "3.6" with standard journal
ReiserFS: hdc1: using ordered data mode
ReiserFS: hdc1: journal params: device hdc1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc1: checking transaction log (hdc1)
ReiserFS: hdc1: Using r5 hash to sort names
ReiserFS: sda10: found reiserfs format "3.6" with standard journal
ReiserFS: sda10: using ordered data mode
ReiserFS: sda10: journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda10: checking transaction log (sda10)
ReiserFS: sda10: Using r5 hash to sort names
ReiserFS: sda5: found reiserfs format "3.6" with standard journal
ReiserFS: sda5: using ordered data mode
ReiserFS: sda5: journal params: device sda5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda5: checking transaction log (sda5)
ReiserFS: sda5: Using r5 hash to sort names
ReiserFS: sda6: found reiserfs format "3.6" with standard journal
ReiserFS: sda6: using ordered data mode
ReiserFS: sda6: journal params: device sda6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda6: checking transaction log (sda6)
ReiserFS: sda6: Using r5 hash to sort names
ReiserFS: hdc2: found reiserfs format "3.6" with standard journal
ReiserFS: hdc2: using ordered data mode
ReiserFS: hdc2: journal params: device hdc2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc2: checking transaction log (hdc2)
ReiserFS: hdc2: Using r5 hash to sort names
eth0: link up, 100Mbps, full-duplex, lpa 0x41E1
vlan0169: add 01:00:5e:00:00:01 mcast address to master interface
vlan0170: add 01:00:5e:00:00:01 mcast address to master interface
NET: Registered protocol family 10
vlan0169: add 33:33:00:00:00:01 mcast address to master interface
vlan0169: add 33:33:ff:a1:92:f5 mcast address to master interface
vlan0170: add 33:33:00:00:00:01 mcast address to master interface
vlan0170: add 33:33:ff:a1:92:f5 mcast address to master interface
IPv6 over IPv4 tunneling driver


Attachments:
(No filename) (1.35 kB)
dmesg.forum (13.21 kB)
dmesg
Download all attachments

2005-10-07 12:30:05

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > Andrew,
> > This patch should resolve the issue seen in bugme bug #5105, where it
> > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > the case, and alternate timesources should be used instead.
> >
> > For more details, see:
> > http://bugzilla.kernel.org/show_bug.cgi?id=5105
>
> I too have a box that shows the symptoms from bugzilla entry above.
> The system is Asus A8V Deluxe MB with
> "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
>
> The patch below did not fix the problem, while "idle=poll" did.
> Hope this helps, dmesg attached.

Are you running the latest BIOS?

-Andi

>
> > Please consider for inclusion in your tree.
> >
> > thanks
> > -john
> >
> > diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> > --- a/arch/x86_64/kernel/time.c
> > +++ b/arch/x86_64/kernel/time.c
> > @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
> > are handled in the OEM check above. */
> > if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> > return 0;
> > - /* All in a single socket - should be synchronized */
> > - if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> > - return 0;
> > #endif
> > /* Assume multi socket systems are not synchronized */
> > return num_online_cpus() > 1;
>
> ~
>
> :wq
>
> With best regards,
> Vladimir Savkin.

2005-10-07 14:15:34

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> > I too have a box that shows the symptoms from bugzilla entry above.
> > The system is Asus A8V Deluxe MB with
> > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> >
> > The patch below did not fix the problem, while "idle=poll" did.
> > Hope this helps, dmesg attached.
>
> Are you running the latest BIOS?

Well, I think not.
Asus file download page is unavailable since yesterday.

>
> -Andi
>
~
:wq
With best regards,
Vladimir Savkin.

2005-10-07 14:22:15

by Velu Erwan

[permalink] [raw]
Subject: Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Vladimir B. Savkin a écrit :

>Well, I think not.
>Asus file download page is unavailable since yesterday.
>
>
Agreed but ftp://ftp.asus.com:/pub/ASUS/mb/socket939/a8v-deluxe is still
available ;)

2005-10-08 10:12:24

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > Andrew,
> > > This patch should resolve the issue seen in bugme bug #5105, where it
> > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > the case, and alternate timesources should be used instead.
> > >
> > > For more details, see:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=5105
> >
> > I too have a box that shows the symptoms from bugzilla entry above.
> > The system is Asus A8V Deluxe MB with
> > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> >
> > The patch below did not fix the problem, while "idle=poll" did.
> > Hope this helps, dmesg attached.
>
> Are you running the latest BIOS?

Just upgraded to the lastest BIOS (revision 1014), nothing changed.
Only with "idle=poll" timers run normally.

>
> > > Please consider for inclusion in your tree.
> > >
> > > thanks
> > > -john
> > >
> > > diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> > > --- a/arch/x86_64/kernel/time.c
> > > +++ b/arch/x86_64/kernel/time.c
> > > @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
> > > are handled in the OEM check above. */
> > > if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> > > return 0;
> > > - /* All in a single socket - should be synchronized */
> > > - if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> > > - return 0;
> > > #endif
> > > /* Assume multi socket systems are not synchronized */
> > > return num_online_cpus() > 1;
> >
> > ~
~
:wq
With best regards,
Vladimir Savkin.

2005-10-10 18:03:35

by john stultz

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Sat, 2005-10-08 at 14:11 +0400, Vladimir B. Savkin wrote:
> On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> > On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> > > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > > Andrew,
> > > > This patch should resolve the issue seen in bugme bug #5105, where it
> > > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > > the case, and alternate timesources should be used instead.
> > > >
> > > > For more details, see:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=5105
> > >
> > > I too have a box that shows the symptoms from bugzilla entry above.
> > > The system is Asus A8V Deluxe MB with
> > > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> > >
> > > The patch below did not fix the problem, while "idle=poll" did.
> > > Hope this helps, dmesg attached.
> >
> > Are you running the latest BIOS?
>
> Just upgraded to the lastest BIOS (revision 1014), nothing changed.
> Only with "idle=poll" timers run normally.

>From your dmesg, it appears that there are no other timesources other
then the TSC available on your hardware. So I'm guessing idle=poll is
keeping the CPUs from halting the TSC and keeping them synched.


I would think that the ACPI PM timer would be supported, but I don't see
anything about it in your dmesg. Could you make sure it is properly
configured in?

thanks
-john


2005-10-10 18:12:23

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, Oct 10, 2005 at 11:03:24AM -0700, john stultz wrote:
> >From your dmesg, it appears that there are no other timesources other
> then the TSC available on your hardware. So I'm guessing idle=poll is
> keeping the CPUs from halting the TSC and keeping them synched.
>
>
> I would think that the ACPI PM timer would be supported, but I don't see
> anything about it in your dmesg. Could you make sure it is properly
> configured in?

Yes, I tried different combinations of PM_TIMER and HPET options.
In this try, PM_TIMER was definetly enabled in kernel config.

What kind of kernel message did you expect from workibf PM timer?

~
:wq
With best regards,
Vladimir Savkin.

2005-10-10 18:20:47

by Jonas Oreland

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Hi,

check http://bugzilla.kernel.org/show_bug.cgi?id=5283

/Jonas

Vladimir B. Savkin wrote:
> On Mon, Oct 10, 2005 at 11:03:24AM -0700, john stultz wrote:
>
>>>From your dmesg, it appears that there are no other timesources other
>>then the TSC available on your hardware. So I'm guessing idle=poll is
>>keeping the CPUs from halting the TSC and keeping them synched.
>>
>>
>>I would think that the ACPI PM timer would be supported, but I don't see
>>anything about it in your dmesg. Could you make sure it is properly
>>configured in?
>
>
> Yes, I tried different combinations of PM_TIMER and HPET options.
> In this try, PM_TIMER was definetly enabled in kernel config.
>
> What kind of kernel message did you expect from workibf PM timer?
>
> ~
> :wq
> With best regards,
> Vladimir Savkin.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2005-10-11 07:35:39

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
> Hi,
>
> check http://bugzilla.kernel.org/show_bug.cgi?id=5283

Excuse me for possibly dumb question, but is it safe to leave TSCs
unsynchronized when using other time source?
How will other subsystems e.g. traffic queueing disciplines react?

~
:wq
With best regards,
Vladimir Savkin.

2005-10-11 08:06:43

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

"Vladimir B. Savkin" <[email protected]> writes:

> On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
> > Hi,
> >
> > check http://bugzilla.kernel.org/show_bug.cgi?id=5283
>
> Excuse me for possibly dumb question, but is it safe to leave TSCs
> unsynchronized when using other time source?
> How will other subsystems e.g. traffic queueing disciplines react?

They might see hickups, but normally they all have relatively
benign failure modes so I wouldn't worry too much.

If you use it on a Opteron with frequency scaling and multiple sockets
it would be safer to patch them to use do_gettimeofday() or better
monotonic_clock(), because the differences can be very large there
(CPUs running with completely different frequencies). Drawback would
be that it would be slower. On systems without frequency scaling
you would likely only see problems if at all after a long uptime.

For some subsystems it is ok, e.g. the scheduler which also uses
TSCs especially deals with unsynchronized clocks.

-Andi

2005-10-11 16:28:11

by Jonas Oreland

[permalink] [raw]
Subject: Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs

Vladimir B. Savkin wrote:
> On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
>
>>Hi,
>>
>>check http://bugzilla.kernel.org/show_bug.cgi?id=5283
>
>
> Excuse me for possibly dumb question, but is it safe to leave TSCs
> unsynchronized when using other time source?
> How will other subsystems e.g. traffic queueing disciplines react?

Excuse me for possibly dumb answer: (i'm not a kernel hacker)

yes, I would guess that this will be handled as any other
SMP machine where TSCs arent in sync.

/Jonas

2005-10-25 07:32:36

by Jonas Oreland

[permalink] [raw]
Subject: x86-64: Syncing dualcore cpus TSCs

Hi,

This might be a very bad suggestion, but here it is:

On dualcore cpus (amd64) the TSC will get out of sync when executing hlt instruction.
booting with idle=poll, will make it never to execute hlt, hence TSC will be in sync.
booting with notsc will make it use other time source...but this is slower
(this is default after "[PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs")

How about syncing TSC after hlt?

If cost of syncing TSC's is smaller than cost of using other time source this might be an alternative.

/Jonas

2005-10-25 07:41:41

by Andi Kleen

[permalink] [raw]
Subject: Re: x86-64: Syncing dualcore cpus TSCs

On Tuesday 25 October 2005 09:35, Jonas Oreland wrote:
> Hi,
>
> This might be a very bad suggestion, but here it is:
>
> On dualcore cpus (amd64) the TSC will get out of sync when executing hlt
> instruction. booting with idle=poll, will make it never to execute hlt,
> hence TSC will be in sync. booting with notsc will make it use other time
> source...but this is slower (this is default after "[PATCH] x86-64: Fix bad
> assumption that dualcore cpus have synced TSCs")
>
> How about syncing TSC after hlt?
>
> If cost of syncing TSC's is smaller than cost of using other time source
> this might be an alternative.

I very doubt it is. Syncing TSCs requires stopping multiple CPUs for longer
time. It is unlikely you can make that up.

-Andi

2005-10-26 00:06:46

by David Lang

[permalink] [raw]
Subject: Re: x86-64: Syncing dualcore cpus TSCs

On Tue, 25 Oct 2005, Andi Kleen wrote:

> On Tuesday 25 October 2005 09:35, Jonas Oreland wrote:
>> Hi,
>>
>> This might be a very bad suggestion, but here it is:
>>
>> On dualcore cpus (amd64) the TSC will get out of sync when executing hlt
>> instruction. booting with idle=poll, will make it never to execute hlt,
>> hence TSC will be in sync. booting with notsc will make it use other time
>> source...but this is slower (this is default after "[PATCH] x86-64: Fix bad
>> assumption that dualcore cpus have synced TSCs")
>>
>> How about syncing TSC after hlt?
>>
>> If cost of syncing TSC's is smaller than cost of using other time source
>> this might be an alternative.
>
> I very doubt it is. Syncing TSCs requires stopping multiple CPUs for longer
> time. It is unlikely you can make that up.

I may be misunderstanding things, but as I understand it the reason for
calling hlt is to save power.

if you really care about the last bit of performance then useing idle=poll
to make the TSC's stay synced makes perfect sense.

it's cases where you care about saving power that you would want to use
hlt. can the power management be reasonably configured so that when things
are running close to full-bore hlt isn't called, when things are more idle
it switches to useing hlt and a non-TSC timesource or re-syncing the TSC
on wakeup, and then if it's more idle then that it goes into the more
traditional power saving modes where it works to shutdown individual CPU's
(obviously having to re-sync the TSC when they wake up).

David Lang

--
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare