2004-01-27 18:09:16

by Jan Kasprzak

[permalink] [raw]
Subject: SMP AMD64 (Tyan S2882) problems.

Hello, world!\n

I have a new Opteron server based on Tyan Thunder K8S Pro (S882GNR) board.
The storage is 3ware 7506 controller with 4 disks in RAID-5. The system
has 4GB of RAM, and runs 2.6.1 kernel.

Problem 1:
I have tried to run a test load (multiple kernel compiles using
make -j4, copying a filesystem subtrees, etc), and I have noticed that
all IRQs go to the CPU0 only. My /proc/interrupts says:

# cat /proc/interrupts
CPU0 CPU1
0: 3188815 0 IO-APIC-edge timer
1: 10387 0 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
9: 0 0 IO-APIC-edge acpi
24: 46802 0 IO-APIC-level eth0
27: 384890 0 IO-APIC-level 3ware Storage Controller
NMI: 903323 881935
LOC: 3188129 3188158
ERR: 0
MIS: 0

Is it normal? How can I set up some IRQ balancing (or at least hard-wire
3ware for CPU1 and eth0 for CPU0)?

Output of "dmesg" is at http://www.fi.muni.cz/~kas/tmp/dmesg-K8SPro.txt

Other problems are not so important or lkml-related:

Problem 2: the 3ware controller does not work correctly on the first
PCI bus (slot 1 and 2) - in slot 1 it hangs under bigger load (e.g.
an array rebuild), in slot 2 it hangs during boot in 3ware BIOS.
It is probably not Linux-specific, but has anyone seen the same problem?

Problem 3:
What the "PCI-DMA: Disabling IOMMU." message in dmesg output means?

Problem 4:
Does Linux support the hardware sensors on this board? The i2c driver
AMD8111 seems to be working, but what sensors driver should I use?

Problem 5:
Is there a 3ware configuration program (tw_cli), which works on AMD64?

Thanks,

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |


2004-01-27 18:36:48

by Andi Kleen

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

Jan Kasprzak <[email protected]> writes:

You don't say if you run a 32bit or a 64bit kernel. I will assume 64bit.

> Is it normal? How can I set up some IRQ balancing (or at least hard-wire
> 3ware for CPU1 and eth0 for CPU0)?

Run irqbalanced

> Problem 2: the 3ware controller does not work correctly on the first
> PCI bus (slot 1 and 2) - in slot 1 it hangs under bigger load (e.g.
> an array rebuild), in slot 2 it hangs during boot in 3ware BIOS.
> It is probably not Linux-specific, but has anyone seen the same problem?

I haven't seen it.

You can try if it goes away when you disable ACPI PCI routing
(pci=noacpi)

> Problem 3:
> What the "PCI-DMA: Disabling IOMMU." message in dmesg output means?

Ok you run a 64bit kernel. You don't have enough memory to require
the IOMMU. That's fine.

> Problem 4:
> Does Linux support the hardware sensors on this board? The i2c driver
> AMD8111 seems to be working, but what sensors driver should I use?

Most likely the Winbond W83627HF

iirc it's not possible in 2.6.1 to enable it. You have to drop the
ISA dependency for "I2C_ISA" in drivers/i2c/busses/Kconfig

> Problem 5:
> Is there a 3ware configuration program (tw_cli), which works on AMD64?

You can try if the 32bit program works. If not ask 3ware.

-Andi

2004-01-27 21:49:37

by Jan Kasprzak

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

Andi Kleen wrote:
: Jan Kasprzak <[email protected]> writes:
:
: You don't say if you run a 32bit or a 64bit kernel. I will assume 64bit.
:
Correct.

: > Is it normal? How can I set up some IRQ balancing (or at least hard-wire
: > 3ware for CPU1 and eth0 for CPU0)?
:
: Run irqbalanced
:
Thanks (in my system - Fedora Core 0.96 - it is "irqbalance", without
the "d" at the end, from kernel-utils package). When I ran it it migrated
some IRQs to CPU1, so this is probably OK. Thanks.

: > Problem 2: the 3ware controller does not work correctly on the first
: > PCI bus (slot 1 and 2) - in slot 1 it hangs under bigger load (e.g.
: > an array rebuild), in slot 2 it hangs during boot in 3ware BIOS.
: > It is probably not Linux-specific, but has anyone seen the same problem?
:
: I haven't seen it.
:
: You can try if it goes away when you disable ACPI PCI routing
: (pci=noacpi)

I will try tomorrow.

: > Problem 3:
: > What the "PCI-DMA: Disabling IOMMU." message in dmesg output means?
:
: Ok you run a 64bit kernel. You don't have enough memory to require
: the IOMMU. That's fine.

This is what I expected. OK.

: > Problem 4:
: > Does Linux support the hardware sensors on this board? The i2c driver
: > AMD8111 seems to be working, but what sensors driver should I use?
:
: Most likely the Winbond W83627HF
:
: iirc it's not possible in 2.6.1 to enable it. You have to drop the
: ISA dependency for "I2C_ISA" in drivers/i2c/busses/Kconfig

Exactly. Thanks for the hint.

: > Problem 5:
: > Is there a 3ware configuration program (tw_cli), which works on AMD64?
:
: You can try if the 32bit program works. If not ask 3ware.

Does not work:

ioctl32(tw_cli:32216): Unknown cmd fd(3) cmd(0000001f){00} arg(080dd2e0) on /dev/twe0

I have asked 3ware.

Thanks for your help,

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-27 22:27:19

by Andi Kleen

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

On Tue, 27 Jan 2004 22:49:31 +0100
Jan Kasprzak <[email protected]> wrote:


>
> Does not work:
>
> ioctl32(tw_cli:32216): Unknown cmd fd(3) cmd(0000001f){00} arg(080dd2e0) on /dev/twe0
>
> I have asked 3ware.

You can probably fix that yourself by adding ioctl translation to the 3ware driver.
See http://www.firstfloor.org/~andi/writing-ioctl32 for details.

-Andi

2004-01-28 08:02:16

by Martin Polak

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

Andi Kleen wrote:
> Jan Kasprzak <[email protected]> writes:
>
> You don't say if you run a 32bit or a 64bit kernel. I will assume 64bit.
>
>
>>Is it normal? How can I set up some IRQ balancing (or at least hard-wire
>>3ware for CPU1 and eth0 for CPU0)?
>
>
> Run irqbalanced
>
>
Well I posted that thing two weeks ago, occuring on a dual 240
K8T-Master from MSI, and yes: running irqbalance works fine, but still I
believe that there is some sort of weirdness in initialization code of
the kernel (2.6), because on 2.4 Kernels smp-affinity defaults to every
cpu and on 2.6 it doesnt.

Martin Polak

2004-01-28 15:18:29

by Andi Kleen

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

On Wed, 28 Jan 2004 09:02:13 +0100
Martin Polak <[email protected]> wrote:

> Andi Kleen wrote:
> > Jan Kasprzak <[email protected]> writes:
> >
> > You don't say if you run a 32bit or a 64bit kernel. I will assume 64bit.
> >
> >
> >>Is it normal? How can I set up some IRQ balancing (or at least hard-wire
> >>3ware for CPU1 and eth0 for CPU0)?
> >
> >
> > Run irqbalanced
> >
> >
> Well I posted that thing two weeks ago, occuring on a dual 240
> K8T-Master from MSI, and yes: running irqbalance works fine, but still I
> believe that there is some sort of weirdness in initialization code of
> the kernel (2.6), because on 2.4 Kernels smp-affinity defaults to every
> cpu and on 2.6 it doesnt.

Opteron doesn't support irq balancing in hardware (at least not with the
AMD chipset). 2.4/x86-64 kernels didn't have it neither.

Some version of 32bit kernels do automatic irq balancing in the kernel though.

-Andi

2004-01-28 17:07:29

by Jan Kasprzak

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

Jan Kasprzak wrote:
: : > Problem 2: the 3ware controller does not work correctly on the first
: : > PCI bus (slot 1 and 2) - in slot 1 it hangs under bigger load (e.g.
: : > an array rebuild), in slot 2 it hangs during boot in 3ware BIOS.
: : > It is probably not Linux-specific, but has anyone seen the same problem?
: :
: : I haven't seen it.
: :
: : You can try if it goes away when you disable ACPI PCI routing
: : (pci=noacpi)
:
: I will try tomorrow.
:

With pci=noacpi the system does not boot: it hangs during
the 3ware initialization - prints the following message:

3w-xxxx: scsi0: UNIT #0: Command (000001002645b0) timed out, resetting card
3w-xxxx: scsi0: UNIT #0: Command (000001002645b0) timed out, resetting card

Then it tries the same with UNIT #1 (whatever it is) and then the system
locks up.

With acpi=off it boots correctly. I will try it in another
Tyan S2882 box - it may be a faulty mainboard.

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
Any compiler or language that likes to hide things like memory allocations
behind your back just isn't a good choice for a kernel. --Linus Torvalds

2004-01-28 17:31:52

by Andi Kleen

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

On Wed, 28 Jan 2004 18:07:03 +0100
Jan Kasprzak <[email protected]> wrote:

> With pci=noacpi the system does not boot: it hangs during
> the 3ware initialization - prints the following message:
>
> 3w-xxxx: scsi0: UNIT #0: Command (000001002645b0) timed out, resetting card
> 3w-xxxx: scsi0: UNIT #0: Command (000001002645b0) timed out, resetting card
>
> Then it tries the same with UNIT #1 (whatever it is) and then the system
> locks up.
>
> With acpi=off it boots correctly. I will try it in another
> Tyan S2882 box - it may be a faulty mainboard.

It's probably an ACPI bug. I don't have time to look into it right now though.
You can file a bug in kernel bugzilla so that it isn't forgotten.

-Andi

2004-01-28 18:08:34

by Jan Kasprzak

[permalink] [raw]
Subject: Re: SMP AMD64 (Tyan S2882) problems.

Andi Kleen wrote:
: It's probably an ACPI bug. I don't have time to look into it right now though.
: You can file a bug in kernel bugzilla so that it isn't forgotten.

I did. http://bugzilla.kernel.org/show_bug.cgi?id=1966

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
Any compiler or language that likes to hide things like memory allocations
behind your back just isn't a good choice for a kernel. --Linus Torvalds