2007-11-29 02:44:25

by bdupree

[permalink] [raw]
Subject: Dynticks Causing High Context Switch Rate in ksoftirqd

I built the same dynticks-enabled 2.6.23.9 kernel on a nearly identical
system with minor changes to reflect the slightly different hardware.
These two systems have identical MSI E7210 MasterX FA6R motherboards (same
model and revision.) The differences are as follows:

behemoth (using Slackware 10.2)
-----------------------------------------------------------------
dual 2.4 GHz Xeons 400 MHz FSB
LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
Newer SATA/PATA Intel PIIX drivers

titan (using Slackware 11.0)
-----------------------------------------------------------------
dual 2.0 GHz Xeons 533 MHz FSB
Creative Labs SB Audigy LS (using ALSA driver)
Older IDE PATA Intel PIIX drivers

The result is that "behemoth" continues to exhibit 155,000 context
switches per second at idle while "titan" shows about 25 - 30 context
switches per second. Note that motherboard BIOS'es are at the same
revision and configured identically.

I guess (ugh) it's time for me to pull the MPT-Fusion U320 HBA and the
SATA disks out of "behemoth" and configure it with old style IDE drives to
be as close as possible to "titan." Then I can add parts back and see when
the problem occurs.

Best regards,

--Bill DuPree

P.S. See attachment for config details and other data.



Attachments:
config_info.txt (159.98 kB)

2007-11-29 15:41:00

by Ray Lee

[permalink] [raw]
Subject: Re: Dynticks Causing High Context Switch Rate in ksoftirqd

On Nov 28, 2007 6:44 PM, <[email protected]> wrote:
> I built the same dynticks-enabled 2.6.23.9 kernel on a nearly identical
> system with minor changes to reflect the slightly different hardware.
> These two systems have identical MSI E7210 MasterX FA6R motherboards (same
> model and revision.) The differences are as follows:
>
> behemoth (using Slackware 10.2)
> -----------------------------------------------------------------
> dual 2.4 GHz Xeons 400 MHz FSB
> LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
> Newer SATA/PATA Intel PIIX drivers
>
> titan (using Slackware 11.0)
> -----------------------------------------------------------------
> dual 2.0 GHz Xeons 533 MHz FSB
> Creative Labs SB Audigy LS (using ALSA driver)
> Older IDE PATA Intel PIIX drivers
>
> The result is that "behemoth" continues to exhibit 155,000 context
> switches per second at idle while "titan" shows about 25 - 30 context
> switches per second. Note that motherboard BIOS'es are at the same
> revision and configured identically.
>
> I guess (ugh) it's time for me to pull the MPT-Fusion U320 HBA and the
> SATA disks out of "behemoth" and configure it with old style IDE drives to
> be as close as possible to "titan." Then I can add parts back and see when
> the problem occurs.

Well, the first thing that seems obvious is that you're using
different version of userspace, and the newer userspace is on the
system that behaves better.

The second thing that pops to mind: are you doing all these
measurements booted into single user mode (init=/bin/bash or
somesuch)? If not, then I don't think we can pin this on the hardware
quite yet.

2007-11-29 17:38:48

by Ray Lee

[permalink] [raw]
Subject: Re: Dynticks Causing High Context Switch Rate in ksoftirqd

On Nov 29, 2007 9:11 AM, <[email protected]> wrote:
>
> These are good points. However, on the Slack 10.2 box I repeated these
> measurements with all userspace code quiesced. No daemons running except
> for those that are kernel threads. Secondly, I do run dynticks kernels on
> other Slackware 10.2 boxen without these issues. The hardware may not be
> identical, e.g. Xeons with Intel E7501 chipsets or Opterons with AMD 8131
> chipsets, but I don't see any of this weirdness. Maybe I'll fire up Slack
> 10.2 on spare partition on the other (almost) identical machine and see if
> it exhibits this problem.

Any way you can narrow down the problem space will help, as there are
a lot of variables right now.

Also, please keep the kernel list CC:d so others who are lurking can
see what's going on, and not ask for duplicate data.