2005-04-05 19:40:14

by Ross Biro

[permalink] [raw]
Subject: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

diff -ur linux-2.6.11/drivers/pci/Kconfig linux-2.6.11-new/drivers/pci/Kconfig
--- linux-2.6.11/drivers/pci/Kconfig 2005-03-01 23:37:51.000000000 -0800
+++ linux-2.6.11-new/drivers/pci/Kconfig 2005-04-01 07:19:32.000000000 -0800
@@ -47,3 +47,38 @@

When in doubt, say Y.

+choice
+ prompt "Enable PCI Master Abort Mode"
+ depends on PCI
+ default PCI_MASTER_ABORT_DEFAULT
+ help
+ On PCI systems, when a bus is unavailable to a bus master, a
+ master abort occurs. Older bridges satisfy the master request
+ with all 0xFF's. This can lead to silent data corruption. Newer
+ bridges can send a target abort to the bus master. Some PCI
+ hardware cannot handle the target abort. Some x86 BIOSes configure
+ the buses in a suboptimal way. This option allows you to override
+ the BIOS setting. If unsure chose default. This choice can be
+ overridden at boot time with the pci_enable_master_abort={default,
+ enable, disable}
+
+config PCI_MASTER_ABORT_DEFAULT
+ bool "Default"
+ help
+ Choose this option if you are unsure, or believe your
+ firmware does the right thing.
+
+config PCI_MASTER_ABORT_ENABLE
+ bool "Enable"
+ help
+ Choose this option if it is more important for you to prevent
+ silent data loss than to have more hardware configurations work.
+
+
+config PCI_MASTER_ABORT_DISABLE
+ bool "Disable"
+ help
+ Choose this option if it is more important for you to have more
+ hardware configurations work than to prevent silent data loss.
+
+endchoice
diff -ur linux-2.6.11/drivers/pci/probe.c linux-2.6.11-new/drivers/pci/probe.c
--- linux-2.6.11/drivers/pci/probe.c 2005-03-01 23:38:13.000000000 -0800
+++ linux-2.6.11-new/drivers/pci/probe.c 2005-04-05 12:07:53.000000000 -0700
@@ -28,6 +28,15 @@

LIST_HEAD(pci_devices);

+/* used to force master abort mode on or off at runtime.
+ PCI_MASTER_ABORT_DEFAULT means leave alone, the BIOS got it correct.
+ PCI_MASTER_ABORT_ENABLE means turn it on everywhere.
+ PCI_MASTER_ABORT_DISABLE means turn it off everywhere.
+*/
+
+static int pci_enable_master_abort=PCI_MASTER_ABORT_VAL;
+
+
#ifdef HAVE_PCI_LEGACY
/**
* pci_create_legacy_files - create legacy I/O port and memory files
@@ -429,6 +438,20 @@
pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);

+ /* Some BIOSes disable master abort mode, even though it's
+ usually a good thing (prevents silent data corruption).
+ Unfortunately some hardware (buggy e-1000 chips for
+ example) require Master Abort Mode to be off, or they will
+ not function properly. So we enable master abort mode
+ unless the user told us not to. The default value
+ for pci_enable_master_abort is set in the config file,
+ but can be overridden at setup time. */
+ if (pci_enable_master_abort == PCI_MASTER_ABORT_ENABLE) {
+ bctl |= PCI_BRIDGE_CTL_MASTER_ABORT;
+ } else if (pci_enable_master_abort == PCI_MASTER_ABORT_DISABLE) {
+ bctl &= ~PCI_BRIDGE_CTL_MASTER_ABORT;
+ }
+
pci_enable_crs(dev);

if ((buses & 0xffff00) && !pcibios_assign_all_busses() && !is_cardbus) {
@@ -932,6 +955,22 @@
kfree(b);
return NULL;
}
+
+static int __devinit pci_enable_master_abort_setup(char *str)
+{
+ if (strcmp(str, "enable") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_ENABLE;
+ } else if (strcmp(str, "disable") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_DISABLE;
+ } else if (strcmp(str, "default") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_DEFAULT;
+ } else {
+ printk (KERN_ERR "PCI: Unknown Master Abort Mode (%s).", str);
+ }
+}
+
+__setup("pci_enable_master_abort=", pci_enable_master_abort_setup);
+
EXPORT_SYMBOL(pci_scan_bus_parented);

#ifdef CONFIG_HOTPLUG
diff -ur linux-2.6.11/include/linux/pci.h linux-2.6.11-new/include/linux/pci.h
--- linux-2.6.11/include/linux/pci.h 2005-03-01 23:38:08.000000000 -0800
+++ linux-2.6.11-new/include/linux/pci.h 2005-04-01 07:19:18.000000000 -0800
@@ -1064,5 +1064,17 @@
#define PCIPCI_VSFX 16
#define PCIPCI_ALIMAGIK 32

+#define PCI_MASTER_ABORT_DEFAULT 0
+#define PCI_MASTER_ABORT_ENABLE 1
+#define PCI_MASTER_ABORT_DISABLE 2
+
+#if defined(CONFIG_PCI_MASTER_ABORT_ENABLE)
+# define PCI_MASTER_ABORT_VAL PCI_MASTER_ABORT_ENABLE
+#elif defined(CONFIG_PCI_MASTER_ABORT_DISABLE)
+# define PCI_MASTER_ABORT_VAL PCI_MASTER_ABORT_DISABLE
+#else
+# define PCI_MASTER_ABORT_VAL PCI_MASTER_ABORT_DEFAULT
+#endif
+
#endif /* __KERNEL__ */
#endif /* LINUX_PCI_H */


Attachments:
master-abort.patch (4.38 kB)

2005-04-05 21:21:32

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

Ross Biro wrote:
>
> Currently Linux 2.6 assumes the BIOS (or firmware) sets the master abort
> mode flag on PCI bridge chips in a coherent fashion. This is not always
> the case and the consequences of getting this flag incorrect can cause
> hardware to fail or silent data corruption. This patch lets the user
> override the BIOS master abort setting at boot time and the distro
> maintainer to set a default according to their target audience.
>
> The comments in the patch are probably a bit too verbose, but I think it
> is a good patch to start discussions around. If it is decided that
> something should be done about this problem, this patch could be
> included in a -mm release and migrate into Linus's kernel as appropriate.

The comments were helpful to me.

> This incarnation of the patch has had minimal testing. For our internal
> kernels, we always force the master abort mode to 1 and then let the
> device drivers for hardware we know can't handle target aborts switch
> the master abort mode to 0. This does not seem appropriate for general
> release.
>
> Some background for those who do not spend most of their waking hours
> exploring buses and what can go wrong.

Is this related (or could it be -- or should it be) at all to the
current discussion on the linux-pci mailing list
[email protected]) about "PCI Error Recovery
API Proposal" ?

> The master abort flag tells a PCI bridge what to do when a bus master
> behind the bridge requests the bus and the bridge is unable to get the
> bus. With the flag clear, for master reads the bridge returns all
> 0xff's (hence silent data corruption) and for master writes, it throws
> the data away. With the bit set, the bridge sends a target abort to the
> master. This can only happen when the system is heavily loaded.

or a PCI device isn't playing nicely?

> The problem with always setting the bit is that some PCI hardware,
> notably some Intel E-1000 chips (Ethernet controller: Intel Corporation:
> Unknown device 1076) cannot properly handle the target abort bit. In
> the case of the E-1000 chip, the driver must reset the chip to recover.
> This usually leads to the machine being off the network for several
> seconds, or sometimes even minutes, which can be bad for servers.
>
> I even have a single motherboard with both a device that cannot handle
> the target abort and an IDE controller that can handle the target abort
> behind the same bridge. For this motherboard, I have to choose the
> lesser of two evils, network hiccups or potential data corruption.
> For the record, I have seen both occur. Other people may make wish to
> make a different choice than we did, hence this patch allows the user to
> choose the mode at runtime.
>
> Ross
>
> ------------------------------------------------------------------------
>
> diff -ur linux-2.6.11/drivers/pci/Kconfig linux-2.6.11-new/drivers/pci/Kconfig
> --- linux-2.6.11/drivers/pci/Kconfig 2005-03-01 23:37:51.000000000 -0800
> +++ linux-2.6.11-new/drivers/pci/Kconfig 2005-04-01 07:19:32.000000000 -0800
> @@ -47,3 +47,38 @@
>
> When in doubt, say Y.
>
> +choice
> + prompt "Enable PCI Master Abort Mode"
> + depends on PCI
> + default PCI_MASTER_ABORT_DEFAULT
> + help
> + On PCI systems, when a bus is unavailable to a bus master, a
> + master abort occurs. Older bridges satisfy the master request
> + with all 0xFF's. This can lead to silent data corruption. Newer
> + bridges can send a target abort to the bus master. Some PCI
> + hardware cannot handle the target abort. Some x86 BIOSes configure
> + the buses in a suboptimal way. This option allows you to override
^^^ extra spaces

> + the BIOS setting. If unsure chose default. This choice can be
choose
> + overridden at boot time with the pci_enable_master_abort={default,
> + enable, disable}
boot option.

> +
> +config PCI_MASTER_ABORT_DEFAULT
> + bool "Default"
> + help
> + Choose this option if you are unsure, or believe your
> + firmware does the right thing.
> +
> +config PCI_MASTER_ABORT_ENABLE
> + bool "Enable"
> + help
> + Choose this option if it is more important for you to prevent
> + silent data loss than to have more hardware configurations work.
^^^^ ??

> +
> +
> +config PCI_MASTER_ABORT_DISABLE
> + bool "Disable"
> + help
> + Choose this option if it is more important for you to have more
^^^^
The phrase "have more hardware configurations work" need something....
Maybe add something like: "Some devices are known not to work with
PCI Master Aborts. If you have one of these devices, you probably
want to Disable this option."


> + hardware configurations work than to prevent silent data loss.
> +
> +endchoice
> diff -ur linux-2.6.11/drivers/pci/probe.c linux-2.6.11-new/drivers/pci/probe.c
> --- linux-2.6.11/drivers/pci/probe.c 2005-03-01 23:38:13.000000000 -0800
> +++ linux-2.6.11-new/drivers/pci/probe.c 2005-04-05 12:07:53.000000000 -0700
> @@ -28,6 +28,15 @@
>
> LIST_HEAD(pci_devices);
>
> +/* used to force master abort mode on or off at runtime.
> + PCI_MASTER_ABORT_DEFAULT means leave alone, the BIOS got it correct.
> + PCI_MASTER_ABORT_ENABLE means turn it on everywhere.
> + PCI_MASTER_ABORT_DISABLE means turn it off everywhere.
> +*/
> +
> +static int pci_enable_master_abort=PCI_MASTER_ABORT_VAL;

Nitpick: spaces around the '=' would enhance readability and be
appreciated.

> @@ -429,6 +438,20 @@
> pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
> bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
>
> + /* Some BIOSes disable master abort mode, even though it's
> + usually a good thing (prevents silent data corruption).
> + Unfortunately some hardware (buggy e-1000 chips for
> + example) require Master Abort Mode to be off, or they will
> + not function properly. So we enable master abort mode
> + unless the user told us not to. The default value
> + for pci_enable_master_abort is set in the config file,
> + but can be overridden at setup time. */
Nit #2: kernel long-comment style is:
/*
* line1
* line2
*/

> + if (pci_enable_master_abort == PCI_MASTER_ABORT_ENABLE) {
> + bctl |= PCI_BRIDGE_CTL_MASTER_ABORT;
> + } else if (pci_enable_master_abort == PCI_MASTER_ABORT_DISABLE) {
> + bctl &= ~PCI_BRIDGE_CTL_MASTER_ABORT;
> + }
> +
> pci_enable_crs(dev);
>
> if ((buses & 0xffff00) && !pcibios_assign_all_busses() && !is_cardbus) {
> @@ -932,6 +955,22 @@
> kfree(b);
> return NULL;
> }
> +
> +static int __devinit pci_enable_master_abort_setup(char *str)
Why __devinit? Looks to me like __init would be fine.

> +{
> + if (strcmp(str, "enable") == 0) {
> + pci_enable_master_abort = PCI_MASTER_ABORT_ENABLE;
> + } else if (strcmp(str, "disable") == 0) {
> + pci_enable_master_abort = PCI_MASTER_ABORT_DISABLE;
> + } else if (strcmp(str, "default") == 0) {
> + pci_enable_master_abort = PCI_MASTER_ABORT_DEFAULT;
> + } else {
> + printk (KERN_ERR "PCI: Unknown Master Abort Mode (%s).", str);
> + }
> +}
> +
> +__setup("pci_enable_master_abort=", pci_enable_master_abort_setup);

--
~Randy

2005-04-06 13:47:56

by Ross Biro

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

Randy.Dunlap wrote:
>
>
> Is this related (or could it be -- or should it be) at all to the
> current discussion on the linux-pci mailing list
> [email protected]) about "PCI Error Recovery
> API Proposal" ?


I'm not familiar with the proposal, but this is not related to error
recovery since master aborts are a way of life on the PCI bus and things
just need to deal. The only question is how.

>
>> the master. This can only happen when the system is heavily loaded.
>
>
> or a PCI device isn't playing nicely?

Yes, but at least then you could blame the device in that case.

[ style and grammar comments noted ]

One thing I did fail to mention in my original post is that all of this
could be done by rc scripts from user space, but that seems unclean to me.

Ross

2005-04-06 20:45:05

by Daniel Egger

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On 05.04.2005, at 21:33, Ross Biro wrote:

> The problem with always setting the bit is that some PCI hardware,
> notably some Intel E-1000 chips (Ethernet controller: Intel
> Corporation: Unknown device 1076) cannot properly handle the target
> abort bit. In the case of the E-1000 chip, the driver must reset the
> chip to recover. This usually leads to the machine being off the
> network for several seconds, or sometimes even minutes, which can be
> bad for servers.

This sounds *exactly* like my problem since I swapped
motherboards. I'll see whether there's some option in
the BIOS that fixes it and if not bite the bullet and
compile a generic kernel....

Thanks a lot for investigating this.

Servus,
Daniel


Attachments:
PGP.sig (186.00 B)
This is a digitally signed message part

2005-04-10 13:29:15

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

Ross Biro <[email protected]> writes:
>
> I even have a single motherboard with both a device that cannot handle
> the target abort and an IDE controller that can handle the target
> abort behind the same bridge. For this motherboard, I have to choose
> the lesser of two evils, network hiccups or potential data corruption.
> For the record, I have seen both occur. Other people may make wish to
> make a different choice than we did, hence this patch allows the user
> to choose the mode at runtime.

I think it is totally wrong to make this Configs and boot options.
Nobody can do anything with such obscure boot configurations
and it is bad to require kernel recompiles for such things.

The right way to do this would be to have sysfs knobs that allow
to change these bits, and then let a user space tool change
it depending on PCI-ID. If the issue is critical enough
that it happens very often then it should be added to kernel
pci quirks - but again be unconditional.

-Andi

2005-04-13 18:42:46

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On Tue, Apr 12, 2005 at 10:52:55AM -0400, Ross Biro wrote:
> On Apr 10, 2005 9:29 AM, Andi Kleen <[email protected]> wrote:
> >
> >
> > The right way to do this would be to have sysfs knobs that allow
> > to change these bits, and then let a user space tool change
> > it depending on PCI-ID. If the issue is critical enough
> > that it happens very often then it should be added to kernel
> > pci quirks - but again be unconditional.
>
>
> Using user space knobs has advantages, but nothing can depend on just the
> hardware configuration. The application the machine is being used for also
> matters. Image you have one of the bad NICs and an IDE controller behind the
> same bridge. Then you have to chose between silent data corruption and the
> NIC locking up for up to a few minutes once in a while. The correct choice
> depends on the application.
>
> For the way we use machines, we are better off with a compile time option
> and no boot line override. That's clearly wrong for general use.

That is definitely wrong for general use. In fact the Linux kernel
has been moving away from the old "put weird workarounds into CONFIG"
for quite some time now. One big reason is that actually most
users use binary kernels these days, but even for us who recompile
kernels regularly it is inconvenient to recompile kernels just for
such things.

If you want it compiled in for your use case I would recommend
that you add a local patch or add a patch for a compiled in kernel
command line in config (some non i386 archs have this already)

>
> You're argument that no one can make sense of such options is totally off
> base. Once you are having a problem, it's pretty easy to see if it's related

I dont think it is in any way help to put suche highly obscure
things into Config. Near nobody can make any sense of it.

If you take a look at quirks.c and DMI options you will see we have quite a lot
of workarounds for various hardware bug. Just imagine there were
CONFIG options for all of this. It would be a big mess!

> to a wrong master abort mode setting. If you see data that is all 0xff's
> somewhere it shouldn't be, for example on a hard drive sector (it usually
> occurs in the file system meta data and not in the data itself) you need to
> force master abort mode on. If you have a mis-behaving PCI device and
> everytime it misbehaves, the saw target abort bit is set, then you need to
> force master abort mode off. First line tech support people should be able
> to tell users to use these settings.

Yeah, but that is impossible if it is a CONFIG - they would need
to expnain the users first how to recompile a kernel, which would
be totally wasted time because it can be set fine without any recompilation
if done properly.

>
> I actually don't see any reason you would ever want master abort mode off,
> other than you have buggy hardware. Unfortunately when you are working with
> PC's you have to assume you always have buggy hardware. I don't have much
> experience with other platforms, so I'll assume they are better (those of
> you with experience, please do not disillusion me.)

Probably yes.

What you could do is to put a experimental patch that forces this always
into -mm* for a few weeks and see if there are any bad reports.

-Andi

2005-04-13 23:00:17

by Ross Biro

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On 13 Apr 2005 20:37:25 +0200, Andi Kleen <[email protected]> wrote:
> \>
> > You're argument that no one can make sense of such options is totally off
> > base. Once you are having a problem, it's pretty easy to see if it's related
>
> I dont think it is in any way help to put suche highly obscure
> things into Config. Near nobody can make any sense of it.
>
> If you take a look at quirks.c and DMI options you will see we have quite a lot
> of workarounds for various hardware bug. Just imagine there were
> CONFIG options for all of this. It would be a big mess!

The config option is for distro maintainers to use to set a policy
for their particular distribution. The boot line option is for end
users to adjust it. Last I heard, most distro makers compile their
own kernels and select options appropriately. I really don't think
it's too much to ask an end user to adjust their grub.conf or
lilo.conf file to work around a bug in their hardware, especially
since their is *no way* to work around the bug in all cases with out
user intervention.

As I said before, the quirks routines cannot handle it since there is
no way to know what the correct setting is unless you know what
application is going to be run and what the users tolerance to
particular problems is. In a perfect world, master abort mode would
always be set to on, but that is not practical in the real world. If
you are suggesting that something in the quirks file stop the boot and
ask the user some questions about how they intend to use the system
and what their tolerance for certain types of errors is, then I think
you are suggesting an even bigger mess.

Someone creating a dstro for enterprise use would most likely compile
the kernel with master abort mode enabled to prevent silent data loss.
Someone building the system for desktop use would choose either
default or disabled, to prevent spurious error messages, or hardware
lock ups. If users report problems that look like they are caused by
the master abort mode setting, a tech support person could easily ask
the end user to add a boot time command line option to see if the
problem goes away. The end user would then have the *option* of
adjusting the config file, or just using the boot time option.

I would aggree with you if it were not for the fact that the correct
setting of this bit is really a judgement call, so it must be simple
for anyone who needs to make the call to be able to. The people
building distors will need to be able change the default setting
easily at compile time and the end user needs to be able to change the
setting at boot time or run time.

Someone on the PCI mailing list has suggested that it is enough to let
the distro maintainer edit the header file and adjust the setting
there. To do so would mean that many distro maintainers would have to
maintain an additional patch for very little reason. Perhaps the
correct solution is to keep it as a config option and add a
CONFIG_OBSCURE so that most people don't ever see option, but the few
that need to can.

Ross

2005-04-13 23:28:39

by Dave Jones

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On Wed, Apr 13, 2005 at 07:00:06PM -0400, Ross Biro wrote:

> > If you take a look at quirks.c and DMI options you will see we have quite a lot
> > of workarounds for various hardware bug. Just imagine there were
> > CONFIG options for all of this. It would be a big mess!
>
> The config option is for distro maintainers to use to set a policy
> for their particular distribution. The boot line option is for end
> users to adjust it. Last I heard, most distro makers compile their
> own kernels and select options appropriately. I really don't think
> it's too much to ask an end user to adjust their grub.conf or
> lilo.conf file to work around a bug in their hardware, especially
> since their is *no way* to work around the bug in all cases with out
> user intervention.

The thing is, most users won't have a clue about this option,
and that is a good thing. They just want stuff to work, not have
to poke random bits and pieces.

> As I said before, the quirks routines cannot handle it since there is
> no way to know what the correct setting is unless you know what
> application is going to be run and what the users tolerance to
> particular problems is. In a perfect world, master abort mode would
> always be set to on, but that is not practical in the real world. If
> you are suggesting that something in the quirks file stop the boot and
> ask the user some questions about how they intend to use the system
> and what their tolerance for certain types of errors is, then I think
> you are suggesting an even bigger mess.

You don't need to ask the user anything (they won't know the answers anyway)
You already mentioned that E1000's cause this problem, so you have the
basis for the beginning of a blacklist. A patch to explicitly enable
this feature in -mm for a while will probably shake out most of the
common problematic hardware pretty quickly.

> Someone creating a dstro for enterprise use would most likely compile
> the kernel with master abort mode enabled to prevent silent data loss.
> Someone building the system for desktop use would choose either
> default or disabled, to prevent spurious error messages, or hardware
> lock ups.

So its ok for enterprise use to spew error msgs and have hardware lockups ?
See the problem with setting it either on/off ? We need to take
additional factors into consideration, or we're left with something
thats essentially useless.

> If users report problems that look like they are caused by
> the master abort mode setting, a tech support person could easily ask
> the end user to add a boot time command line option to see if the
> problem goes away. The end user would then have the *option* of
> adjusting the config file, or just using the boot time option.

A lock-up could be caused by any number of problems, and I'll put money
on even the best support guys not knowing about this option 6 months
after it got merged. Obscure toggles for esoteric features like this
get forgotten about quickly. It's more likely the support bod would
chase down other avenues before ever hitting upon this.

> I would aggree with you if it were not for the fact that the correct
> setting of this bit is really a judgement call, so it must be simple
> for anyone who needs to make the call to be able to. The people
> building distors will need to be able change the default setting
> easily at compile time and the end user needs to be able to change the
> setting at boot time or run time.

As someone who builds distro kernels I disagree.
End users need things to 'just work'. 99% of end-users don't know, or care
about quirks in their hardware. If we start expecting the bulk of
them to have to go editing their grub/lilo/etc configs, we've lost.

> Someone on the PCI mailing list has suggested that it is enough to let
> the distro maintainer edit the header file and adjust the setting
> there. To do so would mean that many distro maintainers would have to
> maintain an additional patch for very little reason. Perhaps the
> correct solution is to keep it as a config option and add a
> CONFIG_OBSCURE so that most people don't ever see option, but the few
> that need to can.

If we have a situation where we screw a subset of users with the
config option =y and a different subset with =n, how is this improving
the situation any over what we have today ?

Dave

2005-04-14 17:25:41

by Ross Biro

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On 4/13/05, Dave Jones <[email protected]> wrote:

> If we have a situation where we screw a subset of users with the
> config option =y and a different subset with =n, how is this improving
> the situation any over what we have today ?

This is exactly the case and this is better than what we have today
because it makes it easy to chose =y or =n, so rather than making
things work for subset 1 and screwing subset 2. Each distro can chose
which subset to screw by default and make it easy for them to unscrew
themselves.

Just to be clear, we can have two users A and B with the exact same
hardware. A setting of =y will screw user A and a setting of =n will
screw user B. Ideally, they would both get better hardware, but that
is not always an option.

Ross

Ross

2005-04-14 17:35:08

by Tim Hockin

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On 4/13/05, Dave Jones <[email protected]> wrote:

> If we have a situation where we screw a subset of users with the
> config option =y and a different subset with =n, how is this improving
> the situation any over what we have today ?

Dave,

What's a good alternative? Do we need to keep a whitelist of hardware
that is known to work? A blacklist is pretty risky, since this is a very
hard problem to find.

What if it was always on, except when the commandlien was passed
(eliminate the CONFIG option)? Really 'leet hacks could tweak a #define
if they don't like the command line option..

2005-04-14 18:02:13

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

> What if it was always on, except when the commandlien was passed
> (eliminate the CONFIG option)? Really 'leet hacks could tweak a #define
> if they don't like the command line option..

That is basically what I suggested. But test it for a month
in -mm* first and figure out if it needs more black/whitelisting

-Andi

2005-04-14 18:34:08

by Dave Jones

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On Thu, Apr 14, 2005 at 08:02:02PM +0200, Andi Kleen wrote:
> > What if it was always on, except when the commandlien was passed
> > (eliminate the CONFIG option)? Really 'leet hacks could tweak a #define
> > if they don't like the command line option..
>
> That is basically what I suggested. But test it for a month
> in -mm* first and figure out if it needs more black/whitelisting

Indeed. I'm in full agreement with Andi's suggestion.

Dave

2005-04-14 19:15:02

by Daniel Egger

[permalink] [raw]
Subject: Re: [RFC/Patch 2.6.11] Take control of PCI Master Abort Mode

On 14.04.2005, at 19:25, Ross Biro wrote:

> Just to be clear, we can have two users A and B with the exact same
> hardware. A setting of =y will screw user A and a setting of =n will
> screw user B. Ideally, they would both get better hardware, but that
> is not always an option.

You tell me a better[1] 32bit GigE PCI adapter than Intel E1000
and I sure do this. It's pretty interesting to see that those
who buy some not-so-cheeep hardware are being screwed in this
case; it should be in Intels best interest to help fix this
issue ASAP and permantently for all users.

[1] better performance at less CPU utilization + good diagnostics
and negotiation capabilities

Servus,
Daniel


Attachments:
PGP.sig (186.00 B)
This is a digitally signed message part