2007-12-04 19:36:14

by Junichi Nomura

[permalink] [raw]
Subject: [PATCH] pci: Omit error message for benign allocation failure

Hi,

On a system with PCI-to-PCI bridges, following errors are observed:

PCI: Failed to allocate mem resource #8:100000@d8200000 for 0000:02:00.0
PCI: Failed to allocate mem resource #6:10000@0 for 0000:03:01.0

'#6' is for expansion ROM and '#8' for the bridge where the device
with the expansion ROM is connected.
But I think the failure is benign because the allocation is
not necessary for these resources.

The allocation failure message is informative when the failure
is really a problem and needs a diagnosis.
However, if the resource is expansion ROMs, the message is just
confusing.

This patch omits the error message if the resource is an expansion
ROM or a bridge.

Thanks,
--
Jun'ichi Nomura, NEC Corporation of America


Signed-off-by: Jun'ichi Nomura <[email protected]>

--- linux-2.6.24-rc3/drivers/pci/setup-res.c.orig 2007-12-04 00:24:11.000000000 -0500
+++ linux-2.6.24-rc3/drivers/pci/setup-res.c 2007-12-04 00:42:14.000000000 -0500
@@ -158,11 +158,12 @@ int pci_assign_resource(struct pci_dev *
}

if (ret) {
- printk(KERN_ERR "PCI: Failed to allocate %s resource "
- "#%d:%llx@%llx for %s\n",
- res->flags & IORESOURCE_IO ? "I/O" : "mem",
- resno, (unsigned long long)size,
- (unsigned long long)res->start, pci_name(dev));
+ if (resno < PCI_ROM_RESOURCE)
+ printk(KERN_ERR "PCI: Failed to allocate %s resource "
+ "#%d:%llx@%llx for %s\n",
+ res->flags & IORESOURCE_IO ? "I/O" : "mem",
+ resno, (unsigned long long)size,
+ (unsigned long long)res->start, pci_name(dev));
} else if (resno < PCI_BRIDGE_RESOURCES) {
pci_update_resource(dev, res, resno);
}


2007-12-04 21:49:31

by Gary Hade

[permalink] [raw]
Subject: Re: [PATCH] pci: Omit error message for benign allocation failure

On Tue, Dec 04, 2007 at 02:35:48PM -0500, Jun'ichi Nomura wrote:
> Hi,
>
> On a system with PCI-to-PCI bridges, following errors are observed:
>
> PCI: Failed to allocate mem resource #8:100000@d8200000 for 0000:02:00.0
> PCI: Failed to allocate mem resource #6:10000@0 for 0000:03:01.0
>
> '#6' is for expansion ROM and '#8' for the bridge where the device
> with the expansion ROM is connected.

I believe there is a good chance that may be another instance
of the regression caused by my "Avoid creating P2P prefetch
window for expansion ROMs" patch that was recently reported by
Jan Beulich.
http://marc.info/?l=linux-kernel&m=119555581103023&w=2

You might want to try reverting my changes to see if the
problem disappears.

I am working on a better fix for the problem that the patch
was attempting to address but this is turning out to be much
more difficult than I expected. If I don't have a solution
very soon I plan to publish a revert patch.

> But I think the failure is benign because the allocation is
> not necessary for these resources.

This is an interesting idea. Could you elaborate? As far
as I can tell, the kernel always tries to allocate memory
for expansion ROMs which it also exports to user level.
I have assumed that some drivers or user level apps may
need to access this space. Is this not true?

If this is due to my change, sorry for the trouble.

Thanks,
Gary

--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc

2007-12-04 23:26:21

by Junichi Nomura

[permalink] [raw]
Subject: Re: [PATCH] pci: Omit error message for benign allocation failure

Hi Gary,

Gary Hade wrote:
> On Tue, Dec 04, 2007 at 02:35:48PM -0500, Jun'ichi Nomura wrote:
>> On a system with PCI-to-PCI bridges, following errors are observed:
>>
>> PCI: Failed to allocate mem resource #8:100000@d8200000 for 0000:02:00.0
>> PCI: Failed to allocate mem resource #6:10000@0 for 0000:03:01.0
>>
>> '#6' is for expansion ROM and '#8' for the bridge where the device
>> with the expansion ROM is connected.
>
> I believe there is a good chance that may be another instance
> of the regression caused by my "Avoid creating P2P prefetch
> window for expansion ROMs" patch that was recently reported by
> Jan Beulich.
> http://marc.info/?l=linux-kernel&m=119555581103023&w=2

Sorry, I was not clear about that.
I've seen the above errors even with 2.6.22 and RHEL5, which is
based on 2.6.18.
So it's not a regression.

> I am working on a better fix for the problem that the patch
> was attempting to address but this is turning out to be much
> more difficult than I expected. If I don't have a solution
> very soon I plan to publish a revert patch.
>
>> But I think the failure is benign because the allocation is
>> not necessary for these resources.
>
> This is an interesting idea. Could you elaborate? As far
> as I can tell, the kernel always tries to allocate memory
> for expansion ROMs which it also exports to user level.

Kernel always tries to. But it's best effort basis, IMO.
(Maybe your patch is going to fix that?)
In the 1st stage (pcibios_allocate_resources, etc.),
it allocates resources based on PCI configuration provided by BIOS,
except for expansion ROMs.
In the 2nd stage (pci_assign_unassigned_resources, etc.),
it allocates resources for unallocated ones including expansion ROMs.
The 2nd stage doesn't reprogram the bridge settings.
So if the expansion ROM is under the bridge where other resource
is already allocated, the allocation failure occurs.
There are comments in drivers/pci/setup-bus.c:
/* Helper function for sizing routines: find first available
bus resource of a given type. Note: we intentionally skip
the bus resources which have already been assigned (that is,
have non-NULL parent resource). */

Fixing the above might be a better solution.
But I don't know if there are any users who need it.

> I have assumed that some drivers or user level apps may
> need to access this space. Is this not true?

I haven't heard of applications which depend on the kernel resource
allocation for expansion ROMs.
X is a big user of expansion ROMs but I heard it solves the resource
conflict itself.
If there is a counter example, I would like to know it.

Thanks,
--
Jun'ichi Nomura, NEC Corporation of America

2007-12-05 01:45:28

by Gary Hade

[permalink] [raw]
Subject: Re: [PATCH] pci: Omit error message for benign allocation failure

On Tue, Dec 04, 2007 at 06:23:32PM -0500, Jun'ichi Nomura wrote:
> Hi Gary,
>
> Gary Hade wrote:
> > On Tue, Dec 04, 2007 at 02:35:48PM -0500, Jun'ichi Nomura wrote:
> >> On a system with PCI-to-PCI bridges, following errors are observed:
> >>
> >> PCI: Failed to allocate mem resource #8:100000@d8200000 for 0000:02:00.0
> >> PCI: Failed to allocate mem resource #6:10000@0 for 0000:03:01.0
> >>
> >> '#6' is for expansion ROM and '#8' for the bridge where the device
> >> with the expansion ROM is connected.
> >
> > I believe there is a good chance that may be another instance
> > of the regression caused by my "Avoid creating P2P prefetch
> > window for expansion ROMs" patch that was recently reported by
> > Jan Beulich.
> > http://marc.info/?l=linux-kernel&m=119555581103023&w=2
>
> Sorry, I was not clear about that.
> I've seen the above errors even with 2.6.22 and RHEL5, which is
> based on 2.6.18.
> So it's not a regression.

Thats a relief. Jan's report was embarrassing enough. :)

>
> > I am working on a better fix for the problem that the patch
> > was attempting to address but this is turning out to be much
> > more difficult than I expected. If I don't have a solution
> > very soon I plan to publish a revert patch.
> >
> >> But I think the failure is benign because the allocation is
> >> not necessary for these resources.
> >
> > This is an interesting idea. Could you elaborate? As far
> > as I can tell, the kernel always tries to allocate memory
> > for expansion ROMs which it also exports to user level.
>
> Kernel always tries to. But it's best effort basis, IMO.
> (Maybe your patch is going to fix that?)

If you are seeing the allocation failures both with and
without my original patch I doubt that an improved version
would work. In my case, the BIOS had allowed sufficient
resource for the expansion ROMs but was expecting the kernel
to get it from the non-prefetch instead of prefetch window.
In your case, it sounds like there just isn't enough resource
available for expansion ROMs from either window.

> In the 1st stage (pcibios_allocate_resources, etc.),
> it allocates resources based on PCI configuration provided by BIOS,
> except for expansion ROMs.
> In the 2nd stage (pci_assign_unassigned_resources, etc.),
> it allocates resources for unallocated ones including expansion ROMs.
> The 2nd stage doesn't reprogram the bridge settings.
> So if the expansion ROM is under the bridge where other resource
> is already allocated, the allocation failure occurs.
> There are comments in drivers/pci/setup-bus.c:
> /* Helper function for sizing routines: find first available
> bus resource of a given type. Note: we intentionally skip
> the bus resources which have already been assigned (that is,
> have non-NULL parent resource). */
>
> Fixing the above might be a better solution.

Like I said above, I doubt that an improved version of my fix
would help your situation.

> But I don't know if there are any users who need it.
>
> > I have assumed that some drivers or user level apps may
> > need to access this space. Is this not true?
>
> I haven't heard of applications which depend on the kernel resource
> allocation for expansion ROMs.
> X is a big user of expansion ROMs but I heard it solves the resource
> conflict itself.
> If there is a counter example, I would like to know it.

Me too.

Thanks,
Gary

--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc

2007-12-05 15:19:11

by Junichi Nomura

[permalink] [raw]
Subject: Re: [PATCH] pci: Omit error message for benign allocation failure

Gary Hade wrote:
> On Tue, Dec 04, 2007 at 06:23:32PM -0500, Jun'ichi Nomura wrote:
>> Kernel always tries to. But it's best effort basis, IMO.
>> (Maybe your patch is going to fix that?)
>
> If you are seeing the allocation failures both with and
> without my original patch I doubt that an improved version
> would work. In my case, the BIOS had allowed sufficient
> resource for the expansion ROMs but was expecting the kernel
> to get it from the non-prefetch instead of prefetch window.

OK. Then it's different from mine.
Thanks for the info.

Regards,
--
Jun'ichi Nomura, NEC Corporation of America

2007-12-05 22:48:09

by Gary Hade

[permalink] [raw]
Subject: Re: [PATCH] pci: Omit error message for benign allocation failure

On Wed, Dec 05, 2007 at 10:18:24AM -0500, Jun'ichi Nomura wrote:
> Gary Hade wrote:
> > On Tue, Dec 04, 2007 at 06:23:32PM -0500, Jun'ichi Nomura wrote:
> >> Kernel always tries to. But it's best effort basis, IMO.
> >> (Maybe your patch is going to fix that?)
> >
> > If you are seeing the allocation failures both with and
> > without my original patch I doubt that an improved version
> > would work. In my case, the BIOS had allowed sufficient
> > resource for the expansion ROMs but was expecting the kernel
> > to get it from the non-prefetch instead of prefetch window.
>
> OK. Then it's different from mine.
> Thanks for the info.

No problem. Since our goals (eliminate confusing PCI memory
allocation failure messages) are the same and since your patch
would partially (see below) fix the issue that my patch was
attempting to address, I am _very_ interested to see how others
who are more expansion ROM knowledgeable than myself react to
your proposal. In fact, if what you are saying is correct,
I'm wondering why the kernel even needs to attempt (by default)
to obtain space for expansion ROMS that on some systems could
be better utilized elsewhere. Perhaps the default behavior
could be changed to exclude the expansion ROM allocation
attempts with a new kernel option added to enable the current
behavior for those that might want it. I think this would
solve both of our problems.

I will now bore you with my story and how it relates
to your change.

The issue that my patch was attempting to address involves
a PCIe adapter that contains a p2p bridge above a SCSI storage
controller:
[root@elm3a9 ~]# lspci -s 0b:00.0
0b:00.0 PCI bridge: PLX Technology, Inc. PEX 8114 PCI Express-to-PCI/PCI-X Bridge (rev bc)
[root@elm3a9 ~]# lspci -ts 0b:00.0
-+-[0000:0b]---00.0-[0000:0c]--
\-[0000:00]-
[root@elm3a9 ~]# lspci -s 0c:04.0
0c:04.0 SCSI storage controller: Adaptec ASC-29320ALP U320 (rev 10)

Case 1: (non-hotplug)
Without my patch and in non-hotplug context (adapter
installed at boot time) we see an allocation failure for
the prefetch window that was created due to the BIOS
unassigned expansion ROM BAR on the SCSI controller:
PCI: Failed to allocate mem resource #9:100000@f2800000 for 0000:0b:00.0
This is happening because the BIOS allowed space in the
non-prefetch window for the expansion ROM and provided
no extra space for the unexpected kernel created prefetch
window.

Case 2: (hotplug)
Without my patch and in hotplug context (same adapter
hotplugged to the same slot that it was in at boot time)
we see an allocation failure for BAR 0 of the bridge:
PCI: Failed to allocate mem resource #0:2000@f2800000 for 0000:0b:00.0
This is happening because allocations for both the
non-prefetch window and expansion ROM motivated prefetch
window succeeded consuming all of the memory that the BIOS
had provided for the adapter. This left no memory for
bridge BAR 0.

Your patch eliminates Case 1 by hiding the prefetch
window allocation failure. Case 2 is unfortunately
still alive and well because the expansion ROM motivated
prefetch window is still consuming the memory that the
BIOS provided for bridge BAR 0.

After seeing Jan's report it became obvious that expansion
ROMs cannot always be directed to a single type (prefetch
or non-prefetch) of window. I know of no direct way of
determining where the BIOS expects expansion ROMs to land
so some sort of intelligent choice based on other information
needs to be made. This is what I have been struggling with.
I can handle a simple case where expansion ROM(s) are
directed to the non-prefetch window if the calculated
non-prefetch window size both with and without expansion
ROM(s) is the same. I haven't yet figured out how to handle
cases where the with/without expansion ROM non-prefetch
window sizes differ. In these cases the BIOS's intention
does not appear to be clear. This is why I really like the
"skip the default expansion ROM allocation attempts" idea
that your proposal has spawned. :)

Thanks for your excellent idea.

Gary

--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc