2006-10-06 05:50:26

by Shawn Starr

[permalink] [raw]
Subject: [2.6.19-rc1][AGP] Regression - amd_k7_agp no longer detected

Hello,

I noticed that my AGP chipset is no longer detected with 2.6.18/19-rc1.
The last known kernel this worked was 2.6.15-rc5 (see end of email)

I checked the BIOS and the settings are fine. The aperture is set to 128MB (manually) for the video card is only 64MB video ram.

Here's the information:

The Motherboard is a A7M266-D board from Asus.

AGP/PCI information:
===================
00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge

PCI IDs:
=======
00:00.0 0600: 1022:700c (rev 11)
00:01.0 0604: 1022:700d

Problem:
========
When loading amd_k7_agp nothing appears from kernel, no information about the AGP chipset/aptreture size etc.
Even putting kprints inside the probe() function of the driver does not get called. So it seems core agpart is aborting silently.
In the specific agp chipset driver, the PCI ID bridge is matched but I see no further checks being done.

When the X Window System starts I get the following:

[ 1084.678461] Linux agpgart interface v0.101 (c) Dave Jones
[ 1301.755691] [drm] Initialized drm 1.0.1 20051102
[ 1301.761525] ACPI: PCI Interrupt 0000:01:05.0[A] -> GSI 16 (level, low) -> IRQ 20
[ 1301.762563] [drm] Initialized radeon 1.25.0 20060524 on minor 0
[ 1303.005775] [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held
[ 1303.005988] [drm:drm_unlock] *ERROR* Process 5215 using kernel context 0

>From Xorg log:

(WW) RADEON(0): [agp] AGP not available
(EE) RADEON(0): [agp] AGP failed to initialize. Disabling the DRI.
(II) RADEON(0): [agp] You may want to make sure the agpgart kernel module is loaded before the radeon kernel module.

What should be expected
>From 2.6.15-rc5:

[ 62.814823] Linux agpgart interface v0.101 (c) Dave Jones
[ 62.841388] agpgart: Detected AMD 760MP chipset
[ 62.870371] agpgart: AGP aperture is 128M @ 0xf0000000

Looking at the differences, I noticed some changes in generic.c for determing the AGP speed. I don't know
if this has anything to do with this breaking. This video card is a Radeon 7500 AiW 64MB DDR and can do
AGP4x and BIOS has AGP4x turned on by default. But this all would fail even before X is started if agpgart finds no chipset.

Thanks,
Shawn.


2006-10-06 06:08:09

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6.19-rc1][AGP] Regression - amd_k7_agp no longer detected

On Fri, Oct 06, 2006 at 01:50:19AM -0400, Shawn Starr wrote:

> When loading amd_k7_agp nothing appears from kernel, no information about the AGP chipset/aptreture size etc.
> Even putting kprints inside the probe() function of the driver does not get called.

Even as the first thing in agp_amdk7_probe() ?
What is pci_register_driver returning ?

> So it seems core agpart is aborting silently.

When we modprobe the chipset driver, and run through the ->probe, it's all pci layer
stuff really, up until we agp_alloc_bridge(). But if you're not getting that far,
the core agpgart stuff doesn't even come into play.

> In the specific agp chipset driver, the PCI ID bridge is matched but I see no further checks being done.
>
> When the X Window System starts I get the following:
>
> [ 1084.678461] Linux agpgart interface v0.101 (c) Dave Jones
> [ 1301.755691] [drm] Initialized drm 1.0.1 20051102
> [ 1301.761525] ACPI: PCI Interrupt 0000:01:05.0[A] -> GSI 16 (level, low) -> IRQ 20
> [ 1301.762563] [drm] Initialized radeon 1.25.0 20060524 on minor 0
> [ 1303.005775] [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held
> [ 1303.005988] [drm:drm_unlock] *ERROR* Process 5215 using kernel context 0

This is the behaviour I'd expect if agp-amdk7.ko hadn't been loaded, but agpgart.ko had.

It's something of a mystery to me as that driver hasn't changed in ages asides
from spelling fixes and other trivialities.

> What should be expected
> From 2.6.15-rc5:

Damn, that's going back a bit..
But again, this driver hasn't really changed much since 2.5.x, so I'm
wondering if this is a side-effect of some change in another subsystem.
Can you narrow it down to a specific kernel version where it broke ?
2.6.15 -> 2.6.18 is such a huge delta it's not even worth looking at.
Narrow the scope, and I'll eyeball the pci changes etc.

> [ 62.814823] Linux agpgart interface v0.101 (c) Dave Jones
> [ 62.841388] agpgart: Detected AMD 760MP chipset
> [ 62.870371] agpgart: AGP aperture is 128M @ 0xf0000000

I don't have any AMD hardware to test any more, so I've no chance of
trying to reproduce this. All I can suggest is to try and narrow
down where it's failing, and then maybe I'll have enough clues to hazard
a guess at the cause.

> Looking at the differences, I noticed some changes in generic.c for determing the AGP speed. I don't know
> if this has anything to do with this breaking. This video card is a Radeon 7500 AiW 64MB DDR and can do
> AGP4x and BIOS has AGP4x turned on by default. But this all would fail even before X is started if agpgart finds no chipset.

That code runs later when /dev/agpgart is open()'d, so it shouldn't
affect this. It shouldn't be hard to revert though if you want to
try it. Also, that only changed the AGPx8 path, which no K7 chipsets can do.
If you ended up running that code, something is deeply screwed.

Dave

--
http://www.codemonkey.org.uk

2006-10-06 06:59:58

by Shawn Starr

[permalink] [raw]
Subject: Re: [2.6.19-rc1][AGP] Regression - amd_k7_agp no longer detected

On Friday 06 October 2006 2:08 am, Dave Jones wrote:
> On Fri, Oct 06, 2006 at 01:50:19AM -0400, Shawn Starr wrote:
> > When loading amd_k7_agp nothing appears from kernel, no information
> > about the AGP chipset/aptreture size etc. Even putting kprints inside
> > the probe() function of the driver does not get called.
>
> Even as the first thing in agp_amdk7_probe() ?

Nada, nothing appears even if I put a printk before we do any actual probing.

> What is pci_register_driver returning ?

Don't know yet. II didn't yet walk agp_amdk7_init() and dump out the values
yet.

> When we modprobe the chipset driver, and run through the ->probe, it's all
> pci layer stuff really, up until we agp_alloc_bridge(). But if you're not
> getting that far, the core agpgart stuff doesn't even come into play.

> It's something of a mystery to me as that driver hasn't changed in ages
> asides from spelling fixes and other trivialities.

It does appear to be PCI. Yes, I don't see any significant changes in the agp
code (other than the one mentioned below)

> Damn, that's going back a bit..
> But again, this driver hasn't really changed much since 2.5.x, so I'm
> wondering if this is a side-effect of some change in another subsystem.
> Can you narrow it down to a specific kernel version where it broke ?
> 2.6.15 -> 2.6.18 is such a huge delta it's not even worth looking at.
> Narrow the scope, and I'll eyeball the pci changes etc.

> I don't have any AMD hardware to test any more, so I've no chance of
> trying to reproduce this. All I can suggest is to try and narrow
> down where it's failing, and then maybe I'll have enough clues to hazard
> a guess at the cause.

I'm going to do some git bisect fun (best time to learn how to do it) and
narrow this down later when I get back from work. We should find the cuprate
later today.

> > Looking at the differences, I noticed some changes in generic.c for
> > determing the AGP speed. I don't know if this has anything to do with
> > this breaking. This video card is a Radeon 7500 AiW 64MB DDR and can do
> > AGP4x and BIOS has AGP4x turned on by default. But this all would fail
> > even before X is started if agpgart finds no chipset.
>
> That code runs later when /dev/agpgart is open()'d, so it shouldn't
> affect this. It shouldn't be hard to revert though if you want to
> try it. Also, that only changed the AGPx8 path, which no K7 chipsets can
> do. If you ended up running that code, something is deeply screwed.

> Dave

I can certainly do a quick debug on that to confirm if it is or not hitting
that code path later today.

Thanks Dave. I'll provide you more info once I narrow things down a bit.

Shawn.

2006-12-11 02:07:24

by Kevin Puetz

[permalink] [raw]
Subject: Re: [2.6.19-rc1][AGP] Regression - amd_k7_agp no longer detected

Shawn Starr <shawn.starr <at> rogers.com> writes:

>
> On Friday 06 October 2006 2:08 am, Dave Jones wrote:
> > On Fri, Oct 06, 2006 at 01:50:19AM -0400, Shawn Starr wrote:
> > > When loading amd_k7_agp nothing appears from kernel, no information
> > > about the AGP chipset/aptreture size etc. Even putting kprints inside
> > > the probe() function of the driver does not get called.
> >
> > Even as the first thing in agp_amdk7_probe() ?
> ... (http://thread.gmane.org/gmane.linux.kernel/453869)

I'm hitting this problem too, and as it's still present in the 2.6.19 final, I'm
assuming you never got enough information to chase it. I found the following
note in the debian BTS that seems relevant:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=363682

I can confirm that if I remove the amd76x_edac module and reload amd_k7_agp, it
detects the aperture. If I then reload radeon.ko and X, I get DRI (and AIGLX).
So hopefully that's a lead to what might have changed...

2006-12-12 18:06:53

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6.19-rc1][AGP] Regression - amd_k7_agp no longer detected

On Mon, Dec 11, 2006 at 02:06:19AM +0000, Kevin Puetz wrote:
> Shawn Starr <shawn.starr <at> rogers.com> writes:
>
> >
> > On Friday 06 October 2006 2:08 am, Dave Jones wrote:
> > > On Fri, Oct 06, 2006 at 01:50:19AM -0400, Shawn Starr wrote:
> > > > When loading amd_k7_agp nothing appears from kernel, no information
> > > > about the AGP chipset/aptreture size etc. Even putting kprints inside
> > > > the probe() function of the driver does not get called.
> > >
> > > Even as the first thing in agp_amdk7_probe() ?
> > ... (http://thread.gmane.org/gmane.linux.kernel/453869)
>
> I'm hitting this problem too, and as it's still present in the 2.6.19 final, I'm
> assuming you never got enough information to chase it. I found the following
> note in the debian BTS that seems relevant:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=363682
>
> I can confirm that if I remove the amd76x_edac module and reload amd_k7_agp, it
> detects the aperture. If I then reload radeon.ko and X, I get DRI (and AIGLX).
> So hopefully that's a lead to what might have changed...

This is increasingly becoming a problem. For cases where we have >1 driver
trying to 'own' a single PCI ID, the first to init generally wins.

Similar problems exist with intel agp vs edac, intel agp vs intel watchdog,
matroxfb vs W1, and probably more..

What's needed is a governing module that claims the device and arbitrates
for drivers 'below' it.

Dave

--
http://www.codemonkey.org.uk