2005-02-22 03:13:35

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: POSTing of video cards (WAS: Solo Xgl..)


> Ben, since I'm not getting any help on LKML maybe you can answer this.
> Secondary cards needs reset. After looking at a bunch of fbdev drivers
> their code assumes the card has been reset when their probe() function
> runs. So this means that we have to run the VBIOS reset before probe
> is called.

I'm putting back LKML on CC since I intended to reply to your post there
once I got a bit of time.

No, we can't really do that _before_ probe is called. It's the
responsibility of the driver to do what it needs at probe time or later.
Some drivers need that reset (and not that many in fact), some don't.

We may provide a "helper" to use from the probe() for that purpose, to
make things easier.

Wether the reset code is kernel based or userland based, I would avoid
have it run synchronously anyway. If a driver detects that it's card
hasn't been properly initialized by the firmware (and the driver should
be able to detect that), I suggest it's probe routine calls the
appropriate helper, providing it with ways to get to the ROM (in some
case, the same helper will be needed for resume from sleep, and the ROM
may not be the PCI BAR one, but the memory shadow, though that will not
always work afaik).

Look at the firmware download helper, that's very similar. I want an
asynchronous interface though (that is you get a callback when the reset
is complete or timed out) rather than synchronous since it's wrong to
synchronously rely on userland beeing available (power management,
pre-root mount, etc...)

> So where can I hook up the call to run the VBIOS up in the kernel? You
> can't trigger it on module load since the module may support multiple
> identical adapters. One adapter may already have the module loaded and
> then a second shows up via hotplug. In this case the module won't get
> loaded again and the second card doesn't get reset.
>
> If using a user space reset program what do you do if the user space
> program is missing or does not complete running? Somehow you have to
> stop the probe function from being called.

That's ok. We deal with that in the firmware loader code already. Just
timeout or check for errors from call_usermodehelper. You basically run
the user program and wait for it to write a "reply" via sysfs. In fact,
the existing firmware loading facility could be re-used.

> Another case, you have a card and load the module for it, this causes
> reset. Now unload the module and load it again. This probably should
> not reset the card a second time. You also have to make sure you don't
> reset the primary card.

It's up to each driver to detect wether it's card need to be POSTed or
not. Anything else would mean infinite breakage.

> One solution is to track in pci_dev if the card has been reset. This
> preserves the state across module load/unload. I'm then tempted to put
> an in-kernel emu86 (I have a 40K one) into the pci driver. PCI would
> use this to reset the card before calling probe(). If the VBIOS/emu86
> has an error it simply wouldn't call the probe function. Doing this
> in-kernel makes everything synchronous but GregKH would probably have
> some choice words about the emulator in the PCI driver.

No, again, it's up to the driver to decide wether it needs to POST or
not (I prefer that to "reset"). I have no preference for the emulator to
be in-kernel or userland. I suppose it's easier in userland, just
re-using the existing infrastructure for firmware loading.

> I am leaning toward putting this into the PCI driver. At boot the PCI
> driver would reset any cards it finds. The PCI driver also implements
> hotplug so now I have a place to do reset before calling probe in this
> case. Doing it in-kernel fixes the synchronization problem. Right now
> there is no way to suspend calling the probe function while we wait
> for a hotplug event to finish.
>
> I have all of the pieces needed to build this. I just can't figure out
> where to hook it into the kernel. Worst case is that I have to go
> modify 75 framebuffer drivers to explicitly support reset.

No. You don't. A lot of them simply don't care. Just adapt radeonfb, and
maybe the others ATIs and rivafb, period. If somebody want to adapt to
your facility, it's up to that person to adapt the framebuffer of their
dream. You provide infrastructure that _adds_ a functionality not
previously present. You don't need to implement it in all drivers
yourself, just do it in a few that matter, and let people who want the
feature catch up as long as "old" drivers don't _lose_ functionality.

Also, a lot of those FB's are embedded things, or ppc/pmac things,
etc... and they simply don't fit into your scheme anyway (and mostly
don't have the problem in the first place).

Cheers,
Ben.



2005-02-22 04:42:14

by [email protected]

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 14:12:48 +1100, Benjamin Herrenschmidt
<[email protected]> wrote:
> It's up to each driver to detect wether it's card need to be POSTed or
> not. Anything else would mean infinite breakage.

Your approach is that it is a per driver problem. I was taking a
different tack and looking at it as a BIOS deficiency that should be
compensated for. There is already code in the kernel for identifying
the boot video device. I was working on the assumption that all PCI
based, VGA class hardware that is not the boot device needs to be
posted. And that the posting should occur before the drivers are
loaded. In order words the BIOS should have provided initialized
hardware but since it didn't we can apply a fixup in the PCI driver. I
also suspect there may be SCSI disk controller cards that need the
same procedure.

I have no strong opinions on how to fix the post problem, I just want
to make sure the problem is fully discussed by the relevant people and
a consensus solution is achieved. I'm not sure that all of the core
kernel developers that might be impacted by this have considered all
of the options. I would like to try and get a consensus design and
avoid reimplementing everything ten times.

--
Jon Smirl
[email protected]

2005-02-22 04:56:17

by Alex Deucher

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 14:12:48 +1100, Benjamin Herrenschmidt
<[email protected]> wrote:
>
> > Ben, since I'm not getting any help on LKML maybe you can answer this.
> > Secondary cards needs reset. After looking at a bunch of fbdev drivers
> > their code assumes the card has been reset when their probe() function
> > runs. So this means that we have to run the VBIOS reset before probe
> > is called.
>
> I'm putting back LKML on CC since I intended to reply to your post there
> once I got a bit of time.
>
> No, we can't really do that _before_ probe is called. It's the
> responsibility of the driver to do what it needs at probe time or later.
> Some drivers need that reset (and not that many in fact), some don't.
>
> We may provide a "helper" to use from the probe() for that purpose, to
> make things easier.
>
> Wether the reset code is kernel based or userland based, I would avoid
> have it run synchronously anyway. If a driver detects that it's card
> hasn't been properly initialized by the firmware (and the driver should
> be able to detect that), I suggest it's probe routine calls the
> appropriate helper, providing it with ways to get to the ROM (in some
> case, the same helper will be needed for resume from sleep, and the ROM
> may not be the PCI BAR one, but the memory shadow, though that will not
> always work afaik).
>
> Look at the firmware download helper, that's very similar. I want an
> asynchronous interface though (that is you get a callback when the reset
> is complete or timed out) rather than synchronous since it's wrong to
> synchronously rely on userland beeing available (power management,
> pre-root mount, etc...)
>
> > So where can I hook up the call to run the VBIOS up in the kernel? You
> > can't trigger it on module load since the module may support multiple
> > identical adapters. One adapter may already have the module loaded and
> > then a second shows up via hotplug. In this case the module won't get
> > loaded again and the second card doesn't get reset.
> >
> > If using a user space reset program what do you do if the user space
> > program is missing or does not complete running? Somehow you have to
> > stop the probe function from being called.
>
> That's ok. We deal with that in the firmware loader code already. Just
> timeout or check for errors from call_usermodehelper. You basically run
> the user program and wait for it to write a "reply" via sysfs. In fact,
> the existing firmware loading facility could be re-used.
>
> > Another case, you have a card and load the module for it, this causes
> > reset. Now unload the module and load it again. This probably should
> > not reset the card a second time. You also have to make sure you don't
> > reset the primary card.
>
> It's up to each driver to detect wether it's card need to be POSTed or
> not. Anything else would mean infinite breakage.
>
> > One solution is to track in pci_dev if the card has been reset. This
> > preserves the state across module load/unload. I'm then tempted to put
> > an in-kernel emu86 (I have a 40K one) into the pci driver. PCI would
> > use this to reset the card before calling probe(). If the VBIOS/emu86
> > has an error it simply wouldn't call the probe function. Doing this
> > in-kernel makes everything synchronous but GregKH would probably have
> > some choice words about the emulator in the PCI driver.
>
> No, again, it's up to the driver to decide wether it needs to POST or
> not (I prefer that to "reset"). I have no preference for the emulator to
> be in-kernel or userland. I suppose it's easier in userland, just
> re-using the existing infrastructure for firmware loading.
>

another advantage of the emulator would be that "PC" vga cards could
be used in non-x86 platforms, which I'm sure would be quite popular...

Alex

2005-02-22 05:10:04

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Mon, 2005-02-21 at 23:42 -0500, Jon Smirl wrote:
> On Tue, 22 Feb 2005 14:12:48 +1100, Benjamin Herrenschmidt
> <[email protected]> wrote:
> > It's up to each driver to detect wether it's card need to be POSTed or
> > not. Anything else would mean infinite breakage.
>
> Your approach is that it is a per driver problem. I was taking a
> different tack and looking at it as a BIOS deficiency that should be
> compensated for. There is already code in the kernel for identifying
> the boot video device.

Your assumption is rather specific to a given platform... what if you
have a card with no ROM (embedded system) but your kernel has a copy of
what should be the ROM at hand ? (flash is expensive, heh :)

> I was working on the assumption that all PCI based, VGA class hardware
> that is not the boot device needs to be posted.

That isn't the case on all platforms. Also, I like the flexibility of
having a userland helper since that doesn't "tie" us to the semantics of
an x86 platform & BIOS (we could have an OF emulator too, or whatever
binary program provided by the vendor in userspace to reinit the card
without having to link that with the kernel).

I think my approach is the most flexible in the long run.

> And that the posting should occur before the drivers are
> loaded. In order words the BIOS should have provided initialized
> hardware but since it didn't we can apply a fixup in the PCI driver. I
> also suspect there may be SCSI disk controller cards that need the
> same procedure.

I don't think we have to do these assumptions. It should really be under
driver control.

> I have no strong opinions on how to fix the post problem, I just want
> to make sure the problem is fully discussed by the relevant people and
> a consensus solution is achieved. I'm not sure that all of the core
> kernel developers that might be impacted by this have considered all
> of the options. I would like to try and get a consensus design and
> avoid reimplementing everything ten times.

Agreed.

Ben.


2005-02-22 05:14:23

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Mon, 2005-02-21 at 23:56 -0500, Alex Deucher wrote:

> another advantage of the emulator would be that "PC" vga cards could
> be used in non-x86 platforms, which I'm sure would be quite popular...

That's implied indeed... though Jon approach would require the common
code to "know" that we are on a platform that didn't run the x86 BIOS
on this or this card...

Some non-x86 platforms do already have an emulator in the firmware, some
do POST all cards, some don't, it's really tricky to try to "know" from
the generic code what to do here and will probably lead us into endless
trouble. (We may want to avoid some cards, broken BIOSes, etc... and do
it all by hand).

I think that the driver is the "chief" here and the one to know what to
do with the cards it drives. It can detect a non-POSTed card and deal
with it.

What we can/should provide, is a ncie helper to do the job once the
driver decides to have a go at it. I think userspace is the right
solution, similar to the firmware loader helpers, as I wrote earlier.
There are a few issues related on trying to run these before / is
mounted or during the sleep process, but those are things I plan to work
on & fix sooner or later. (Which is also why it has to be an
asynchronous API, so that the helper can call back "later" when the
helper has been found).

Ben.


2005-02-22 06:03:21

by [email protected]

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 16:13:36 +1100, Benjamin Herrenschmidt
<[email protected]> wrote:
> On Mon, 2005-02-21 at 23:56 -0500, Alex Deucher wrote:
> I think that the driver is the "chief" here and the one to know what to
> do with the cards it drives. It can detect a non-POSTed card and deal
> with it.

What about the x86 case of VGA devices that run without a driver being
loaded? Do we force people to load an fbdev driver to get the reset?
The BIOS deficiency strategy works for these devices.

--
Jon Smirl
[email protected]

2005-02-22 06:05:37

by [email protected]

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 16:13:36 +1100, Benjamin Herrenschmidt
<[email protected]> wrote:
> What we can/should provide, is a ncie helper to do the job once the
> driver decides to have a go at it. I think userspace is the right
> solution, similar to the firmware loader helpers, as I wrote earlier.
> There are a few issues related on trying to run these before / is
> mounted or during the sleep process, but those are things I plan to work
> on & fix sooner or later. (Which is also why it has to be an
> asynchronous API, so that the helper can call back "later" when the
> helper has been found).

Can a userspace solution solve the problem of cards that need to be
posted when they are coming out of suspend?

--
Jon Smirl
[email protected]

2005-02-22 06:33:27

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 2005-02-22 at 01:03 -0500, Jon Smirl wrote:
> On Tue, 22 Feb 2005 16:13:36 +1100, Benjamin Herrenschmidt
> <[email protected]> wrote:
> > On Mon, 2005-02-21 at 23:56 -0500, Alex Deucher wrote:
> > I think that the driver is the "chief" here and the one to know what to
> > do with the cards it drives. It can detect a non-POSTed card and deal
> > with it.
>
> What about the x86 case of VGA devices that run without a driver being
> loaded? Do we force people to load an fbdev driver to get the reset?
> The BIOS deficiency strategy works for these devices.

Do we need to deal with those at all ? (I mean _really_: do we care ?)

And even if we did, then we could have the vga "legacy" driver use the
firmware loader to "boot" them. And again, you seem to dismiss all my
other arguments...


Ben.


2005-02-22 06:35:44

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 2005-02-22 at 01:05 -0500, Jon Smirl wrote:
> On Tue, 22 Feb 2005 16:13:36 +1100, Benjamin Herrenschmidt
> <[email protected]> wrote:
> > What we can/should provide, is a ncie helper to do the job once the
> > driver decides to have a go at it. I think userspace is the right
> > solution, similar to the firmware loader helpers, as I wrote earlier.
> > There are a few issues related on trying to run these before / is
> > mounted or during the sleep process, but those are things I plan to work
> > on & fix sooner or later. (Which is also why it has to be an
> > asynchronous API, so that the helper can call back "later" when the
> > helper has been found).
>
> Can a userspace solution solve the problem of cards that need to be
> posted when they are coming out of suspend?

Yes, though they'll come up a bit later than the rest of the world if
the driver can't resume them itself (which is what should happen
normally, running the BIOS on resume is a hack).

Also, as I wrote earlier, what we care about now is the API in the form
of a helper. It fits well to have that helper just do something similar
to the firmware loader, running the stuff in userspace, but that isn't
mandatory, we could change later to be in-kernel, partially, or even
have a CONFIG option wether to have the emulator in kernel or not :)

The driver doesn't have to care if we provide a suitable API. And
userspace helper is a good enough implementation to start with.

Ben.


2005-02-22 06:42:30

by [email protected]

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 17:32:40 +1100, Benjamin Herrenschmidt
<[email protected]> wrote:
> And even if we did, then we could have the vga "legacy" driver use the
> firmware loader to "boot" them. And again, you seem to dismiss all my
> other arguments...

I'm not dismissing them, I'm in agreement with with doing it in the
drivers if we are sure we have thought through all of the different
cases where we might need to post.

--
Jon Smirl
[email protected]

2005-02-22 06:52:05

by [email protected]

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

Does the kernel need to keep a bit that says the device has been
posted, don't do it again? Should removing/inserting a driver cause a
repost? I was going to add bit in pci_dev that tracks the reset status
so that it will persist across unloads. Do we have code to tell if
hardware needs a reset without the tracking bit?

On the x86 DRM will run without fbdev loaded. So DRM needs to also be
able to do the post and well as fbdev. Or we can just leave the old
drivers alone and only implement this in a merged fbdev/drm driver?

When current X loads it's going to reset the cards again, that may
stomp anything the driver has set up.

--
Jon Smirl
[email protected]

2005-02-22 06:58:42

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 2005-02-22 at 01:52 -0500, Jon Smirl wrote:
> Does the kernel need to keep a bit that says the device has been
> posted, don't do it again?

No. The kernel have no idea about what POSTing means in fact. That is
also driver specific.

> Should removing/inserting a driver cause a repost?

The driver should be able to determine that looking at the state of the
device. Things like vgacon may need some massaging, but that is not
something we need to care too much about :)

> I was going to add bit in pci_dev that tracks the reset status
> so that it will persist across unloads. Do we have code to tell if
> hardware needs a reset without the tracking bit?

That doesn't have room in pci_dev, that only concerns a minority of HW
and I don't think we need to track it accross load/unload.

> On the x86 DRM will run without fbdev loaded. So DRM needs to also be
> able to do the post and well as fbdev. Or we can just leave the old
> drivers alone and only implement this in a merged fbdev/drm driver?

I think we need _at_least_ to make a common "stub" driver for fbdev/drm,
and if possible, only implement that in the merged driver when that
happens. We are talking about the future here. Existing users already
have X happily POST'ing their cards.

> When current X loads it's going to reset the cards again, that may
> stomp anything the driver has set up.

Yes, and X need to be fixed for that, this is _WRONG_, one of the
numerous x86-centric assumptions in X. Note that the fbdev driver is
currently aware that anything can happen to the card when in KD_GRAPHICS
mode (and thus, the driver loses ownership). I restore as much as I need
hopefully when coming back. So we may end up having a non-issue there.
Once we have an Xgl on top of mesa solo, the problem will not happen.

Ben.




2005-02-22 19:19:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)



On Mon, 21 Feb 2005, Jon Smirl wrote:
>
> I was working on the assumption that all PCI based, VGA class hardware
> that is not the boot device needs to be posted.

I don't think that's true. We certainly don't _want_ it to be true in the
long run - and even now there are cards that we can initialize fully
without using the BIOS at all.

> And that the posting should occur before the drivers are
> loaded.

Personally, I'd much rather let the driver be involved in the decision.

That may mean that the probe routine knows how to initialize the card, but
it may mean that it does an "exec_usermodehelper()" kind of thing.
Actually, I'd prefer it if this was largely up to "udev": if the driver
notices that it can't initialize the card, why not just enumerate it
enough that "udev" knows about it (that's pretty much automatic), and let
the driver just ignore the card until some (possibly much later) date when
the user level scripts have found it and initialized it.

That would imply that the driver have some "re-attach" entrypoint (which
migth be a ioctl, but might also be just a /sysfs file access), which is
the user-lands way of saying "try again - I've now initialized the
hardware".

The advantage of that kind of "disconnected" initialization is that you
don't _need_ to have the card initialization in initramfs or other "very
early boot" sequence. It gets _detected_ early on, but you can then delay
initializing it arbitrarily long, and it obviously won't be usable until
that point (but who cares? The ones that do care can put the things in
their initramfs, others may decide to do it only once the system is
up-and-running and /usr has been NFS-mounted).

Linus

2005-02-22 19:39:46

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

On Tue, 22 Feb 2005 11:19:10 -0800 (PST), Linus Torvalds
<[email protected]> wrote:
>
>
> On Mon, 21 Feb 2005, Jon Smirl wrote:
> >
> > I was working on the assumption that all PCI based, VGA class hardware
> > that is not the boot device needs to be posted.
>
> I don't think that's true. We certainly don't _want_ it to be true in the
> long run - and even now there are cards that we can initialize fully
> without using the BIOS at all.
>
> > And that the posting should occur before the drivers are
> > loaded.
>
> Personally, I'd much rather let the driver be involved in the decision.
>
> That may mean that the probe routine knows how to initialize the card, but
> it may mean that it does an "exec_usermodehelper()" kind of thing.
> Actually, I'd prefer it if this was largely up to "udev": if the driver
> notices that it can't initialize the card, why not just enumerate it
> enough that "udev" knows about it (that's pretty much automatic), and let
> the driver just ignore the card until some (possibly much later) date when
> the user level scripts have found it and initialized it.
>
> That would imply that the driver have some "re-attach" entrypoint (which
> migth be a ioctl, but might also be just a /sysfs file access), which is
> the user-lands way of saying "try again - I've now initialized the
> hardware".
>

This sounds awfully like firmware loader that seems to be working just
fine for a ranfe of network cards and other devices.

--
Dmitry

2005-02-22 20:46:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)



On Tue, 22 Feb 2005, Dmitry Torokhov wrote:
>
> This sounds awfully like firmware loader that seems to be working just
> fine for a range of network cards and other devices.

Yes. HOWEVER - and note how firmware loading for this case is not validly
done at device discovery, but at "ifconfig" time.

Ie device discovery (probing) is a _separate_ phase entirely, and happens
much earlier. We should initialize the hardware only when it actually gets
"acively used" some way by user space.

Linus

2005-02-28 15:06:25

by Pavel Machek

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)

Hi!

> > I think that the driver is the "chief" here and the one to know what to
> > do with the cards it drives. It can detect a non-POSTed card and deal
> > with it.
>
> What about the x86 case of VGA devices that run without a driver being
> loaded? Do we force people to load an fbdev driver to get the reset?
> The BIOS deficiency strategy works for these devices.

Yes, I think we do force people to load fbdev...

...because different BIOSes will be broken in slightly different ways,
and you'll probably need to know which BIOS you are trying to load...

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-28 16:07:09

by Vladimir Dergachev

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)



On Mon, 28 Feb 2005, Pavel Machek wrote:

> Hi!
>
>>> I think that the driver is the "chief" here and the one to know what to
>>> do with the cards it drives. It can detect a non-POSTed card and deal
>>> with it.
>>
>> What about the x86 case of VGA devices that run without a driver being
>> loaded? Do we force people to load an fbdev driver to get the reset?
>> The BIOS deficiency strategy works for these devices.
>
> Yes, I think we do force people to load fbdev...
>
> ...because different BIOSes will be broken in slightly different ways,
> and you'll probably need to know which BIOS you are trying to load...

I agree. For example, on my Dell notebook the graphics card is not
reinitialized properly on return from resume. At some point I'll get
bothered enough to write code that does it.

best

Vladimir Dergachev

2005-02-28 16:51:51

by Keith Packard

[permalink] [raw]
Subject: Re: POSTing of video cards (WAS: Solo Xgl..)


Around 11 o'clock on Feb 28, Vladimir Dergachev wrote:

> I agree. For example, on my Dell notebook the graphics card is not
> reinitialized properly on return from resume. At some point I'll get
> bothered enough to write code that does it.

# vbetool post

Run from your suspend script while on a text VT.

-keith



Attachments:
(No filename) (228.00 B)