2003-01-12 17:58:41

by Alan Cox

[permalink] [raw]
Subject: Linux 2.4.21-pre3-ac4

Various compile fixes, and the next stage of IDE updating. In particular
the PIIX driver has been subjected to a full review of Intel chipset
errata. I've also added the framework to fix the problems Ross Biro found
but not yet enabled its use. Once enabled that should fix a lot of the
problems with shared IRQ IDE. Handle with care as usual.

Linux 2.4.21pre3-ac4
o Finish verifying PIIX/ICH drivers versus errata (me)
o Fix handling of DMA0 MWDMA on early ICH (me)
o Fix compile in kernel for Aurora SIO16 (Adrian Bunk)
o Clean up various Configure.help bits (Adrian Bunk)
o Disallow write combining on 450NX (me)
o Ensure rev C0 450NX has restreaming off (me)
o Don't do IDE DMA on rev B0 450NX or later (me)
450NX without BIOS workarounds for the hang
o Update Configure.help for HPT IDE (Adrian Bunk)
o Fix harmless code error in sb_mixer (Jeff Garzik)
o Fix ethernet padding on via-rhine (Roger Luethi)
o Add ndelay functionality for x86 (me)
| Based on Ross Biro's code
o Add ide_execute_command (me)
| Again based on Ross Biro's changed. Not yet used
| This will be the new correct way to kick off an
| IDE command from non IRQ context
o Matroxfb compile fix for one option combination (Petr Vandrovec)

Linux 2.4.21pre3-ac3
o Address comments on wcache value/issuing (me)
cache flush requests
o Update credits entry for Stelian Pop (Stelian Pop)
o Backport some sonypi improvements from 2.5 (Kunihiko IMAI)
o Fix pdcraid/silraid symbol clash (Arjan van de Ven)
o Fix ehci build with older gcc (Greg Kroah-Hartmann)
o Fix via 8233/5 hang (me)
o Fix non SMP cpufreq build (Eyal Lebidinsky)
o Fix sbp2 build with some config options (Eyal Lebidinsky)
o Fix ATM build bugs (Francois Romieu)
o Fix an ipc/sem.c race (Bernhard Kaindl)
o Fix toshiba keyboard double release (Unknown)
o CPUFreq updaes/fixes (Dominik Brodowski)
o Natsemi Geode/Cyrix MediaGX cpufreq support (Hiroshi Miura,
Zwane Mwaikambo)
o Add frequency table helpers to CPUfreq (Dominik Brodowski)

Linux 2.4.21pre3-ac2
o Fix the dumb bug in skb_pad (Dave Miller)
o Confirm some sparc bits are wrong and drop them (Dave Miller)
o Remove a wrong additional copyright comment (Dave Miller)
o Upgrade IPMI driver to v16 (Corey Minyard)
o Fix 3c523 compile (Francois Romieu)
o Handle newer rpm where -ta is rpmbuild not rpm (me)
o Driver for Aurora Sio16 PCI adapter series (Joachim Martillo)
(SIO8000P, 16000P, and CPCI)
| Initial merge
o Backport Hammer 32bit mtrr/nmi changes (Andi Kleen)
o Add the fast IRQ path to via 8233/5 audio (me)

Linux 2.4.21pre3-ac1
+ Handle battery quirk on the Vaio Z600-RE (Paul Mitcheson)
* EHCI USB updates (David Brownell)
+ IDE Raid support for AMI/SI 'Medley' IDE Raid (Arjan van de Ven)
+ NVIDIA nForce2 IDE PCI identifiers (Johannes Deisenhofer,
Tim Krieglstein)
* CPU bitmask truncation fix (Bjorn Helgaas)
o HP100 cleanup (Pavel Machek)
o Fix initial capslock handling on USB keyboard (Pete Zaitcev)
+ Update dscc4 driver for new wan (Francois Romieu)
+ Fix boot on Chaintech 4BEA/4BEA-R and (Alexander Achenbach)
Gigabyte 9EJL by handing wacky E820 memory
reporting
o SysKonnect driver updates (Mirko Lindner)
o Fix memory leak in n_hdlc (Paul Fulghum)
o Fix missing mtd dependancy (Herbert Xu)
+ Clean up ide-tape printk stuff (Pete Zaitcev)
+ IDE tape fixes (Pete Zaitcev)
o Fix size reporting of large disks in scsi (Andries Brouwer)
+ Fix excessive stack usage in NMI handlers (Mikael Pettersson)
+ Add support for Epson 785EPX USB printer pcmcia (Khalid Aziz)
* Quirk handler to sort out IDE compatibility (Ivan Kokshaysky)
mishandling
+ Model 1 is valid for PIV in MP table (Egenera)
+ Ethernet padding fixes for various drivers (me)
o Allow trident codec setup to time out (Ian Soboroff)
This can happen with non PM codecs
o Fix broken documentation link (Henning Meier-Geinitz)
o Update video4linux docbook (William Stimson)
o Correct kmalloc check in dpt_i2o (Pablo Menichini)
o Shrink kmap area to required space only (Manfred Spraul)
o Fix irq balancing (Ben LaHaise)
o CPUfreq updates (Dominik Brodowski)
o Fix typo in pmagb fb (John Bradford)
o EDD backport (Matt Domsch)


REMOVED FOR NOW

- RMAP

REMOVED FOR GOOD

- LLC (See 2.5)
- VaryIO (Never accepted mainstream)


2003-01-12 19:19:10

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 19:07, Alan Cox wrote:
> Various compile fixes, and the next stage of IDE updating. In particular
> the PIIX driver has been subjected to a full review of Intel chipset
> errata. I've also added the framework to fix the problems Ross Biro found
> but not yet enabled its use. Once enabled that should fix a lot of the
> problems with shared IRQ IDE. Handle with care as usual.

It seems it lacks the code to actually blast the cmd byte passed
as a parameter...

Also, how are we supposed to use it exactly ? We shall fill the taskfile
first I beleive, so I suppose we shall call ide_spin_wait_hwgroup()
first which gets us a channel locked and not busy, right ?

That would do something like:

if (ide_spin_wait_hwgroup(drive))
return -EBUSY;
hwgroup->busy = 1;
.. put stuffs in taskfile regs ..
spin_unlock_irq(&io_request_lock);
ide_execute_command(drive, ...)

?nd then hopefully wait (wait_for_completion() ?) for some action by
the expiry or the handler, right ?

I'm just trying to make sure I understand the purpose of this new function...

Ben.



2003-01-12 19:45:03

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 19:51, Benjamin Herrenschmidt wrote:
> What about PCI write posting ? How can we enforce the 400ns delay here ?

For i/o space it is ok as in*/out* are synchronous. For mmio right now I
don't know. I need to talk to Andre about that for SATA. I guess for the
PPC its going to be fun

> > 2. The code is racey in some situations with a shared IRQ because we
> > may get an IRQ after we set the handler but before we send the command,
> > or implemnted the other way the command can complete before we set the
> > handler.
>
> Yup, that's an old problem indeed.

Time for it to die....

2003-01-12 19:47:08

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 20:51, Benjamin Herrenschmidt wrote:
> On Sun, 2003-01-12 at 21:27, Alan Cox wrote:
>
> > which currently has two problems Ross found
> >
> > 1. The processors or so fast we have to enforce the 400nS delay nowdays
>
> What about PCI write posting ? How can we enforce the 400ns delay here ?
> I suspect we can't read back from the taskfile registers after writing
> the command. Especially when using DMA, I think I remember Andre telling
> me even tapping alt status might not be safe... So we need to issue
> a read from the same bus path, but not on any taskfile register from
> this channel... hrm... any idea ?

Actually, do we really need that delay as we are waiting for an
interrupt anyway ? my understanding is that this delay is the required
before we start polling for BSY bit (that is the max time the drive may
take to assert BSY after getting the command), but in our case, unless
we have other bugs, we shall have the channel marked busy, so nobody
will tap it, except the actual interrupt coming in. Or will the case of
shared interrupt potentially cause a read of status at the wrong time ?

Ben.


2003-01-12 19:41:01

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 21:27, Alan Cox wrote:

> which currently has two problems Ross found
>
> 1. The processors or so fast we have to enforce the 400nS delay nowdays

What about PCI write posting ? How can we enforce the 400ns delay here ?
I suspect we can't read back from the taskfile registers after writing
the command. Especially when using DMA, I think I remember Andre telling
me even tapping alt status might not be safe... So we need to issue
a read from the same bus path, but not on any taskfile register from
this channel... hrm... any idea ?

> 2. The code is racey in some situations with a shared IRQ because we
> may get an IRQ after we set the handler but before we send the command,
> or implemnted the other way the command can complete before we set the
> handler.

Yup, that's an old problem indeed.

Ben.

2003-01-12 19:55:30

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 19:57, Benjamin Herrenschmidt wrote:
> Actually, do we really need that delay as we are waiting for an
> interrupt anyway ? my understanding is that this delay is the required
> before we start polling for BSY bit (that is the max time the drive may
> take to assert BSY after getting the command), but in our case, unless
> we have other bugs, we shall have the channel marked busy, so nobody
> will tap it, except the actual interrupt coming in. Or will the case of
> shared interrupt potentially cause a read of status at the wrong time ?

Precisely. Or a random IRQ from a drive power change or hotplug that
passed our command in the other direction.

We could actually address this another way which might even be easier,
that is in the IRQ path to wait the 400nS if BSY isnt asserted. I need
to go reread the spec to check if we can poll it before the timeout
but not trust the data, or cannot poll it.

Alan

2003-01-12 19:52:37

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 21:40, Alan Cox wrote:
> On Sun, 2003-01-12 at 19:51, Benjamin Herrenschmidt wrote:
> > What about PCI write posting ? How can we enforce the 400ns delay here ?
>
> For i/o space it is ok as in*/out* are synchronous. For mmio right now I
> don't know. I need to talk to Andre about that for SATA. I guess for the
> PPC its going to be fun

Thinking about it, it might make sense to provide an hwif->IOSYNC iop along
with {IN/OUT}{B,W,L} that would be a no-op by default for that.

Ben.


2003-01-13 16:44:25

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Sun, 2003-01-12 at 21:51, Alan Cox wrote:
> On Sun, 2003-01-12 at 19:57, Benjamin Herrenschmidt wrote:
> > Actually, do we really need that delay as we are waiting for an
> > interrupt anyway ? my understanding is that this delay is the required
> > before we start polling for BSY bit (that is the max time the drive may
> > take to assert BSY after getting the command), but in our case, unless
> > we have other bugs, we shall have the channel marked busy, so nobody
> > will tap it, except the actual interrupt coming in. Or will the case of
> > shared interrupt potentially cause a read of status at the wrong time ?
>
> Precisely. Or a random IRQ from a drive power change or hotplug that
> passed our command in the other direction.
>
> We could actually address this another way which might even be easier,
> that is in the IRQ path to wait the 400nS if BSY isnt asserted. I need
> to go reread the spec to check if we can poll it before the timeout
> but not trust the data, or cannot poll it.

Now I think that would be bad as that would mean waiting in the normal
case. But if you look at the various access patterns, I think there is
no real problem in fact, though I beleive only Andre can confirm.

We have 3 cases to deal with:

1 - Command with no data transfer. Here, there should be no problem
just doing a read from status or alt_status, right ? now, reading status
might clear the IRQ if we are slow enough, so we may want alt_status
instead. (We can probably safely ignore controllers that have no alt
status here, or route them via some hwif->IOSYNC() macro). Or maybe just
re-read the select register for making everybody happy.

2 - Command with a data transfer not using DMA (that is either PIO or
the command part of an ATAPI command in ide-cd). I don't think there's
any problem reading alt status or select here, is there andre ? So we
can also safely do that before waiting and thus make sure the bus path
to the controller is ok

3 - Command with a data transfer using DMA (new ide-scsi, ide-hd with
dma, etc...). Here, I beleive there is just no problem, we don't need to
wait at all, since the interrupt handler will check for the DMA
controller to have completed via hwif->ide_dma_test_irq before reading
the status reg at all, right ?

Ben.

2003-01-13 18:42:24

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Benjamin Herrenschmidt wrote:

>On Sun, 2003-01-12 at 21:27, Alan Cox wrote:
>
>
>
>>which currently has two problems Ross found
>>
>>1. The processors or so fast we have to enforce the 400nS delay nowdays\
>>

The reason we need to enforce the 400nS delay is because of what is
going on on the other processor. If the other processor is in ide_intr
trying to grab the spinlock and we do not give the drive time to assert
the busy bit and the other processor makes it to the call to
drive_is_ready, then the drive could still return not busy and we could
think the command is done. This code path is probably less than 50
instructions, so I don't think it's taken anywhere near 400ns for a long
time.

DMA is slightly different. We don't actually have to delay the 400ns if
we call ide_dma_begin from inside the spinlock.

Ross


2003-01-13 18:51:41

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 2003-01-13 at 19:49, Ross Biro wrote:

> The reason we need to enforce the 400nS delay is because of what is
> going on on the other processor. If the other processor is in ide_intr
> trying to grab the spinlock and we do not give the drive time to assert
> the busy bit and the other processor makes it to the call to
> drive_is_ready, then the drive could still return not busy and we could
> think the command is done. This code path is probably less than 50
> instructions, so I don't think it's taken anywhere near 400ns for a long
> time.
>
> DMA is slightly different. We don't actually have to delay the 400ns if
> we call ide_dma_begin from inside the spinlock.

Exactly. My problem right now is with enforcing that 400ns delay on
non-DMA path as with PCI write posting on one side, and other fancy bus
store queues etc... you are really not sure when your outb for the
command byte will really reach the disk.

So the problem turns down to: is it safe for commands with no data
transfer and commands with a PIO data transfer to read back from
some other task file register right after issuing the command byte
(the select register looks like a good choice, better than status
for sure) and before doing the delay of 400ns ? On any sane bus
architecture, that read will make sure the previous write will
have reached the device or your IO accessors are broken...

Ben.





2003-01-13 19:12:17

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Benjamin Herrenschmidt wrote:

>
>
>Exactly. My problem right now is with enforcing that 400ns delay on
>non-DMA path as with PCI write posting on one side, and other fancy bus
>store queues etc... you are really not sure when your outb for the
>command byte will really reach the disk.
>
>So the problem turns down to: is it safe for commands with no data
>transfer and commands with a PIO data transfer to read back from
>some other task file register right after issuing the command byte
>(the select register looks like a good choice, better than status
>for sure) and before doing the delay of 400ns ? On any sane bus
>architecture, that read will make sure the previous write will
>have reached the device or your IO accessors are broken...
>
>
>
Ahh, good point. My experience with the promise controller says that it
is not safe to talk to the drive after a start of a DMA command. For
non-dma commands, it should be safe but I believe it would be an ATA
spec violation to do so. In particular from the ata-6 spec

HPIOI0: INTRQ_Wait State: This state is entered when the host has
written a PIO data-in command to the device and nIEN is cleared to zero,
or at the completion of a DRQ data block transfer if all the data for
the command has not been transferred and nIEN is cleared to zero. When
in this state, the host shall wait for INTRQ to be asserted.

So technically we are not allowed to talk to the drive, but must wait
for an irq. The problem becomes how to tell if the irq is meant for us.
My guess is that most drives will not care if nIEN is set or not at
this point and we can use

HPIOI1: Check_Status State: This state is entered when the host has
written a PIO data-in command to the device and nIEN is set to one, or
when INTRQ is asserted. When in this state, the host shall read the
device Status register. When entering this state from the HI4 state, the
host shall wait 400 ns before reading the Status register. When entering
this state from the HPIOI2 state, the host shall wait one PIO transfer
cycle time before reading the Status register. The wait may be
accomplished by reading the Alternate Status register and ignoring the
result.

and read the alt status register to get a delay.

This is technically a spec violation, but it's probably safe. I'm going
to send an email to a couple of the drive manufacturers and see what
they think.

Ross

2003-01-13 19:25:34

by John Alvord

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On 13 Jan 2003 20:03:29 +0100, Benjamin Herrenschmidt
<[email protected]> wrote:

>On Mon, 2003-01-13 at 19:49, Ross Biro wrote:
>
>> The reason we need to enforce the 400nS delay is because of what is
>> going on on the other processor. If the other processor is in ide_intr
>> trying to grab the spinlock and we do not give the drive time to assert
>> the busy bit and the other processor makes it to the call to
>> drive_is_ready, then the drive could still return not busy and we could
>> think the command is done. This code path is probably less than 50
>> instructions, so I don't think it's taken anywhere near 400ns for a long
>> time.
>>
>> DMA is slightly different. We don't actually have to delay the 400ns if
>> we call ide_dma_begin from inside the spinlock.
>
>Exactly. My problem right now is with enforcing that 400ns delay on
>non-DMA path as with PCI write posting on one side, and other fancy bus
>store queues etc... you are really not sure when your outb for the
>command byte will really reach the disk.
>
>So the problem turns down to: is it safe for commands with no data
>transfer and commands with a PIO data transfer to read back from
>some other task file register right after issuing the command byte
>(the select register looks like a good choice, better than status
>for sure) and before doing the delay of 400ns ? On any sane bus
>architecture, that read will make sure the previous write will
>have reached the device or your IO accessors are broken...
>
You could simplify the problem somewhat by forcing all interaction and
interrupt processing to a single CPU.

john

2003-01-13 19:24:51

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4


One thing we could do to solve this entire problem is wait for the
interrupt to finish before sending the command to the drive in the first
place. Basically in ide_do_request we just have to change

if (masked_irq && hwif->irq != masked_irq)
disable_irq_nosync(hwif->irq);

to

if (masked_irq && hwif->irq != masked_irq) {
BUG();
}

if (!masked_irq) {
disable_irq_sync(hwif->irq);
}

But that means that we will only queue up new commands to a drive when
its interrupt is not currently active.

Ross


Ross Biro wrote:

> Benjamin Herrenschmidt wrote:
>
>>
>>
>> Exactly. My problem right now is with enforcing that 400ns delay on
>> non-DMA path as with PCI write posting on one side, and other fancy bus
>> store queues etc... you are really not sure when your outb for the
>> command byte will really reach the disk.
>>
>> So the problem turns down to: is it safe for commands with no data
>> transfer and commands with a PIO data transfer to read back from
>> some other task file register right after issuing the command byte
>> (the select register looks like a good choice, better than status
>> for sure) and before doing the delay of 400ns ? On any sane bus
>> architecture, that read will make sure the previous write will
>> have reached the device or your IO accessors are broken...
>>
>>
>>
> Ahh, good point. My experience with the promise controller says that
> it is not safe to talk to the drive after a start of a DMA command.
> For non-dma commands, it should be safe but I believe it would be an
> ATA spec violation to do so. In particular from the ata-6 spec
>
> HPIOI0: INTRQ_Wait State: This state is entered when the host has
> written a PIO data-in command to the device and nIEN is cleared to
> zero, or at the completion of a DRQ data block transfer if all the
> data for the command has not been transferred and nIEN is cleared to
> zero. When in this state, the host shall wait for INTRQ to be asserted.
>
> So technically we are not allowed to talk to the drive, but must wait
> for an irq. The problem becomes how to tell if the irq is meant for
> us. My guess is that most drives will not care if nIEN is set or not
> at this point and we can use
>
> HPIOI1: Check_Status State: This state is entered when the host has
> written a PIO data-in command to the device and nIEN is set to one, or
> when INTRQ is asserted. When in this state, the host shall read the
> device Status register. When entering this state from the HI4 state,
> the host shall wait 400 ns before reading the Status register. When
> entering this state from the HPIOI2 state, the host shall wait one PIO
> transfer cycle time before reading the Status register. The wait may
> be accomplished by reading the Alternate Status register and ignoring
> the result.
>
> and read the alt status register to get a delay.
>
> This is technically a spec violation, but it's probably safe. I'm
> going to send an email to a couple of the drive manufacturers and see
> what they think.
>
> Ross
>



2003-01-13 19:37:32

by Alan Cox

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

> host shall wait 400 ns before reading the Status register. When entering
> this state from the HPIOI2 state, the host shall wait one PIO transfer
> cycle time before reading the Status register. The wait may be
> accomplished by reading the Alternate Status register and ignoring the
> result.

Fatal on PIIX PIO

2003-01-13 19:41:00

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

John Alvord wrote:

>On 13 Jan 2003 20:03:29 +0100, Benjamin Herrenschmidt
><[email protected]> wrote:
>
>
>
>>Exactly. My problem right now is with enforcing that 400ns delay on
>>non-DMA path as with PCI write posting on one side, and other fancy bus
>>store queues etc... you are really not sure when your outb for the
>>command byte will really reach the disk.
>>
>>So the problem turns down to: is it safe for commands with no data
>>transfer and commands with a PIO data transfer to read back from
>>some other task file register right after issuing the command byte
>>(the select register looks like a good choice, better than status
>>for sure) and before doing the delay of 400ns ? On any sane bus
>>architecture, that read will make sure the previous write will
>>have reached the device or your IO accessors are broken...
>>
>>
>>
>You could simplify the problem somewhat by forcing all interaction and
>interrupt processing to a single CPU.
>
>john
>
>
I don't think that helps. The problem can still occur if the PCI write
post is long enough for an interrupt to get through. We could read the
alt status twice in drive_is_ready and only take the second one.

Ross


2003-01-13 20:09:39

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 2003-01-13 at 19:32, Ross Biro wrote:
> One thing we could do to solve this entire problem is wait for the
> interrupt to finish before sending the command to the drive in the first
> place. Basically in ide_do_request we just have to change


> if (!masked_irq) {
> disable_irq_sync(hwif->irq);
> }

You cannot disable an IRQ synchronously holding a spin lock taken by an
IRQ handler

2003-01-13 20:29:25

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Alan Cox wrote:

>On Mon, 2003-01-13 at 19:32, Ross Biro wrote:
>
>
>>One thing we could do to solve this entire problem is wait for the
>>interrupt to finish before sending the command to the drive in the first
>>place. Basically in ide_do_request we just have to change
>>
>>
>
>
>
>
>> if (!masked_irq) {
>> disable_irq_sync(hwif->irq);
>> }
>>
>>
>
>You cannot disable an IRQ synchronously holding a spin lock taken by an
>IRQ handler
>
>

You are correct, you have to drop the spinlock first. And it doesn't
really help anyway.

Ross

2003-01-13 20:48:23

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 2003-01-13 at 20:19, Ross Biro wrote:

> and read the alt status register to get a delay.
>
> This is technically a spec violation, but it's probably safe. I'm going
> to send an email to a couple of the drive manufacturers and see what
> they think.

Or get back to my original idea of an IOSYNC() callback in hwif. For
standard PCI controllers with DMA, it's enough to read the dma_status
register which is on the same bus path. Others will have to provide
some implementation or be unsafe on some non-x86. What do you think ?

Ben.


2003-01-13 21:15:22

by Jeff Garzik

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, Jan 13, 2003 at 08:03:29PM +0100, Benjamin Herrenschmidt wrote:
> Exactly. My problem right now is with enforcing that 400ns delay on
> non-DMA path as with PCI write posting on one side, and other fancy bus
> store queues etc... you are really not sure when your outb for the
> command byte will really reach the disk.

As a slight tangent, PCI write posting is quite annoying because on some
hardware one simply cannot perform a read immediately after a write,
without pausing for a hardware-specified amount of time.

...but, at the same time, who knows how long the write posting may take,
so one doesn't know how long the delay really needs to be.

It would be nice if there was an arch-specific flush-posted-writes hook
[wmb_mmio() ?], if that was possible on write-posting CPUs. Currently
right now the canonical solution ("MMIO read") doesn't work in some
situations, and I do think we have a solution at all for those "some
situations."

Jeff




2003-01-13 21:28:48

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 13 Jan 2003, Jeff Garzik wrote:

> On Mon, Jan 13, 2003 at 08:03:29PM +0100, Benjamin Herrenschmidt wrote:
> > Exactly. My problem right now is with enforcing that 400ns delay on
> > non-DMA path as with PCI write posting on one side, and other fancy bus
> > store queues etc... you are really not sure when your outb for the
> > command byte will really reach the disk.
>
> As a slight tangent, PCI write posting is quite annoying because on some
> hardware one simply cannot perform a read immediately after a write,
> without pausing for a hardware-specified amount of time.
>
> ...but, at the same time, who knows how long the write posting may take,
> so one doesn't know how long the delay really needs to be.
>
> It would be nice if there was an arch-specific flush-posted-writes hook
> [wmb_mmio() ?], if that was possible on write-posting CPUs. Currently
> right now the canonical solution ("MMIO read") doesn't work in some
> situations, and I do think we have a solution at all for those "some
> situations."
>
> Jeff
>

There is a well-defined procedure for this. Any "read" anywhere
in the PCI address space, will force all posted writes to complete.
However, the "read" will not be the data one would obtain after
the write completes. Therefore, to guarantee that all posted
writes complete before you read, for instance, status that could
be affected by that write, you execute a dummy read anywhere in
PCI address space, somewhere that will not screw up your
status. In other words, you don't read your device status twice,
once to post the writes and once to get the status because some
hardware will detect the read and fail to give you the correct
status on the second read. Instead, you read some 'harmless' register
that your hardware will decode, but not muck up the status. You
don't want to read a nonexistant register because this will cause
a lonnnnnnng bus-timeout. It will work to flush pending writes, but
it's slow.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-01-13 21:30:27

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Benjamin Herrenschmidt wrote:

>On Mon, 2003-01-13 at 20:19, Ross Biro wrote:
>
>
>
>>and read the alt status register to get a delay.
>>
>>This is technically a spec violation, but it's probably safe. I'm going
>>to send an email to a couple of the drive manufacturers and see what
>>they think.
>>
>>
>
>Or get back to my original idea of an IOSYNC() callback in hwif. For
>standard PCI controllers with DMA, it's enough to read the dma_status
>register which is on the same bus path. Others will have to provide
>some implementation or be unsafe on some non-x86. What do you think ?
>

I think that's a very good idea provided that we know that the
dma_status register exists and is on the same bus path. That should be
true for all modern IDE controllers on the x86. But is not a completely
general solution.

One thing that we should keep in mind, is that the IDE controller could
buffer the write as well. I've seen some evidence that Promise chips
might attempt to buffer things like resets until a UDMA burst is
complete. I guess we have to assume that any controller that does such
a thing will also provide a way of knowing when the command has actually
been sent to the drive.

If anyone is curious, I believe I've got the hardware to see how long
after the PCI bus sees an i/o command that it makes it to the drive, but
this would only be trivia that applies to the motherboard we test it on
with the settings currently in place and should not be relied on.

Ross


2003-01-13 21:53:57

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Jeff Garzik wrote:

> ...but, at the same time, who knows how long the write posting may
take,
> so one doesn't know how long the delay really needs to be.
>
> It would be nice if there was an arch-specific flush-posted-writes
hook
> [wmb_mmio() ?], if that was possible on write-posting CPUs. Currently
> right now the canonical solution ("MMIO read") doesn't work in some
> situations, and I do think we have a solution at all for those "some
> situations."

What situations a read from the same bus path won't work ? wmb_mmio
can't work, it's really a matter of bus path, you have to read from the
same bus segment your device is on, preferably the same device.

On Mon, 2003-01-13 at 22:40, Richard B. Johnson wrote:

> There is a well-defined procedure for this. Any "read" anywhere
> in the PCI address space, will force all posted writes to complete.
> However, the "read" will not be the data one would obtain after
> the write completes. Therefore, to guarantee that all posted
> writes complete before you read, for instance, status that could
> be affected by that write, you execute a dummy read anywhere in
> PCI address space, somewhere that will not screw up your
> status. In other words, you don't read your device status twice,
> once to post the writes and once to get the status because some
> hardware will detect the read and fail to give you the correct
> status on the second read. Instead, you read some 'harmless' register
> that your hardware will decode, but not muck up the status. You
> don't want to read a nonexistant register because this will cause
> a lonnnnnnng bus-timeout. It will work to flush pending writes, but
> it's slow.

Hrm, in fact that's definitely not generic ;) A nonexistant register
on some archs will send you right to Machine Check -> Ooops ;)

Also, you a read from "anywhere on PCI" won't do the trick, you have to
read from the exact same bus path, crossing the same host & P2P bridges.

The problem is more generic than just MMIO on PCI, for example, even
with IO, I don't think we can guarantee anything, IO is basically MMIO
(we have some error recovery stuff in ppc32 that might make it +/- sync,
but that's not something that I would count on). Also, the CPU itself
may well have a store queue acting as a "all busses" write posting. On
PPC (again ;) we guarantee ordering, so if you do a read next, you'll
flush previous writes to the bus, but without that read, the write might
well stay a few cycles in your CPU store queue.

Ben.


2003-01-13 21:59:11

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 2003-01-13 at 22:40, Richard B. Johnson wrote:

> There is a well-defined procedure for this. Any "read" anywhere
> in the PCI address space, will force all posted writes to complete.
> However, the "read" will not be the data one would obtain after
> the write completes.

Just to avoid confusion, the above is obviously wrong, the read will
indeed force pending store queues to complete _in order_, the read will
reach the device after the stores are complete and you'll read the value
you would get after the write normally. At least on PCI ;)

Ben.

2003-01-14 01:07:28

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On 13 Jan 2003, Benjamin Herrenschmidt wrote:

> On Mon, 2003-01-13 at 22:40, Richard B. Johnson wrote:
>
> > There is a well-defined procedure for this. Any "read" anywhere
> > in the PCI address space, will force all posted writes to complete.
> > However, the "read" will not be the data one would obtain after
> > the write completes.
>
> Just to avoid confusion, the above is obviously wrong, the read will
> indeed force pending store queues to complete _in order_, the read will
> reach the device after the stores are complete and you'll read the value
> you would get after the write normally. At least on PCI ;)
>
> Ben.
>

It is not, as you say; "obviously wrong". It is, in fact correct.
If you think you will get, as previously stated, the current status
by reading the status register of a device, while a posted-write
is in-progress, the code is broken. There are warnings all over
PCI device hardware specifications about this.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-01-14 01:28:07

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Ross Biro wrote:

>>>
>>> This is technically a spec violation, but it's probably safe. I'm
>>> going to send an email to a couple of the drive manufacturers and
>>> see what they think.
>>>
>>
I just heard back from one ide controller chip vendor and they think we
should disable PCI write posting. From the tone of the response, I
believe that they may not have thought of this before and it may be a
problem in their non-opensource drivers as well.

Ross


2003-01-14 01:38:11

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4


Ross,

Is wonderful how you can not say who the vendor is who is violating the
spec. *sigh*

On Mon, 13 Jan 2003, Ross Biro wrote:

> Ross Biro wrote:
>
> >>>
> >>> This is technically a spec violation, but it's probably safe. I'm
> >>> going to send an email to a couple of the drive manufacturers and
> >>> see what they think.
> >>>
> >>
> I just heard back from one ide controller chip vendor and they think we
> should disable PCI write posting. From the tone of the response, I
> believe that they may not have thought of this before and it may be a
> problem in their non-opensource drivers as well.
>
> Ross
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2003-01-14 08:57:34

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 02:20, Richard B. Johnson wrote:

> It is not, as you say; "obviously wrong". It is, in fact correct.
> If you think you will get, as previously stated, the current status
> by reading the status register of a device, while a posted-write
> is in-progress, the code is broken. There are warnings all over
> PCI device hardware specifications about this.

Can you point me to such a warning in the PCI2.1 or 2.2 spec please ?

Ben.


2003-01-14 09:09:38

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 02:35, Ross Biro wrote:
> Ross Biro wrote:
>
> >>>
> >>> This is technically a spec violation, but it's probably safe. I'm
> >>> going to send an email to a couple of the drive manufacturers and
> >>> see what they think.
> >>>
> >>
> I just heard back from one ide controller chip vendor and they think we
> should disable PCI write posting. From the tone of the response, I
> believe that they may not have thought of this before and it may be a
> problem in their non-opensource drivers as well.

Argh... Well, I don't think that's a solution unfortunately. The
"posting" can be done at various level down the path to the device and
we don't always know how to (or want to) tweak it to disable any kind of
posting. It can be done on P2P bridges, it can be done in the host
bridge (for which we may have no specs in some cases) and it can be done
at the CPU level (not couting the IDE chipset itself that might want to
play tricks).

So what can we do at this point ? I beleive the only sane solution is to
provide that hwif->IOSYNC. Normal PCI-DMA controllers setting it to
dma_base by default, I know what to do for ide-pmac, others will have to
find some way to get it right for their platform (legacy x86 IO ports
might not be a problem as Alan pointed those IOs are fully synchronous).
Maybe we shall initialize that to some default provided by asm/ide.h (I
don't like that much though).

Ben.


2003-01-14 13:54:22

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On 14 Jan 2003, Benjamin Herrenschmidt wrote:

> On Tue, 2003-01-14 at 02:20, Richard B. Johnson wrote:
>
> > It is not, as you say; "obviously wrong". It is, in fact correct.
> > If you think you will get, as previously stated, the current status
> > by reading the status register of a device, while a posted-write
> > is in-progress, the code is broken. There are warnings all over
> > PCI device hardware specifications about this.
>
> Can you point me to such a warning in the PCI2.1 or 2.2 spec please ?
>
> Ben.


Ok. The "bible" that most of us use is PCI SYSTEM ARCHITECTURE,
Fourth Edition, MindShare, Inc. ISBN 0-201-30974-2
Page 94, "Posting Improves Memory Write Performance", General,
second paragraph;

"When a bridge implements a posted-write buffer, a potential problem exists.
Another bus master (or the same one) may initiate a memory read from the tar-
get of the posted write before the data is actually written to the memory
target. If this were permitted, the master performing the read would not
receive the freshest copy of the information. In order to prevent this
from occurring, before permitting a read to occur on the target bus, the
bridge designer must first flush all posted writes to their destination
targets. A device driver can ensure that all memory data has been written
to its device by performing a read from the device. This will force the
flushing of all posted write buffers in bridges that reside between the
master performing the read and the target device before the read is
permitted to complete."

Now, some persons think that a "memory target" is RAM. Not so. It
is "memory-mapped I/O" that they are taking about. No bridge is
allowed to post writes to devices in I/O space. In other words any
'out' and 'in' instructions are assured, in their words; "real-time
communication". That's in the last sentence of the quoted reference.

There is at least one other such warning in this book, but I
can't locate it at the moment. However, in the AMD Elan SC520
Micro-controller User's manual (the SC520 embeds a PCI bridge),
Page 9-21, pp 9.5.4.8, fourth paragraph; "The PCI bus specification
recommends that the CPU perform a read to the interrupting PCI bus
device, to force all system posted write buffers to flush (including
PCI bus bridges)".

Most of my other references are at home because I finished the
successful design of some fairly complex stuff using PCI devices.
I even wrote the BIOS to initialize everything from scratch.
I brought most of my reference material, necessary to design
the product, home.

When writing a device driver, it is not always necessary to
perform an extra read of the device. It all depends upon the
device. If you are writing to I/O registers, you never have
any problem. But suppose your device is going to produce an
interrupt as soon as it has written a data-packet from a SNIC
(Serial Network Interface Controller) hardware to a memory-mapped
buffer. When your ISR gets control, it MUST read from the
device first to make sure that the posted writes complete
before it actually uses the data. This usually happens from
normal programming practice because you usually copy (read)
the data. But, if you decided to write the data back to the
device (you wrote a new hardware destination address) and
transmitted the modified packet, all bets are off. The same
problem occurs with hardware data that is used as a semaphore.

A similar problem occurs when an I/O bit is used to show that
a memory-mapped I/O operation has occurred. The I/O mapped bit
may show that data are available long before it is. Many
Network device drivers expose this potential problem by looping
in the ISR, so-called interrupt mitigation. This is one of the
reasons for some PCI SNICs locking up and having to be restarted
by a timer.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-01-14 14:25:59

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 15:04, Richard B. Johnson wrote:

> "When a bridge implements a posted-write buffer, a potential problem exists.
> Another bus master (or the same one) may initiate a memory read from the tar-
> get of the posted write before the data is actually written to the memory
> target. If this were permitted, the master performing the read would not
> receive the freshest copy of the information. In order to prevent this
> from occurring, before permitting a read to occur on the target bus, the
> bridge designer must first flush all posted writes to their destination
> targets. A device driver can ensure that all memory data has been written
> to its device by performing a read from the device. This will force the
> flushing of all posted write buffers in bridges that reside between the
> master performing the read and the target device before the read is
> permitted to complete."

That is ok. What you read is also in the PCI spec, that is the PCI
bridge doing write buffering has to flush it's write buffer before
letting any read go. This is why we can use a read from the same bus
path to flush the write buffer, but nowhere in the above text it says
that the actual data read would be incorrect. It would have been _if_
the bridge didn't respect the described behaviour, but this behaviour is
quite an important part of PCI spec and I yet to encounter a bridge not
doing it properly.

> Now, some persons think that a "memory target" is RAM. Not so. It
> is "memory-mapped I/O" that they are taking about. No bridge is
> allowed to post writes to devices in I/O space. In other words any
> 'out' and 'in' instructions are assured, in their words; "real-time
>communication". That's in the last sentence of the quoted reference.

Good. That means that in theory, as Alan pointed out, IO is safe from
PCI posting issues. Unfortunately, regarding the IDE code, we still have
to deal with MMIO controllers (ide-pmac like or just new MMIO PCI
controllers that tend to appear on the PC market). We also have to deal
with platforms that don't do IO at the CPU level (all non-x86 afaik) and
have a write buffer on the bus path before reaching the PCI host. So the
problem remains for a bunch of controllers.

> There is at least one other such warning in this book, but I
> can't locate it at the moment. However, in the AMD Elan SC520
> Micro-controller User's manual (the SC520 embeds a PCI bridge),
> Page 9-21, pp 9.5.4.8, fourth paragraph; "The PCI bus specification
> recommends that the CPU perform a read to the interrupting PCI bus
> device, to force all system posted write buffers to flush (including
> PCI bus bridges)".

Again, that doesn't mention the fact that data read then would be
incorrect. Beleive me, it is correct ;)

> Most of my other references are at home because I finished the
> successful design of some fairly complex stuff using PCI devices.
> I even wrote the BIOS to initialize everything from scratch.
> I brought most of my reference material, necessary to design
> the product, home.
>
> When writing a device driver, it is not always necessary to
> perform an extra read of the device. It all depends upon the
> device. If you are writing to I/O registers, you never have
> any problem. But suppose your device is going to produce an
> interrupt as soon as it has written a data-packet from a SNIC
> (Serial Network Interface Controller) hardware to a memory-mapped
> buffer. When your ISR gets control, it MUST read from the
> device first to make sure that the posted writes complete
> before it actually uses the data. This usually happens from
> normal programming practice because you usually copy (read)
> the data. But, if you decided to write the data back to the
> device (you wrote a new hardware destination address) and
> transmitted the modified packet, all bets are off. The same
> problem occurs with hardware data that is used as a semaphore.

Yah; that's a typical issue with posted write on PCI, though not the
only one. Also don't forget that interrupt are basically asynchronous.
That is disabling IRQ emission on your device, even with the proper read
to flush posted writes, doesn't gurarantee that you won't get an
interrupt right after. The interrupt may well have been issues previous
to your write and been buffered for a few ns in it's path up to your CPU
(interrupt controller, CPU itself has asynchronous IRQ handling, etc...)

> A similar problem occurs when an I/O bit is used to show that
> a memory-mapped I/O operation has occurred. The I/O mapped bit
> may show that data are available long before it is. Many
> Network device drivers expose this potential problem by looping
> in the ISR, so-called interrupt mitigation. This is one of the
> reasons for some PCI SNICs locking up and having to be restarted
> by a timer.

2003-01-14 17:28:15

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Mon, 2003-01-13 at 20:46, Alan Cox wrote:
> > host shall wait 400 ns before reading the Status register. When entering
> > this state from the HPIOI2 state, the host shall wait one PIO transfer
> > cycle time before reading the Status register. The wait may be
> > accomplished by reading the Alternate Status register and ignoring the
> > result.
>
> Fatal on PIIX PIO

Ok, but PIIX runs on intel platforms with real IOs, so there is no need
to perform a read... If we go the hwif->IOSYNC() way, we might well set
it up to no-op on x86 PIO iops by default and read of alt-status on
other archs if it's safe enough on other controllers/drives...

Ben.


2003-01-14 17:42:54

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 18:49, Ross Biro wrote:
> Benjamin Herrenschmidt wrote:
>
> >Ok, but PIIX runs on intel platforms with real IOs, so there is no need
> >to perform a read... If we go the hwif->IOSYNC() way, we might well set
> >it up to no-op on x86 PIO iops by default and read of alt-status on
> >other archs if it's safe enough on other controllers/drives...
> >
> I believe that this will corrupt any inprogress UDMA transfer on the
> promise 20265 chip and probably others. It would be better to read the
> dma registers for the Promise controllers.

You mean on the chip's other channel ? As we discussed earlier, we don't
need to enforce this delay at all for DMA as we wait for the DMA
controller to complete in the interrupt anyway. Or did I miss a race ?

Ben.


2003-01-14 17:41:40

by Ross Biro

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

Benjamin Herrenschmidt wrote:

>Ok, but PIIX runs on intel platforms with real IOs, so there is no need
>to perform a read... If we go the hwif->IOSYNC() way, we might well set
>it up to no-op on x86 PIO iops by default and read of alt-status on
>other archs if it's safe enough on other controllers/drives...
>
I believe that this will corrupt any inprogress UDMA transfer on the
promise 20265 chip and probably others. It would be better to read the
dma registers for the Promise controllers.

Ross


2003-01-14 21:11:36

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4



Ben, just because there does not appear to be a race in the code, does not
provide any information about the hardware.

On 14 Jan 2003, Benjamin Herrenschmidt wrote:

> On Tue, 2003-01-14 at 18:49, Ross Biro wrote:
> > Benjamin Herrenschmidt wrote:
> >
> > >Ok, but PIIX runs on intel platforms with real IOs, so there is no need
> > >to perform a read... If we go the hwif->IOSYNC() way, we might well set
> > >it up to no-op on x86 PIO iops by default and read of alt-status on
> > >other archs if it's safe enough on other controllers/drives...
> > >
> > I believe that this will corrupt any inprogress UDMA transfer on the
> > promise 20265 chip and probably others. It would be better to read the
> > dma registers for the Promise controllers.
>
> You mean on the chip's other channel ? As we discussed earlier, we don't
> need to enforce this delay at all for DMA as we wait for the DMA
> controller to complete in the interrupt anyway. Or did I miss a race ?
>
> Ben.
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2003-01-15 00:14:31

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 22:17, Andre Hedrick wrote:
> Ben, just because there does not appear to be a race in the code, does not
> provide any information about the hardware.

Of course, which is why we rely on people like you to tell us about what
kind of strange HW behaviour we might have to deal with in this specific
case ;)

Ben.


2003-01-15 05:02:00

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4


Hi Ben,

Well I could do more if I was not having to resolve a few issues, and
finding any means possible to squelch the racket and noise in the OT
threads. This includes digging out the personality and going to town.
That old nasty thing does not fit anymore and it makes me more pointed
than ever to wear it. So maybe the noise level is now settled and the
signal of progress will move again. Oh and nobody asked me to put it back
on, but I did not ask for thread pollution.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On 15 Jan 2003, Benjamin Herrenschmidt wrote:

> On Tue, 2003-01-14 at 22:17, Andre Hedrick wrote:
> > Ben, just because there does not appear to be a race in the code, does not
> > provide any information about the hardware.
>
> Of course, which is why we rely on people like you to tell us about what
> kind of strange HW behaviour we might have to deal with in this specific
> case ;)
>
> Ben.

2003-01-21 07:08:36

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-14 at 01:35, Ross Biro wrote:
> Ross Biro wrote:
> I just heard back from one ide controller chip vendor and they think we
> should disable PCI write posting. From the tone of the response, I
> believe that they may not have thought of this before and it may be a
> problem in their non-opensource drivers as well.

Thankfully its a SATA specific issue. However on plenty of chipsets we
cannot disable write posting even if we wanted to (which we dont!)

More evil plotting required (eg doing the read from the ROM resource 8))

2003-01-21 11:37:32

by jurriaan

[permalink] [raw]
Subject: small patch for Via Pro 266T agp-support

Here is a small patch to allow my motherboard to use the AGP port. It
didn't work with agp_try_unsupported=1 on the command-line.

It's a dual-tualatin motherboard, so it's kind of exotic.

00:00.0 Host bridge: VIA Technologies, Inc. VT8653 Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP]
00:08.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
00:0a.0 SCSI storage controller: LSI Logic / Symbios Logic 53c860 (rev 13)
00:0b.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
00:0c.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:0c.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
00:0e.0 Unknown mass storage controller: Promise Technology, Inc. 20265 (rev 02)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8233 PCI to ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus Master IDE (rev 06)
00:11.2 USB Controller: VIA Technologies, Inc. USB (rev 1b)
00:11.3 USB Controller: VIA Technologies, Inc. USB (rev 1b)
00:11.4 USB Controller: VIA Technologies, Inc. USB (rev 1b)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 03)

Kind regards,
Jurriaan

diff -urN -X dontdiff linux-2.4.21p3ac4/drivers/char/agp/agpgart_be.c linux-2.4.21p3ac4-pro266t/drivers/char/agp/agpgart_be.c
--- linux-2.4.21p3ac4/drivers/char/agp/agpgart_be.c 2003-01-21 12:04:59.000000000 +0100
+++ linux-2.4.21p3ac4-pro266t/drivers/char/agp/agpgart_be.c 2003-01-15 20:07:53.000000000 +0100
@@ -4700,6 +4700,12 @@
"Via",
"Apollo Pro KT400",
via_generic_setup },
+ { PCI_DEVICE_ID_VIA_8653_0,
+ PCI_VENDOR_ID_VIA,
+ VIA_APOLLO_PRO266T,
+ "Via",
+ "Apollo Pro 266T",
+ via_generic_setup },
{ 0,
PCI_VENDOR_ID_VIA,
VIA_GENERIC,
diff -urN -X dontdiff linux-2.4.21p3ac4/drivers/char/drm/drm_agpsupport.h linux-2.4.21p3ac4-pro266t/drivers/char/drm/drm_agpsupport.h
--- linux-2.4.21p3ac4/drivers/char/drm/drm_agpsupport.h 2003-01-21 12:04:59.000000000 +0100
+++ linux-2.4.21p3ac4-pro266t/drivers/char/drm/drm_agpsupport.h 2003-01-21 12:11:39.000000000 +0100
@@ -281,6 +281,8 @@
break;
case VIA_APOLLO_KT400: head->chipset = "VIA Apollo KT400";
break;
+ case VIA_APOLLO_PRO266T: head->chipset = "VIA Apollo Pro 266T";
+ break;
case VIA_APOLLO_PRO: head->chipset = "VIA Apollo Pro";
break;

diff -urN -X dontdiff linux-2.4.21p3ac4/drivers/char/drm-4.0/agpsupport.c linux-2.4.21p3ac4-pro266t/drivers/char/drm-4.0/agpsupport.c
--- linux-2.4.21p3ac4/drivers/char/drm-4.0/agpsupport.c 2003-01-21 12:05:00.000000000 +0100
+++ linux-2.4.21p3ac4-pro266t/drivers/char/drm-4.0/agpsupport.c 2003-01-15 20:07:53.000000000 +0100
@@ -277,6 +277,8 @@
break;
case VIA_APOLLO_KT400: head->chipset = "VIA Apollo KT400";
break;
+ case VIA_APOLLO_PRO266T: head->chipset = "VIA Apollo PRO266T";
+ break;
#endif

case VIA_APOLLO_PRO: head->chipset = "VIA Apollo Pro";
diff -urN -X dontdiff linux-2.4.21p3ac4/include/linux/agp_backend.h linux-2.4.21p3ac4-pro266t/include/linux/agp_backend.h
--- linux-2.4.21p3ac4/include/linux/agp_backend.h 2003-01-21 12:05:02.000000000 +0100
+++ linux-2.4.21p3ac4-pro266t/include/linux/agp_backend.h 2003-01-15 20:07:53.000000000 +0100
@@ -61,6 +61,7 @@
VIA_APOLLO_KX133,
VIA_APOLLO_KT133,
VIA_APOLLO_KT400,
+ VIA_APOLLO_PRO266T,
SIS_GENERIC,
AMD_GENERIC,
AMD_IRONGATE,
diff -urN -X dontdiff linux-2.4.21p3ac4/include/linux/pci_ids.h linux-2.4.21p3ac4-pro266t/include/linux/pci_ids.h
--- linux-2.4.21p3ac4/include/linux/pci_ids.h 2003-01-21 12:05:02.000000000 +0100
+++ linux-2.4.21p3ac4-pro266t/include/linux/pci_ids.h 2003-01-15 20:07:53.000000000 +0100
@@ -1011,6 +1011,7 @@
#define PCI_DEVICE_ID_VIA_82C686_6 0x3068
#define PCI_DEVICE_ID_VIA_8233_0 0x3074
#define PCI_DEVICE_ID_VIA_8633_0 0x3091
+#define PCI_DEVICE_ID_VIA_8653_0 0x3101
#define PCI_DEVICE_ID_VIA_8367_0 0x3099
#define PCI_DEVICE_ID_VIA_8622 0x3102
#define PCI_DEVICE_ID_VIA_8233C_0 0x3109
--
"And remember: Evil will always prevail, because Good is dumb."
Spaceballs
GNU/Linux 2.4.21-pre3-ac4 SMP/ReiserFS 2x2785 bogomips load av: 0.41 0.50 0.37

2003-01-21 17:12:16

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Linux 2.4.21-pre3-ac4

On Tue, 2003-01-21 at 00:31, Alan Cox wrote:
> On Tue, 2003-01-14 at 17:38, Benjamin Herrenschmidt wrote:
> > > Fatal on PIIX PIO
> >
> > Ok, but PIIX runs on intel platforms with real IOs, so there is no need
> > to perform a read... If we go the hwif->IOSYNC() way, we might well set
> > it up to no-op on x86 PIO iops by default and read of alt-status on
> > other archs if it's safe enough on other controllers/drives...
>
> PIIX is also found on MIPS, IA-64 and some other platforms. I also fear
> that bug may be far from unique

Then we may want to have the actual implementation of the default iops
IOSYNC() for PIO & MMIO be arch specific. I would suggest nothing for
PIO on x86 and at least and mb() on others. For MMIO, then, we have the
choice of reading the alt status, or reading the dma_base if any.

Of course these are defaults, I expect controllers like ide-pmac to
implement their own IOSYNC.

That means that we must _both_ call IOSYNC and do the 400ns delay (I
don't trust reading alt status to be enough, especially on MMIO based
controlers, also, on ide-pmac, I plan to go read some other register
instead).

What do you think ?

Ben.