After activating the MSI support by adding sata_sil24.msi=1 to the
kernel command line, the first write to a drive attached to the SiI
3132 controller results in the following errors:
[ 138.950074] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
[ 138.961023] ata2.00: failed command: WRITE FPDMA QUEUED
[ 138.970034] ata2.00: cmd 61/00:00:a5:95:4a/04:00:01:00:00/40 tag 0
ncq 524288 out
[ 138.970037] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 138.992467] ata2.00: status: { DRDY }
[ 138.999864] ata2.00: failed command: WRITE FPDMA QUEUED
[ 139.008830] ata2.00: cmd 61/00:08:a5:91:4a/04:00:01:00:00/40 tag 1
ncq 524288 out
[ 139.008833] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 139.031370] ata2.00: status: { DRDY }
[ 139.038906] ata2.00: failed command: WRITE FPDMA QUEUED
[ 139.048055] ata2.00: cmd 61/00:10:a5:8d:4a/04:00:01:00:00/40 tag 2
ncq 524288 out
[ 139.048058] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 139.070828] ata2.00: status: { DRDY }
[ 139.078383] ata2.00: failed command: WRITE FPDMA QUEUED
[ 139.087506] ata2.00: cmd 61/00:18:a5:99:4a/04:00:01:00:00/40 tag 3
ncq 524288 out
[ 139.087509] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 139.110281] ata2.00: status: { DRDY }
[ 139.117877] ata2: hard resetting link
[ 141.310057] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[ 146.310037] ata2.00: qc timeout (cmd 0xec)
[ 146.318032] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[ 146.328027] ata2.00: revalidation failed (errno=-5)
[ 146.336815] ata2: hard resetting link
[ 148.530055] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[ 158.530035] ata2.00: qc timeout (cmd 0xec)
[ 158.538066] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[ 158.548122] ata2.00: revalidation failed (errno=-5)
[ 158.556955] ata2: limiting SATA link speed to 1.5 Gbps
[ 158.566052] ata2: hard resetting link
[ 160.760071] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
[ 190.760046] ata2.00: qc timeout (cmd 0xec)
[ 190.768137] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[ 190.778219] ata2.00: revalidation failed (errno=-5)
[ 190.787085] ata2.00: disabled
[ 190.794002] ata2.00: device reported invalid CHS sector 0
[ 190.803368] ata2.00: device reported invalid CHS sector 0
[ 190.812704] ata2.00: device reported invalid CHS sector 0
[ 190.821965] ata2.00: device reported invalid CHS sector 0
[ 190.831231] ata2: hard resetting link
[ 193.030050] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
[ 193.030073] ata2: EH complete
[ 193.030112] sd 1:0:0:0: [sdb] Unhandled error code
[ 193.030120] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
[ 193.030127] sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 01 4a 99 a5 00 04 00 00
[ 193.030141] end_request: I/O error, dev sdb, sector 21666213
[ 193.039854] sd 1:0:0:0: [sdb] Unhandled error code
[ 193.039857] sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
[ 193.039862] sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 01 4a 8d a5 00 04 00 00
[ 193.039874] end_request: I/O error, dev sdb, sector 21663141
[ 193.040083] sd 1:0:0:0: [sdb] Unhandled error code
This repeats for a large number of times, then the XFS filesystem gets
shut down.
More information about my system are in my first report:
http://lkml.org/lkml/2009/12/19/82
lspci -vv for both the normal, working and the broken MSI case are in
my second try to report this:
http://lkml.org/lkml/2009/12/25/12
Please ask, if you need more information.
Torsten
On 01/06/2010 03:37 AM, Torsten Kaiser wrote:
> After activating the MSI support by adding sata_sil24.msi=1 to the
> kernel command line, the first write to a drive attached to the SiI
> 3132 controller results in the following errors:
>
> [ 138.950074] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
> [ 138.961023] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 138.970034] ata2.00: cmd 61/00:00:a5:95:4a/04:00:01:00:00/40 tag 0
> ncq 524288 out
> [ 138.970037] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
Looking at the code in sata_sil24 and the SiI3132 datasheet, there's a
control bit which doesn't seem to be handled in the driver, global
control register bit 30: "MSI Acknowledge (W). Writing a one to this bit
acknowledges a Message Signaled Interrupt and permits generation of
another MSI. This bit is cleared immediately after the acknowledgement
is recognized by the control logic, hence the bit will always be read as
a zero. If all interrupt conditions are removed subsequent to an MSI, it
is not necessary to assert this Acknowledge; another MSI will be
generated when an interrupt condition occurs."
The way the interrupt handler for this driver works is that we check the
global IRQ status register, and then based on what ports indicated an
interrupt in that register, we check the individual port command
completion registers. The issue would seem to be that if a port got an
interrupt condition in between these two operations, we'd miss it, and
the MSI logic described above then wouldn't generate any more interrupts
since we didn't remove all interrupt conditions.
Can you try this patch and see if it helps? (Might be whitespace damaged
but hopefully you can apply manually in that case.)
diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
index 1370df6..d3d8dec 100644
--- a/drivers/ata/sata_sil24.c
+++ b/drivers/ata/sata_sil24.c
@@ -102,6 +102,7 @@ enum {
HOST_CTRL_STOP = (1 << 18), /* latched PCI STOP */
HOST_CTRL_DEVSEL = (1 << 19), /* latched PCI DEVSEL */
HOST_CTRL_REQ64 = (1 << 20), /* latched PCI REQ64 */
+ HOST_CTRL_MSIACK = (1 << 30), /* MSI acknowledge */
HOST_CTRL_GLOBAL_RST = (1 << 31), /* global reset */
/*
@@ -1168,6 +1169,7 @@ static irqreturn_t sil24_interrupt(int irq, void
*dev_instance)
": interrupt from disabled port
%d\n", i);
}
+ writel(IRQ_STAT_4PORTS | HOST_CTRL_MSIACK, host_base + HOST_CTRL);
spin_unlock(&host->lock);
out:
return IRQ_RETVAL(handled);
On Thu, Jan 7, 2010 at 1:59 AM, Robert Hancock <[email protected]> wrote:
> On 01/06/2010 03:37 AM, Torsten Kaiser wrote:
>>
>> After activating the MSI support by adding sata_sil24.msi=1 to the
>> kernel command line, the first write to a drive attached to the SiI
>> 3132 controller results in the following errors:
>>
>> [ ?138.950074] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6
>> frozen
>> [ ?138.961023] ata2.00: failed command: WRITE FPDMA QUEUED
>> [ ?138.970034] ata2.00: cmd 61/00:00:a5:95:4a/04:00:01:00:00/40 tag 0
>> ncq 524288 out
>> [ ?138.970037] ? ? ? ? ?res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>> 0x4 (timeout)
>
> Looking at the code in sata_sil24 and the SiI3132 datasheet, there's a
> control bit which doesn't seem to be handled in the driver, global control
> register bit 30: "MSI Acknowledge (W). Writing a one to this bit
> acknowledges a Message Signaled Interrupt and permits generation of another
> MSI. This bit is cleared immediately after the acknowledgement is recognized
> by the control logic, hence the bit will always be read as a zero. If all
> interrupt conditions are removed subsequent to an MSI, it is not necessary
> to assert this Acknowledge; another MSI will be generated when an interrupt
> condition occurs."
>
> The way the interrupt handler for this driver works is that we check the
> global IRQ status register, and then based on what ports indicated an
> interrupt in that register, we check the individual port command completion
> registers. The issue would seem to be that if a port got an interrupt
> condition in between these two operations, we'd miss it, and the MSI logic
> described above then wouldn't generate any more interrupts since we didn't
> remove all interrupt conditions.
>
> Can you try this patch and see if it helps? (Might be whitespace damaged but
> hopefully you can apply manually in that case.)
Tried it, but writing still fails:
[ 53.467694] XFS mounting filesystem sdb2
[ 141.010058] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
[ 141.020361] ata2.00: failed command: WRITE FPDMA QUEUED
[ 141.028718] ata2.00: cmd 61/00:00:5d:cd:48/04:00:01:00:00/40 tag 0
ncq 524288 out
[ 141.028721] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 141.049895] ata2.00: status: { DRDY }
[ 141.056715] ata2.00: failed command: WRITE FPDMA QUEUED
[ 141.065133] ata2.00: cmd 61/00:08:5d:c5:48/04:00:01:00:00/40 tag 1
ncq 524288 out
[ 141.065135] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 141.086492] ata2.00: status: { DRDY }
[ 141.093313] ata2.00: failed command: WRITE FPDMA QUEUED
[ 141.101679] ata2.00: cmd 61/00:10:5d:c9:48/04:00:01:00:00/40 tag 2
ncq 524288 out
[ 141.101682] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 141.122813] ata2.00: status: { DRDY }
[ 141.129522] ata2.00: failed command: WRITE FPDMA QUEUED
[ 141.137769] ata2.00: cmd 61/00:18:5d:d1:48/04:00:01:00:00/40 tag 3
ncq 524288 out
[ 141.137771] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 141.158660] ata2.00: status: { DRDY }
[ 141.165313] ata2: hard resetting link
[ 143.370049] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[ 148.370031] ata2.00: qc timeout (cmd 0xec)
[ 148.377198] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 148.386450] ata2.00: revalidation failed (errno=-5)
[ 148.394504] ata2: hard resetting link
[ 150.600064] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[ 160.600038] ata2.00: qc timeout (cmd 0xec)
[ 160.607451] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 160.616913] ata2.00: revalidation failed (errno=-5)
[ 160.625181] ata2: limiting SATA link speed to 1.5 Gbps
[ 160.633746] ata2: hard resetting link
[ 162.830049] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
...
Please note, that in my first report I also mentioned that I get the
same behavior with sata_nv. If I use sata_nv.msi=1 writing to the
drives attached to the MCP55 fail. The sata_nv problem is not new,
that never worked for me, but I only retried it with 2.6.33-rc1.
Other drivers can use MSI successfull (tg3, hda-intel, radeon).
> diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
> index 1370df6..d3d8dec 100644
> --- a/drivers/ata/sata_sil24.c
> +++ b/drivers/ata/sata_sil24.c
> @@ -102,6 +102,7 @@ enum {
> ? ? ? ?HOST_CTRL_STOP ? ? ? ? ?= (1 << 18), /* latched PCI STOP */
> ? ? ? ?HOST_CTRL_DEVSEL ? ? ? ?= (1 << 19), /* latched PCI DEVSEL */
> ? ? ? ?HOST_CTRL_REQ64 ? ? ? ? = (1 << 20), /* latched PCI REQ64 */
> + ? ? ? HOST_CTRL_MSIACK ? ? ? ?= (1 << 30), /* MSI acknowledge */
> ? ? ? ?HOST_CTRL_GLOBAL_RST ? ?= (1 << 31), /* global reset */
>
> ? ? ? ?/*
> @@ -1168,6 +1169,7 @@ static irqreturn_t sil24_interrupt(int irq, void
> *dev_instance)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ": interrupt from disabled port %d\n",
> i);
> ? ? ? ? ? ? ? ?}
>
> + ? ? ? writel(IRQ_STAT_4PORTS | HOST_CTRL_MSIACK, host_base + HOST_CTRL);
> ? ? ? ?spin_unlock(&host->lock);
> ?out:
> ? ? ? ?return IRQ_RETVAL(handled);
>
On 01/06/2010 08:27 PM, Torsten Kaiser wrote:
> On Thu, Jan 7, 2010 at 1:59 AM, Robert Hancock<[email protected]> wrote:
>> On 01/06/2010 03:37 AM, Torsten Kaiser wrote:
>>>
>>> After activating the MSI support by adding sata_sil24.msi=1 to the
>>> kernel command line, the first write to a drive attached to the SiI
>>> 3132 controller results in the following errors:
>>>
>>> [ 138.950074] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6
>>> frozen
>>> [ 138.961023] ata2.00: failed command: WRITE FPDMA QUEUED
>>> [ 138.970034] ata2.00: cmd 61/00:00:a5:95:4a/04:00:01:00:00/40 tag 0
>>> ncq 524288 out
>>> [ 138.970037] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>> 0x4 (timeout)
>>
>> Looking at the code in sata_sil24 and the SiI3132 datasheet, there's a
>> control bit which doesn't seem to be handled in the driver, global control
>> register bit 30: "MSI Acknowledge (W). Writing a one to this bit
>> acknowledges a Message Signaled Interrupt and permits generation of another
>> MSI. This bit is cleared immediately after the acknowledgement is recognized
>> by the control logic, hence the bit will always be read as a zero. If all
>> interrupt conditions are removed subsequent to an MSI, it is not necessary
>> to assert this Acknowledge; another MSI will be generated when an interrupt
>> condition occurs."
>>
>> The way the interrupt handler for this driver works is that we check the
>> global IRQ status register, and then based on what ports indicated an
>> interrupt in that register, we check the individual port command completion
>> registers. The issue would seem to be that if a port got an interrupt
>> condition in between these two operations, we'd miss it, and the MSI logic
>> described above then wouldn't generate any more interrupts since we didn't
>> remove all interrupt conditions.
>>
>> Can you try this patch and see if it helps? (Might be whitespace damaged but
>> hopefully you can apply manually in that case.)
>
> Tried it, but writing still fails:
> [ 53.467694] XFS mounting filesystem sdb2
> [ 141.010058] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
> [ 141.020361] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 141.028718] ata2.00: cmd 61/00:00:5d:cd:48/04:00:01:00:00/40 tag 0
> ncq 524288 out
> [ 141.028721] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [ 141.049895] ata2.00: status: { DRDY }
> [ 141.056715] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 141.065133] ata2.00: cmd 61/00:08:5d:c5:48/04:00:01:00:00/40 tag 1
> ncq 524288 out
> [ 141.065135] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [ 141.086492] ata2.00: status: { DRDY }
> [ 141.093313] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 141.101679] ata2.00: cmd 61/00:10:5d:c9:48/04:00:01:00:00/40 tag 2
> ncq 524288 out
> [ 141.101682] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [ 141.122813] ata2.00: status: { DRDY }
> [ 141.129522] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 141.137769] ata2.00: cmd 61/00:18:5d:d1:48/04:00:01:00:00/40 tag 3
> ncq 524288 out
> [ 141.137771] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [ 141.158660] ata2.00: status: { DRDY }
> [ 141.165313] ata2: hard resetting link
> [ 143.370049] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
> [ 148.370031] ata2.00: qc timeout (cmd 0xec)
> [ 148.377198] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 148.386450] ata2.00: revalidation failed (errno=-5)
> [ 148.394504] ata2: hard resetting link
> [ 150.600064] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
> [ 160.600038] ata2.00: qc timeout (cmd 0xec)
> [ 160.607451] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 160.616913] ata2.00: revalidation failed (errno=-5)
> [ 160.625181] ata2: limiting SATA link speed to 1.5 Gbps
> [ 160.633746] ata2: hard resetting link
> [ 162.830049] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
> ...
>
> Please note, that in my first report I also mentioned that I get the
> same behavior with sata_nv. If I use sata_nv.msi=1 writing to the
> drives attached to the MCP55 fail. The sata_nv problem is not new,
> that never worked for me, but I only retried it with 2.6.33-rc1.
> Other drivers can use MSI successfull (tg3, hda-intel, radeon).
>
>> diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
>> index 1370df6..d3d8dec 100644
>> --- a/drivers/ata/sata_sil24.c
>> +++ b/drivers/ata/sata_sil24.c
>> @@ -102,6 +102,7 @@ enum {
>> HOST_CTRL_STOP = (1<< 18), /* latched PCI STOP */
>> HOST_CTRL_DEVSEL = (1<< 19), /* latched PCI DEVSEL */
>> HOST_CTRL_REQ64 = (1<< 20), /* latched PCI REQ64 */
>> + HOST_CTRL_MSIACK = (1<< 30), /* MSI acknowledge */
>> HOST_CTRL_GLOBAL_RST = (1<< 31), /* global reset */
>>
>> /*
>> @@ -1168,6 +1169,7 @@ static irqreturn_t sil24_interrupt(int irq, void
>> *dev_instance)
>> ": interrupt from disabled port %d\n",
>> i);
>> }
>>
>> + writel(IRQ_STAT_4PORTS | HOST_CTRL_MSIACK, host_base + HOST_CTRL);
>> spin_unlock(&host->lock);
>> out:
>> return IRQ_RETVAL(handled);
>>
Hmm, well presumably the problem isn't related to that then. I was
looking at your lspci output though:
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
(prog-if 85 [Master SecO PriO])
Subsystem: ASUSTeK Computer Inc. Device 81f0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0 (750ns min, 250ns max)
Interrupt: pin A routed to IRQ 30
Region 0: I/O ports at cc00 [size=8]
Region 1: I/O ports at c880 [size=4]
Region 2: I/O ports at c800 [size=8]
Region 3: I/O ports at c480 [size=4]
Region 4: I/O ports at c400 [size=16]
Region 5: Memory at efafb000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 00000000fee0f00c Data: 4189
Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
The HT MSI Mapping capability is not enabled on the device. I'm thinking
it should be, but I'm not sure. And it's also not enabled on the bus
which has the Silicon Image controller:
04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA
Raid II Controller (rev 01)
on its subordinate bus:
00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
(prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: efe00000-efefffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: nVidia Corporation Device 0000
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
Address: 00000000fee0f00c Data: 4149
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
CCing some people that might have some idea about this..
On Thu, Jan 7, 2010 at 4:05 AM, Robert Hancock <[email protected]> wrote:
> Hmm, well presumably the problem isn't related to that then. I was looking
> at your lspci output though:
>
> 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
> (prog-if 85 [Master SecO PriO])
> ? ? ? ?Subsystem: ASUSTeK Computer Inc. Device 81f0
> ? ? ? ?Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
> ? ? ? ?Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> ? ? ? ?Latency: 0 (750ns min, 250ns max)
> ? ? ? ?Interrupt: pin A routed to IRQ 30
> ? ? ? ?Region 0: I/O ports at cc00 [size=8]
> ? ? ? ?Region 1: I/O ports at c880 [size=4]
> ? ? ? ?Region 2: I/O ports at c800 [size=8]
> ? ? ? ?Region 3: I/O ports at c480 [size=4]
> ? ? ? ?Region 4: I/O ports at c400 [size=16]
> ? ? ? ?Region 5: Memory at efafb000 (32-bit, non-prefetchable) [size=4K]
> ? ? ? ?Capabilities: [44] Power Management version 2
> ? ? ? ? ? ? ? ?Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> ? ? ? ? ? ? ? ?Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> ? ? ? ?Capabilities: [b0] MSI: Enable+ Count=1/4 Maskable- 64bit+
> ? ? ? ? ? ? ? ?Address: 00000000fee0f00c ?Data: 4189
> ? ? ? ?Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
>
> The HT MSI Mapping capability is not enabled on the device. I'm thinking it
> should be, but I'm not sure. And it's also not enabled on the bus which has
> the Silicon Image controller:
>
> 04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA
> Raid II Controller (rev 01)
>
> on its subordinate bus:
>
> 00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
> (prog-if 00 [Normal decode])
> ? ? ? ?Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
> ? ? ? ?Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> ? ? ? ?Latency: 0, Cache Line Size: 64 bytes
> ? ? ? ?Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
> ? ? ? ?I/O behind bridge: 0000e000-0000efff
> ? ? ? ?Memory behind bridge: efe00000-efefffff
> ? ? ? ?Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- <SERR- <PERR-
> ? ? ? ?BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
> ? ? ? ? ? ? ? ?PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> ? ? ? ?Capabilities: [40] Subsystem: nVidia Corporation Device 0000
> ? ? ? ?Capabilities: [48] Power Management version 2
> ? ? ? ? ? ? ? ?Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> ? ? ? ? ? ? ? ?Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> ? ? ? ?Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
> ? ? ? ? ? ? ? ?Address: 00000000fee0f00c ?Data: 4149
> ? ? ? ?Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
> ? ? ? ? ? ? ? ?Mapping Address Base: 00000000fee00000
>
> CCing some people that might have some idea about this..
part of the PCI tree:
+-0b.0-[04]----00.0 Silicon Image, Inc. SiI 3132 Serial
ATA Raid II Controller
+-0c.0-[03]----00.0 Broadcom Corporation NetXtreme BCM5754
Gigabit Ethernet PCI Express
+-0d.0-[02]----00.0 Broadcom Corporation NetXtreme BCM5754
Gigabit Ethernet PCI Express
+-0f.0-[01]--+-00.0 ATI Technologies Inc RV370 5B60
[Radeon X300 (PCIE)]
| \-00.1 ATI Technologies Inc RV370 [Radeon X300SE]
The three devices attached to 0c.0, 0d.0 and 0f.0 work correctly with MSI.
But each of these PCI Express bridges also has this Mapping disabled:
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
This capability seems only to be enabled at the root:
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
Subsystem: ASUSTeK Computer Inc. Device 81f0
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- S
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort
Latency: 0
Capabilities: [44] HyperTransport: Slave or Primary Interface
Command: BaseUnitID=0 UnitCnt=15 MastHost- DefDir- DUL-
Link Control 0: CFlE+ CST- CFE- <LkFail- Init+ EOC-
TXO- <CRCErr=0 Isoc
Link Config 0: MLWI=16bit DwFcIn- MLWO=16bit DwFcOut-
LWI=16bit DwFcInE
Link Control 1: CFlE- CST- CFE- <LkFail+ Init- EOC+
TXO+ <CRCErr=0 Isoc
Link Config 1: MLWI=8bit DwFcIn- MLWO=8bit DwFcOut-
LWI=8bit DwFcInEn-
Revision ID: 1.03
Link Frequency 0: 1.0GHz
Link Error 0: <Prot- <Ovfl- <EOC- CTLTm-
Link Frequency Capability 0: 200MHz+ 300MHz+ 400MHz+
500MHz+ 600MHz+ 80
Feature Capability: IsocFC+ LDTSTOP+ CRCTM- ECTLT- 64bA- UIDRD-
Link Frequency 1: 200MHz
Link Error 1: <Prot- <Ovfl- <EOC- CTLTm-
Link Frequency Capability 1: 200MHz- 300MHz- 400MHz-
500MHz- 600MHz- 80
Error Handling: PFlE+ OFlE+ PFE- OFE- EOCFE- RFE-
CRCFE- SERRFE- CF- RE
Prefetchable memory behind bridge Upper: 00-00
Bus Number: 00
Capabilities: [dc] HyperTransport: MSI Mapping Enable+ Fixed-
Mapping Address Base: 00000000fee00000
>From my dmesg:
[ 1.636318] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.641854] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.647420] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.652946] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.658505] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.664055] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.669597] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.675172] pci 0000:00:00.0: Found enabled HT MSI Mapping
[ 1.680715] pci 0000:00:00.0: Found enabled HT MSI Mapping
I found this output very strange, as it always referred to the same
pci device, but looking at the code, that might only be a visual nit.
The output is from msi_ht_cap_enabled() in drivers/pci/quirks.c. This
will be called via nv_ht_enable_msi_mapping(), but always to check the
'host_bridge', not the devices that __nv_msi_ht_cap_quirk() loops
over.
But I do not have the knowlegde to to decide, if this is just a
overeager debug output, or if this should be switched to test each
device.
Torsten
On 01/06/2010 07:28 PM, Torsten Kaiser wrote:
> On Thu, Jan 7, 2010 at 4:05 AM, Robert Hancock <[email protected]> wrote:
>> Hmm, well presumably the problem isn't related to that then. I was looking
>> at your lspci output though:
>>
>> 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
>> (prog-if 85 [Master SecO PriO])
>> Subsystem: ASUSTeK Computer Inc. Device 81f0
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>> Latency: 0 (750ns min, 250ns max)
>> Interrupt: pin A routed to IRQ 30
>> Region 0: I/O ports at cc00 [size=8]
>> Region 1: I/O ports at c880 [size=4]
>> Region 2: I/O ports at c800 [size=8]
>> Region 3: I/O ports at c480 [size=4]
>> Region 4: I/O ports at c400 [size=16]
>> Region 5: Memory at efafb000 (32-bit, non-prefetchable) [size=4K]
>> Capabilities: [44] Power Management version 2
>> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
>> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>> Capabilities: [b0] MSI: Enable+ Count=1/4 Maskable- 64bit+
>> Address: 00000000fee0f00c Data: 4189
>> Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
>>
>> The HT MSI Mapping capability is not enabled on the device. I'm thinking it
>> should be, but I'm not sure. And it's also not enabled on the bus which has
>> the Silicon Image controller:
>>
>> 04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA
>> Raid II Controller (rev 01)
>>
>> on its subordinate bus:
>>
>> 00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
>> (prog-if 00 [Normal decode])
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>> Latency: 0, Cache Line Size: 64 bytes
>> Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
>> I/O behind bridge: 0000e000-0000efff
>> Memory behind bridge: efe00000-efefffff
>> Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- <SERR- <PERR-
>> BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
>> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>> Capabilities: [40] Subsystem: nVidia Corporation Device 0000
>> Capabilities: [48] Power Management version 2
>> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>> Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
>> Address: 00000000fee0f00c Data: 4149
>> Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
>> Mapping Address Base: 00000000fee00000
>>
>> CCing some people that might have some idea about this..
>
> part of the PCI tree:
> +-0b.0-[04]----00.0 Silicon Image, Inc. SiI 3132 Serial
> ATA Raid II Controller
> +-0c.0-[03]----00.0 Broadcom Corporation NetXtreme BCM5754
> Gigabit Ethernet PCI Express
> +-0d.0-[02]----00.0 Broadcom Corporation NetXtreme BCM5754
> Gigabit Ethernet PCI Express
> +-0f.0-[01]--+-00.0 ATI Technologies Inc RV370 5B60
> [Radeon X300 (PCIE)]
> | \-00.1 ATI Technologies Inc RV370 [Radeon X300SE]
>
> The three devices attached to 0c.0, 0d.0 and 0f.0 work correctly with MSI.
> But each of these PCI Express bridges also has this Mapping disabled:
> Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
> Mapping Address Base: 00000000fee00000
so that could be Sil silicon problem or driver problem.
>
> This capability seems only to be enabled at the root:
> 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
> Subsystem: ASUSTeK Computer Inc. Device 81f0
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- S
> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort
> Latency: 0
> Capabilities: [44] HyperTransport: Slave or Primary Interface
> Command: BaseUnitID=0 UnitCnt=15 MastHost- DefDir- DUL-
> Link Control 0: CFlE+ CST- CFE- <LkFail- Init+ EOC-
> TXO- <CRCErr=0 Isoc
> Link Config 0: MLWI=16bit DwFcIn- MLWO=16bit DwFcOut-
> LWI=16bit DwFcInE
> Link Control 1: CFlE- CST- CFE- <LkFail+ Init- EOC+
> TXO+ <CRCErr=0 Isoc
> Link Config 1: MLWI=8bit DwFcIn- MLWO=8bit DwFcOut-
> LWI=8bit DwFcInEn-
> Revision ID: 1.03
> Link Frequency 0: 1.0GHz
> Link Error 0: <Prot- <Ovfl- <EOC- CTLTm-
> Link Frequency Capability 0: 200MHz+ 300MHz+ 400MHz+
> 500MHz+ 600MHz+ 80
> Feature Capability: IsocFC+ LDTSTOP+ CRCTM- ECTLT- 64bA- UIDRD-
> Link Frequency 1: 200MHz
> Link Error 1: <Prot- <Ovfl- <EOC- CTLTm-
> Link Frequency Capability 1: 200MHz- 300MHz- 400MHz-
> 500MHz- 600MHz- 80
> Error Handling: PFlE+ OFlE+ PFE- OFE- EOCFE- RFE-
> CRCFE- SERRFE- CF- RE
> Prefetchable memory behind bridge Upper: 00-00
> Bus Number: 00
> Capabilities: [dc] HyperTransport: MSI Mapping Enable+ Fixed-
> Mapping Address Base: 00000000fee00000
>
>>From my dmesg:
> [ 1.636318] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.641854] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.647420] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.652946] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.658505] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.664055] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.669597] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.675172] pci 0000:00:00.0: Found enabled HT MSI Mapping
> [ 1.680715] pci 0000:00:00.0: Found enabled HT MSI Mapping
>
> I found this output very strange, as it always referred to the same
> pci device, but looking at the code, that might only be a visual nit.
>
> The output is from msi_ht_cap_enabled() in drivers/pci/quirks.c. This
> will be called via nv_ht_enable_msi_mapping(), but always to check the
> 'host_bridge', not the devices that __nv_msi_ht_cap_quirk() loops
> over.
if the host_brige get that ht_msi mapping enabled, then don't need to enable that on bridge under that.
YH
Hello Kyle,
I'm also using SIL3234 (sil24 driver) on P2020 and encountering
problems. Instead of starting my own investigation first I used google
powers to find this old email thread.
Have you found a more recent working solution to your problem?
Regards,
Leon.
On Sat, May 29, 2010 at 2:05 AM, Moffett, Kyle D
<[email protected]> wrote:
> My advance apologies if this email gets badly MIME-mangled...
>
> On 2010/01/06 20:59, "Robert Hancock" <[email protected]> wrote:
>> On 01/06/2010 03:37 AM, Torsten Kaiser wrote:
>>> After activating the MSI support by adding sata_sil24.msi=1 to the
>>> kernel command line, the first write to a drive attached to the SiI
>>> 3132 controller results in the following errors:
>>>
>>> [ ?138.950074] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6
>>> frozen
>>> [ ?138.961023] ata2.00: failed command: WRITE FPDMA QUEUED
>>> [ ?138.970034] ata2.00: cmd 61/00:00:a5:95:4a/04:00:01:00:00/40 tag 0
>>> ncq 524288 out
>>> [ ?138.970037] ? ? ? ? ?res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>> 0x4 (timeout)
>>
>> Looking at the code in sata_sil24 and the SiI3132 datasheet, there's a
>> control bit which doesn't seem to be handled in the driver, global
>> control register bit 30: "MSI Acknowledge (W). Writing a one to this bit
>> acknowledges a Message Signaled Interrupt and permits generation of
>> another MSI. This bit is cleared immediately after the acknowledgement
>> is recognized by the control logic, hence the bit will always be read as
>> a zero. If all interrupt conditions are removed subsequent to an MSI, it
>> is not necessary to assert this Acknowledge; another MSI will be
>> generated when an interrupt condition occurs."
>>
>> The way the interrupt handler for this driver works is that we check the
>> global IRQ status register, and then based on what ports indicated an
>> interrupt in that register, we check the individual port command
>> completion registers. The issue would seem to be that if a port got an
>> interrupt condition in between these two operations, we'd miss it, and
>> the MSI logic described above then wouldn't generate any more interrupts
>> since we didn't remove all interrupt conditions.
>>
>> Can you try this patch and see if it helps? (Might be whitespace damaged
>> but hopefully you can apply manually in that case.)
>
> I've got this custom board that uses the sata_sil24 driver (off a P2020
> processor). ?My current kernel is a slightly patched 2.6.32 kernel
> (including the sata_sil24 enable-MSI patch).
>
> Unfortunately when I turn MSI on, I get the exact same hang described here,
> boot log included as dmesg1.txt.
>
> With this patch applied, it seems to get a little further (dmesg2.txt), but
> still dies miserably.
>
> I'm relatively sure that MSI works on this chipset as I also have an e1000e
> controller off an adjacent PCI-E bus which works correctly with MSI.
>
> It's relatively critical for me to get MSI working, because the legacy-PCI
> INTx interrupt for that PCI-E port happens to share an IRQ line with a
> device that is very unfriendly to shared IRQs (it has no internal IRQ
> disable register). ?I'd rather not have to go in there with a soldering iron
> and some scraps of wire to make it work. :-D
>
> Cheers,
> Kyle Moffett
>
>
--
Leon
Hi Leon,
I have the same problem with the sata_sil24 driver,
using SiI 3132 and enabling msi causes an nmi.
http://marc.info/?l=linux-ide&m=129138342901236&w=2
Can't there be a workaround for this errata?
The sata_sil24 interrupt is shared on my system with
an cx88 tv card which often produces an irq storm.
-Tobias