2010-11-29 18:08:12

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Thu, Oct 21, 2010 at 09:23:02PM -0400, Jeff Garzik wrote:
> 1) Link Power Management (LPM) rewrite
>

Hrm, this is causing issues for me... When I resume from suspend on my
thinkpad X201s I get:

[ 2454.478673] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2454.487736] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 2454.487742] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
filtered out
[repeated a few times]
[ 2454.492688] ata1.00: configured for UDMA/100
[ 2454.506690] ata1.00: configured for UDMA/100
[ 2454.506696] ata1: EH complete
[...]
[ 2455.097874] ata1.00: configured for UDMA/100
[ 2455.097882] ata1: EH complete
[...]
[ 2455.386756] ata1.00: failed to enable DIPM, Emask 0x100
[ 2455.386768] ata1: hard resetting link
[...]
[ 2455.847925] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2455.849372] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 2455.849380] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ditto]
[ 2455.854171] ata1.00: configured for UDMA/100
[ 2455.854480] ata1.00: failed to enable DIPM, Emask 0x100
[ 2455.854484] ata1: disabling LPM on the link
[ 2455.854489] ata1: limiting SATA link speed to 1.5 Gbps
[ 2455.854494] ata1.00: limiting speed to UDMA/100:PIO3

Tejun, any ideas how I can debug this?

regards, Kyle


2010-11-30 14:14:16

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

Hello,

On 11/29/2010 07:07 PM, Kyle McMartin wrote:
> On Thu, Oct 21, 2010 at 09:23:02PM -0400, Jeff Garzik wrote:
>> 1) Link Power Management (LPM) rewrite
>>
>
> Hrm, this is causing issues for me... When I resume from suspend on my
> thinkpad X201s I get:
>
> [ 2454.478673] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 2454.487736] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
> succeeded
> [ 2454.487742] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
> filtered out
> [repeated a few times]
> [ 2454.492688] ata1.00: configured for UDMA/100
> [ 2454.506690] ata1.00: configured for UDMA/100
> [ 2454.506696] ata1: EH complete
> [...]
> [ 2455.097874] ata1.00: configured for UDMA/100
> [ 2455.097882] ata1: EH complete
> [...]
> [ 2455.386756] ata1.00: failed to enable DIPM, Emask 0x100
> [ 2455.386768] ata1: hard resetting link
> [...]
> [ 2455.847925] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 2455.849372] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
> succeeded
> [ 2455.849380] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
> LOCK) filtered out
> [ditto]
> [ 2455.854171] ata1.00: configured for UDMA/100
> [ 2455.854480] ata1.00: failed to enable DIPM, Emask 0x100
> [ 2455.854484] ata1: disabling LPM on the link
> [ 2455.854489] ata1: limiting SATA link speed to 1.5 Gbps
> [ 2455.854494] ata1.00: limiting speed to UDMA/100:PIO3
>
> Tejun, any ideas how I can debug this?

Hmm... DIPM commands are failing with AC_ERR_OTHER. Other than DIPM
not being configured and speed capped at 1.5Gbps, the machine works
fine afterwards, right? Are you up for applying debug patches?

Thanks.

--
tejun

2010-11-30 15:38:53

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Tue, Nov 30, 2010 at 03:13:58PM +0100, Tejun Heo wrote:
> > Tejun, any ideas how I can debug this?
>
> Hmm... DIPM commands are failing with AC_ERR_OTHER. Other than DIPM
> not being configured and speed capped at 1.5Gbps, the machine works
> fine afterwards, right? Are you up for applying debug patches?
>

Yup, it's chugging along happily, though I notice kernel builds are
taking a minute or two longer (but I don't bother to time them so it's
purely subjective.)

Be happy to apply any debug patches, but I plan on replacing the drive
with a faster one at some point in the next few weeks.

--Kyle

2010-11-30 16:29:42

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On 11/30/2010 04:38 PM, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 03:13:58PM +0100, Tejun Heo wrote:
>>> Tejun, any ideas how I can debug this?
>>
>> Hmm... DIPM commands are failing with AC_ERR_OTHER. Other than DIPM
>> not being configured and speed capped at 1.5Gbps, the machine works
>> fine afterwards, right? Are you up for applying debug patches?
>>
>
> Yup, it's chugging along happily, though I notice kernel builds are
> taking a minute or two longer (but I don't bother to time them so it's
> purely subjective.)
>
> Be happy to apply any debug patches, but I plan on replacing the drive
> with a faster one at some point in the next few weeks.

Can you please apply the following patch and report the resulting
kernel log? You're on ahci, right?

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ebc08d6..b1c39db 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1560,6 +1560,10 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
}

/* okay, let's hand over to EH */
+ if (active_qc && ata_tag_internal(active_qc->tag))
+ ata_dev_printk(active_qc->dev, KERN_WARNING,
+ "ahci: internal command failure, irq_stat=0x%x\n",
+ irq_stat);

if (irq_stat & PORT_IRQ_FREEZE)
ata_port_freeze(ap);
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7f77c67..7cf236b 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1668,6 +1668,10 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,

/* perform minimal error analysis */
if (qc->flags & ATA_QCFLAG_FAILED) {
+ ata_dev_printk(dev, KERN_WARNING,
+ "internal command failure: stat=0x%x ehi_desc=\"%s\"\n",
+ qc->tf.command, dev->link->eh_info.desc);
+
if (qc->result_tf.command & (ATA_ERR | ATA_DF))
qc->err_mask |= AC_ERR_DEV;

2010-11-30 16:31:57

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > Be happy to apply any debug patches, but I plan on replacing the drive
> > with a faster one at some point in the next few weeks.
>
> Can you please apply the following patch and report the resulting
> kernel log? You're on ahci, right?
>

Yup, building now.

> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index ebc08d6..b1c39db 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -1560,6 +1560,10 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
> }
>
> /* okay, let's hand over to EH */
> + if (active_qc && ata_tag_internal(active_qc->tag))
> + ata_dev_printk(active_qc->dev, KERN_WARNING,
> + "ahci: internal command failure, irq_stat=0x%x\n",
> + irq_stat);
>
> if (irq_stat & PORT_IRQ_FREEZE)
> ata_port_freeze(ap);
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 7f77c67..7cf236b 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -1668,6 +1668,10 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
>
> /* perform minimal error analysis */
> if (qc->flags & ATA_QCFLAG_FAILED) {
> + ata_dev_printk(dev, KERN_WARNING,
> + "internal command failure: stat=0x%x ehi_desc=\"%s\"\n",
> + qc->tf.command, dev->link->eh_info.desc);
> +
> if (qc->result_tf.command & (ATA_ERR | ATA_DF))
> qc->err_mask |= AC_ERR_DEV;
>
>

2010-11-30 17:53:25

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Tue, Nov 30, 2010 at 11:31:50AM -0500, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > > Be happy to apply any debug patches, but I plan on replacing the drive
> > > with a faster one at some point in the next few weeks.
> >
> > Can you please apply the following patch and report the resulting
> > kernel log? You're on ahci, right?
> >
>
> Yup, building now.
>

Ok, running it, I haven't seen the issue across three suspends... Will
update when it inevitably happens.

--Kyle

2010-11-30 21:09:51

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Tue, Nov 30, 2010 at 12:53:17PM -0500, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 11:31:50AM -0500, Kyle McMartin wrote:
> > On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > > > Be happy to apply any debug patches, but I plan on replacing the drive
> > > > with a faster one at some point in the next few weeks.
> > >
> > > Can you please apply the following patch and report the resulting
> > > kernel log? You're on ahci, right?
> > >
> >
> > Yup, building now.
> >
>
> Ok, running it, I haven't seen the issue across three suspends... Will
> update when it inevitably happens.
>

[10272.544858] ata1.00: ahci: internal command failure,
irq_stat=0x400001
[10272.544943] ata1.00: internal command failure: stat=0xef
ehi_desc="irq_stat 0x00400001, PHY RDY changed"
[10272.544951] ata1.00: failed to enable DIPM, Emask 0x100
[10272.544960] ata1: hard resetting link
[10273.004718] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[10273.006108] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.006116] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[10273.006410] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.006417] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[10273.009337] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.009344] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[10273.009611] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.009631] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[10273.010945] ata1.00: configured for UDMA/100
[10273.011206] ata1.00: ahci: internal command failure,
irq_stat=0x400001
[10273.011241] ata1.00: internal command failure: stat=0xef
ehi_desc="irq_stat 0x00400001, PHY RDY changed"
[10273.011247] ata1.00: failed to enable DIPM, Emask 0x100
[10273.011250] ata1: disabling LPM on the link
[10273.011255] ata1: limiting SATA link speed to 1.5 Gbps
[10273.011259] ata1.00: limiting speed to UDMA/100:PIO3

regards, Kyle

2010-12-01 11:17:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

Hello,

On 11/30/2010 10:09 PM, Kyle McMartin wrote:
>> Ok, running it, I haven't seen the issue across three suspends... Will
>> update when it inevitably happens.
>
> [10273.011241] ata1.00: internal command failure: stat=0xef
> ehi_desc="irq_stat 0x00400001, PHY RDY changed"

Hmmm, so it's a timing problem. For some reason, a PHY event is being
reported after reset is complete which unfortunately gets reported
while DIPM command is in progress. Hmmm... what's broken is that the
PHY event happens after reset is complete. Can you please attach the
output of "lspci -nn" and "hdparm -I /dev/sda"? Looks a lot like a
timing issue. Maybe we'll have to add a delay somewhere. :-(

--
tejun

2010-12-01 12:44:55

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Wed, Dec 01, 2010 at 12:17:52PM +0100, Tejun Heo wrote:
> > [10273.011241] ata1.00: internal command failure: stat=0xef
> > ehi_desc="irq_stat 0x00400001, PHY RDY changed"
>
> Hmmm, so it's a timing problem. For some reason, a PHY event is being
> reported after reset is complete which unfortunately gets reported
> while DIPM command is in progress. Hmmm... what's broken is that the
> PHY event happens after reset is complete. Can you please attach the
> output of "lspci -nn" and "hdparm -I /dev/sda"? Looks a lot like a
> timing issue. Maybe we'll have to add a delay somewhere. :-(
>
>

Sure, np:

00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series
Chipset 6 port SATA AHCI Controller [8086:3b2f] (rev 06) (prog-if 01
[AHCI 1.0])
Subsystem: Lenovo Device [17aa:2168]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 42
Region 0: I/O ports at 1860 [size=8]
Region 1: I/O ports at 1814 [size=4]
Region 2: I/O ports at 1818 [size=8]
Region 3: I/O ports at 1810 [size=4]
Region 4: I/O ports at 1840 [size=32]
Region 5: Memory at f2727000 (32-bit, non-prefetchable)
[size=2K]
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee0100c Data: 4161
Capabilities: [70] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
Capabilities: [b0] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: ahci

/dev/sda:

ATA device, with non-removable media
Model Number: HITACHI HTS725032A9A364
Serial Number: 100407PCKC04VPHVB55K
Firmware Revision: PC3ZC70F
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II
Extensions, SATA Rev 2.5, SATA Rev 2.6; Revision: ATA8-AST T13 Project
D1697 Revision 0b
Standards:
Used: unknown (minor revision code 0x0028)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 625142448
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 305245 MBytes
device size with M = 1000*1000: 320072 MBytes (320 GB)
cache/buffer size = 15151 KBytes (type=DualPortCache)
Form Factor: 2.5 inch
Nominal Media Rotation Rate: 7200
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Vendor, no device specific
minimum
R/W multiple sector transfer: Max = 16 Current = ?
Advanced power management level: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* DOWNLOAD_MICROCODE
* Advanced Power Management feature set
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* 64-bit World wide name
* IDLE_IMMEDIATE with UNLOAD
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* Idle-Unload when NCQ is active
* NCQ priority information
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
88min for SECURITY ERASE UNIT. 88min for ENHANCED SECURITY ERASE
UNIT.
Logical Unit WWN Device Identifier: 5000cca5b7da1312
NAA : 5
IEEE OUI : 000cca
Unique ID : 5b7da1312
Checksum: correct

regards, Kyle

2010-12-01 14:20:08

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

Hello, again.

On 12/01/2010 01:44 PM, Kyle McMartin wrote:
> 00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series
...
> Model Number: HITACHI HTS725032A9A364

Nothing peculiar. Hmm... okay, I think I've found what went wrong.
Can you please try the following patch and see whether the problem
goes away?

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 5e59050..17a6378 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3275,6 +3275,7 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
struct ata_port *ap = ata_is_host_link(link) ? link->ap : NULL;
struct ata_eh_context *ehc = &link->eh_context;
struct ata_device *dev, *link_dev = NULL, *lpm_dev = NULL;
+ enum ata_lpm_policy old_policy = link->lpm_policy;
unsigned int hints = ATA_LPM_EMPTY | ATA_LPM_HIPM;
unsigned int err_mask;
int rc;
@@ -3338,6 +3339,14 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
goto fail;
}

+ /*
+ * Low level driver acked the transition. Issue DIPM command
+ * with the new policy set.
+ */
+ link->lpm_policy = policy;
+ if (ap && ap->slave_link)
+ ap->slave_link->lpm_policy = policy;
+
/* host config updated, enable DIPM if transitioning to MIN_POWER */
ata_for_each_dev(dev, link, ENABLED) {
if (policy == ATA_LPM_MIN_POWER && ata_id_has_dipm(dev->id)) {
@@ -3353,12 +3362,14 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
}
}

- link->lpm_policy = policy;
- if (ap && ap->slave_link)
- ap->slave_link->lpm_policy = policy;
return 0;

fail:
+ /* restore the old policy */
+ link->lpm_policy = old_policy;
+ if (ap && ap->slave_link)
+ ap->slave_link->lpm_policy = old_policy;
+
/* if no device or only one more chance is left, disable LPM */
if (!dev || ehc->tries[dev->devno] <= 2) {
ata_link_printk(link, KERN_WARNING,

2010-12-01 15:50:35

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Wed, Dec 01, 2010 at 03:19:06PM +0100, Tejun Heo wrote:
> Hello, again.
>
> On 12/01/2010 01:44 PM, Kyle McMartin wrote:
> > 00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series
> ...
> > Model Number: HITACHI HTS725032A9A364
>
> Nothing peculiar. Hmm... okay, I think I've found what went wrong.
> Can you please try the following patch and see whether the problem
> goes away?
>

Thanks, I'll let you know when/if it drops to 1.5gbps.

--Kyle

2010-12-01 20:09:21

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Wed, Dec 01, 2010 at 10:50:26AM -0500, Kyle McMartin wrote:
> > Nothing peculiar. Hmm... okay, I think I've found what went wrong.
> > Can you please try the following patch and see whether the problem
> > goes away?
> >
>
> Thanks, I'll let you know when/if it drops to 1.5gbps.
>

Looks to stay at 3.0 Gbps now... but gives up on EH after 5 retries?

[ 4342.048572] ata1: hard resetting link
[ 4342.507282] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 4342.510118] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.510127] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4342.510413] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.510421] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4342.514766] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.514774] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4342.515024] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.515031] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4342.516697] ata1.00: configured for UDMA/100
[ 4342.516977] ata1: hard resetting link
[ 4342.976018] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 4342.978892] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.978900] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4342.979186] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.979193] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4342.983577] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.983585] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4342.983865] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4342.983872] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4342.984989] ata1.00: configured for UDMA/100
[ 4342.985408] ata1: hard resetting link
[ 4343.445045] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 4343.447845] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.447856] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4343.448119] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.448126] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4343.451322] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.451330] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4343.451628] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.451635] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4343.452961] ata1.00: configured for UDMA/100
[ 4343.453269] ata1: hard resetting link
[ 4343.912801] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 4343.915808] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.915816] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4343.916102] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.916110] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4343.921403] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.921411] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4343.921679] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4343.921686] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4343.923580] ata1.00: configured for UDMA/100
[ 4343.923841] ata1: hard resetting link
[ 4344.383113] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 4344.386020] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4344.386029] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4344.386309] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4344.386320] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4344.390462] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4344.390470] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[ 4344.390737] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[ 4344.390744] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[ 4344.392217] ata1.00: configured for UDMA/100
[ 4344.392496] ata1: EH pending after 5 tries, giving up
[ 4344.392507] ata1: EH complete

regards, Kyle

2010-12-02 10:17:43

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On 12/01/2010 09:09 PM, Kyle McMartin wrote:
> On Wed, Dec 01, 2010 at 10:50:26AM -0500, Kyle McMartin wrote:
>>> Nothing peculiar. Hmm... okay, I think I've found what went wrong.
>>> Can you please try the following patch and see whether the problem
>>> goes away?
>>>
>>
>> Thanks, I'll let you know when/if it drops to 1.5gbps.
>>
>
> Looks to stay at 3.0 Gbps now... but gives up on EH after 5 retries?

Hmmm... something is setting EH_PENDING while EH is in progress.
Let's see what's doing it. Can you please apply the following patch
on top of the previous one and report the kernel log?

Thanks.

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 5e59050..8f0d854 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -896,6 +896,10 @@ static void ata_eh_set_pending(struct ata_port *ap, int fastdrain)

ap->pflags |= ATA_PFLAG_EH_PENDING;

+ ata_port_printk(ap, KERN_WARNING, "XXX SET EH PENDING, fastdrain=%d\n",
+ fastdrain);
+ dump_stack();
+
if (!fastdrain)
return;

2010-12-02 12:56:29

by Stanley Gan

[permalink] [raw]
Subject: debug kernel using two laptops without serial port

Hi All,
I have two dell laptops 1015. For studying Linux kernel, I want to
let one run debugging Linux as client, the other as host. The two laptops
don't have serial port, just have LAN and usb ports. How can I configure
them to get early kernel print message? I ever asked a lot of people, but
don't get answers. Please give me some practical advices, thanks in advance.


stanley

2010-12-02 13:17:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: debug kernel using two laptops without serial port

On Thu, Dec 02, 2010 at 08:56:19PM +0800, Stanley Gan wrote:
> Hi All,
> I have two dell laptops 1015. For studying Linux kernel, I want to
> let one run debugging Linux as client, the other as host. The two laptops
> don't have serial port, just have LAN and usb ports. How can I configure
> them to get early kernel print message? I ever asked a lot of people, but
> don't get answers. Please give me some practical advices, thanks in advance.

Take a look at <Documentation/networking/netconsole.txt> in the kernel
sources.

--
Regards/Gruss,
Boris.

2010-12-02 14:41:12

by Stanley Gan

[permalink] [raw]
Subject: 答复: debug kernel using two lapto ps without serial port

Hi Borislav,
I will check your recommend document. Thank you so much.

Best wishes,
Stanley
-----邮件原件-----
发件人: Borislav Petkov [mailto:[email protected]]
发送时间: 2010年12月2日 21:17
收件人: Stanley Gan
抄送: 'LKML'
主题: Re: debug kernel using two laptops without serial port

On Thu, Dec 02, 2010 at 08:56:19PM +0800, Stanley Gan wrote:
> Hi All,
> I have two dell laptops 1015. For studying Linux kernel, I want to
> let one run debugging Linux as client, the other as host. The two laptops
> don't have serial port, just have LAN and usb ports. How can I configure
> them to get early kernel print message? I ever asked a lot of people, but
> don't get answers. Please give me some practical advices, thanks in advance.

Take a look at <Documentation/networking/netconsole.txt> in the kernel
sources.

--
Regards/Gruss,
Boris.

2010-12-02 21:24:16

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Thu, Dec 02, 2010 at 11:17:09AM +0100, Tejun Heo wrote:
> Hmmm... something is setting EH_PENDING while EH is in progress.
> Let's see what's doing it. Can you please apply the following patch
> on top of the previous one and report the kernel log?
>

This gets triggered a bunch on boot-up:

[ 36.981370] ata1: XXX SET EH PENDING, fastdrain=0
[ 36.981380] Pid: 0, comm: swapper Not tainted
2.6.37-0.rc4.git1.2.fc15.x86_64 #1
[ 36.981386] Call Trace:
[ 36.981393] <IRQ> [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85
[ 36.981407] [<ffffffff8131db36>] ata_do_link_abort+0x44/0x94
[ 36.981412] [<ffffffff8131dbac>] ata_port_abort+0x10/0x12
[ 36.981421] [<ffffffff8132b0ab>] ahci_interrupt+0x467/0x5a1
[ 36.981428] [<ffffffff810afbc1>] ?
check_for_new_grace_period.clone.21+0x8b/0x97
[ 36.981434] [<ffffffff8106fb68>] ? sched_clock_cpu+0x42/0xc6
[ 36.981440] [<ffffffff810abbfe>] handle_IRQ_event+0x58/0x122
[ 36.981447] [<ffffffff81024d64>] ? apic_write+0x16/0x18
[ 36.981452] [<ffffffff810adf27>] handle_edge_irq+0xfa/0x145
[ 36.981458] [<ffffffff8100d2dd>] handle_irq+0x88/0x90
[ 36.981463] [<ffffffff8147d508>] do_IRQ+0x58/0xb0
[ 36.981468] [<ffffffff81477513>] ret_from_intr+0x0/0x11
[ 36.981470] <EOI> [<ffffffff81274eab>] ? intel_idle+0x115/0x13d
[ 36.981480] [<ffffffff81274e8a>] ? intel_idle+0xf4/0x13d
[ 36.981485] [<ffffffff8139f894>] cpuidle_idle_call+0x9e/0x104
[ 36.981489] [<ffffffff8100932e>] cpu_idle+0xae/0x105
[ 36.981493] [<ffffffff8145f2ce>] rest_init+0x72/0x74
[ 36.981498] [<ffffffff81b53c31>] start_kernel+0x3e0/0x3eb
[ 36.981503] [<ffffffff81b532cb>] x86_64_start_reservations+0xb6/0xba
[ 36.981507] [<ffffffff81b533d5>] x86_64_start_kernel+0x106/0x115
[ 37.200683] ata1: XXX SET EH PENDING, fastdrain=1
[ 37.200692] Pid: 1465, comm: ck-history Not tainted
2.6.37-0.rc4.git1.2.fc15.x86_64 #1
[ 37.200696] Call Trace:
[ 37.200699] <IRQ> [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85
[ 37.200715] [<ffffffff8131dae5>] ata_port_schedule_eh+0x43/0x50
[ 37.200720] [<ffffffff813150d7>] ata_qc_complete+0xd2/0x152
[ 37.200723] [<ffffffff8131521d>] ata_qc_complete_multiple+0xc6/0xcb
[ 37.200727] [<ffffffff8132b141>] ahci_interrupt+0x4fd/0x5a1
[ 37.200737] [<ffffffff8105c784>] ? run_timer_softirq+0x28f/0x29e
[ 37.200742] [<ffffffff810114bb>] ? native_sched_clock+0x35/0x37
[ 37.200747] [<ffffffff8106fb68>] ? sched_clock_cpu+0x42/0xc6
[ 37.200752] [<ffffffff810abbfe>] handle_IRQ_event+0x58/0x122
[ 37.200758] [<ffffffff81024d64>] ? apic_write+0x16/0x18
[ 37.200761] [<ffffffff810adf27>] handle_edge_irq+0xfa/0x145
[ 37.200766] [<ffffffff8100d2dd>] handle_irq+0x88/0x90
[ 37.200769] [<ffffffff8147d508>] do_IRQ+0x58/0xb0
[ 37.200774] [<ffffffff81477513>] ret_from_intr+0x0/0x11
[ 37.200775] <EOI> [<ffffffff81117bd1>] ? lock_page_cgroup+0x19/0x21
[ 37.200783] [<ffffffff811198c2>]
__mem_cgroup_commit_charge+0x28/0x97
[ 37.200786] [<ffffffff8111b961>] mem_cgroup_charge_common+0x66/0x76
[ 37.200789] [<ffffffff8111bc89>] mem_cgroup_cache_charge+0xe4/0xf9
[ 37.200794] [<ffffffff810d8cf3>] add_to_page_cache_locked+0x42/0x113
[ 37.200797] [<ffffffff810d8dee>] add_to_page_cache_lru+0x2a/0x58
[ 37.200802] [<ffffffff8114a883>] mpage_readpages+0xa3/0x113
[ 37.200805] [<ffffffff81199b1a>] ? ext4_get_block+0x0/0x18
[ 37.200808] [<ffffffff81199b1a>] ? ext4_get_block+0x0/0x18
[ 37.200811] [<ffffffff810ee22b>] ? zone_statistics+0x65/0x6a
[ 37.200816] [<ffffffff8110991a>] ? alloc_pages_current+0xb2/0xc3
[ 37.200821] [<ffffffff811965ec>] ext4_readpages+0x1d/0x1f
[ 37.200826] [<ffffffff810e1145>]
__do_page_cache_readahead+0x105/0x176
[ 37.200829] [<ffffffff810e141b>] ra_submit+0x21/0x25
[ 37.200831] [<ffffffff810e15db>] ondemand_readahead+0x1bc/0x1cb
[ 37.200834] [<ffffffff810e1665>]
page_cache_async_readahead+0x7b/0xa3
[ 37.200837] [<ffffffff810d9bbc>] generic_file_aio_read+0x2a5/0x5b6
[ 37.200840] [<ffffffff810e1c10>] ? __lru_cache_add+0x34/0x5b
[ 37.200844] [<ffffffff8111f602>] do_sync_read+0xcb/0x108
[ 37.200848] [<ffffffff811ebbdb>] ? fsnotify_perm+0x64/0x70
[ 37.200852] [<ffffffff811ec327>] ?
security_file_permission+0x2e/0x33
[ 37.200855] [<ffffffff8111fce5>] vfs_read+0xa9/0xfc
[ 37.200857] [<ffffffff8111fd82>] sys_read+0x4a/0x6e
[ 37.200860] [<ffffffff8100acc2>] system_call_fastpath+0x16/0x1b
[ 37.203575] ata1.00: configured for UDMA/100
[ 37.203582] ata1: EH complete
[ 37.239419] sd 0:0:0:0: [sda] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[ 38.145358] ata1: XXX SET EH PENDING, fastdrain=1
[ 38.145367] Pid: 1489, comm: sata_alpm Not tainted
2.6.37-0.rc4.git1.2.fc15.x86_64 #1
[ 38.145371] Call Trace:
[ 38.145384] [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85
[ 38.145391] [<ffffffff8131dae5>] ata_port_schedule_eh+0x43/0x50
[ 38.145400] [<ffffffff81476f5a>] ? _raw_spin_lock_irqsave+0x12/0x2f
[ 38.145406] [<ffffffff8131993b>] ata_scsi_lpm_store+0x89/0xa5
[ 38.145413] [<ffffffff812e3912>] dev_attr_store+0x20/0x22
[ 38.145422] [<ffffffff811798df>] sysfs_write_file+0x114/0x150
[ 38.145430] [<ffffffff8111fbe9>] vfs_write+0xac/0xff
[ 38.145436] [<ffffffff8111fdf0>] sys_write+0x4a/0x6e
[ 38.145444] [<ffffffff8100acc2>] system_call_fastpath+0x16/0x1b
[ 38.145710] ata2: XXX SET EH PENDING, fastdrain=1
[ 38.145718] Pid: 1489, comm: sata_alpm Not tainted
2.6.37-0.rc4.git1.2.fc15.x86_64 #1
[ 38.145723] Call Trace:
[ 38.145730] [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85
[ 38.145737] [<ffffffff8131dae5>] ata_port_schedule_eh+0x43/0x50
[ 38.145743] [<ffffffff81476f5a>] ? _raw_spin_lock_irqsave+0x12/0x2f
[ 38.145748] [<ffffffff8131993b>] ata_scsi_lpm_store+0x89/0xa5
[ 38.145754] [<ffffffff812e3912>] dev_attr_store+0x20/0x22
[ 38.145760] [<ffffffff811798df>] sysfs_write_file+0x114/0x150
[ 38.145767] [<ffffffff8111fbe9>] vfs_write+0xac/0xff
[ 38.145773] [<ffffffff8111fdf0>] sys_write+0x4a/0x6e
[ 38.145779] [<ffffffff8100acc2>] system_call_fastpath+0x16/0x1b


My whole dmesg from bootup is available at:
http://kyle.fedorapeople.org/dmesg

There's nothing interesting in my kernel wrt ATA (no patches at all
touching it aside from the two from this thread.) Maybe systemd or
something during boot-up is poking sysfs, I'll take a look.

--Kyle

2010-12-03 10:10:30

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

Hello, Kyle.

On 12/02/2010 10:23 PM, Kyle McMartin wrote:
> On Thu, Dec 02, 2010 at 11:17:09AM +0100, Tejun Heo wrote:
>> Hmmm... something is setting EH_PENDING while EH is in progress.
>> Let's see what's doing it. Can you please apply the following patch
>> on top of the previous one and report the kernel log?
>>
>
> This gets triggered a bunch on boot-up:

Yeah, that's expected. Can you please do the usual testing routine
and trigger the EH giving up message and then attach the full kernel
log?

Thanks.

--
tejun

2010-12-03 11:59:12

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Fri, Dec 03, 2010 at 11:10:12AM +0100, Tejun Heo wrote:
> > This gets triggered a bunch on boot-up:
>
> Yeah, that's expected. Can you please do the usual testing routine
> and trigger the EH giving up message and then attach the full kernel
> log?
>

Ahh, ok, sure.

Sorry 'bout that.

--Kyle

2010-12-06 16:07:46

by Kyle McMartin

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

On Fri, Dec 03, 2010 at 06:59:02AM -0500, Kyle McMartin wrote:
> On Fri, Dec 03, 2010 at 11:10:12AM +0100, Tejun Heo wrote:
> > > This gets triggered a bunch on boot-up:
> >
> > Yeah, that's expected. Can you please do the usual testing routine
> > and trigger the EH giving up message and then attach the full kernel
> > log?
> >
>

OK, attached is a couple days worth of suspend/resume, hopefully that
turns up what you're looking for.

regards, Kyle



Attachments:
(No filename) (459.00 B)
dmesg.dreadnought-2010-12-06.txt (122.39 kB)
Download all attachments

2010-12-06 17:47:39

by Tejun Heo

[permalink] [raw]
Subject: Re: [git patches] libata updates for 2.6.37

Hello,

On 12/06/2010 05:07 PM, Kyle McMartin wrote:
> On Fri, Dec 03, 2010 at 06:59:02AM -0500, Kyle McMartin wrote:
>> On Fri, Dec 03, 2010 at 11:10:12AM +0100, Tejun Heo wrote:
>>>> This gets triggered a bunch on boot-up:
>>>
>>> Yeah, that's expected. Can you please do the usual testing routine
>>> and trigger the EH giving up message and then attach the full kernel
>>> log?
>>>
>>
>
> OK, attached is a couple days worth of suspend/resume, hopefully that
> turns up what you're looking for.

Heh, yeah, actually it did. Can you please apply the following patch
on top and see whether the problem goes away?

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7f77c67..f23d6d4 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4807,9 +4807,6 @@ static void ata_verify_xfer(struct ata_queued_cmd *qc)
{
struct ata_device *dev = qc->dev;

- if (ata_tag_internal(qc->tag))
- return;
-
if (ata_is_nodata(qc->tf.protocol))
return;

@@ -4858,14 +4855,23 @@ void ata_qc_complete(struct ata_queued_cmd *qc)
if (unlikely(qc->err_mask))
qc->flags |= ATA_QCFLAG_FAILED;

- if (unlikely(qc->flags & ATA_QCFLAG_FAILED)) {
- /* always fill result TF for failed qc */
+ /*
+ * Finish internal commands without any further processing
+ * and always with the result TF filled.
+ */
+ if (unlikely(ata_tag_internal(qc->tag))) {
fill_result_tf(qc);
+ __ata_qc_complete(qc);
+ return;
+ }

- if (!ata_tag_internal(qc->tag))
- ata_qc_schedule_eh(qc);
- else
- __ata_qc_complete(qc);
+ /*
+ * Non-internal qc has failed. Fill the result TF and
+ * summon EH.
+ */
+ if (unlikely(qc->flags & ATA_QCFLAG_FAILED)) {
+ fill_result_tf(qc);
+ ata_qc_schedule_eh(qc);
return;
}