2012-10-16 14:49:36

by Borislav Petkov

[permalink] [raw]
Subject: ata4.00: failed to get Identify Device Data, Emask 0x1

Hi all,

a bunch of my boxes started showing this on 3.7-rc1 (and maybe earlier):

[ 4.667077] ata4.00: failed to get Identify Device Data, Emask 0x1
[ 4.675071] ata4.00: failed to get Identify Device Data, Emask 0x1

Another one:

[ 3.325371] ata4.00: failed to get Identify Device Data, Emask 0x1
[ 3.488602] ata4.00: failed to get Identify Device Data, Emask 0x1

This last one is a laptop which suspends/resumes so this message happens
each time the driver gets initialized.

[ 1.389734] ata1.00: failed to get Identify Device Data, Emask 0x1
[ 1.395031] ata1.00: failed to get Identify Device Data, Emask 0x1
[16825.339587] ata1.00: failed to get Identify Device Data, Emask 0x1
[16825.345835] ata1.00: failed to get Identify Device Data, Emask 0x1
[16842.294983] ata1.00: failed to get Identify Device Data, Emask 0x1
[16842.300513] ata1.00: failed to get Identify Device Data, Emask 0x1
[23628.820487] ata1.00: failed to get Identify Device Data, Emask 0x1
[23628.825556] ata1.00: failed to get Identify Device Data, Emask 0x1
[23661.555111] ata1.00: failed to get Identify Device Data, Emask 0x1
[23661.560745] ata1.00: failed to get Identify Device Data, Emask 0x1

Grepping points to:

/* Obtain SATA Settings page from Identify Device Data Log,
* which contains DevSlp timing variables etc.
* Exclude old devices with ata_id_has_ncq()
*/
if (ata_id_has_ncq(dev->id)) {
err_mask = ata_read_log_page(dev,
ATA_LOG_SATA_ID_DEV_DATA,
ATA_LOG_SATA_SETTINGS,
dev->sata_settings,
1);
if (err_mask)
ata_dev_dbg(dev,
"failed to get Identify Device Data, Emask 0x%x\n",
err_mask);
}

in ata_dev_configure().

Judging by the logs, it must've came in during this merge window:

Oct 4 11:08:13 kepek kernel: [ 4.670357] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 11:08:13 kepek kernel: [ 4.678833] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 11:29:23 kepek kernel: [ 4.690456] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 11:29:23 kepek kernel: [ 4.698473] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 11:54:20 kepek kernel: [ 4.706777] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 11:54:20 kepek kernel: [ 4.715194] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 14:30:17 kepek kernel: [ 4.666199] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 14:30:17 kepek kernel: [ 4.674742] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 15:30:17 kepek kernel: [ 4.686204] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 4 15:30:17 kepek kernel: [ 4.694194] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 10 13:09:40 kepek kernel: [ 4.670245] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 10 13:09:40 kepek kernel: [ 4.678429] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 10 15:28:56 kepek kernel: [ 4.658456] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 10 15:28:56 kepek kernel: [ 4.666485] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 12 10:38:17 kepek kernel: [ 4.694542] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 12 10:38:17 kepek kernel: [ 4.702561] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 16 14:30:26 kepek kernel: [ 4.667077] ata4.00: failed to get Identify Device Data, Emask 0x1
Oct 16 14:30:26 kepek kernel: [ 4.675071] ata4.00: failed to get Identify Device Data, Emask 0x1

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551


2012-10-16 14:53:32

by Alan

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On Tue, 16 Oct 2012 16:49:32 +0200
Borislav Petkov <[email protected]> wrote:

> Hi all,
>
> a bunch of my boxes started showing this on 3.7-rc1 (and maybe earlier):
>
> [ 4.667077] ata4.00: failed to get Identify Device Data, Emask 0x1
> [ 4.675071] ata4.00: failed to get Identify Device Data, Emask 0x1

Can you check whether 3.6 works on them. I know 3.6 is horribly broken on
several brands of AHCI controller (Jmicron for example). Dunno where Jeff
is on fixing the regressions ?

Alan

2012-10-16 15:18:34

by Borislav Petkov

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On Tue, Oct 16, 2012 at 03:58:24PM +0100, Alan Cox wrote:
> Can you check whether 3.6 works on them. I know 3.6 is horribly broken
> on several brands of AHCI controller (Jmicron for example). Dunno
> where Jeff is on fixing the regressions ?

If by "works" you mean I don't see the message there, then yes, it does.
Logs say the message started appearing on Oct 4th after me building
Linus master after the merge window started.

Ok, let me test 3.6.2 just in case ...<tests>... yes, no error message
there.

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

2012-10-17 01:38:45

by Aaron Lu

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On 10/16/2012 11:18 PM, Borislav Petkov wrote:
> On Tue, Oct 16, 2012 at 03:58:24PM +0100, Alan Cox wrote:
>> Can you check whether 3.6 works on them. I know 3.6 is horribly broken
>> on several brands of AHCI controller (Jmicron for example). Dunno
>> where Jeff is on fixing the regressions ?
>
> If by "works" you mean I don't see the message there, then yes, it does.
> Logs say the message started appearing on Oct 4th after me building
> Linus master after the merge window started.
>
> Ok, let me test 3.6.2 just in case ...<tests>... yes, no error message
> there.

This is brought by commit: 65fe1f0f66a57380229a4ced844188103135f37b,
ahci: implement aggressive SATA device sleep support.

Shane, got time to take a look? This debug message made people
uncomfortable :-)

Thanks,
Aaron

2012-10-17 04:50:59

by Robert Hancock

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On 10/16/2012 07:38 PM, Aaron Lu wrote:
> On 10/16/2012 11:18 PM, Borislav Petkov wrote:
>> On Tue, Oct 16, 2012 at 03:58:24PM +0100, Alan Cox wrote:
>>> Can you check whether 3.6 works on them. I know 3.6 is horribly broken
>>> on several brands of AHCI controller (Jmicron for example). Dunno
>>> where Jeff is on fixing the regressions ?
>>
>> If by "works" you mean I don't see the message there, then yes, it does.
>> Logs say the message started appearing on Oct 4th after me building
>> Linus master after the merge window started.
>>
>> Ok, let me test 3.6.2 just in case ...<tests>... yes, no error message
>> there.
>
> This is brought by commit: 65fe1f0f66a57380229a4ced844188103135f37b,
> ahci: implement aggressive SATA device sleep support.
>
> Shane, got time to take a look? This debug message made people
> uncomfortable :-)

I don't have whatever version of ATA command set defines this command,
but surely there's some identify bit which lists whether this log page
is supported. Right now checking for it is only conditional on NCQ support.

2012-10-17 06:55:58

by Aaron Lu

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On 10/17/2012 12:50 PM, Robert Hancock wrote:
> On 10/16/2012 07:38 PM, Aaron Lu wrote:
>> On 10/16/2012 11:18 PM, Borislav Petkov wrote:
>>> On Tue, Oct 16, 2012 at 03:58:24PM +0100, Alan Cox wrote:
>>>> Can you check whether 3.6 works on them. I know 3.6 is horribly broken
>>>> on several brands of AHCI controller (Jmicron for example). Dunno
>>>> where Jeff is on fixing the regressions ?
>>>
>>> If by "works" you mean I don't see the message there, then yes, it does.
>>> Logs say the message started appearing on Oct 4th after me building
>>> Linus master after the merge window started.
>>>
>>> Ok, let me test 3.6.2 just in case ...<tests>... yes, no error message
>>> there.
>>
>> This is brought by commit: 65fe1f0f66a57380229a4ced844188103135f37b,
>> ahci: implement aggressive SATA device sleep support.
>>
>> Shane, got time to take a look? This debug message made people
>> uncomfortable :-)
>
> I don't have whatever version of ATA command set defines this command,
> but surely there's some identify bit which lists whether this log page
> is supported. Right now checking for it is only conditional on NCQ support.

Agree. If NCQ does not imply support of this log page, we should
definitely refine the check condition used here.

I suppose Shane will take care of this, but if he doesn't, I'll do that
at a later time.

Thanks,
Aaron

2012-10-18 09:11:28

by Huang, Shane

[permalink] [raw]
Subject: RE: ata4.00: failed to get Identify Device Data, Emask 0x1

> Agree. If NCQ does not imply support of this log page, we should
> definitely refine the check condition used here.
>
> I suppose Shane will take care of this, but if he doesn't, I'll do that
> at a later time.

I tried word 78 bit 5(Hardware Feature Control) which does not work,
it is 0 on my HDD sample with log 30h page 08h and DevSlp supported.

Seems that word 78 bit 5 is only the sufficient condition, not the
essential condition. Do you guys have suggestion?

Quoting SATA spec:
> Word 78: Serial ATA features supported
> Bit 5 If bit 5 is set to one, then Hardware Feature Control is
> supported (see 13.10). If bit 5 is cleared to zero, then Hardware
> Feature Control is not supported and IDENTIFY DEVICE data
> word 79 bit 5 shall be cleared to zero.
>
> If Hardware Feature Control is supported, then:
> a) IDENTIFY DEVICE data word 78 bit 5 (see 13.2.1.18) shall be
> set to one;
> b) the SET FEATURES Select Hardware Feature Control subcommand
> shall be supported (see 13.3.8);
> c) page 08h of the Identify Device Data log (see 13.7.7) shall
> be supported;


Thanks,
Shane

2012-11-16 16:03:09

by Huang, Shane

[permalink] [raw]
Subject: RE: ata4.00: failed to get Identify Device Data, Emask 0x1

> I tried word 78 bit 5(Hardware Feature Control) which does not work,
> it is 0 on my HDD sample with log 30h page 08h and DevSlp supported.
>
> Seems that word 78 bit 5 is only the sufficient condition, not the
> essential condition. Do you guys have suggestion?

Eventually I received the confirmation from the DevSlp HDD vendor,
bit 5 should be and will be set in production drives with log 30h
page 08h supported. So I will submit a patch to use it instead.


Hi Jeff,

I don't know when I will receive some production drives to verify
my patch, are you okay if I submit my patch first without testing
so as to meet kernel 3.7 bug fix window?

Thanks,
Shane

2012-11-16 17:44:35

by Jeff Garzik

[permalink] [raw]
Subject: Re: ata4.00: failed to get Identify Device Data, Emask 0x1

On 11/16/2012 11:02 AM, Huang, Shane wrote:
>> I tried word 78 bit 5(Hardware Feature Control) which does not work,
>> it is 0 on my HDD sample with log 30h page 08h and DevSlp supported.
>>
>> Seems that word 78 bit 5 is only the sufficient condition, not the
>> essential condition. Do you guys have suggestion?
>
> Eventually I received the confirmation from the DevSlp HDD vendor,
> bit 5 should be and will be set in production drives with log 30h
> page 08h supported. So I will submit a patch to use it instead.
>
>
> Hi Jeff,
>
> I don't know when I will receive some production drives to verify
> my patch, are you okay if I submit my patch first without testing
> so as to meet kernel 3.7 bug fix window?

Yes, please do.

Jeff