2010-06-17 06:05:37

by Jeff Chua

[permalink] [raw]
Subject: What is HSM violation? Causing ata1.00 hard resetting link


Hi,

I just got an external e-SATA hard disk, and I'm testing for backup.

When I tried to "tar" 90GB to it, it caused a HSM violation, and since
this
is the root file system, system is brain-dead. "Bus error" for every
command
issued!


Error is (typed as was captured in the screen) ...

ata1.00: exception Emask 0x2 SAct 0x7 SErr 0x3000400 action 0x6
ata1.00: irq_stat 0x44000008
ata1.00: cmd 60/30:00:37:b0:84/00:00:17:00:00/40 tag 0 ncq 24576 in
res 40/00:08:6f:b0:84/00:00:17:00:00/40 Emask 0x2 (HSM violation)
ata1.00: cmd 60/50:08:6f:b0:84/00:00:17:00:00/40 tag 1 ncq 40960 in
res 51/04:00:6f:b0:84/00:00:17:00:00/40 Emask 0x403 (HSM violation)
ata1.00: cmd 60/00:10:bf:b0:84/01:00:17:00:00/40 tag 2 ncq 131072 in
res 40/00:08:6f:b0:84/00:00:17:00:00/40 Emask 0x2 (HSM violation)
ata1: hard resetting link
ata1: softreset failed (device not ready)
ata1: hard resetting link
ata1: softreset failed (device not ready)
ata1: hard resetting link
ata1: link is slow to respond, please be patient (ready=0)


Running on SSD on Linux 2.6.35-rc3.

This does not happen all the time, but I've seen this if it's copying
intensively.

Thanks,
Jeff.


2010-06-17 23:41:37

by Robert Hancock

[permalink] [raw]
Subject: Re: What is HSM violation? Causing ata1.00 hard resetting link

On Thu, Jun 17, 2010 at 12:05 AM, Jeff Chua <[email protected]> wrote:
>
> Hi,
>
> I just got an external e-SATA hard disk, and I'm testing for backup.
>
> When I tried to "tar" 90GB to it, it caused a HSM violation, and since this
> is the root file system, system is brain-dead. "Bus error" for every command
> issued!
>
>
> Error is (typed as was captured in the screen) ...
>
> ata1.00: exception Emask 0x2 SAct 0x7 SErr 0x3000400 action 0x6
> ata1.00: irq_stat 0x44000008

The controller is reporting a protocol violation in Serror, and the
IRQ status reports a taskfile error and unknown FIS received from the
device.

What kind of external drive is this? Seems to me like either it's
doing something not quite kosher, or there's some other problem (maybe
the eSATA cable is bad, or it doesn't seat properly in the connector -
apparently that's not uncommon).

> ata1.00: cmd 60/30:00:37:b0:84/00:00:17:00:00/40 tag 0 ncq 24576 in
> ? ? ? res 40/00:08:6f:b0:84/00:00:17:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: cmd 60/50:08:6f:b0:84/00:00:17:00:00/40 tag 1 ncq 40960 in
> ? ? ? res 51/04:00:6f:b0:84/00:00:17:00:00/40 Emask 0x403 (HSM violation)
> ata1.00: cmd 60/00:10:bf:b0:84/01:00:17:00:00/40 tag 2 ncq 131072 in
> ? ? ? res 40/00:08:6f:b0:84/00:00:17:00:00/40 Emask 0x2 (HSM violation)
> ata1: hard resetting link
> ata1: softreset failed (device not ready)
> ata1: hard resetting link
> ata1: softreset failed (device not ready)
> ata1: hard resetting link
> ata1: link is slow to respond, please be patient (ready=0)
>
>
> Running on SSD on Linux 2.6.35-rc3.
>
> This does not happen all the time, but I've seen this if it's copying
> intensively.
>
> Thanks,
> Jeff.
>

2010-06-18 02:53:49

by Robert Hancock

[permalink] [raw]
Subject: Re: What is HSM violation? Causing ata1.00 hard resetting link

On Thu, Jun 17, 2010 at 8:33 PM, Jeff Chua <[email protected]> wrote:
>
>
> On Fri, Jun 18, 2010 at 7:41 AM, Robert Hancock <[email protected]>
> wrote:
>>
>> On Thu, Jun 17, 2010 at 12:05 AM, Jeff Chua <[email protected]>
>> wrote:
>>
>> > ata1.00: exception Emask 0x2 SAct 0x7 SErr 0x3000400 action 0x6
>> > ata1.00: irq_stat 0x44000008
>>
>> The controller is reporting a protocol violation in Serror, and the
>> IRQ status reports a taskfile error and unknown FIS received from the
>> device.
>>
>> What kind of external drive is this? Seems to me like either it's
>> doing something not quite kosher, or there's some other problem (maybe
>> the eSATA cable is bad, or it doesn't seat properly in the connector -
>> apparently that's not uncommon).
>
> ata1.00 is the internal AHCI connector [Intel Corporation 5 Series/3400
> Series Chipset 6 port SATA AHCI Controller (rev 06)] with a? [SAMSUNG
> MMDOE56GG5MXP-OVB rev 0 with f/w VBM1801Q] (256GB SATA) attached.
>
> The errors were pointing to ata1.00 and after the HSM violation appeared,
> the disk was no longer accessible.

OK, so it was the SSD that had the problem, not the eSATA drive.

It does seem like the drive did something the controller really didn't
like, though. Is there any firmware update available for the drive?

>
> The external e-SATA has a JMicro [JMicron JMB360 AHCI Controller (rev 02)]
> with the same model Samsung SSD attached.
>
> If I do a "sync" every 3 seconds, I don't see any error. I had seen this
> errors in the past doing tar to external USB2 disks ... and it could be the
> same problem but just didn't notice it before.
>
> Thanks,
> Jeff
>
>