2008-07-26 06:20:55

by Elias Oltmanns

[permalink] [raw]
Subject: [RFC] Disk shock protection in GNU/Linux

Hi all,

yet again, I'm trying to get this feature into the Linux kernel. This
time, I have tried to arrange the code in such a way that hardly
anything not strictly belonging to the libata or ide subsystem is
touched at all. Since former attempts to design this in a more generic
way involving the block and scsi layer have proven very difficult and
since this is all about an ATA specific feature anyway, I think the best
solution (for now at least) is to stick to those two subsystems as far
as possible.

Nevertheless, I'd welcome James' opinion on the third patch in the
series with regard to the use of vendor specific cdbs. If it is
acceptable in principle, the question remains whether we sould reserve
part of the range 0xf800 - 0xffff of vendor specific service actions for
the exclusive use by LLDDs (in much the way I did it here), or whether
all these codes should be unique in the kernel and defined in scsi.h.

Also, we have to agree upon a suitable way to deal with devices that
actually do support the UNLOAD FEATURE but don't report that capability
in their ID as specified in ATA-7. Perhaps we even want to fall back to
STANDBY IMMEDIATE on devices that definitely don't support IDLE
IMMEDIATE with UNLOAD FEATURE; this has been the case for the original
hdaps / disk-protect patch but I'm not quite sure whether anybody
actually relies on this feature and whether it would be advisable in the
first place. Regarding devices that support head unloading but don't
report it properly, there seem to be the following options:

a) Provide an attribute in sysfs for the user to indicate that the
device really does support the UNLOAD FEATURE.
b) Issue the command (for the first time) anyway and check whether the
device responds as specified in ATA-7. Make a note of the result and
act accordingly from there on.

The second option looks appealing because no user intervention is
required and it is known to work at least for some devices. On the other
hand, it may be dangerous on old hardware and I don't even know whether
all pre-ATA-7 devices supporting the UNLOAD FEATURE really do return the
correct data in the command registers.

Here is a short description of the patches in this series applying
cleanly on top of next-20080725:

1. This is a trivial patch making things slightly easier for #3. More
importantly though, it also fixes a potential memory leak and should
probably be applied regardless of the rest of the series (Jeff?).
2. This is a small patch to ata.h in order to provide a simple check for
support of the UNLOAD FEATURE as indicated in a device's ID.
3. Here disk head unloading is implemented in the libata subsystem. See
the changelog entry for current shortcomings of this patch.
4. The same for ide.
5. A little bit of documentation.

There are some questions I need your advice on that have been inserted
as FIXME comments into the code. However, here is a more general
question that interests me even though it could turn out to be a good
idea (or at least acceptable) to merge these patches before that
question has been fully resolved. Alan once mentioned that it might be
desirable to abort commands currently in progress and issue IDLE
IMMEDIATE straight away. My question is whether this is at all possible
for non SATA command protocols. In volume 2 of the ATA-7 specification I
haven't found a well defined way to abort or shor circuit, say, a DMA or
PIO command in progress. It even says that after the host has aborted
data transfer when DMA was in progress, it is supposed to initiate a
SRST. Obviously, I'd like to avoid that at all costs.

Thank you in advance for your cooperation,

Elias


2008-07-26 06:23:47

by Elias Oltmanns

[permalink] [raw]
Subject: [PATCH 1/5] Make sure that ata_force_tbl is freed in case of an error

This is just a trivial fix of a potential memory leak when ata_init()
encounters an error.

Signed-off-by: Elias Oltmanns <[email protected]>
---

drivers/ata/libata-core.c | 14 +++++++++-----
1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 9bef1a8..0a2f921 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6088,16 +6088,20 @@ static int __init ata_init(void)

ata_wq = create_workqueue("ata");
if (!ata_wq)
- return -ENOMEM;
+ goto free_force_tbl;

ata_aux_wq = create_singlethread_workqueue("ata_aux");
- if (!ata_aux_wq) {
- destroy_workqueue(ata_wq);
- return -ENOMEM;
- }
+ if (!ata_aux_wq)
+ goto free_wq;

printk(KERN_DEBUG "libata version " DRV_VERSION " loaded.\n");
return 0;
+
+free_wq:
+ destroy_workqueue(ata_wq);
+free_force_tbl:
+ kfree(ata_force_tbl);
+ return -ENOMEM;
}

static void __exit ata_exit(void)

2008-07-26 06:24:58

by Elias Oltmanns

[permalink] [raw]
Subject: [PATCH 2/5] Introduce ata_id_has_unload()

Add a function to check an ATA device's id for head unload support as
specified in ATA-7.

Signed-off-by: Elias Oltmanns <[email protected]>
---

include/linux/ata.h | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/include/linux/ata.h b/include/linux/ata.h
index cf4ef6d..c92ac10 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -530,6 +530,21 @@ static inline bool ata_id_has_dipm(const u16 *id)
}


+static inline int ata_id_has_unload(const u16 *id)
+{
+ /*
+ * ATA-7 specifies two places to indicate unload feature support.
+ * Since I don't really understand the difference, I'll just check
+ * both and only return zero if none of them indicates otherwise.
+ */
+ if ((id[ATA_ID_CFSSE] & 0xC000) == 0x4000
+ && id[ATA_ID_CFSSE] & (1 << 13))
+ return id[ATA_ID_CFSSE] & (1 << 13);
+ if ((id[ATA_ID_CSF_DEFAULT] & 0xC000) == 0x4000)
+ return id[ATA_ID_CSF_DEFAULT] & (1 << 13);
+ return 0;
+}
+
static inline int ata_id_has_fua(const u16 *id)
{
if ((id[ATA_ID_CFSSE] & 0xC000) != 0x4000)

2008-07-26 06:26:20

by Elias Oltmanns

[permalink] [raw]
Subject: [PATCH 3/5] libata: Implement disk shock protection support

On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
FEATURE as specified in ATA-7 is issued to the device and processing of
the request queue is stopped thereafter until the speified timeout
expires or user space asks to resume normal operation. This is supposed
to prevent the heads of a hard drive from accidentally crashing onto the
platter when a heavy shock is anticipated (like a falling laptop
expected to hit the floor). This patch simply stops processing the
request queue. In particular, it does not yet, for instance, defer an
SRST issued in order to recover from an error on the other device on the
interface.

Signed-off-by: Elias Oltmanns <[email protected]>
---

drivers/ata/ahci.c | 1
drivers/ata/ata_piix.c | 6 +
drivers/ata/libata-core.c | 6 +
drivers/ata/libata-scsi.c | 343 +++++++++++++++++++++++++++++++++++++++++++--
drivers/ata/libata.h | 11 +
include/linux/libata.h | 6 +
6 files changed, 360 insertions(+), 13 deletions(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index dc7596f..cbf86fa 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -316,6 +316,7 @@ static struct device_attribute *ahci_shost_attrs[] = {

static struct device_attribute *ahci_sdev_attrs[] = {
&dev_attr_sw_activity,
+ &dev_attr_unload_heads,
NULL
};

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index a90ae03..e4cdd86 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -289,8 +289,14 @@ static struct pci_driver piix_pci_driver = {
#endif
};

+static struct device_attribute *piix_sdev_attrs[] = {
+ &dev_attr_unload_heads,
+ NULL
+};
+
static struct scsi_host_template piix_sht = {
ATA_BMDMA_SHT(DRV_NAME),
+ .sdev_attrs = piix_sdev_attrs,
};

static struct ata_port_operations piix_pata_ops = {
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0a2f921..07737ec 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6094,6 +6094,11 @@ static int __init ata_init(void)
if (!ata_aux_wq)
goto free_wq;

+ if (ata_scsi_register_pm_notifier()) {
+ destroy_workqueue(ata_aux_wq);
+ goto free_wq;
+ }
+
printk(KERN_DEBUG "libata version " DRV_VERSION " loaded.\n");
return 0;

@@ -6109,6 +6114,7 @@ static void __exit ata_exit(void)
kfree(ata_force_tbl);
destroy_workqueue(ata_wq);
destroy_workqueue(ata_aux_wq);
+ ata_scsi_unregister_pm_notifier();
}

subsys_initcall(ata_init);
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index b5085cd..4bc0334 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -46,13 +46,16 @@
#include <linux/libata.h>
#include <linux/hdreg.h>
#include <linux/uaccess.h>
+#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
+# include <linux/suspend.h>
+#endif

#include "libata.h"

#define SECTOR_SIZE 512
#define ATA_SCSI_RBUF_SIZE 4096

-static DEFINE_SPINLOCK(ata_scsi_rbuf_lock);
+static DEFINE_SPINLOCK(ata_scsi_lock);
static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE];

typedef unsigned int (*ata_xlat_func_t)(struct ata_queued_cmd *qc);
@@ -113,6 +116,82 @@ static struct scsi_transport_template ata_scsi_transport_template = {
.user_scan = ata_scsi_user_scan,
};

+#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
+static int ata_scsi_park_count = 1;
+DECLARE_WAIT_QUEUE_HEAD(ata_scsi_park_wq);
+
+static inline int ata_scsi_suspend_parking(void)
+{
+ spin_lock_irq(&ata_scsi_lock);
+ if (ata_scsi_park_count == 1)
+ ata_scsi_park_count = 0;
+ spin_unlock_irq(&ata_scsi_lock);
+ return !ata_scsi_park_count;
+}
+
+static int ata_scsi_pm_notifier(struct notifier_block *nb, unsigned long val,
+ void *null)
+{
+ switch (val) {
+ case PM_SUSPEND_PREPARE:
+ wait_event(ata_scsi_park_wq, ata_scsi_suspend_parking());
+ break;
+ case PM_POST_SUSPEND:
+ ata_scsi_park_count = 1;
+ break;
+ default:
+ return NOTIFY_DONE;
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block ata_scsi_pm_notifier_block = {
+ .notifier_call = ata_scsi_pm_notifier,
+};
+
+int ata_scsi_register_pm_notifier(void)
+{
+ return register_pm_notifier(&ata_scsi_pm_notifier_block);
+}
+
+int ata_scsi_unregister_pm_notifier(void)
+{
+ return unregister_pm_notifier(&ata_scsi_pm_notifier_block);
+}
+
+static inline void ata_scsi_signal_unpark(void)
+{
+ spin_lock(&ata_scsi_lock);
+ ata_scsi_park_count--;
+ spin_unlock(&ata_scsi_lock);
+ wake_up_all(&ata_scsi_park_wq);
+}
+
+static inline int ata_scsi_mod_park_timer(struct timer_list *timer,
+ unsigned long timeout)
+{
+ int ret;
+
+ spin_lock(&ata_scsi_lock);
+ if (likely(ata_scsi_park_count)) {
+ ret = mod_timer(timer, timeout);
+ if (!ret)
+ ata_scsi_park_count++;
+ } else
+ ret = -EBUSY;
+ spin_unlock(&ata_scsi_lock);
+ return ret;
+}
+#else /* defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION) */
+static inline void ata_scsi_signal_unpark(void) { }
+
+static inline int ata_scsi_mod_park_timer(struct timer_list *timer,
+ unsigned long timeout)
+{
+ return mod_timer(timer, timeout);
+}
+#endif /* defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION) */
+

static const struct {
enum link_pm value;
@@ -183,6 +262,102 @@ DEVICE_ATTR(link_power_management_policy, S_IRUGO | S_IWUSR,
ata_scsi_lpm_show, ata_scsi_lpm_put);
EXPORT_SYMBOL_GPL(dev_attr_link_power_management_policy);

+static ssize_t ata_scsi_park_show(struct device *device,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev = to_scsi_device(device);
+ struct ata_port *ap;
+ struct ata_device *dev;
+ struct request_queue *q = sdev->request_queue;
+ unsigned int seconds;
+
+ ap = ata_shost_to_port(sdev->host);
+ dev = ata_scsi_find_dev(ap, sdev);
+ if (!dev)
+ return -ENODEV;
+ if (!ata_id_has_unload(dev->id))
+ return -EOPNOTSUPP;
+
+ spin_lock_irq(q->queue_lock);
+ if (timer_pending(&dev->park_timer))
+ /*
+ * Adding 1 in order to guarantee nonzero value until timer
+ * has actually expired.
+ */
+ seconds = jiffies_to_msecs(dev->park_timer.expires - jiffies)
+ / 1000 + 1;
+ else
+ seconds = 0;
+ spin_unlock_irq(q->queue_lock);
+ return snprintf(buf, 20, "%u\n", seconds);
+}
+
+static void ata_scsi_issue_park_cmd(struct ata_device *dev,
+ struct request *rq, __be16 sa_op);
+
+static ssize_t ata_scsi_park_store(struct device *device,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+#define MAX_PARK_TIMEOUT 30
+ struct scsi_device *sdev = to_scsi_device(device);
+ struct ata_port *ap;
+ struct ata_device *dev;
+ struct request_queue *q = sdev->request_queue;
+ struct request *rq;
+ unsigned long seconds;
+ char *p;
+ int skipped_cmd = 0, rc = 0;
+
+ seconds = simple_strtoul((char *)buf, &p, 0);
+ if (p == buf || (*p != '\0' && (*p != '\n' || *(p + 1) != '\0'))
+ || seconds > MAX_PARK_TIMEOUT)
+ return -EINVAL;
+
+ ap = ata_shost_to_port(sdev->host);
+ dev = ata_scsi_find_dev(ap, sdev);
+ if (unlikely(!dev))
+ return -ENODEV;
+ if (!ata_id_has_unload(dev->id))
+ return -EOPNOTSUPP;
+
+ rq = blk_get_request(q, 0, __GFP_WAIT);
+ spin_lock_irq(q->queue_lock);
+ if (seconds) {
+ if (unlikely(scsi_device_get(sdev))) {
+ rc = -ENXIO;
+ skipped_cmd = 1;
+ goto free_rq;
+ }
+ rc = ata_scsi_mod_park_timer(&dev->park_timer,
+ msecs_to_jiffies(seconds * 1000)
+ + jiffies);
+ if (rc) {
+ scsi_device_put(sdev);
+ skipped_cmd = 1;
+ if (likely(rc == 1))
+ rc = 0;
+ } else
+ ata_scsi_issue_park_cmd(dev, rq, PARK_HEADS);
+ } else {
+ if (del_timer(&dev->park_timer)) {
+ ata_scsi_issue_park_cmd(dev, rq, UNPARK_HEADS);
+ ata_scsi_signal_unpark();
+ scsi_device_put(sdev);
+ } else
+ skipped_cmd = 1;
+ }
+free_rq:
+ if (skipped_cmd)
+ __blk_put_request(q, rq);
+ spin_unlock_irq(q->queue_lock);
+
+ return rc ? rc : len;
+}
+DEVICE_ATTR(unload_heads, S_IRUGO | S_IWUSR,
+ ata_scsi_park_show, ata_scsi_park_store);
+EXPORT_SYMBOL_GPL(dev_attr_unload_heads);
+
static void ata_scsi_set_sense(struct scsi_cmnd *cmd, u8 sk, u8 asc, u8 ascq)
{
cmd->result = (DRIVER_SENSE << 24) | SAM_STAT_CHECK_CONDITION;
@@ -912,6 +1087,62 @@ static void ata_gen_ata_sense(struct ata_queued_cmd *qc)
desc[11] = block;
}

+static void ata_scsi_issue_park_cmd(struct ata_device *dev,
+ struct request *rq, __be16 sa_op)
+{
+ struct request_queue *q = dev->sdev->request_queue;
+ struct scsi_varlen_cdb_hdr *vl_cdb;
+
+ rq->cmd[0] = VARIABLE_LENGTH_CMD;
+ rq->cmd_len = 10;
+ rq->cmd_flags |= REQ_SOFTBARRIER;
+ rq->cmd_type = REQ_TYPE_BLOCK_PC;
+ rq->retries = 5;
+ rq->timeout = 10 * HZ;
+ vl_cdb = (struct scsi_varlen_cdb_hdr *)rq->cmd;
+ vl_cdb->additional_cdb_length = 2;
+ vl_cdb->service_action = sa_op;
+ __elv_add_request(q, rq, ELEVATOR_INSERT_FRONT, 0);
+ switch (sa_op) {
+ case PARK_HEADS:
+ blk_stop_queue(q);
+ q->request_fn(q);
+ break;
+ case UNPARK_HEADS:
+ blk_start_queue(q);
+ break;
+ default:
+ BUG();
+ }
+}
+
+static void ata_scsi_unpark_work(struct work_struct *work)
+{
+ struct ata_device *dev = container_of(work, struct ata_device,
+ unpark_work);
+ struct scsi_device *sdev = dev->sdev;
+ struct request_queue *q = sdev->request_queue;
+ struct request *rq;
+
+ rq = blk_get_request(q, 0, __GFP_WAIT);
+ spin_lock_irq(q->queue_lock);
+ if (likely(!timer_pending(&dev->park_timer)))
+ ata_scsi_issue_park_cmd(dev, rq, UNPARK_HEADS);
+ else
+ __blk_put_request(q, rq);
+ ata_scsi_signal_unpark();
+ spin_unlock_irq(q->queue_lock);
+ scsi_device_put(sdev);
+}
+
+static void ata_scsi_park_timeout(unsigned long data)
+{
+ struct ata_device *dev = (struct ata_device *)data;
+
+ queue_work(ata_aux_wq, &dev->unpark_work);
+ /* FIXME: Is that correct or should I use ata_wq instead? */
+}
+
static void ata_scsi_sdev_config(struct scsi_device *sdev)
{
sdev->use_10_for_rw = 1;
@@ -994,6 +1225,12 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, depth);
}

+ /* configure disk head parking */
+ INIT_WORK(&dev->unpark_work, ata_scsi_unpark_work);
+ dev->park_timer.function = ata_scsi_park_timeout;
+ dev->park_timer.data = (unsigned long)dev;
+ init_timer(&dev->park_timer);
+
return 0;
}

@@ -1227,6 +1464,65 @@ invalid_fld:
return 1;
}

+/**
+ * ata_scsi_park_xlat - Translate private PARK_HEADS command
+ * @qc: Storage for translated ATA taskfile
+ *
+ * Sets up an ATA taskfile to issue IDLE IMMEDIATE with UNLOAD
+ * FEATURE.
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * Zero on success, non-zero on error.
+ */
+static unsigned int ata_scsi_park_xlat(struct ata_queued_cmd *qc)
+{
+ struct ata_taskfile *tf = &qc->tf;
+ struct ata_device *dev = qc->dev;
+
+ tf->flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+ tf->protocol = ATA_PROT_NODATA;
+ tf->command = ATA_CMD_IDLEIMMEDIATE;
+ tf->feature = 0x44;
+ tf->lbal = 0x4c;
+ tf->lbam = 0x4e;
+ tf->lbah = 0x55;
+
+ dev->flags |= ATA_DFLAG_PARKED;
+ dev->sdev->max_device_blocked = 2;
+
+ return 0;
+}
+
+/**
+ * ata_scsi_unpark_xlat - Translate private UNPARK_HEADS command
+ * @qc: Storage for translated ATA taskfile
+ *
+ * Issues a CHECK POWER CONDITION command indicating to the device
+ * that it can resume normal operation.
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * Zero on success, non-zero on error.
+ */
+static unsigned int ata_scsi_unpark_xlat(struct ata_queued_cmd *qc)
+{
+ struct ata_taskfile *tf = &qc->tf;
+ struct ata_device *dev = qc->dev;
+
+ tf->flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+ tf->protocol = ATA_PROT_NODATA;
+ tf->command = ATA_CMD_CHK_POWER;
+
+ dev->flags &= ~ATA_DFLAG_PARKED;
+ dev->sdev->max_device_blocked = 1;
+
+ return 0;
+}

/**
* ata_scsi_flush_xlat - Translate SCSI SYNCHRONIZE CACHE command
@@ -1658,6 +1954,15 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd,

VPRINTK("ENTER\n");

+ if (unlikely(dev->flags & ATA_DFLAG_PARKED)) {
+ struct scsi_varlen_cdb_hdr *vl_cdb;
+
+ vl_cdb = (struct scsi_varlen_cdb_hdr *) cmd->cmnd;
+ if (vl_cdb->opcode != VARIABLE_LENGTH_CMD
+ || vl_cdb->service_action != UNPARK_HEADS)
+ return SCSI_MLQUEUE_DEVICE_BUSY;
+ }
+
qc = ata_scsi_qc_new(dev, cmd, done);
if (!qc)
goto err_mem;
@@ -1724,7 +2029,7 @@ defer:
* Prepare buffer for simulated SCSI commands.
*
* LOCKING:
- * spin_lock_irqsave(ata_scsi_rbuf_lock) on success
+ * spin_lock_irqsave(ata_scsi_lock) on success
*
* RETURNS:
* Pointer to response buffer.
@@ -1732,7 +2037,7 @@ defer:
static void *ata_scsi_rbuf_get(struct scsi_cmnd *cmd, bool copy_in,
unsigned long *flags)
{
- spin_lock_irqsave(&ata_scsi_rbuf_lock, *flags);
+ spin_lock_irqsave(&ata_scsi_lock, *flags);

memset(ata_scsi_rbuf, 0, ATA_SCSI_RBUF_SIZE);
if (copy_in)
@@ -1751,7 +2056,7 @@ static void *ata_scsi_rbuf_get(struct scsi_cmnd *cmd, bool copy_in,
* @copy_back is true.
*
* LOCKING:
- * Unlocks ata_scsi_rbuf_lock.
+ * Unlocks ata_scsi_lock.
*/
static inline void ata_scsi_rbuf_put(struct scsi_cmnd *cmd, bool copy_out,
unsigned long *flags)
@@ -1759,7 +2064,7 @@ static inline void ata_scsi_rbuf_put(struct scsi_cmnd *cmd, bool copy_out,
if (copy_out)
sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd),
ata_scsi_rbuf, ATA_SCSI_RBUF_SIZE);
- spin_unlock_irqrestore(&ata_scsi_rbuf_lock, *flags);
+ spin_unlock_irqrestore(&ata_scsi_lock, *flags);
}

/**
@@ -2814,9 +3119,10 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
* Pointer to translation function if possible, %NULL if not.
*/

-static inline ata_xlat_func_t ata_get_xlat_func(struct ata_device *dev, u8 cmd)
+static inline ata_xlat_func_t ata_get_xlat_func(struct ata_device *dev,
+ struct scsi_cmnd *scmd)
{
- switch (cmd) {
+ switch (scmd->cmnd[0]) {
case READ_6:
case READ_10:
case READ_16:
@@ -2841,6 +3147,18 @@ static inline ata_xlat_func_t ata_get_xlat_func(struct ata_device *dev, u8 cmd)

case START_STOP:
return ata_scsi_start_stop_xlat;
+
+ case VARIABLE_LENGTH_CMD:
+ if (scmd->cmd_len < 10)
+ break;
+ switch (((struct scsi_varlen_cdb_hdr *)
+ scmd->cmnd)->service_action) {
+ case PARK_HEADS:
+ return ata_scsi_park_xlat;
+
+ case UNPARK_HEADS:
+ return ata_scsi_unpark_xlat;
+ }
}

return NULL;
@@ -2874,7 +3192,6 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd,
void (*done)(struct scsi_cmnd *),
struct ata_device *dev)
{
- u8 scsi_op = scmd->cmnd[0];
ata_xlat_func_t xlat_func;
int rc = 0;

@@ -2882,15 +3199,15 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd,
if (unlikely(!scmd->cmd_len || scmd->cmd_len > dev->cdb_len))
goto bad_cdb_len;

- xlat_func = ata_get_xlat_func(dev, scsi_op);
+ xlat_func = ata_get_xlat_func(dev, scmd);
} else {
if (unlikely(!scmd->cmd_len))
goto bad_cdb_len;

xlat_func = NULL;
- if (likely((scsi_op != ATA_16) || !atapi_passthru16)) {
+ if (likely((scmd->cmnd[0] != ATA_16) || !atapi_passthru16)) {
/* relay SCSI command to ATAPI device */
- int len = COMMAND_SIZE(scsi_op);
+ int len = COMMAND_SIZE(scmd->cmnd[0]);
if (unlikely(len > scmd->cmd_len || len > dev->cdb_len))
goto bad_cdb_len;

@@ -2900,7 +3217,7 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd,
if (unlikely(scmd->cmd_len > 16))
goto bad_cdb_len;

- xlat_func = ata_get_xlat_func(dev, scsi_op);
+ xlat_func = ata_get_xlat_func(dev, scmd);
}
}

@@ -2913,7 +3230,7 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd,

bad_cdb_len:
DPRINTK("bad CDB len=%u, scsi_op=0x%02x, max=%u\n",
- scmd->cmd_len, scsi_op, dev->cdb_len);
+ scmd->cmd_len, scmd->cmnd[0], dev->cdb_len);
scmd->result = DID_ERROR << 16;
done(scmd);
return 0;
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index f6f9c28..1a03efa 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -31,6 +31,10 @@
#define DRV_NAME "libata"
#define DRV_VERSION "3.00" /* must be exactly four chars */

+/* private scsi opcodes */
+#define PARK_HEADS 0xf800
+#define UNPARK_HEADS 0xf801
+
struct ata_scsi_args {
struct ata_device *dev;
u16 *id;
@@ -149,6 +153,13 @@ extern void ata_scsi_hotplug(struct work_struct *work);
extern void ata_schedule_scsi_eh(struct Scsi_Host *shost);
extern void ata_scsi_dev_rescan(struct work_struct *work);
extern int ata_bus_probe(struct ata_port *ap);
+#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
+extern int ata_scsi_register_pm_notifier(void);
+extern int ata_scsi_unregister_pm_notifier(void);
+#else
+static inline int ata_scsi_register_pm_notifier(void) { return 0; }
+static inline int ata_scsi_unregister_pm_notifier(void) { return 0; }
+#endif

/* libata-eh.c */
extern unsigned long ata_internal_cmd_timeout(struct ata_device *dev, u8 cmd);
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 5b247b8..b40550d 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -150,6 +150,7 @@ enum {

ATA_DFLAG_DETACH = (1 << 24),
ATA_DFLAG_DETACHED = (1 << 25),
+ ATA_DFLAG_PARKED = (1 << 26),

ATA_DEV_UNKNOWN = 0, /* unknown device */
ATA_DEV_ATA = 1, /* ATA device */
@@ -451,6 +452,7 @@ enum link_pm {
MEDIUM_POWER,
};
extern struct device_attribute dev_attr_link_power_management_policy;
+extern struct device_attribute dev_attr_unload_heads;
extern struct device_attribute dev_attr_em_message_type;
extern struct device_attribute dev_attr_em_message;
extern struct device_attribute dev_attr_sw_activity;
@@ -592,6 +594,10 @@ struct ata_device {
u16 id[ATA_ID_WORDS]; /* IDENTIFY xxx DEVICE data */
u32 gscr[SATA_PMP_GSCR_DWORDS]; /* PMP GSCR block */
};
+
+ /* protected by block layer queue lock */
+ struct timer_list park_timer;
+ struct work_struct unpark_work;
};

/* Offset into struct ata_device. Fields above it are maintained

2008-07-26 06:27:29

by Elias Oltmanns

[permalink] [raw]
Subject: [PATCH 4/5] ide: Implement disk shock protection support

On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
FEATURE as specified in ATA-7 is issued to the device and processing of
the request queue is stopped thereafter until the speified timeout
expires or user space asks to resume normal operation. This is supposed
to prevent the heads of a hard drive from accidentally crashing onto the
platter when a heavy shock is anticipated (like a falling laptop
expected to hit the floor). This patch simply stops processing the
request queue. In particular, it does not yet, for instance, defer an
SRST issued in order to recover from an error on the other device on the
interface.

Signed-off-by: Elias Oltmanns <[email protected]>
---

drivers/ide/ide-io.c | 26 ++++++
drivers/ide/ide.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/ide.h | 5 +
3 files changed, 254 insertions(+), 0 deletions(-)

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index ce9ecd1..888ed28 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -717,7 +717,28 @@ static ide_startstop_t execute_drive_cmd (ide_drive_t *drive,

static ide_startstop_t ide_special_rq(ide_drive_t *drive, struct request *rq)
{
+ ide_hwif_t *hwif = drive->hwif;
+ ide_task_t task;
+ struct ide_taskfile *tf = &task.tf;
+
+ memset(&task, 0, sizeof(task));
switch (rq->cmd[0]) {
+ case REQ_PARK_HEADS:
+ if (unlikely(!timer_pending(&drive->park_timer))) {
+ ide_end_request(drive, 0, 0);
+ return ide_stopped;
+ }
+ drive->sleep = drive->park_timer.expires;
+ drive->sleeping = 1;
+ tf->command = WIN_IDLEIMMEDIATE;
+ tf->feature = 0x44;
+ tf->lbal = 0x4c;
+ tf->lbam = 0x4e;
+ tf->lbah = 0x55;
+ break;
+ case REQ_UNPARK_HEADS:
+ tf->command = WIN_CHECKPOWERMODE1;
+ break;
case REQ_DRIVE_RESET:
return ide_do_reset(drive);
default:
@@ -725,6 +746,11 @@ static ide_startstop_t ide_special_rq(ide_drive_t *drive, struct request *rq)
ide_end_request(drive, 0, 0);
return ide_stopped;
}
+ task.tf_flags = IDE_TFLAG_TF | IDE_TFLAG_DEVICE;
+ task.rq = rq;
+ rq->special = &task;
+ hwif->data_phase = task.data_phase = TASKFILE_NO_DATA;
+ return do_rw_taskfile(drive, &task);
}

static void ide_check_pm_state(ide_drive_t *drive, struct request *rq)
diff --git a/drivers/ide/ide.c b/drivers/ide/ide.c
index 21b3a76..ecc9075 100644
--- a/drivers/ide/ide.c
+++ b/drivers/ide/ide.c
@@ -57,9 +57,13 @@
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/pci.h>
+#include <linux/ata.h>
#include <linux/ide.h>
#include <linux/completion.h>
#include <linux/device.h>
+#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
+# include <linux/suspend.h>
+#endif


/* default maximum number of failures */
@@ -78,6 +82,134 @@ DEFINE_MUTEX(ide_cfg_mtx);
__cacheline_aligned_in_smp DEFINE_SPINLOCK(ide_lock);
EXPORT_SYMBOL(ide_lock);

+#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
+static int ide_park_count = 1;
+DECLARE_WAIT_QUEUE_HEAD(ide_park_wq);
+
+static inline int suspend_parking(void)
+{
+ spin_lock_irq(&ide_lock);
+ if (ide_park_count == 1)
+ ide_park_count = 0;
+ spin_unlock_irq(&ide_lock);
+ return !ide_park_count;
+}
+
+static int ide_pm_notifier(struct notifier_block *nb, unsigned long val,
+ void *null)
+{
+ switch (val) {
+ case PM_SUSPEND_PREPARE:
+ wait_event(ide_park_wq, suspend_parking());
+ break;
+ case PM_POST_SUSPEND:
+ ide_park_count = 1;
+ break;
+ default:
+ return NOTIFY_DONE;
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block ide_pm_notifier_block = {
+ .notifier_call = ide_pm_notifier,
+};
+
+static inline int ide_register_pm_notifier(void)
+{
+ return register_pm_notifier(&ide_pm_notifier_block);
+}
+
+static inline int ide_unregister_pm_notifier(void)
+{
+ return unregister_pm_notifier(&ide_pm_notifier_block);
+}
+
+static inline void signal_unpark(void)
+{
+ ide_park_count--;
+ wake_up_all(&ide_park_wq);
+}
+
+static inline int ide_mod_park_timer(struct timer_list *timer,
+ unsigned long timeout)
+{
+ if (likely(ide_park_count)) {
+ if (!mod_timer(timer, timeout)) {
+ ide_park_count++;
+ return 0;
+ }
+ return 1;
+ }
+ return -EBUSY;
+}
+#else /* defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION) */
+static inline int ide_register_pm_notifier(void) { return 0; }
+
+static inline int ide_unregister_pm_notifier(void) { return 0; }
+
+static inline void signal_unpark(void) { }
+
+static inline int ide_mod_park_timer(struct timer_list *timer,
+ unsigned long timeout)
+{
+ return mod_timer(timer, timeout);
+}
+#endif /* defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION) */
+
+static void issue_park_cmd(ide_drive_t *drive, struct request *rq, u8 op_code)
+{
+ ide_hwgroup_t *hwgroup = drive->hwif->hwgroup;
+ struct request_queue *q = drive->queue;
+
+ rq->cmd[0] = op_code;
+ rq->cmd_len = 1;
+ rq->cmd_type = REQ_TYPE_SPECIAL;
+ rq->cmd_flags |= REQ_SOFTBARRIER;
+ __elv_add_request(q, rq, ELEVATOR_INSERT_FRONT, 0);
+ switch (op_code) {
+ case REQ_PARK_HEADS:
+ blk_stop_queue(q);
+ q->request_fn(q);
+ break;
+ case REQ_UNPARK_HEADS:
+ drive->sleeping = 0;
+ if (hwgroup->sleeping) {
+ del_timer(&hwgroup->timer);
+ hwgroup->sleeping = 0;
+ hwgroup->busy = 0;
+ }
+ blk_start_queue(q);
+ break;
+ default:
+ BUG();
+ }
+}
+
+static void unpark_work(struct work_struct *work)
+{
+ ide_drive_t *drive = container_of(work, ide_drive_t, unpark_work);
+ struct request_queue *q = drive->queue;
+ struct request *rq;
+
+ rq = blk_get_request(q, READ, __GFP_WAIT);
+ spin_lock_irq(&ide_lock);
+ if (likely(!timer_pending(&drive->park_timer)))
+ issue_park_cmd(drive, rq, REQ_UNPARK_HEADS);
+ else
+ __blk_put_request(q, rq);
+ signal_unpark();
+ spin_unlock_irq(&ide_lock);
+ ide_device_put(drive);
+}
+
+static void park_timeout(unsigned long data)
+{
+ ide_drive_t *drive = (ide_drive_t *)data;
+
+ schedule_work(&drive->unpark_work);
+}
+
static void ide_port_init_devices_data(ide_hwif_t *);

/*
@@ -130,6 +262,10 @@ static void ide_port_init_devices_data(ide_hwif_t *hwif)

INIT_LIST_HEAD(&drive->list);
init_completion(&drive->gendev_rel_comp);
+ INIT_WORK(&drive->unpark_work, unpark_work);
+ drive->park_timer.function = park_timeout;
+ drive->park_timer.data = (unsigned long)drive;
+ init_timer(&drive->park_timer);
}
}

@@ -770,6 +906,84 @@ static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
return sprintf(buf, "%s\n", (char *)&drive->id[ATA_ID_SERNO]);
}

+static ssize_t park_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ ide_drive_t *drive = to_ide_device(dev);
+ unsigned int seconds;
+
+ if (!ata_id_has_unload(drive->id))
+ return -EOPNOTSUPP;
+
+ spin_lock_irq(&ide_lock);
+ if (timer_pending(&drive->park_timer))
+ /*
+ * Adding 1 in order to guarantee nonzero value until timer
+ * has actually expired.
+ */
+ seconds = jiffies_to_msecs(drive->park_timer.expires - jiffies)
+ / 1000 + 1;
+ else
+ seconds = 0;
+ spin_unlock_irq(&ide_lock);
+ return snprintf(buf, 20, "%u\n", seconds);
+}
+
+static ssize_t park_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+#define MAX_PARK_TIMEOUT 30
+ ide_drive_t *drive = to_ide_device(dev);
+ struct request_queue *q = drive->queue;
+ struct request *rq;
+ unsigned long seconds;
+ char *p;
+ int skipped_cmd = 0, rc = 0;
+
+ seconds = simple_strtoul((char *)buf, &p, 0);
+ if (p == buf || (*p != '\0' && (*p != '\n' || *(p + 1) != '\0'))
+ || seconds > MAX_PARK_TIMEOUT)
+ return -EINVAL;
+ if (!ata_id_has_unload(drive->id))
+ return -EOPNOTSUPP;
+
+ rq = blk_get_request(q, READ, __GFP_WAIT);
+ spin_lock_irq(&ide_lock);
+ if (seconds) {
+ if (unlikely(ide_device_get(drive))) {
+ skipped_cmd = 1;
+ rc = -ENXIO;
+ goto free_rq;
+ }
+ rc = ide_mod_park_timer(&drive->park_timer,
+ msecs_to_jiffies(seconds * 1000)
+ + jiffies);
+ if (!rc)
+ issue_park_cmd(drive, rq, REQ_PARK_HEADS);
+ else {
+ if (likely(rc == 1)) {
+ drive->sleep = drive->park_timer.expires;
+ rc = 0;
+ }
+ ide_device_put(drive);
+ skipped_cmd = 1;
+ }
+ } else {
+ if (del_timer(&drive->park_timer)) {
+ issue_park_cmd(drive, rq, REQ_UNPARK_HEADS);
+ signal_unpark();
+ ide_device_put(drive);
+ } else
+ skipped_cmd = 1;
+ }
+free_rq:
+ if (skipped_cmd)
+ __blk_put_request(q, rq);
+ spin_unlock_irq(&ide_lock);
+
+ return rc ? rc : len;
+}
+
static struct device_attribute ide_dev_attrs[] = {
__ATTR_RO(media),
__ATTR_RO(drivename),
@@ -777,6 +991,7 @@ static struct device_attribute ide_dev_attrs[] = {
__ATTR_RO(model),
__ATTR_RO(firmware),
__ATTR(serial, 0400, serial_show, NULL),
+ __ATTR(unload_heads, 0644, park_show, park_store),
__ATTR_NULL
};

@@ -1032,6 +1247,12 @@ static int __init ide_init(void)
goto out_port_class;
}

+ ret = ide_register_pm_notifier();
+ if (ret) {
+ class_destroy(ide_port_class);
+ goto out_port_class;
+ }
+
proc_ide_create();

return 0;
@@ -1046,6 +1267,8 @@ static void __exit ide_exit(void)
{
proc_ide_destroy();

+ ide_unregister_pm_notifier();
+
class_destroy(ide_port_class);

bus_unregister(&ide_bus_type);
diff --git a/include/linux/ide.h b/include/linux/ide.h
index 2d8d21c..625db11 100644
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -143,6 +143,8 @@ struct ide_io_ports {
* Values should be in the range of 0x20 to 0x3f.
*/
#define REQ_DRIVE_RESET 0x20
+#define REQ_PARK_HEADS 0x21
+#define REQ_UNPARK_HEADS 0x22

/*
* Check for an interrupt and acknowledge the interrupt status
@@ -477,6 +479,9 @@ struct ide_drive_s {
void (*pc_callback)(struct ide_drive_s *);

unsigned long atapi_flags;
+
+ struct timer_list park_timer; /* protected by queue_lock */
+ struct work_struct unpark_work;
};

typedef struct ide_drive_s ide_drive_t;

2008-07-26 06:29:31

by Elias Oltmanns

[permalink] [raw]
Subject: [PATCH 5/5] Add documentation for hard disk shock protection interface

Put some information (and pointers to more) into the kernel's doc tree,
describing briefly how to set up disk shock protection under GNU/Linux.

Signed-off-by: Elias Oltmanns <[email protected]>
---

Documentation/laptops/disk-shock-protection.txt | 66 +++++++++++++++++++++++
1 files changed, 66 insertions(+), 0 deletions(-)
create mode 100644 Documentation/laptops/disk-shock-protection.txt

diff --git a/Documentation/laptops/disk-shock-protection.txt b/Documentation/laptops/disk-shock-protection.txt
new file mode 100644
index 0000000..ac26040
--- /dev/null
+++ b/Documentation/laptops/disk-shock-protection.txt
@@ -0,0 +1,66 @@
+Hard disk shock protection
+==========================
+
+Author: Elias Oltmanns <[email protected]>
+Last modified: 2008-07-24
+
+
+Intro
+-----
+
+ATA/ATAPI-7 specifies the IDLE IMMEDIATE command with UNLOAD FEATURE.
+Issuing this command should cause the drive to switch to idle mode and
+unload disk heads. This feature is being used in modern laptops in
+conjunction with accelerometers and appropriate software to implement
+a shock protection facility. The idea is to stop all I/O operations on
+the internal hard drive and park its heads on the ramp when critical
+situations are anticipated. The desire to have such a feature
+available on GNU/Linux systems has been the original motivation to
+implement a generic disk head parking interface in the Linux kernel.
+
+
+The interface
+-------------
+
+The interface works as follows: Writing an integer value to
+/sys/block/*/device/unload_heads will take the heads of the respective
+drive off the platter and block all I/O operations for the specified
+number of seconds. When the timeout expires and no further disk head
+park request has been issued in the meantime, normal operation will be
+resumed.
+
+FIXME:
+Once we've addressed the problem with pre-ATA-7 drives supporting the
+UNLOAD FEATURE but not announcing it properly, we'll document it
+accordingly.
+
+
+References
+----------
+
+There are several laptops from different brands featuring shock
+protection capabilities. As manufacturers have refused to support open
+source development of the required software components so far, Linux
+support for shock protection varies considerably between different
+hardware implementations. Ideally, this section should contain a list
+of pointers at different projects aiming at an implementation of shock
+protection on different systeems. Unfortunately, I only know of a
+single project which, although still considered experimental, is fit
+for use. Please feel free to add projects that have been the victims
+of my ignorance.
+
+- http://www.thinkwiki.org/wiki/HDAPS
+ See this page for information about Linux support of the hard disk
+ active protection syystem as implemented in IBM/Lenovo Thinkpads.
+ (FIXME: The information there will have to be updated once this
+ patch has been approved or the user interface has been agreed upon
+ at least.)
+
+
+CREDITS
+-------
+
+This implementation of disk head parking has been based on a patch
+originally published by Jon Escombe <[email protected]>. Assisted by
+various kernel developers, the author of this document has rewritten
+the original patch in order to make it fit for upstream submission.

2008-07-26 09:19:38

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH 5/5] Add documentation for hard disk shock protection interface

Hello.

Elias Oltmanns wrote:
> Put some information (and pointers to more) into the kernel's doc tree,
> describing briefly how to set up disk shock protection under GNU/Linux.
>
> Signed-off-by: Elias Oltmanns <[email protected]>
>
[...]
> diff --git a/Documentation/laptops/disk-shock-protection.txt b/Documentation/laptops/disk-shock-protection.txt
> new file mode 100644
> index 0000000..ac26040
> --- /dev/null
> +++ b/Documentation/laptops/disk-shock-protection.txt
>
[...]
> +
> +- http://www.thinkwiki.org/wiki/HDAPS
> + See this page for information about Linux support of the hard disk
> + active protection syystem as implemented in IBM/Lenovo Thinkpads.
>

A typo in "system". :-)

MBR, Sergei

2008-07-26 16:19:55

by Alan

[permalink] [raw]
Subject: Re: [PATCH 2/5] Introduce ata_id_has_unload()

On Sat, 26 Jul 2008 08:24:35 +0200
Elias Oltmanns <[email protected]> wrote:

> Add a function to check an ATA device's id for head unload support as
> specified in ATA-7.

If a feature is new in ATA 7 you should also check the ATA version is >=7
in the tests. Just in case some prehistoric device has it down for a
different purpose.

Alan

2008-08-01 07:20:03

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Elias Oltmanns wrote:
> On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
> FEATURE as specified in ATA-7 is issued to the device and processing of
> the request queue is stopped thereafter until the speified timeout
> expires or user space asks to resume normal operation. This is supposed
> to prevent the heads of a hard drive from accidentally crashing onto the
> platter when a heavy shock is anticipated (like a falling laptop
> expected to hit the floor). This patch simply stops processing the
> request queue. In particular, it does not yet, for instance, defer an
> SRST issued in order to recover from an error on the other device on the
> interface.

For libata, the easiest way to achieve the above would be adding a
per-dev EH action, say, ATA_EH_UNLOAD and schedule EH w/ the action OR'd
to eh_info->action. The EH_UNLOAD handler can then issue the command
wait for the specified number of seconds and continue. This will be
pretty simple to implement as command exclusion and stuff are all
automatically handled by EH framework.

However, SATA or not, there simply isn't a way to abort commands in ATA.
Issuing random command while other commands are in progress simply is
state machine violation and there will be many interesting results
including complete system lockup (ATA controller dying while holding the
PCI bus). The only reliable way to abort in-flight commands are by
issuing hardreset. However, ATA reset protocol is not designed for
quick recovery. The machine is gonna hit the ground hard way before the
reset protocol is complete.

The only way to solve this nicely is either to build the accelerometer
into the drive and let the drive itself protect itself or implement a
sideband signal to tell it to duck for cover. For SATA, this sideband
signal can be another OOB sequence. If it's ever implemented this way,
it will be in SControl, I guess.

Well, short of that, all we can do is to wait for the currently
in-flight commands to drain and hope that it happens before the machine
hits the ground. Also, that the harddrive is not going through one of
the longish EH recovery sequences when it starts to fall. :-(

--
tejun

2008-08-01 22:53:11

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

> However, SATA or not, there simply isn't a way to abort commands in ATA.
> Issuing random command while other commands are in progress simply is
> state machine violation and there will be many interesting results
> including complete system lockup (ATA controller dying while holding the
> PCI bus). The only reliable way to abort in-flight commands are by
> issuing hardreset. However, ATA reset protocol is not designed for
> quick recovery. The machine is gonna hit the ground hard way before the
> reset protocol is complete.

Actually you cau can issue idle immediate on older ATA devices. I am not
clear if that was stuck back into the current accelerometer friendly
drives or not. Would need to check with IBLenovo

2008-08-03 03:20:45

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Alan Cox wrote:
>> However, SATA or not, there simply isn't a way to abort commands in ATA.
>> Issuing random command while other commands are in progress simply is
>> state machine violation and there will be many interesting results
>> including complete system lockup (ATA controller dying while holding the
>> PCI bus). The only reliable way to abort in-flight commands are by
>> issuing hardreset. However, ATA reset protocol is not designed for
>> quick recovery. The machine is gonna hit the ground hard way before the
>> reset protocol is complete.
>
> Actually you cau can issue idle immediate on older ATA devices. I am not
> clear if that was stuck back into the current accelerometer friendly
> drives or not. Would need to check with IBLenovo

Was that something intentional or was it a happy accident? There can be
bus ownership problem on PATA and on SATA this is much more state logic
on both sides of the cable and I think things like that would be more
difficult to work accidentally.

Thanks.

--
tejun

2008-08-03 13:23:43

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

> Was that something intentional or was it a happy accident? There can be
> bus ownership problem on PATA and on SATA this is much more state logic

IDLE IMMEDIATE is part of the "historic" specification rather than modern
ATA standards. Older drives support it happily newer ones tended to get
quite cross.

> on both sides of the cable and I think things like that would be more
> difficult to work accidentally.

We'd have to query the drive vendors to see what they expected.

2008-08-03 13:59:17

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Alan Cox wrote:
>> on both sides of the cable and I think things like that would be more
>> difficult to work accidentally.
>
> We'd have to query the drive vendors to see what they expected.

We'll have to query the controller vendors too and I think it's highly
likely to cause problems on many controllers.

--
tejun

2008-08-04 06:39:21

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 4/5] ide: Implement disk shock protection support

Hi!

> #include <linux/ide.h>
> #include <linux/completion.h>
> #include <linux/device.h>
> +#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
> +# include <linux/suspend.h>
> +#endif

This ifdef should be unneccessary.


> +#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
> +static int ide_park_count = 1;
> +DECLARE_WAIT_QUEUE_HEAD(ide_park_wq);
> +
> +static inline int suspend_parking(void)
> +{
> + spin_lock_irq(&ide_lock);
> + if (ide_park_count == 1)
> + ide_park_count = 0;
> + spin_unlock_irq(&ide_lock);
> + return !ide_park_count;
> +}
> +
> +static int ide_pm_notifier(struct notifier_block *nb, unsigned long val,
> + void *null)
> +{
> + switch (val) {
> + case PM_SUSPEND_PREPARE:
> + wait_event(ide_park_wq, suspend_parking());
> + break;
> + case PM_POST_SUSPEND:
> + ide_park_count = 1;
> + break;
> + default:
> + return NOTIFY_DONE;
> + }
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block ide_pm_notifier_block = {
> + .notifier_call = ide_pm_notifier,
> +};
> +
> +static inline int ide_register_pm_notifier(void)
> +{
> + return register_pm_notifier(&ide_pm_notifier_block);
> +}
> +
> +static inline int ide_unregister_pm_notifier(void)
> +{
> + return unregister_pm_notifier(&ide_pm_notifier_block);
> +}

Any reason this does not use normal driver model suspend/resume
callbacks?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-08-04 13:29:44

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Tejun Heo <[email protected]> wrote:
> Elias Oltmanns wrote:
>> On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
>
>> FEATURE as specified in ATA-7 is issued to the device and processing of
>> the request queue is stopped thereafter until the speified timeout
>> expires or user space asks to resume normal operation. This is supposed
>> to prevent the heads of a hard drive from accidentally crashing onto the
>> platter when a heavy shock is anticipated (like a falling laptop
>> expected to hit the floor). This patch simply stops processing the
>> request queue. In particular, it does not yet, for instance, defer an
>> SRST issued in order to recover from an error on the other device on the
>> interface.
>
> For libata, the easiest way to achieve the above would be adding a
> per-dev EH action, say, ATA_EH_UNLOAD and schedule EH w/ the action OR'd
> to eh_info->action. The EH_UNLOAD handler can then issue the command
> wait for the specified number of seconds and continue. This will be
> pretty simple to implement as command exclusion and stuff are all
> automatically handled by EH framework.

I'm rather afraid this approach is impractical or unfavourable at the
very least. Depending on the configured thresholds, a head unload
request might well be issued unintentionally, e.g. by accidentally
knocking against the table. It is quite alright for the HD to stop I/O
for a moment but if the secondary device on the interface happens to be
a CD writer, it will be very annoying to have CD writing operations fail
due to minor percussions. Also, if there are two devices on the same
port that support the UNLOAD FEATURE and you issue a head unload request
to both of them in close succession, the IDLE IMMEDIATE to the second
device will be blocked until the timeout for the first has expired.

Generally, blocking SRST and the likes on a port seems acceptable, but
stopping all I/O on a port just because a head unload request has been
issued to a single device attached to it is not an option.

>
> However, SATA or not, there simply isn't a way to abort commands in ATA.
> Issuing random command while other commands are in progress simply is
> state machine violation and there will be many interesting results
> including complete system lockup (ATA controller dying while holding the
> PCI bus). The only reliable way to abort in-flight commands are by
> issuing hardreset. However, ATA reset protocol is not designed for
> quick recovery. The machine is gonna hit the ground hard way before the
> reset protocol is complete.

Yes, I suspected as much. Thanks for the confirmation.

[...]
>
> Well, short of that, all we can do is to wait for the currently
> in-flight commands to drain and hope that it happens before the machine
> hits the ground. Also, that the harddrive is not going through one of
> the longish EH recovery sequences when it starts to fall. :-(

Yes, it's a bit unsatisfactory but it's better than nothing.

Regards,

Elias

2008-08-04 13:47:19

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Alan Cox <[email protected]> wrote:
>> However, SATA or not, there simply isn't a way to abort commands in ATA.
>
>> Issuing random command while other commands are in progress simply is
>> state machine violation and there will be many interesting results
>> including complete system lockup (ATA controller dying while holding the
>> PCI bus). The only reliable way to abort in-flight commands are by
>> issuing hardreset. However, ATA reset protocol is not designed for
>> quick recovery. The machine is gonna hit the ground hard way before the
>> reset protocol is complete.
>
> Actually you cau can issue idle immediate on older ATA devices. I am not
> clear if that was stuck back into the current accelerometer friendly
> drives or not. Would need to check with IBLenovo

In ide_atapi_error() IDLE IMMEDIATE is issued even if busy bit is still
set. This made me hope that we could do something similar wrt disk head
unloading. However, since I haven't found anything about this in the
specs and considering Tejun's comments, I'm now wondering whether the
code in ide_atapi_error() isn't a little imprudent. Also, as long as
nothing definite is said about it in the specs, we can only issue IDLE
IMMEDIATE in parallel to other commands on devices we know to support
it, right?

Regards,

Elias

2008-08-04 14:12:55

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Elias Oltmanns wrote:
>> For libata, the easiest way to achieve the above would be adding a
>> per-dev EH action, say, ATA_EH_UNLOAD and schedule EH w/ the action OR'd
>> to eh_info->action. The EH_UNLOAD handler can then issue the command
>> wait for the specified number of seconds and continue. This will be
>> pretty simple to implement as command exclusion and stuff are all
>> automatically handled by EH framework.
>
> I'm rather afraid this approach is impractical or unfavourable at the
> very least. Depending on the configured thresholds, a head unload
> request might well be issued unintentionally, e.g. by accidentally
> knocking against the table. It is quite alright for the HD to stop I/O
> for a moment but if the secondary device on the interface happens to be
> a CD writer, it will be very annoying to have CD writing operations fail
> due to minor percussions.

Why would it fail?

> Also, if there are two devices on the same
> port that support the UNLOAD FEATURE and you issue a head unload request
> to both of them in close succession, the IDLE IMMEDIATE to the second
> device will be blocked until the timeout for the first has expired.

Unload can be implemented as port-wide operation so that it issues IDLE
IMMEDIATE to all drives on the port but given that this is mostly for
laptop, this discussion is a bit peripheral.

--
tejun

2008-08-04 14:19:48

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [PATCH 4/5] ide: Implement disk shock protection support

Pavel Machek <[email protected]> wrote:
> Hi!
>
>> #include <linux/ide.h>
>> #include <linux/completion.h>
>> #include <linux/device.h>
>> +#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
>> +# include <linux/suspend.h>
>> +#endif
>
> This ifdef should be unneccessary.

Right.

>
>
>> +#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_HIBERNATION)
>> +static int ide_park_count = 1;
>> +DECLARE_WAIT_QUEUE_HEAD(ide_park_wq);
>> +
>> +static inline int suspend_parking(void)
>> +{
>> + spin_lock_irq(&ide_lock);
>> + if (ide_park_count == 1)
>> + ide_park_count = 0;
>> + spin_unlock_irq(&ide_lock);
>> + return !ide_park_count;
>> +}
>> +
>> +static int ide_pm_notifier(struct notifier_block *nb, unsigned long val,
>> + void *null)
>> +{
>> + switch (val) {
>> + case PM_SUSPEND_PREPARE:
>> + wait_event(ide_park_wq, suspend_parking());
>> + break;
>> + case PM_POST_SUSPEND:
>> + ide_park_count = 1;
>> + break;
>> + default:
>> + return NOTIFY_DONE;
>> + }
>> + return NOTIFY_OK;
>> +}
>> +
>> +static struct notifier_block ide_pm_notifier_block = {
>> + .notifier_call = ide_pm_notifier,
>> +};
>> +
>> +static inline int ide_register_pm_notifier(void)
>> +{
>> + return register_pm_notifier(&ide_pm_notifier_block);
>> +}
>> +
>> +static inline int ide_unregister_pm_notifier(void)
>> +{
>> + return unregister_pm_notifier(&ide_pm_notifier_block);
>> +}
>
> Any reason this does not use normal driver model suspend/resume
> callbacks?

Yes, two in fact. Firstly, it would appear that after freeze_processes()
has completed, timers don't fire anymore. That's why we have to call the
unpark hook manually. The ATA suspend callback would be the place to do
that but the scsi ULD's suspend callback issues a cache sync and a disk
start_stop command before and waits indefinitely for completion.

Secondly and more importantly, using pm_notifiers is the right thing to
do as long as user processes control when to park and unpark the disks.
This way we can hold back from suspending until user space gives the all
clear.

Regards,

Elias

2008-08-04 14:28:42

by Gabor Gombas

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

On Fri, Aug 01, 2008 at 04:19:26PM +0900, Tejun Heo wrote:

> However, SATA or not, there simply isn't a way to abort commands in ATA.
> Issuing random command while other commands are in progress simply is
> state machine violation and there will be many interesting results
> including complete system lockup (ATA controller dying while holding the
> PCI bus).

A system lockup may be an acceptable compromise if that saves the
hardware. Maybe the kernel should explicitely panic unless the
controller/drive is known to be able to recover.

Gabor

--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------

2008-08-04 14:32:07

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

On Mon, 4 Aug 2008 16:28:32 +0200
Gabor Gombas <[email protected]> wrote:

> On Fri, Aug 01, 2008 at 04:19:26PM +0900, Tejun Heo wrote:
>
> > However, SATA or not, there simply isn't a way to abort commands in ATA.
> > Issuing random command while other commands are in progress simply is
> > state machine violation and there will be many interesting results
> > including complete system lockup (ATA controller dying while holding the
> > PCI bus).
>
> A system lockup may be an acceptable compromise if that saves the
> hardware. Maybe the kernel should explicitely panic unless the
> controller/drive is known to be able to recover.

We've already been told that the accelerometer will now and then randomly
trigger due to other shock patterns like a bump. I don't want my laptop
to panic randomly on train journeys thank you.

Alan

2008-08-04 14:36:35

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Gabor Gombas wrote:
> On Fri, Aug 01, 2008 at 04:19:26PM +0900, Tejun Heo wrote:
>
>> However, SATA or not, there simply isn't a way to abort commands in ATA.
>> Issuing random command while other commands are in progress simply is
>> state machine violation and there will be many interesting results
>> including complete system lockup (ATA controller dying while holding the
>> PCI bus).
>
> A system lockup may be an acceptable compromise if that saves the
> hardware. Maybe the kernel should explicitely panic unless the
> controller/drive is known to be able to recover.

Such lockups usually would occur before the intervening command is
successfully issued. HSM violation occurs when the driver asks the
controller to send another command while it's already processing another
command. Heh... panicking on accelerometer would be fun tho. We're
gonna get ourselves really flamewars on just about every linux news site.

--
tejun

2008-08-04 16:58:15

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Tejun Heo <[email protected]> wrote:
> Elias Oltmanns wrote:
>>> For libata, the easiest way to achieve the above would be adding a
>
>>> per-dev EH action, say, ATA_EH_UNLOAD and schedule EH w/ the action OR'd
>>> to eh_info->action. The EH_UNLOAD handler can then issue the command
>>> wait for the specified number of seconds and continue. This will be
>>> pretty simple to implement as command exclusion and stuff are all
>>> automatically handled by EH framework.
>>
>> I'm rather afraid this approach is impractical or unfavourable at the
>> very least. Depending on the configured thresholds, a head unload
>> request might well be issued unintentionally, e.g. by accidentally
>> knocking against the table. It is quite alright for the HD to stop I/O
>> for a moment but if the secondary device on the interface happens to be
>> a CD writer, it will be very annoying to have CD writing operations fail
>> due to minor percussions.
>
> Why would it fail?

To be quite honest, I don't know very much about the way CD writing
works. I just assumed that delaying queue processing for a CD writer for
several seconds would have very much the same effect as the input buffer
of cdrecord getting empty prematurely. Do you mean to say that CD
writing (or any other time expensive operation I haven't thought of)
won't be affected irrecoverably by interrupted command processing?

>
>> Also, if there are two devices on the same
>> port that support the UNLOAD FEATURE and you issue a head unload request
>> to both of them in close succession, the IDLE IMMEDIATE to the second
>> device will be blocked until the timeout for the first has expired.
>
> Unload can be implemented as port-wide operation so that it issues IDLE
> IMMEDIATE to all drives on the port but given that this is mostly for
> laptop, this discussion is a bit peripheral.

We can't rule out that the HD is connected as master and CDRW as slave
to the same controller in a PATA setup. I'm not familiar with the SATA
configurations in modern laptops though.

Regards,

Elias

2008-08-04 23:27:38

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Elias Oltmanns wrote:
>>> I'm rather afraid this approach is impractical or unfavourable at the
>>> very least. Depending on the configured thresholds, a head unload
>>> request might well be issued unintentionally, e.g. by accidentally
>>> knocking against the table. It is quite alright for the HD to stop I/O
>>> for a moment but if the secondary device on the interface happens to be
>>> a CD writer, it will be very annoying to have CD writing operations fail
>>> due to minor percussions.
>> Why would it fail?
>
> To be quite honest, I don't know very much about the way CD writing
> works. I just assumed that delaying queue processing for a CD writer for
> several seconds would have very much the same effect as the input buffer
> of cdrecord getting empty prematurely. Do you mean to say that CD
> writing (or any other time expensive operation I haven't thought of)
> won't be affected irrecoverably by interrupted command processing?

Any modern cd/dvd writer can happily recover from buffer underruns. I
think the physical shock itself has better chance of screwing up the
recording. The only thing to make sure is that no command is issued to
ATAPI devices. Other than that, there should be no problem.

>>> Also, if there are two devices on the same
>>> port that support the UNLOAD FEATURE and you issue a head unload request
>>> to both of them in close succession, the IDLE IMMEDIATE to the second
>>> device will be blocked until the timeout for the first has expired.
>> Unload can be implemented as port-wide operation so that it issues IDLE
>> IMMEDIATE to all drives on the port but given that this is mostly for
>> laptop, this discussion is a bit peripheral.
>
> We can't rule out that the HD is connected as master and CDRW as slave
> to the same controller in a PATA setup. I'm not familiar with the SATA
> configurations in modern laptops though.

On most, they occupy different channels and even when they reside on the
same channel, it just doesn't really matter these days. And even on
those cases, it would be better to use EH as that will make the IDLE
IMMEDIATE command always win the bus as soon as possible while IDLE
IMMEDIATE queued at the head of the drive queue could lose to an ATAPI
command.

Thanks.

--
tejun

2008-08-05 04:05:23

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Alan Cox wrote:
> On Mon, 4 Aug 2008 16:28:32 +0200
> Gabor Gombas <[email protected]> wrote:
>
>> On Fri, Aug 01, 2008 at 04:19:26PM +0900, Tejun Heo wrote:
>>
>>> However, SATA or not, there simply isn't a way to abort commands in ATA.
>>> Issuing random command while other commands are in progress simply is
>>> state machine violation and there will be many interesting results
>>> including complete system lockup (ATA controller dying while holding the
>>> PCI bus).
>> A system lockup may be an acceptable compromise if that saves the
>> hardware. Maybe the kernel should explicitely panic unless the
>> controller/drive is known to be able to recover.
>
> We've already been told that the accelerometer will now and then randomly
> trigger due to other shock patterns like a bump. I don't want my laptop
> to panic randomly on train journeys thank you.

Yes, from what I've seen on these laptops, it doesn't take much to
trigger the shock protection in Windows - lifting the front of the
laptop off the table an inch and dropping it will do it, as will picking
it up and suddenly tilting it. (I think the idea is to detect situations
where the laptop starts to fall, as by the time it impacts the floor
it's too late.) There's an option in the settings to "ignore repetitive
shocks as experienced in a train or automobile".

So yes, it's reasonable to expect the shock protection to be triggered
pretty often.

2008-08-05 04:18:39

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Tejun Heo wrote:
> Elias Oltmanns wrote:
>> On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
>> FEATURE as specified in ATA-7 is issued to the device and processing of
>> the request queue is stopped thereafter until the speified timeout
>> expires or user space asks to resume normal operation. This is supposed
>> to prevent the heads of a hard drive from accidentally crashing onto the
>> platter when a heavy shock is anticipated (like a falling laptop
>> expected to hit the floor). This patch simply stops processing the
>> request queue. In particular, it does not yet, for instance, defer an
>> SRST issued in order to recover from an error on the other device on the
>> interface.
>
> For libata, the easiest way to achieve the above would be adding a
> per-dev EH action, say, ATA_EH_UNLOAD and schedule EH w/ the action OR'd
> to eh_info->action. The EH_UNLOAD handler can then issue the command
> wait for the specified number of seconds and continue. This will be
> pretty simple to implement as command exclusion and stuff are all
> automatically handled by EH framework.
>
> However, SATA or not, there simply isn't a way to abort commands in ATA.
> Issuing random command while other commands are in progress simply is
> state machine violation and there will be many interesting results
> including complete system lockup (ATA controller dying while holding the
> PCI bus). The only reliable way to abort in-flight commands are by
> issuing hardreset. However, ATA reset protocol is not designed for
> quick recovery. The machine is gonna hit the ground hard way before the
> reset protocol is complete.

How long does hardreset have to take? I only see a 1ms delay in the
COMRESET process (sata_link_hardreset). I'd think it would be feasible
to do something like:

-stop the queue to prevent new commands from being issued
-wait a certain amount of time (20ms or so?) for existing command(s) to
complete, if they do then issue the idle command
-if time runs out, trigger a hardreset and then issue the idle command

The drive is going to take a little while to actually unload the heads
anyway, so a few milliseconds delay doesn't seem like a big deal..

>
> The only way to solve this nicely is either to build the accelerometer
> into the drive and let the drive itself protect itself or implement a
> sideband signal to tell it to duck for cover. For SATA, this sideband
> signal can be another OOB sequence. If it's ever implemented this way,
> it will be in SControl, I guess.
>
> Well, short of that, all we can do is to wait for the currently
> in-flight commands to drain and hope that it happens before the machine
> hits the ground. Also, that the harddrive is not going through one of
> the longish EH recovery sequences when it starts to fall. :-(

Well, Lenovo (and others?) have implemented this in Windows somehow.. It
would be interesting to know what solution they used there (either
hardreset, issue the command even when busy, or just wait for the
commands to hopefully finish in time).

2008-08-05 07:51:37

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Robert Hancock wrote:
>> However, SATA or not, there simply isn't a way to abort commands in ATA.
>> Issuing random command while other commands are in progress simply is
>> state machine violation and there will be many interesting results
>> including complete system lockup (ATA controller dying while holding the
>> PCI bus). The only reliable way to abort in-flight commands are by
>> issuing hardreset. However, ATA reset protocol is not designed for
>> quick recovery. The machine is gonna hit the ground hard way before the
>> reset protocol is complete.
>
> How long does hardreset have to take? I only see a 1ms delay in the
> COMRESET process (sata_link_hardreset). I'd think it would be feasible
> to do something like:
>
> -stop the queue to prevent new commands from being issued
> -wait a certain amount of time (20ms or so?) for existing command(s) to
> complete, if they do then issue the idle command
> -if time runs out, trigger a hardreset and then issue the idle command
>
> The drive is going to take a little while to actually unload the heads
> anyway, so a few milliseconds delay doesn't seem like a big deal..

Two major areas of delays are...

- Post-hardreset PHY readiness delay. It depends on both the controller
and drive. Some combination might take pretty short while there are
combinations which are known to take in the order of few seconds. It's
determined by sata_deb_timing_* arrays in libata-core.c. In most cases,
sata_deb_timing_normal works fine. Currently, sil24 needs the long
variant. Using the normal one, the shortest possible timing would be a
bit above 100ms as libata determines PHY is online only after the link
state hasn't oscillate for that long.

- Device readiness (the initial TF w/ signature). It depends on how the
drive implementation. If the drive is spinning, it's usually pretty
quick but there's no guarantee. Also, there's another problem that some
controllers just can't wait for device readiness after hardreset and
thus needs to perform softreset after hard one, which adds to the delay.

Missing either of the above two can jam the reset sequence forcing a
retry. It might work with some combinations of devices but given that
we wouldn't get too much test coverage I don't really think the overhead
and risk are justifiable.

>> The only way to solve this nicely is either to build the accelerometer
>> into the drive and let the drive itself protect itself or implement a
>> sideband signal to tell it to duck for cover. For SATA, this sideband
>> signal can be another OOB sequence. If it's ever implemented this way,
>> it will be in SControl, I guess.
>>
>> Well, short of that, all we can do is to wait for the currently
>> in-flight commands to drain and hope that it happens before the machine
>> hits the ground. Also, that the harddrive is not going through one of
>> the longish EH recovery sequences when it starts to fall. :-(
>
> Well, Lenovo (and others?) have implemented this in Windows somehow.. It
> would be interesting to know what solution they used there (either
> hardreset, issue the command even when busy, or just wait for the
> commands to hopefully finish in time).

I think just waiting till the currently pending commands are complete
and then issuing IDLE_IMMEDIATE would cover most of the cases. Longer
term, I really think there needs to be an out-of-band signal if this is
gonna get done right.

--
tejun

2008-08-05 13:43:40

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

On Mon, Aug 04, 2008 at 10:05:03PM -0600, Robert Hancock wrote:
> Yes, from what I've seen on these laptops, it doesn't take much to
> trigger the shock protection in Windows - lifting the front of the
> laptop off the table an inch and dropping it will do it,

A few years ago, I had a Thinkpad T21 laptop, accidentally slip
through my butterfingers and dropped about an inch before it landed on
the table. Unfortunately, (a) the Thinkpad T21 laptop was rather
heavy (compared to modern laptops), (b) it didn't have the rubber
"bubble" on the bottom of the laptop to cushion the landing as the T22
and T23's had (and I'm sure I know why it was added), and (c) the hard
drive was active at the time. It was enough to cause a head crash and
Linux immediately started reporting an exponentially increasing number
of write errors; the hard drive was totally unusable within an hour or
so.

So there's a reason why the anti-shock protection is set at a rather
sensitive level...

The real right answer though is to buy one of the laptop drives (such
as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
detection built directly into the hard drive. That way you don't have
to have a daemon that sits in the OS waking up the CPU some 20 to 30
times a second and burning up your battery even when the laptop is
idle.

- Ted

Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

On Tue, 05 Aug 2008, Theodore Tso wrote:
> The real right answer though is to buy one of the laptop drives (such
> as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
> detection built directly into the hard drive. That way you don't have
> to have a daemon that sits in the OS waking up the CPU some 20 to 30
> times a second and burning up your battery even when the laptop is
> idle.

Make that 50 times a second for the kernel over a slow LPC bus IO path, plus
whatever userspace needs to process that data and feed it back as an UNLOAD
IMMEDIATE request...

Yes, doing it inside the HD itself is *MUCH* better. But only very few
disks do it, so far, I only know of those Seagates. And they are available
only on SATA. So, many thinkpad owners will want "software-based" HDAPS for
at least five more years, maybe more (thinkpads REALLY have a long useful
life, we still see a good number of T23 in heavy use...).

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-08-05 15:14:26

by Éric Piel

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Theodore Tso schreef:
> The real right answer though is to buy one of the laptop drives (such
> as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
> detection built directly into the hard drive. That way you don't have
> to have a daemon that sits in the OS waking up the CPU some 20 to 30
> times a second and burning up your battery even when the laptop is
> idle.
Well, it's not always _that_ bad. For instance, the accelerometer in the
HP laptops (lis3lv02dl) can monitor free-falls by itself. When the
laptop starts falling, it generates one interrupt.

Eric

2008-08-05 19:58:04

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

Hi!

> > Yes, from what I've seen on these laptops, it doesn't take much to
> > trigger the shock protection in Windows - lifting the front of the
> > laptop off the table an inch and dropping it will do it,
>
> A few years ago, I had a Thinkpad T21 laptop, accidentally slip
> through my butterfingers and dropped about an inch before it landed on
> the table. Unfortunately, (a) the Thinkpad T21 laptop was rather
> heavy (compared to modern laptops), (b) it didn't have the rubber
> "bubble" on the bottom of the laptop to cushion the landing as the T22
> and T23's had (and I'm sure I know why it was added), and (c) the hard
> drive was active at the time. It was enough to cause a head crash and
> Linux immediately started reporting an exponentially increasing number
> of write errors; the hard drive was totally unusable within an hour or
> so.
>
> So there's a reason why the anti-shock protection is set at a rather
> sensitive level...
>
> The real right answer though is to buy one of the laptop drives (such
> as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
> detection built directly into the hard drive. That way you don't have
> to have a daemon that sits in the OS waking up the CPU some 20 to 30
> times a second and burning up your battery even when the laptop is
> idle.

Hmm, when the laptop is idle, "right thing" is to spin the disk down,
and at that point you no longer need to poll the accelerometer...

Yes, doing it in drive is probably right place... but I'm told the
algorithms are quite complex.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-08-05 19:59:17

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

On Tue 2008-08-05 17:14:02, Eric Piel wrote:
> Theodore Tso schreef:
>> The real right answer though is to buy one of the laptop drives (such
>> as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
>> detection built directly into the hard drive. That way you don't have
>> to have a daemon that sits in the OS waking up the CPU some 20 to 30
>> times a second and burning up your battery even when the laptop is
>> idle.
> Well, it's not always _that_ bad. For instance, the accelerometer in the HP
> laptops (lis3lv02dl) can monitor free-falls by itself. When the laptop
> starts falling, it generates one interrupt.

I was told that "when it starts falling" is too late.

Thinkpads at least tried to detect the movement before fall: before
notebook falls off table, it is very likely to tilt a lot.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-08-05 23:03:19

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [PATCH 3/5] libata: Implement disk shock protection support

[Resending because gmane cut off the To: header in Pavel's email, sorry.]

Pavel Machek <[email protected]> wrote:
> Hi!
>
[...]
>> So there's a reason why the anti-shock protection is set at a rather

[+]
>> sensitive level...
>>
>> The real right answer though is to buy one of the laptop drives (such
>> as the Seagate Momentus 7200.2 or 7200.3) which has the anti-shock
>> detection built directly into the hard drive. That way you don't have
>> to have a daemon that sits in the OS waking up the CPU some 20 to 30
>> times a second and burning up your battery even when the laptop is
>> idle.
>
> Hmm, when the laptop is idle, "right thing" is to spin the disk down,
> and at that point you no longer need to poll the accelerometer...

Not quite, I'm afraid. Even if the disk isn't spinning, we still have to
make sure that it won't spin up in a precarious situation. Of course, if
it wasn't user space but some kernel routine that consults the
accelerometer and decides when to stop I/O, then we could indeed stop
querying the accelerometer while the disk is in standby mode and delay a
subsequent spin up for the time required to gather the necessary
accelerometer data. However, as I have explained before, it is not quite
trivial to implement all this in kernel space and I'd like to get head
unloading merged first.

Regards,

Elias