2012-06-22 06:30:20

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 0/6] scsi, async: asynchronous probing rework / fixes

Set5 of 5 patchsets to update scsi, libsas, and libata in
support of the isci driver.

Commit 43a8d39d "[SCSI] fix async probe regression" found that
async_synchronize_full() was missing async work that was scheduled to
its own domain. This led James to note:

"My theory is that this is an init problem: The assumption in a lot of
our code is that async_synchronize_full() waits for everything ... even
the domain specific async schedules, which isn't true."

...and this set aims to make that assumption true, but also with the
ability to opt-out for "private" async work.

The other async probe fix is in the area of unplug events that occur in
the scsi async scanning interval. Essentially scsi_remove_target() can
now see semi-initialized scsi_targets that have yet to be added via
device_add().

If there are no objections I'll put these in -next. But I expect at
least patch1 and patch2 will need an ack from Arjan before the set shows
up in scsi.git.

---

Dan Williams (6):
async: introduce 'async_domain' type
async: make async_synchronize_full() flush all work regardless of domain
scsi: queue async scan work to an async_schedule domain
scsi: cleanup usages of scsi_complete_async_scans
Revert "[SCSI] fix async probe regression"
scsi: fix hot unplug vs async scan race


drivers/regulator/core.c | 2 +
drivers/scsi/libsas/sas_ata.c | 2 +
drivers/scsi/scsi.c | 4 ++
drivers/scsi/scsi_priv.h | 3 +-
drivers/scsi/scsi_scan.c | 34 ++++++------------
drivers/scsi/scsi_sysfs.c | 41 ++++++++++++++--------
drivers/scsi/scsi_wait_scan.c | 15 +++-----
include/linux/async.h | 36 +++++++++++++++++--
include/scsi/scsi_scan.h | 11 ------
kernel/async.c | 76 +++++++++++++++++++++++++++++++----------
kernel/power/hibernate.c | 8 ----
kernel/power/user.c | 2 -
12 files changed, 138 insertions(+), 96 deletions(-)
delete mode 100644 include/scsi/scsi_scan.h


2012-06-22 06:30:29

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 1/6] async: introduce 'async_domain' type

This is in preparation for teaching async_synchronize_full() to sync all
pending async work, and not just on the async_running domain. This
conversion is functionally equivalent, just embedding the existing list
in a new async_domain type.

The .registered attribute is used in a later patch to distinguish
between domains that want to be flushed by async_synchronize_full()
versus those that only expect async_synchronize_{full|cookie}_domain to
be used for flushing.

Cc: Liam Girdwood <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: James Bottomley <[email protected]>
Acked-by: Mark Brown <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/regulator/core.c | 2 +-
drivers/scsi/libsas/sas_ata.c | 2 +-
drivers/scsi/scsi.c | 3 ++-
drivers/scsi/scsi_priv.h | 3 ++-
include/linux/async.h | 35 +++++++++++++++++++++++++++++++----
kernel/async.c | 35 +++++++++++++++++------------------
6 files changed, 54 insertions(+), 26 deletions(-)

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 09a737c..4293aae 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -2744,7 +2744,7 @@ static void regulator_bulk_enable_async(void *data, async_cookie_t cookie)
int regulator_bulk_enable(int num_consumers,
struct regulator_bulk_data *consumers)
{
- LIST_HEAD(async_domain);
+ ASYNC_DOMAIN_EXCLUSIVE(async_domain);
int i;
int ret = 0;

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 607a35b..899d190 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -828,7 +828,7 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
void sas_ata_strategy_handler(struct Scsi_Host *shost)
{
struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(shost);
- LIST_HEAD(async);
+ ASYNC_DOMAIN_EXCLUSIVE(async);
int i;

/* it's ok to defer revalidation events during ata eh, these
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index bbbc9c9..4cade88 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -54,6 +54,7 @@
#include <linux/notifier.h>
#include <linux/cpu.h>
#include <linux/mutex.h>
+#include <linux/async.h>

#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
@@ -91,7 +92,7 @@ EXPORT_SYMBOL(scsi_logging_level);
#endif

/* sd, scsi core and power management need to coordinate flushing async actions */
-LIST_HEAD(scsi_sd_probe_domain);
+ASYNC_DOMAIN(scsi_sd_probe_domain);
EXPORT_SYMBOL(scsi_sd_probe_domain);

/* NB: These are exposed through /proc/scsi/scsi and form part of the ABI.
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 07ce3f5..eab472d 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -2,6 +2,7 @@
#define _SCSI_PRIV_H

#include <linux/device.h>
+#include <linux/async.h>

struct request_queue;
struct request;
@@ -163,7 +164,7 @@ static inline int scsi_autopm_get_host(struct Scsi_Host *h) { return 0; }
static inline void scsi_autopm_put_host(struct Scsi_Host *h) {}
#endif /* CONFIG_PM_RUNTIME */

-extern struct list_head scsi_sd_probe_domain;
+extern struct async_domain scsi_sd_probe_domain;

/*
* internal scsi timeout functions: for use by mid-layer and transport
diff --git a/include/linux/async.h b/include/linux/async.h
index 68a9530..364e7ff 100644
--- a/include/linux/async.h
+++ b/include/linux/async.h
@@ -9,19 +9,46 @@
* as published by the Free Software Foundation; version 2
* of the License.
*/
+#ifndef __ASYNC_H__
+#define __ASYNC_H__

#include <linux/types.h>
#include <linux/list.h>

typedef u64 async_cookie_t;
typedef void (async_func_ptr) (void *data, async_cookie_t cookie);
+struct async_domain {
+ struct list_head node;
+ struct list_head domain;
+ int count;
+ unsigned registered:1;
+};
+
+/*
+ * domain participates in global async_synchronize_full
+ */
+#define ASYNC_DOMAIN(_name) \
+ struct async_domain _name = { .node = LIST_HEAD_INIT(_name.node), \
+ .domain = LIST_HEAD_INIT(_name.domain), \
+ .count = 0, \
+ .registered = 1 }
+
+/*
+ * domain is free to go out of scope as soon as all pending work is
+ * complete, this domain does not participate in async_synchronize_full
+ */
+#define ASYNC_DOMAIN_EXCLUSIVE(_name) \
+ struct async_domain _name = { .node = LIST_HEAD_INIT(_name.node), \
+ .domain = LIST_HEAD_INIT(_name.domain), \
+ .count = 0, \
+ .registered = 0 }

extern async_cookie_t async_schedule(async_func_ptr *ptr, void *data);
extern async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
- struct list_head *list);
+ struct async_domain *domain);
extern void async_synchronize_full(void);
-extern void async_synchronize_full_domain(struct list_head *list);
+extern void async_synchronize_full_domain(struct async_domain *domain);
extern void async_synchronize_cookie(async_cookie_t cookie);
extern void async_synchronize_cookie_domain(async_cookie_t cookie,
- struct list_head *list);
-
+ struct async_domain *domain);
+#endif
diff --git a/kernel/async.c b/kernel/async.c
index bd0c168..ba5491d 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -62,7 +62,7 @@ static async_cookie_t next_cookie = 1;
#define MAX_WORK 32768

static LIST_HEAD(async_pending);
-static LIST_HEAD(async_running);
+static ASYNC_DOMAIN(async_running);
static DEFINE_SPINLOCK(async_lock);

struct async_entry {
@@ -71,7 +71,7 @@ struct async_entry {
async_cookie_t cookie;
async_func_ptr *func;
void *data;
- struct list_head *running;
+ struct async_domain *running;
};

static DECLARE_WAIT_QUEUE_HEAD(async_done);
@@ -82,13 +82,12 @@ static atomic_t entry_count;
/*
* MUST be called with the lock held!
*/
-static async_cookie_t __lowest_in_progress(struct list_head *running)
+static async_cookie_t __lowest_in_progress(struct async_domain *running)
{
struct async_entry *entry;

- if (!list_empty(running)) {
- entry = list_first_entry(running,
- struct async_entry, list);
+ if (!list_empty(&running->domain)) {
+ entry = list_first_entry(&running->domain, typeof(*entry), list);
return entry->cookie;
}

@@ -99,7 +98,7 @@ static async_cookie_t __lowest_in_progress(struct list_head *running)
return next_cookie; /* "infinity" value */
}

-static async_cookie_t lowest_in_progress(struct list_head *running)
+static async_cookie_t lowest_in_progress(struct async_domain *running)
{
unsigned long flags;
async_cookie_t ret;
@@ -119,10 +118,11 @@ static void async_run_entry_fn(struct work_struct *work)
container_of(work, struct async_entry, work);
unsigned long flags;
ktime_t uninitialized_var(calltime), delta, rettime;
+ struct async_domain *running = entry->running;

/* 1) move self to the running queue */
spin_lock_irqsave(&async_lock, flags);
- list_move_tail(&entry->list, entry->running);
+ list_move_tail(&entry->list, &running->domain);
spin_unlock_irqrestore(&async_lock, flags);

/* 2) run (and print duration) */
@@ -156,7 +156,7 @@ static void async_run_entry_fn(struct work_struct *work)
wake_up(&async_done);
}

-static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct list_head *running)
+static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct async_domain *running)
{
struct async_entry *entry;
unsigned long flags;
@@ -223,7 +223,7 @@ EXPORT_SYMBOL_GPL(async_schedule);
* Note: This function may be called from atomic or non-atomic contexts.
*/
async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
- struct list_head *running)
+ struct async_domain *running)
{
return __async_schedule(ptr, data, running);
}
@@ -238,20 +238,20 @@ void async_synchronize_full(void)
{
do {
async_synchronize_cookie(next_cookie);
- } while (!list_empty(&async_running) || !list_empty(&async_pending));
+ } while (!list_empty(&async_running.domain) || !list_empty(&async_pending));
}
EXPORT_SYMBOL_GPL(async_synchronize_full);

/**
* async_synchronize_full_domain - synchronize all asynchronous function within a certain domain
- * @list: running list to synchronize on
+ * @domain: running list to synchronize on
*
* This function waits until all asynchronous function calls for the
- * synchronization domain specified by the running list @list have been done.
+ * synchronization domain specified by the running list @domain have been done.
*/
-void async_synchronize_full_domain(struct list_head *list)
+void async_synchronize_full_domain(struct async_domain *domain)
{
- async_synchronize_cookie_domain(next_cookie, list);
+ async_synchronize_cookie_domain(next_cookie, domain);
}
EXPORT_SYMBOL_GPL(async_synchronize_full_domain);

@@ -261,11 +261,10 @@ EXPORT_SYMBOL_GPL(async_synchronize_full_domain);
* @running: running list to synchronize on
*
* This function waits until all asynchronous function calls for the
- * synchronization domain specified by the running list @list submitted
+ * synchronization domain specified by running list @running submitted
* prior to @cookie have been done.
*/
-void async_synchronize_cookie_domain(async_cookie_t cookie,
- struct list_head *running)
+void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *running)
{
ktime_t uninitialized_var(starttime), delta, endtime;

2012-06-22 06:30:41

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 2/6] async: make async_synchronize_full() flush all work regardless of domain

In response to an async related regression James noted:

"My theory is that this is an init problem: The assumption in a lot of
our code is that async_synchronize_full() waits for everything ... even
the domain specific async schedules, which isn't true."

...so make this assumption true.

Each domain, including the default one, registers itself on a global domain
list when work is scheduled. Once all entries complete it exits that
list. Waiting for the list to be empty syncs all in-flight work across
all domains.

Domains can opt-out of global syncing if they are declared as exclusive
ASYNC_DOMAIN_EXCLUSIVE(). All stack-based domains have been declared
exclusive since the domain may go out of scope as soon as the last work
item completes.

Statically declared domains are mostly ok, but async_unregister_domain()
is there to close any theoretical races with pending
async_synchronize_full waiters at module removal time.

Cc: Arjan van de Ven <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: James Bottomley <[email protected]>
Reported-by: Meelis Roos <[email protected]>
Reported-by: Eldad Zack <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/scsi/scsi.c | 1 +
include/linux/async.h | 1 +
kernel/async.c | 43 +++++++++++++++++++++++++++++++++++++++++--
3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 4cade88..2936b44 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -1355,6 +1355,7 @@ static void __exit exit_scsi(void)
scsi_exit_devinfo();
scsi_exit_procfs();
scsi_exit_queue();
+ async_unregister_domain(&scsi_sd_probe_domain);
}

subsys_initcall(init_scsi);
diff --git a/include/linux/async.h b/include/linux/async.h
index 364e7ff..7a24fe9 100644
--- a/include/linux/async.h
+++ b/include/linux/async.h
@@ -46,6 +46,7 @@ struct async_domain {
extern async_cookie_t async_schedule(async_func_ptr *ptr, void *data);
extern async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
struct async_domain *domain);
+void async_unregister_domain(struct async_domain *domain);
extern void async_synchronize_full(void);
extern void async_synchronize_full_domain(struct async_domain *domain);
extern void async_synchronize_cookie(async_cookie_t cookie);
diff --git a/kernel/async.c b/kernel/async.c
index ba5491d..9d31183 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -63,7 +63,9 @@ static async_cookie_t next_cookie = 1;

static LIST_HEAD(async_pending);
static ASYNC_DOMAIN(async_running);
+static LIST_HEAD(async_domains);
static DEFINE_SPINLOCK(async_lock);
+static DEFINE_MUTEX(async_register_mutex);

struct async_entry {
struct list_head list;
@@ -145,6 +147,8 @@ static void async_run_entry_fn(struct work_struct *work)
/* 3) remove self from the running queue */
spin_lock_irqsave(&async_lock, flags);
list_del(&entry->list);
+ if (running->registered && --running->count == 0)
+ list_del_init(&running->node);

/* 4) free the entry */
kfree(entry);
@@ -187,6 +191,8 @@ static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct a
spin_lock_irqsave(&async_lock, flags);
newcookie = entry->cookie = next_cookie++;
list_add_tail(&entry->list, &async_pending);
+ if (running->registered && running->count++ == 0)
+ list_add_tail(&running->node, &async_domains);
atomic_inc(&entry_count);
spin_unlock_irqrestore(&async_lock, flags);

@@ -236,13 +242,43 @@ EXPORT_SYMBOL_GPL(async_schedule_domain);
*/
void async_synchronize_full(void)
{
+ mutex_lock(&async_register_mutex);
do {
- async_synchronize_cookie(next_cookie);
- } while (!list_empty(&async_running.domain) || !list_empty(&async_pending));
+ struct async_domain *domain = NULL;
+
+ spin_lock_irq(&async_lock);
+ if (!list_empty(&async_domains))
+ domain = list_first_entry(&async_domains, typeof(*domain), node);
+ spin_unlock_irq(&async_lock);
+
+ async_synchronize_cookie_domain(next_cookie, domain);
+ } while (!list_empty(&async_domains));
+ mutex_unlock(&async_register_mutex);
}
EXPORT_SYMBOL_GPL(async_synchronize_full);

/**
+ * async_unregister_domain - ensure no more anonymous waiters on this domain
+ * @domain: idle domain to flush out of any async_synchronize_full instances
+ *
+ * async_synchronize_{cookie|full}_domain() are not flushed since callers
+ * of these routines should know the lifetime of @domain
+ *
+ * Prefer ASYNC_DOMAIN_EXCLUSIVE() declarations over flushing
+ */
+void async_unregister_domain(struct async_domain *domain)
+{
+ mutex_lock(&async_register_mutex);
+ spin_lock_irq(&async_lock);
+ WARN_ON(!domain->registered || !list_empty(&domain->node) ||
+ !list_empty(&domain->domain));
+ domain->registered = 0;
+ spin_unlock_irq(&async_lock);
+ mutex_unlock(&async_register_mutex);
+}
+EXPORT_SYMBOL_GPL(async_unregister_domain);
+
+/**
* async_synchronize_full_domain - synchronize all asynchronous function within a certain domain
* @domain: running list to synchronize on
*
@@ -268,6 +304,9 @@ void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain
{
ktime_t uninitialized_var(starttime), delta, endtime;

+ if (!running)
+ return;
+
if (initcall_debug && system_state == SYSTEM_BOOTING) {
printk(KERN_DEBUG "async_waiting @ %i\n", task_pid_nr(current));
starttime = ktime_get();

2012-06-22 06:30:53

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 6/6] scsi: fix hot unplug vs async scan race

The following crash results from cases where the end_device has been
removed before scsi_sysfs_add_sdev has had a chance to run.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6
...
Call Trace:
[<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3
[<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff8125e641>] kobject_add_varg+0x41/0x50
[<ffffffff8125e70b>] kobject_add+0x64/0x66
[<ffffffff8131122b>] device_add+0x12d/0x63a
[<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56
[<ffffffff8107de15>] ? module_refcount+0x89/0xa0
[<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a
[<ffffffff8132dcbb>] do_scan_async+0x9c/0x145

...teach scsi_sysfs_add_devices() to check for deleted devices() before
trying to add them, and teach scsi_remove_target() how to remove targets
that have not been added via device_add().

Cc: Mike Christie <[email protected]>
Cc: Robert Love <[email protected]>
Cc: Nagalakshmi Nandigama <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: <[email protected]>
Reported-by: Dariusz Majchrzak <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/scsi/scsi_scan.c | 3 +++
drivers/scsi/scsi_sysfs.c | 41 ++++++++++++++++++++++++++---------------
2 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 03ebb37..56a9379 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1702,6 +1702,9 @@ static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
{
struct scsi_device *sdev;
shost_for_each_device(sdev, shost) {
+ /* target removed before the device could be added */
+ if (sdev->sdev_state == SDEV_DEL)
+ continue;
if (!scsi_host_scan_allowed(shost) ||
scsi_sysfs_add_sdev(sdev) != 0)
__scsi_remove_device(sdev);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 04c2a27..f888aad 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1000,7 +1000,6 @@ static void __scsi_remove_target(struct scsi_target *starget)
struct scsi_device *sdev;

spin_lock_irqsave(shost->host_lock, flags);
- starget->reap_ref++;
restart:
list_for_each_entry(sdev, &shost->__devices, siblings) {
if (sdev->channel != starget->channel ||
@@ -1014,14 +1013,6 @@ static void __scsi_remove_target(struct scsi_target *starget)
goto restart;
}
spin_unlock_irqrestore(shost->host_lock, flags);
- scsi_target_reap(starget);
-}
-
-static int __remove_child (struct device * dev, void * data)
-{
- if (scsi_is_target_device(dev))
- __scsi_remove_target(to_scsi_target(dev));
- return 0;
}

/**
@@ -1034,14 +1025,34 @@ static int __remove_child (struct device * dev, void * data)
*/
void scsi_remove_target(struct device *dev)
{
- if (scsi_is_target_device(dev)) {
- __scsi_remove_target(to_scsi_target(dev));
- return;
+ struct Scsi_Host *shost = dev_to_shost(dev->parent);
+ struct scsi_target *starget, *found;
+ unsigned long flags;
+
+ restart:
+ found = NULL;
+ spin_lock_irqsave(shost->host_lock, flags);
+ list_for_each_entry(starget, &shost->__targets, siblings) {
+ if (starget->state == STARGET_DEL)
+ continue;
+ if (starget->dev.parent == dev || &starget->dev == dev) {
+ found = starget;
+ found->reap_ref++;
+ break;
+ }
}
+ spin_unlock_irqrestore(shost->host_lock, flags);

- get_device(dev);
- device_for_each_child(dev, NULL, __remove_child);
- put_device(dev);
+ if (found) {
+ __scsi_remove_target(found);
+ scsi_target_reap(found);
+ /* in the case where @dev has multiple starget children,
+ * continue removing.
+ *
+ * FIXME: does such a case exist?
+ */
+ goto restart;
+ }
}
EXPORT_SYMBOL(scsi_remove_target);

2012-06-22 06:30:47

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 5/6] Revert "[SCSI] fix async probe regression"

This reverts commit 43a8d39d0137612c336aa8bbb2cb886a79772ffb.

Commit 43a8d39d fixed the fact that wait_for_device_probe() was unable
to flush sd probe work. Now that sd probe work is once again flushable
via wait_for_device_probe() this workaround is no longer needed.

Cc: Meelis Roos <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/scsi/scsi_scan.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 2731c81..03ebb37 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -147,7 +147,7 @@ int scsi_complete_async_scans(void)

do {
if (list_empty(&scanning_hosts))
- goto out;
+ return 0;
/* If we can't get memory immediately, that's OK. Just
* sleep a little. Even if we never get memory, the async
* scans will finish eventually.
@@ -179,11 +179,8 @@ int scsi_complete_async_scans(void)
}
done:
spin_unlock(&async_scan_lock);
- kfree(data);
-
- out:
- async_synchronize_full_domain(&scsi_sd_probe_domain);

+ kfree(data);
return 0;
}

2012-06-22 06:31:32

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 3/6] scsi: queue async scan work to an async_schedule domain

This is preparation to enable async_synchronize_full() to be used as a
replacement for scsi_complete_async_scans(), i.e. to stop leaking scsi
internal details where they are not needed.

Cc: Arjan van de Ven <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: James Bottomley <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/scsi/scsi_scan.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 2e5fe58..a381fa2 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1842,14 +1842,13 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
}
}

-static int do_scan_async(void *_data)
+static void do_scan_async(void *_data, async_cookie_t c)
{
struct async_scan_data *data = _data;
struct Scsi_Host *shost = data->shost;

do_scsi_scan_host(shost);
scsi_finish_async_scan(data);
- return 0;
}

/**
@@ -1858,7 +1857,6 @@ static int do_scan_async(void *_data)
**/
void scsi_scan_host(struct Scsi_Host *shost)
{
- struct task_struct *p;
struct async_scan_data *data;

if (strncmp(scsi_scan_type, "none", 4) == 0)
@@ -1873,9 +1871,11 @@ void scsi_scan_host(struct Scsi_Host *shost)
return;
}

- p = kthread_run(do_scan_async, data, "scsi_scan_%d", shost->host_no);
- if (IS_ERR(p))
- do_scan_async(data);
+ /* register with the async subsystem so wait_for_device_probe()
+ * will flush this work
+ */
+ async_schedule(do_scan_async, data);
+
/* scsi_autopm_put_host(shost) is called in scsi_finish_async_scan() */
}
EXPORT_SYMBOL(scsi_scan_host);

2012-06-22 06:31:30

by Dan Williams

[permalink] [raw]
Subject: [set5 PATCH 4/6] scsi: cleanup usages of scsi_complete_async_scans

Now that scsi registers its async scan work with the async subsystem,
wait_for_device_probe() is sufficient for ensuring all scanning is
complete.

Cc: Arjan van de Ven <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: James Bottomley <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/scsi/scsi_scan.c | 12 ------------
drivers/scsi/scsi_wait_scan.c | 15 +++++----------
include/scsi/scsi_scan.h | 11 -----------
kernel/power/hibernate.c | 8 --------
kernel/power/user.c | 2 --
5 files changed, 5 insertions(+), 43 deletions(-)
delete mode 100644 include/scsi/scsi_scan.h

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index a381fa2..2731c81 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -187,18 +187,6 @@ int scsi_complete_async_scans(void)
return 0;
}

-/* Only exported for the benefit of scsi_wait_scan */
-EXPORT_SYMBOL_GPL(scsi_complete_async_scans);
-
-#ifndef MODULE
-/*
- * For async scanning we need to wait for all the scans to complete before
- * trying to mount the root fs. Otherwise non-modular drivers may not be ready
- * yet.
- */
-late_initcall(scsi_complete_async_scans);
-#endif
-
/**
* scsi_unlock_floptical - unlock device via a special MODE SENSE command
* @sdev: scsi device to send command to
diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
index ae78148..57de24a 100644
--- a/drivers/scsi/scsi_wait_scan.c
+++ b/drivers/scsi/scsi_wait_scan.c
@@ -12,21 +12,16 @@

#include <linux/module.h>
#include <linux/device.h>
-#include "scsi_priv.h"

static int __init wait_scan_init(void)
{
/*
- * First we need to wait for device probing to finish;
- * the drivers we just loaded might just still be probing
- * and might not yet have reached the scsi async scanning
+ * This will not return until all async work (system wide) is
+ * quiesced. Probing queues host-scanning work to the async
+ * queue which is why we don't need a separate call to
+ * scsi_complete_async_scans()
*/
wait_for_device_probe();
- /*
- * and then we wait for the actual asynchronous scsi scan
- * to finish.
- */
- scsi_complete_async_scans();
return 0;
}

@@ -38,5 +33,5 @@ MODULE_DESCRIPTION("SCSI wait for scans");
MODULE_AUTHOR("James Bottomley");
MODULE_LICENSE("GPL");

-late_initcall(wait_scan_init);
+module_init(wait_scan_init);
module_exit(wait_scan_exit);
diff --git a/include/scsi/scsi_scan.h b/include/scsi/scsi_scan.h
deleted file mode 100644
index 7889888..0000000
--- a/include/scsi/scsi_scan.h
+++ /dev/null
@@ -1,11 +0,0 @@
-#ifndef _SCSI_SCSI_SCAN_H
-#define _SCSI_SCSI_SCAN_H
-
-#ifdef CONFIG_SCSI
-/* drivers/scsi/scsi_scan.c */
-extern int scsi_complete_async_scans(void);
-#else
-static inline int scsi_complete_async_scans(void) { return 0; }
-#endif
-
-#endif /* _SCSI_SCSI_SCAN_H */
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 8b53db3..238025f 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -27,7 +27,6 @@
#include <linux/syscore_ops.h>
#include <linux/ctype.h>
#include <linux/genhd.h>
-#include <scsi/scsi_scan.h>

#include "power.h"

@@ -748,13 +747,6 @@ static int software_resume(void)
async_synchronize_full();
}

- /*
- * We can't depend on SCSI devices being available after loading
- * one of their modules until scsi_complete_async_scans() is
- * called and the resume device usually is a SCSI one.
- */
- scsi_complete_async_scans();
-
swsusp_resume_device = name_to_dev_t(resume_file);
if (!swsusp_resume_device) {
error = -ENODEV;
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 91b0fd0..4ed81e7 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -24,7 +24,6 @@
#include <linux/console.h>
#include <linux/cpu.h>
#include <linux/freezer.h>
-#include <scsi/scsi_scan.h>

#include <asm/uaccess.h>

@@ -84,7 +83,6 @@ static int snapshot_open(struct inode *inode, struct file *filp)
* appear.
*/
wait_for_device_probe();
- scsi_complete_async_scans();

data->swap = -1;
data->mode = O_WRONLY;

2012-06-22 21:40:34

by Eldad Zack

[permalink] [raw]
Subject: Re: [set5 PATCH 2/6] async: make async_synchronize_full() flush all work regardless of domain


On Thu, 21 Jun 2012, Dan Williams wrote:
> In response to an async related regression James noted:
>
> "My theory is that this is an init problem: The assumption in a lot of
> our code is that async_synchronize_full() waits for everything ... even
> the domain specific async schedules, which isn't true."
>
> ...so make this assumption true.

Hi Dan,

I've applied the all 6 patches of this set.
(HEAD 8874e812feb4926f4a51a82c4fca75c7daa05fc5 , Linus' tree)

I've tested it on the same setup and everything runs fine, and I've
seen no suspicious messages.

If you'd like, I can test only patch #2 in isolation.

Tested-by: Eldad Zack <[email protected]>

Best regards,
Eldad

2012-06-23 07:07:18

by Dan Williams

[permalink] [raw]
Subject: Re: [set5 PATCH 2/6] async: make async_synchronize_full() flush all work regardless of domain

On Fri, Jun 22, 2012 at 2:40 PM, Eldad Zack <[email protected]> wrote:
>
> On Thu, 21 Jun 2012, Dan Williams wrote:
>> In response to an async related regression James noted:
>>
>> ? "My theory is that this is an init problem: The assumption in a lot of
>> ? ?our code is that async_synchronize_full() waits for everything ... even
>> ? ?the domain specific async schedules, which isn't true."
>>
>> ...so make this assumption true.
>
> Hi Dan,
>
> I've applied the all 6 patches of this set.
> (HEAD 8874e812feb4926f4a51a82c4fca75c7daa05fc5 , Linus' tree)
>
> I've tested it on the same setup and everything runs fine, and I've
> seen no suspicious messages.
>
> If you'd like, I can test only patch #2 in isolation.

No, I think the testing you've done is enough.

> Tested-by: Eldad Zack <[email protected]>

Much appreciated.

--
Dan

2012-06-28 22:22:36

by Dan Williams

[permalink] [raw]
Subject: Re: [set5 PATCH 0/6] scsi, async: asynchronous probing rework / fixes

On Thu, Jun 21, 2012 at 11:46 PM, Dan Williams <[email protected]> wrote:
> Set5 of 5 patchsets to update scsi, libsas, and libata in
> support of the isci driver.
>
> Commit 43a8d39d "[SCSI] fix async probe regression" found that
> async_synchronize_full() was missing async work that was scheduled to
> its own domain. ?This led James to note:
>
> ? ? ?"My theory is that this is an init problem: The assumption in a lot of
> ? ? ? our code is that async_synchronize_full() waits for everything ... even
> ? ? ? the domain specific async schedules, which isn't true."
>
> ...and this set aims to make that assumption true, but also with the
> ability to opt-out for "private" async work.
>
> The other async probe fix is in the area of unplug events that occur in
> the scsi async scanning interval. ?Essentially scsi_remove_target() can
> now see semi-initialized scsi_targets that have yet to be added via
> device_add().
>
> If there are no objections I'll put these in -next. ?But I expect at
> least patch1 and patch2 will need an ack from Arjan before the set shows
> up in scsi.git.
>
> ---

James,

I caught a couple conflicts with scsi.git/for-next and was able to fix
them up without rebasing on top of your tree. I'll resend the series
after letting these sit in -next for a while, or I can prepare a tag
for you to pull with the whole pending set.

>
> Dan Williams (6):
> ? ? ?async: introduce 'async_domain' type

dropped the include of async.h

> ? ? ?async: make async_synchronize_full() flush all work regardless of domain
> ? ? ?scsi: queue async scan work to an async_schedule domain
> ? ? ?scsi: cleanup usages of scsi_complete_async_scans

dropped the changes to scsi_wait_scan.c since that file is deleted in
your for-next branch.

--
Dan

2012-06-29 09:06:42

by James Bottomley

[permalink] [raw]
Subject: Re: [set5 PATCH 0/6] scsi, async: asynchronous probing rework / fixes

On Thu, 2012-06-28 at 15:22 -0700, Dan Williams wrote:
> On Thu, Jun 21, 2012 at 11:46 PM, Dan Williams <[email protected]> wrote:
> > Set5 of 5 patchsets to update scsi, libsas, and libata in
> > support of the isci driver.
> >
> > Commit 43a8d39d "[SCSI] fix async probe regression" found that
> > async_synchronize_full() was missing async work that was scheduled to
> > its own domain. This led James to note:
> >
> > "My theory is that this is an init problem: The assumption in a lot of
> > our code is that async_synchronize_full() waits for everything ... even
> > the domain specific async schedules, which isn't true."
> >
> > ...and this set aims to make that assumption true, but also with the
> > ability to opt-out for "private" async work.
> >
> > The other async probe fix is in the area of unplug events that occur in
> > the scsi async scanning interval. Essentially scsi_remove_target() can
> > now see semi-initialized scsi_targets that have yet to be added via
> > device_add().
> >
> > If there are no objections I'll put these in -next. But I expect at
> > least patch1 and patch2 will need an ack from Arjan before the set shows
> > up in scsi.git.
> >
> > ---
>
> James,
>
> I caught a couple conflicts with scsi.git/for-next and was able to fix
> them up without rebasing on top of your tree. I'll resend the series
> after letting these sit in -next for a while, or I can prepare a tag
> for you to pull with the whole pending set.

Ideally, I'd like acks from Arjan as the original async author before we
commit it in stone (which is what effectively happens for a signed tag).

Could you resend the series now and I'll bother him for the two we need
his ack on.

Thanks,

James


> >
> > Dan Williams (6):
> > async: introduce 'async_domain' type
>
> dropped the include of async.h
>
> > async: make async_synchronize_full() flush all work regardless of domain
> > scsi: queue async scan work to an async_schedule domain
> > scsi: cleanup usages of scsi_complete_async_scans
>
> dropped the changes to scsi_wait_scan.c since that file is deleted in
> your for-next branch.