2007-09-20 07:27:23

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET 2/4] sysfs: allow suicide

Hello, all.

This is the second patchset of four sysfs update patchset series[1]
and to be applied on top of the first patchset[1].

Currently, sysfs files which want to kill themselves should ask
someone else (workqueue) to kill it, which is so inhumane. This
patchset updates sysfs file implementation such that sysfs files can
commit suicide peacefully.

Global sledgehammer module unload inhibition/delay mechanism is
implemented and used to prevent premature unload while suicide is in
progress. Suicide attempt is detected by scanning sysfs_buffers for
matching accessor. If suicide is detected, active references the
accessor were holding are dropped early such that the suiciding node
can be deactivated without deadlock.

As active references go away early, the module basing the code the
accessor is running can go away before it finishes. Global module
unload inhibition is used here to prevent that until accessor callback
is complete.

This patchset contains the following four patches.

0001-module-implement-module_inhibit_unload.patch
0002-sysfs-make-the-sysfs_addrm_cxt-removed-list-FIFO.patch
0003-sysfs-care-free-suicide-for-sysfs-files.patch
0004-sysfs-make-suicidal-nodes-just-do-it-directly.patch

0001 needs Rusty Russell's ack.

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/582105



2007-09-20 07:27:37

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/4] sysfs: make the sysfs_addrm_cxt->removed list FIFO

Add sysfs_addrm_cxt->removed_tail to make the ->removed list FIFO.
This will be used to implement care-free suicide.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 7 +++++--
fs/sysfs/sysfs.h | 2 +-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ad3394a..c902bdc 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -361,6 +361,7 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,

memset(acxt, 0, sizeof(*acxt));
acxt->parent_sd = parent_sd;
+ acxt->removed_tail = &acxt->removed;

/* Lookup parent inode. inode initialization and I_NEW
* clearing are protected by sysfs_mutex. By grabbing it and
@@ -448,8 +449,10 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
sysfs_unlink_sibling(sd);

sd->s_flags |= SYSFS_FLAG_REMOVED;
- sd->s_sibling = acxt->removed;
- acxt->removed = sd;
+ *acxt->removed_tail = sd;
+ /* keep them in FIFO order, suicide check depends on it */
+ acxt->removed_tail = &sd->s_sibling;
+ *acxt->removed_tail = NULL;

if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
drop_nlink(acxt->parent_inode);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 269c845..be16947 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -71,7 +71,7 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
struct sysfs_addrm_cxt {
struct sysfs_dirent *parent_sd;
struct inode *parent_inode;
- struct sysfs_dirent *removed;
+ struct sysfs_dirent *removed, **removed_tail;
int cnt;
};

--
1.5.0.3


2007-09-20 07:27:54

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 3/4] sysfs: care-free suicide for sysfs files

Life can be weary and some sysfs files choose to commit suicide (kills
itself when written to). This is troublesome because while a sysfs
file is being accessed, the accessing task holds active references to
the node and its parent. Removing a sysfs node waits for active
references to be drained. The suicidal thread ends up waiting for its
own active reference and thus is sadly forced to live till the end of
the time.

Till now, this has been dealt with by requiring suicidal node to ask
someone else (workqueue) to kill it. In recognition of the
inhumanitrian nature of this solution, this patch implements care-free
suicide support.

Suicide attempt is automagically detected and the active references
the suiciding task is holding are put early to avoid the above
described deadlock. Module unload is inhibited till the sysfs file
access is complete to avoid freeing the code region too early.

This patch only implements care-free suicide support. The next patch
will convert the users.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 2 +
fs/sysfs/file.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
fs/sysfs/sysfs.h | 1 +
3 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index c902bdc..fe8270c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -546,6 +546,8 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
sd->s_sibling = NULL;

sysfs_drop_dentry(sd);
+ if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+ sysfs_file_check_suicide(sd);
sysfs_deactivate(sd);
sysfs_put(sd);
}
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index c05f961..0460970 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -76,9 +76,62 @@ struct sysfs_buffer {
int needs_read_fill;
int event;
struct list_head list;
+ struct task_struct *accessor;
+ int committed_suicide;
};

/**
+ * sysfs_file_check_suicide - check whether a file is trying to kill itself
+ * @sd: sysfs_dirent of interest
+ *
+ * Check whether @sd is trying to commit suicide. If so, help it
+ * by putting active references early such that deactivation
+ * doesn't deadlock waiting for its own active references.
+ *
+ * This works because a leaf node is always removed before its
+ * parent. By the time deactivation is called on the parent, the
+ * suiciding node has already put the active references to itself
+ * and the parent.
+ *
+ * LOCKING:
+ * None.
+ */
+void sysfs_file_check_suicide(struct sysfs_dirent *sd)
+{
+ struct sysfs_open_dirent *od;
+ struct sysfs_buffer *buffer;
+
+ spin_lock(&sysfs_open_dirent_lock);
+
+ od = sd->s_attr.open;
+ if (od) {
+ list_for_each_entry(buffer, &od->buffers, list) {
+ if (buffer->accessor != current)
+ continue;
+
+ /* it's trying to commit suicide, help it */
+
+ /* Inhibit unload till the suiciding method is
+ * complete. This is to avoid premature
+ * unload of the owner of the suiciding
+ * method.
+ */
+ module_inhibit_unload();
+
+ /* Global unload inhibition in effect, safe to
+ * put active references.
+ */
+ sysfs_put_active_two(sd);
+ buffer->committed_suicide = 1;
+
+ break;
+ }
+ }
+
+ spin_unlock(&sysfs_open_dirent_lock);
+}
+
+/**
* fill_read_buffer - allocate and fill buffer from object.
* @dentry: dentry pointer.
* @buffer: data buffer for file.
@@ -107,9 +160,14 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
return -ENODEV;

buffer->event = atomic_read(&attr_sd->s_attr.open->event);
+ buffer->accessor = current;
+
count = ops->show(kobj, attr_sd->s_attr.attr, buffer->page);

- sysfs_put_active_two(attr_sd);
+ if (buffer->committed_suicide)
+ module_allow_unload();
+ else
+ sysfs_put_active_two(attr_sd);

BUG_ON(count > (ssize_t)PAGE_SIZE);
if (count >= 0) {
@@ -215,9 +273,14 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
if (!sysfs_get_active_two(attr_sd))
return -ENODEV;

+ buffer->accessor = current;
+
rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);

- sysfs_put_active_two(attr_sd);
+ if (buffer->committed_suicide)
+ module_allow_unload();
+ else
+ sysfs_put_active_two(attr_sd);

return rc;
}
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index be16947..58f517b 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -142,6 +142,7 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
*/
extern const struct file_operations sysfs_file_operations;

+void sysfs_file_check_suicide(struct sysfs_dirent *sd);
int sysfs_add_file(struct sysfs_dirent *dir_sd,
const struct attribute *attr, int type);

--
1.5.0.3


2007-09-20 07:28:19

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/4] module: implement module_inhibit_unload()

As its name suggests, module_inhibit_unload() inhibits all module
unloading till the matching module_allow_unload() is called. This
unload inhibition doesn't affect whether a module can be unloaded or
not. It just stalls the final module free till the inhibition is
lifted.

This sledgehammer mechanism is to be used briefly in obscure cases
where identifying or getting the module to prevent from unloading is
difficult or not worth the effort. Note that module unloading is
siberia-cold path. If the inhibion is relatively brief in human
scale, that is, upto a few secs at maximum, it should be fine. So, if
this sledgehammer simplifies API and fixes premature unload bugs which
unfortunately aren't too rare, there isn't much reason not to use it.

Even if something goes wrong with unload inhibition (e.g. someone
forgets to lift the inhibition), it doesn't prevent modules from being
loaded.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Rusty Russell <[email protected]>
---
include/linux/module.h | 17 +++++++++++++
kernel/module.c | 59 ++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 72 insertions(+), 4 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index b6a646c..a835659 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -418,6 +418,9 @@ static inline int try_module_get(struct module *module)

extern void module_put(struct module *module);

+void module_inhibit_unload(void);
+void module_allow_unload(void);
+
#else /*!CONFIG_MODULE_UNLOAD*/
static inline int try_module_get(struct module *module)
{
@@ -431,6 +434,12 @@ static inline void __module_get(struct module *module)
}
#define symbol_put(x) do { } while(0)
#define symbol_put_addr(p) do { } while(0)
+static inline void module_inhibit_unload(void)
+{
+}
+static inline void module_allow_unload(void)
+{
+}

#endif /* CONFIG_MODULE_UNLOAD */

@@ -516,6 +525,14 @@ static inline void module_put(struct module *module)
{
}

+static inline void module_inhibit_unload(void)
+{
+}
+
+static inline void module_allow_unload(void)
+{
+}
+
#define module_name(mod) "kernel"

#define __unsafe(mod)
diff --git a/kernel/module.c b/kernel/module.c
index db0ead0..8daff45 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -44,6 +44,7 @@
#include <asm/uaccess.h>
#include <asm/semaphore.h>
#include <asm/cacheflush.h>
+#include <asm/atomic.h>
#include <linux/license.h>

extern int module_sysfs_initialized;
@@ -65,6 +66,8 @@ extern int module_sysfs_initialized;
* (add/delete uses stop_machine). */
static DEFINE_MUTEX(module_mutex);
static LIST_HEAD(modules);
+static atomic_t module_unload_inhibit_cnt = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(module_unload_wait);

static BLOCKING_NOTIFIER_HEAD(module_notify_list);

@@ -656,6 +659,7 @@ static void wait_for_zero_refcount(struct module *mod)
asmlinkage long
sys_delete_module(const char __user *name_user, unsigned int flags)
{
+ DECLARE_WAITQUEUE(wait, current);
struct module *mod;
char name[MODULE_NAME_LEN];
int ret, forced = 0;
@@ -714,12 +718,22 @@ sys_delete_module(const char __user *name_user, unsigned int flags)
if (!forced && module_refcount(mod) != 0)
wait_for_zero_refcount(mod);

+ mutex_unlock(&module_mutex);
+
/* Final destruction now noone is using it. */
- if (mod->exit != NULL) {
- mutex_unlock(&module_mutex);
+ if (mod->exit != NULL)
mod->exit();
- mutex_lock(&module_mutex);
- }
+
+ /* Don't proceed till inhibition is lifted. */
+ add_wait_queue(&module_unload_wait, &wait);
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ if (atomic_read(&module_unload_inhibit_cnt))
+ schedule();
+ __set_current_state(TASK_RUNNING);
+ remove_wait_queue(&module_unload_wait, &wait);
+
+ mutex_lock(&module_mutex);
+
free_module(mod);

out:
@@ -805,6 +819,43 @@ void module_put(struct module *module)
}
EXPORT_SYMBOL(module_put);

+/**
+ * module_inhibit_unload - inhibit module unload
+ *
+ * Inhibit module unload until allowed again. All module unload
+ * operations which reach zero reference count after this call
+ * has returned are guaranteed to be stalled till inhibition is
+ * lifted.
+ *
+ * This is a simple mechanism to prevent premature unload while
+ * code on a to-be-unloaded module is still executing. Unload
+ * inhibitions must be finite and relatively short.
+ *
+ * LOCKING:
+ * None.
+ */
+void module_inhibit_unload(void)
+{
+ atomic_inc(&module_unload_inhibit_cnt);
+}
+
+/**
+ * module_allow_unload - allow module unload
+ *
+ * Allow module unload. Must be balanced with calls to
+ * module_inhibit_unload().
+ *
+ * LOCKING:
+ * None.
+ */
+void module_allow_unload(void)
+{
+ if (atomic_dec_and_test(&module_unload_inhibit_cnt))
+ wake_up_all(&module_unload_wait);
+
+ BUG_ON(atomic_read(&module_unload_inhibit_cnt) < 0);
+}
+
#else /* !CONFIG_MODULE_UNLOAD */
static void print_unload_info(struct seq_file *m, struct module *mod)
{
--
1.5.0.3


2007-09-20 07:28:38

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 4/4] sysfs: make suicidal nodes just do it directly

Sysfs now allows direct suicide. Make suicidial sysfs nodes to use
it and remove now unncessary callback mechanism.

Signed-off-by: Tejun Heo <[email protected]>
---
drivers/base/core.c | 33 ----------------------
drivers/s390/cio/ccwgroup.c | 25 +++-------------
drivers/scsi/scsi_sysfs.c | 13 +--------
fs/sysfs/file.c | 63 -------------------------------------------
include/linux/device.h | 6 ----
include/linux/sysfs.h | 9 ------
6 files changed, 6 insertions(+), 143 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index d290ceb..a4942f1 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -464,39 +464,6 @@ void device_remove_bin_file(struct device *dev, struct bin_attribute *attr)
}
EXPORT_SYMBOL_GPL(device_remove_bin_file);

-/**
- * device_schedule_callback_owner - helper to schedule a callback for a device
- * @dev: device.
- * @func: callback function to invoke later.
- * @owner: module owning the callback routine
- *
- * Attribute methods must not unregister themselves or their parent device
- * (which would amount to the same thing). Attempts to do so will deadlock,
- * since unregistration is mutually exclusive with driver callbacks.
- *
- * Instead methods can call this routine, which will attempt to allocate
- * and schedule a workqueue request to call back @func with @dev as its
- * argument in the workqueue's process context. @dev will be pinned until
- * @func returns.
- *
- * This routine is usually called via the inline device_schedule_callback(),
- * which automatically sets @owner to THIS_MODULE.
- *
- * Returns 0 if the request was submitted, -ENOMEM if storage could not
- * be allocated, -ENODEV if a reference to @owner isn't available.
- *
- * NOTE: This routine won't work if CONFIG_SYSFS isn't set! It uses an
- * underlying sysfs routine (since it is intended for use by attribute
- * methods), and if sysfs isn't available you'll get nothing but -ENOSYS.
- */
-int device_schedule_callback_owner(struct device *dev,
- void (*func)(struct device *), struct module *owner)
-{
- return sysfs_schedule_callback(&dev->kobj,
- (void (*)(void *)) func, dev, owner);
-}
-EXPORT_SYMBOL_GPL(device_schedule_callback_owner);
-
static void klist_children_get(struct klist_node *n)
{
struct device *dev = container_of(n, struct device, knode_parent);
diff --git a/drivers/s390/cio/ccwgroup.c b/drivers/s390/cio/ccwgroup.c
index 5d967c4..7085305 100644
--- a/drivers/s390/cio/ccwgroup.c
+++ b/drivers/s390/cio/ccwgroup.c
@@ -66,20 +66,6 @@ __ccwgroup_remove_symlinks(struct ccwgroup_device *gdev)

}

-/*
- * Provide an 'ungroup' attribute so the user can remove group devices no
- * longer needed or accidentially created. Saves memory :)
- */
-static void ccwgroup_ungroup_callback(struct device *dev)
-{
- struct ccwgroup_device *gdev = to_ccwgroupdev(dev);
-
- mutex_lock(&gdev->reg_mutex);
- __ccwgroup_remove_symlinks(gdev);
- device_unregister(dev);
- mutex_unlock(&gdev->reg_mutex);
-}
-
static ssize_t
ccwgroup_ungroup_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
{
@@ -91,12 +77,11 @@ ccwgroup_ungroup_store(struct device *dev, struct device_attribute *attr, const
if (gdev->state != CCWGROUP_OFFLINE)
return -EINVAL;

- /* Note that we cannot unregister the device from one of its
- * attribute methods, so we have to use this roundabout approach.
- */
- rc = device_schedule_callback(dev, ccwgroup_ungroup_callback);
- if (rc)
- count = rc;
+ mutex_lock(&gdev->reg_mutex);
+ __ccwgroup_remove_symlinks(gdev);
+ device_unregister(dev);
+ mutex_unlock(&gdev->reg_mutex);
+
return count;
}

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ede9986..fdac321 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -466,22 +466,11 @@ store_rescan_field (struct device *dev, struct device_attribute *attr, const cha
}
static DEVICE_ATTR(rescan, S_IWUSR, NULL, store_rescan_field);

-static void sdev_store_delete_callback(struct device *dev)
-{
- scsi_remove_device(to_scsi_device(dev));
-}
-
static ssize_t sdev_store_delete(struct device *dev, struct device_attribute *attr, const char *buf,
size_t count)
{
- int rc;
+ scsi_remove_device(to_scsi_device(dev));

- /* An attribute cannot be unregistered by one of its own methods,
- * so we have to use this roundabout approach.
- */
- rc = device_schedule_callback(dev, sdev_store_delete_callback);
- if (rc)
- count = rc;
return count;
};
static DEVICE_ATTR(delete, S_IWUSR, NULL, sdev_store_delete);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0460970..5d708ff 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -715,68 +715,5 @@ void sysfs_remove_file_from_group(struct kobject *kobj,
}
EXPORT_SYMBOL_GPL(sysfs_remove_file_from_group);

-struct sysfs_schedule_callback_struct {
- struct kobject *kobj;
- void (*func)(void *);
- void *data;
- struct module *owner;
- struct work_struct work;
-};
-
-static void sysfs_schedule_callback_work(struct work_struct *work)
-{
- struct sysfs_schedule_callback_struct *ss = container_of(work,
- struct sysfs_schedule_callback_struct, work);
-
- (ss->func)(ss->data);
- kobject_put(ss->kobj);
- module_put(ss->owner);
- kfree(ss);
-}
-
-/**
- * sysfs_schedule_callback - helper to schedule a callback for a kobject
- * @kobj: object we're acting for.
- * @func: callback function to invoke later.
- * @data: argument to pass to @func.
- * @owner: module owning the callback code
- *
- * sysfs attribute methods must not unregister themselves or their parent
- * kobject (which would amount to the same thing). Attempts to do so will
- * deadlock, since unregistration is mutually exclusive with driver
- * callbacks.
- *
- * Instead methods can call this routine, which will attempt to allocate
- * and schedule a workqueue request to call back @func with @data as its
- * argument in the workqueue's process context. @kobj will be pinned
- * until @func returns.
- *
- * Returns 0 if the request was submitted, -ENOMEM if storage could not
- * be allocated, -ENODEV if a reference to @owner isn't available.
- */
-int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
- void *data, struct module *owner)
-{
- struct sysfs_schedule_callback_struct *ss;
-
- if (!try_module_get(owner))
- return -ENODEV;
- ss = kmalloc(sizeof(*ss), GFP_KERNEL);
- if (!ss) {
- module_put(owner);
- return -ENOMEM;
- }
- kobject_get(kobj);
- ss->kobj = kobj;
- ss->func = func;
- ss->data = data;
- ss->owner = owner;
- INIT_WORK(&ss->work, sysfs_schedule_callback_work);
- schedule_work(&ss->work);
- return 0;
-}
-EXPORT_SYMBOL_GPL(sysfs_schedule_callback);
-
-
EXPORT_SYMBOL_GPL(sysfs_create_file);
EXPORT_SYMBOL_GPL(sysfs_remove_file);
diff --git a/include/linux/device.h b/include/linux/device.h
index 2e15822..61fc85d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -356,12 +356,6 @@ extern int __must_check device_create_bin_file(struct device *dev,
struct bin_attribute *attr);
extern void device_remove_bin_file(struct device *dev,
struct bin_attribute *attr);
-extern int device_schedule_callback_owner(struct device *dev,
- void (*func)(struct device *), struct module *owner);
-
-/* This is a macro to avoid include problems with THIS_MODULE */
-#define device_schedule_callback(dev, func) \
- device_schedule_callback_owner(dev, func, THIS_MODULE)

/* device resource management */
typedef void (*dr_release_t)(struct device *dev, void *res);
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index db5dd24..8af072e 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -76,9 +76,6 @@ struct sysfs_ops {

#ifdef CONFIG_SYSFS

-int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
- void *data, struct module *owner);
-
int __must_check sysfs_create_dir(struct kobject *kobj);
void sysfs_remove_dir(struct kobject *kobj);
int __must_check sysfs_rename_dir(struct kobject *kobj, const char *new_name);
@@ -114,12 +111,6 @@ extern int __must_check sysfs_init(void);

#else /* CONFIG_SYSFS */

-static inline int sysfs_schedule_callback(struct kobject *kobj,
- void (*func)(void *), void *data, struct module *owner)
-{
- return -ENOSYS;
-}
-
static inline int sysfs_create_dir(struct kobject *kobj)
{
return 0;
--
1.5.0.3


2007-09-20 09:25:27

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 4/4] sysfs: make suicidal nodes just do it directly

On Thu, 20 Sep 2007 16:26:15 +0900,
Tejun Heo <[email protected]> wrote:

> Sysfs now allows direct suicide. Make suicidial sysfs nodes to use
> it and remove now unncessary callback mechanism.
>
> Signed-off-by: Tejun Heo <[email protected]>
> ---
> drivers/base/core.c | 33 ----------------------
> drivers/s390/cio/ccwgroup.c | 25 +++-------------
> drivers/scsi/scsi_sysfs.c | 13 +--------
> fs/sysfs/file.c | 63 -------------------------------------------
> include/linux/device.h | 6 ----
> include/linux/sysfs.h | 9 ------
> 6 files changed, 6 insertions(+), 143 deletions(-)
>

Just browsing through this (I'll review and do some tests later), but
you missed drivers/s390/cio/device.c::ccw_device_remove_disconnected().

Following (not event compile tested) patch (which is basically a revert
of 59a8a6e227cf0bc42e5be741ebfea97c222ab9ef) should take care of it.

Signed-off-by: Cornelia Huck <[email protected]>

---
drivers/s390/cio/device.c | 51 +++++++++-------------------------------------
1 files changed, 10 insertions(+), 41 deletions(-)

--- linux-2.6.orig/drivers/s390/cio/device.c
+++ linux-2.6/drivers/s390/cio/device.c
@@ -288,62 +288,31 @@ static void ccw_device_unregister(struct
device_del(&cdev->dev);
}

-static void ccw_device_remove_orphan_cb(struct device *dev)
-{
- struct ccw_device *cdev = to_ccwdev(dev);
-
- ccw_device_unregister(cdev);
- put_device(&cdev->dev);
-}
-
-static void ccw_device_remove_sch_cb(struct device *dev)
-{
- struct subchannel *sch;
-
- sch = to_subchannel(dev);
- css_sch_device_unregister(sch);
- /* Reset intparm to zeroes. */
- sch->schib.pmcw.intparm = 0;
- cio_modify(sch);
- put_device(&sch->dev);
-}
-
static void
ccw_device_remove_disconnected(struct ccw_device *cdev)
{
+ struct subchannel *sch;
unsigned long flags;
- int rc;

/*
* Forced offline in disconnected state means
* 'throw away device'.
*/
if (ccw_device_is_orphan(cdev)) {
- /*
- * Deregister ccw device.
- * Unfortunately, we cannot do this directly from the
- * attribute method.
- */
+ /* Deregister ccw device. */
spin_lock_irqsave(cdev->ccwlock, flags);
cdev->private->state = DEV_STATE_NOT_OPER;
spin_unlock_irqrestore(cdev->ccwlock, flags);
- rc = device_schedule_callback(&cdev->dev,
- ccw_device_remove_orphan_cb);
- if (rc)
- CIO_MSG_EVENT(2, "Couldn't unregister orphan "
- "0.%x.%04x\n",
- cdev->private->dev_id.ssid,
- cdev->private->dev_id.devno);
+ ccw_device_unregister(cdev);
+ put_device(&cdev->dev);
return;
}
- /* Deregister subchannel, which will kill the ccw device. */
- rc = device_schedule_callback(cdev->dev.parent,
- ccw_device_remove_sch_cb);
- if (rc)
- CIO_MSG_EVENT(2, "Couldn't unregister disconnected device "
- "0.%x.%04x\n",
- cdev->private->dev_id.ssid,
- cdev->private->dev_id.devno);
+ sch = to_subchannel(cdev->dev.parent);
+ css_sch_device_unregister(sch);
+ /* Reset intparm to zeroes. */
+ sch->schib.pmcw.intparm = 0;
+ cio_modify(sch);
+ put_device(&sch->dev);
}

/**

2007-09-20 09:44:19

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/4] sysfs: make suicidal nodes just do it directly

Cornelia Huck wrote:
> On Thu, 20 Sep 2007 16:26:15 +0900,
> Tejun Heo <[email protected]> wrote:
>
>> Sysfs now allows direct suicide. Make suicidial sysfs nodes to use
>> it and remove now unncessary callback mechanism.
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> ---
>> drivers/base/core.c | 33 ----------------------
>> drivers/s390/cio/ccwgroup.c | 25 +++-------------
>> drivers/scsi/scsi_sysfs.c | 13 +--------
>> fs/sysfs/file.c | 63 -------------------------------------------
>> include/linux/device.h | 6 ----
>> include/linux/sysfs.h | 9 ------
>> 6 files changed, 6 insertions(+), 143 deletions(-)
>>
>
> Just browsing through this (I'll review and do some tests later), but
> you missed drivers/s390/cio/device.c::ccw_device_remove_disconnected().
>
> Following (not event compile tested) patch (which is basically a revert
> of 59a8a6e227cf0bc42e5be741ebfea97c222ab9ef) should take care of it.
>
> Signed-off-by: Cornelia Huck <[email protected]>

Thanks for spotting this. After converting ccwgroup, I thought that was
it for s390.

--
tejun

2007-09-24 22:00:20

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Hi, Tejun,

I was just looking over these changes...

> + /* Don't proceed till inhibition is lifted. */
> + add_wait_queue(&module_unload_wait, &wait);
> + set_current_state(TASK_UNINTERRUPTIBLE);
> + if (atomic_read(&module_unload_inhibit_cnt))
> + schedule();
> + __set_current_state(TASK_RUNNING);
> + remove_wait_queue(&module_unload_wait, &wait);
> +
> + mutex_lock(&module_mutex);

Maybe I'm missing something, but this looks racy to me. There's no
check after schedule() to see if module_unload_inhibit_cnt is really
zero, and nothing to keep somebody else from slipping in and raising it
again afterward.

Given your description of this tool as a "sledgehammer," might it not be
easier to just take and hold module_mutex for the duration of the unload
block?

jon

2007-09-24 23:19:57

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Jonathan Corbet wrote:
> Hi, Tejun,
>
> I was just looking over these changes...
>
>> + /* Don't proceed till inhibition is lifted. */
>> + add_wait_queue(&module_unload_wait, &wait);
>> + set_current_state(TASK_UNINTERRUPTIBLE);
>> + if (atomic_read(&module_unload_inhibit_cnt))
>> + schedule();
>> + __set_current_state(TASK_RUNNING);
>> + remove_wait_queue(&module_unload_wait, &wait);
>> +
>> + mutex_lock(&module_mutex);
>
> Maybe I'm missing something, but this looks racy to me. There's no
> check after schedule() to see if module_unload_inhibit_cnt is really
> zero, and nothing to keep somebody else from slipping in and raising it
> again afterward.

The unloading can proceed once module_unload_inhibit_cnt reaches zero.
An unloading thread only has to care about inhibition put in effect
before unloading has started, so there's no need to check again.

> Given your description of this tool as a "sledgehammer," might it not be
> easier to just take and hold module_mutex for the duration of the unload
> block?

That would be easier but...

* It would serialize users of the sledgehammer.
* It would block loading modules (which is often more important than
unloading them) when things go south.

--
tejun

2007-09-24 23:43:16

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 2007-09-25 at 08:18 +0900, Tejun Heo wrote:
> > Given your description of this tool as a "sledgehammer," might it not be
> > easier to just take and hold module_mutex for the duration of the unload
> > block?
>
> That would be easier but...
>
> * It would serialize users of the sledgehammer.
> * It would block loading modules (which is often more important than
> unloading them) when things go south.

My concern is that you're dropping the module mutex around ->exit now.
I don't *think* this should matter, but it's worth considering.

I really wonder if an explicit "kill_this_attribute()" is a better way
to go than this...

Rusty.

2007-09-25 01:42:43

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
> On Tue, 2007-09-25 at 08:18 +0900, Tejun Heo wrote:
>>> Given your description of this tool as a "sledgehammer," might it not be
>>> easier to just take and hold module_mutex for the duration of the unload
>>> block?
>> That would be easier but...
>>
>> * It would serialize users of the sledgehammer.
>> * It would block loading modules (which is often more important than
>> unloading them) when things go south.
>
> My concern is that you're dropping the module mutex around ->exit now.
> I don't *think* this should matter, but it's worth considering.

We always did that. Before the patch the code segment looked like the
following.

/* Final destruction now noone is using it. */
if (mod->exit != NULL) {
mutex_unlock(&module_mutex);
mod->exit();
mutex_lock(&module_mutex);
}

> I really wonder if an explicit "kill_this_attribute()" is a better way
> to go than this...

I think this sort of temporary unload blocking would be useful for other
cases like this.

Thanks.

--
tejun

2007-09-25 02:13:45

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 2007-09-25 at 10:40 +0900, Tejun Heo wrote:
> Rusty Russell wrote:
> > My concern is that you're dropping the module mutex around ->exit now.
> > I don't *think* this should matter, but it's worth considering.
>
> We always did that. Before the patch the code segment looked like the
> following.

Hi Tejun,

Thanks, misread patch.

> > I really wonder if an explicit "kill_this_attribute()" is a better way
> > to go than this...
>
> I think this sort of temporary unload blocking would be useful for other
> cases like this.

I hope not: this doesn't work in general. Calling into a module after
->exit has called assumes that the exit function doesn't free up or
overwrite stuff the other functions need.

Cheers,
Rusty.

2007-09-25 02:41:16

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
>>> I really wonder if an explicit "kill_this_attribute()" is a better way
>>> to go than this...
>> I think this sort of temporary unload blocking would be useful for other
>> cases like this.
>
> I hope not: this doesn't work in general. Calling into a module after
> ->exit has called assumes that the exit function doesn't free up or
> overwrite stuff the other functions need.

Right, the sole purpose the unload inhibition is to hold onto the 'code'
section from going away. The rest of object lifetime management should
be implemented using separate mechanisms anyway. I was talking about
similar cases where the 'code' should be protected for a short time.

Thanks.

--
tejun

2007-09-25 03:22:09

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 2007-09-25 at 11:39 +0900, Tejun Heo wrote:
> Rusty Russell wrote:
> >>> I really wonder if an explicit "kill_this_attribute()" is a better way
> >>> to go than this...
> >> I think this sort of temporary unload blocking would be useful for other
> >> cases like this.
> >
> > I hope not: this doesn't work in general. Calling into a module after
> > ->exit has called assumes that the exit function doesn't free up or
> > overwrite stuff the other functions need.
>
> Right, the sole purpose the unload inhibition is to hold onto the 'code'
> section from going away. The rest of object lifetime management should
> be implemented using separate mechanisms anyway. I was talking about
> similar cases where the 'code' should be protected for a short time.

As stated you cannot protect arbitrary code this way, as you are trying
to do. I do not think you've broken any of the current code, but I
cannot tell. You're certainly going to surprise unsuspecting future
authors.

Can you really not figure out the module owner of the sysfs entry to inc
its use count during this procedure? (__module_get()).

Rusty.

2007-09-25 03:37:57

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
> As stated you cannot protect arbitrary code this way, as you are trying
> to do. I do not think you've broken any of the current code, but I
> cannot tell. You're certainly going to surprise unsuspecting future
> authors.

Can you elaborate a bit? Why can't it protect the code?

> Can you really not figure out the module owner of the sysfs entry to inc
> its use count during this procedure? (__module_get()).

I can but I don't think it's worth the effort. It will involve passing
@owner parameter down through kobject to sysfs but the path is pretty
obscure and thus difficult to test. I think it's too much work for the
users of the API and it will be easy to pass the wrong @owner and go
unnoticed.

Thanks.

--
tejun

2007-09-25 04:39:17

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 2007-09-25 at 12:36 +0900, Tejun Heo wrote:
> Rusty Russell wrote:
> > As stated you cannot protect arbitrary code this way, as you are trying
> > to do. I do not think you've broken any of the current code, but I
> > cannot tell. You're certainly going to surprise unsuspecting future
> > authors.
>
> Can you elaborate a bit? Why can't it protect the code?

Because you don't know what that code does. After all, it's assumed
that module code doesn't get called after exit and you're deliberately
violating that assumption.

> > Can you really not figure out the module owner of the sysfs entry to inc
> > its use count during this procedure? (__module_get()).
>
> I can but I don't think it's worth the effort. It will involve passing
> @owner parameter down through kobject to sysfs but the path is pretty
> obscure and thus difficult to test.

Have you tested that *this* path works? Let's take your first change as
an example:

+ mutex_lock(&gdev->reg_mutex);
+ __ccwgroup_remove_symlinks(gdev);
+ device_unregister(dev);
+ mutex_unlock(&gdev->reg_mutex);

Now, are you sure that calling cleanup_ccwgroup just after
device_unregister() works?

static void __exit
cleanup_ccwgroup (void)
{
bus_unregister (&ccwgroup_bus_type);
}

> I think it's too much work for the
> users of the API and it will be easy to pass the wrong @owner and go
> unnoticed.

But your shortcut insists that all module authors be aware that
functions can be running after exit() is called. That's a recipe for
instability and disaster.

Rusty.

2007-09-25 08:01:51

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 25 Sep 2007 14:38:38 +1000,
Rusty Russell <[email protected]> wrote:

> Have you tested that *this* path works? Let's take your first change as
> an example:
>
> + mutex_lock(&gdev->reg_mutex);
> + __ccwgroup_remove_symlinks(gdev);
> + device_unregister(dev);
> + mutex_unlock(&gdev->reg_mutex);
>
> Now, are you sure that calling cleanup_ccwgroup just after
> device_unregister() works?
>
> static void __exit
> cleanup_ccwgroup (void)
> {
> bus_unregister (&ccwgroup_bus_type);
> }
>

ccwgroup is a bit special. The ccwgroup drivers (say, qeth) will
unregister their ccwgroup_driver in their exit function. ccwgroup will
then unregister all devices bound to this driver (a ccwgroup device
without a driver does not make sense, since they are artifically
created by writing to a 'group' attribute provided by the driver). This
makes sure that ccwgroup cannot be unloaded while there are still
devices on the bus, so your example won't hit.

> > I think it's too much work for the
> > users of the API and it will be easy to pass the wrong @owner and go
> > unnoticed.
>
> But your shortcut insists that all module authors be aware that
> functions can be running after exit() is called. That's a recipe for
> instability and disaster.

There are already problems like this with ->release().

<And no, I still haven't gotten around to testing and reviewing all
those patchsets, sorry>

2007-09-25 08:27:04

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
> On Tue, 2007-09-25 at 12:36 +0900, Tejun Heo wrote:
>> Rusty Russell wrote:
>>> As stated you cannot protect arbitrary code this way, as you are trying
>>> to do. I do not think you've broken any of the current code, but I
>>> cannot tell. You're certainly going to surprise unsuspecting future
>>> authors.
>> Can you elaborate a bit? Why can't it protect the code?
>
> Because you don't know what that code does. After all, it's assumed
> that module code doesn't get called after exit and you're deliberately
> violating that assumption.

What I meant by protecting 'code' was the 'code' itself. Those pages
containing instructions that cpu executes. It of course can't protect
against all the things they do.

>>> Can you really not figure out the module owner of the sysfs entry to inc
>>> its use count during this procedure? (__module_get()).
>> I can but I don't think it's worth the effort. It will involve passing
>> @owner parameter down through kobject to sysfs but the path is pretty
>> obscure and thus difficult to test.
>
> Have you tested that *this* path works? Let's take your first change as
> an example:
>
> + mutex_lock(&gdev->reg_mutex);
> + __ccwgroup_remove_symlinks(gdev);
> + device_unregister(dev);
> + mutex_unlock(&gdev->reg_mutex);
>
> Now, are you sure that calling cleanup_ccwgroup just after
> device_unregister() works?
>
> static void __exit
> cleanup_ccwgroup (void)
> {
> bus_unregister (&ccwgroup_bus_type);
> }

It should. After ->exit() is called, there can't be any object left
behind. If a module is hosting objects which can't be destroyed from
->exit(), its module ref count shouldn't be zero. So, either 1.
refcount != 0 or 2. ->exit() can destroy all objects. As Cornelia
explains, for ccwgroup, it's #1. Note that unload inhibition doesn't
change anything about this.

>> I think it's too much work for the
>> users of the API and it will be easy to pass the wrong @owner and go
>> unnoticed.
>
> But your shortcut insists that all module authors be aware that
> functions can be running after exit() is called. That's a recipe for
> instability and disaster.

No, it doesn't change that at all. All unload inhibition does is
postponing removal of code (and data too of course) section a bit so
that a module can host code which issues unloading of itself. Object
synchronization rules remain exactly the same. Formerly broken code is
still broken and I don't even think unload inhibition would mask them
too much either.

I think the naming is too ambiguous. Maybe it should be named something
like "hold_module_for_suicide".

Thanks.

--
tejun

2007-09-25 08:38:42

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Tejun Heo wrote:
> Rusty Russell wrote:
>> Now, are you sure that calling cleanup_ccwgroup just after
>> device_unregister() works?
>>
>> static void __exit
>> cleanup_ccwgroup (void)
>> {
>> bus_unregister (&ccwgroup_bus_type);
>> }
>
> It should. After ->exit() is called, there can't be any object left
> behind. If a module is hosting objects which can't be destroyed from
> ->exit(), its module ref count shouldn't be zero. So, either 1.
> refcount != 0 or 2. ->exit() can destroy all objects. As Cornelia
> explains, for ccwgroup, it's #1. Note that unload inhibition doesn't
> change anything about this.

Hmmm.... There doesn't seem to any reason why the blocking should be
after calling ->exit(). And, yeah, it would be more useful and
intuitive if blocking happens before ->exit(). What do you think?

Thanks.

--
tejun

2007-09-25 08:51:20

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 2007-09-25 at 17:36 +0900, Tejun Heo wrote:
> Hmmm.... There doesn't seem to any reason why the blocking should be
> after calling ->exit(). And, yeah, it would be more useful and
> intuitive if blocking happens before ->exit(). What do you think?

*That* I have no problem with.

I was going to say "just grab a reference to every module" except if a
new module is loaded you don't know about it.

If you move your blocking, it seems fine.

Thanks!
Rusty.

2007-09-25 14:07:23

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
> On Tue, 2007-09-25 at 17:36 +0900, Tejun Heo wrote:
>> Hmmm.... There doesn't seem to any reason why the blocking should be
>> after calling ->exit(). And, yeah, it would be more useful and
>> intuitive if blocking happens before ->exit(). What do you think?
>
> *That* I have no problem with.
>
> I was going to say "just grab a reference to every module" except if a
> new module is loaded you don't know about it.
>
> If you move your blocking, it seems fine.

Great, I think I was too occupied with the sysfs case when I was writing
it. I'll update the patch. Thanks a lot.

--
tejun

2007-09-25 14:24:19

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 25 Sep 2007, Tejun Heo wrote:

> Jonathan Corbet wrote:
> > Hi, Tejun,
> >
> > I was just looking over these changes...
> >
> >> + /* Don't proceed till inhibition is lifted. */
> >> + add_wait_queue(&module_unload_wait, &wait);
> >> + set_current_state(TASK_UNINTERRUPTIBLE);
> >> + if (atomic_read(&module_unload_inhibit_cnt))
> >> + schedule();
> >> + __set_current_state(TASK_RUNNING);
> >> + remove_wait_queue(&module_unload_wait, &wait);
> >> +
> >> + mutex_lock(&module_mutex);
> >
> > Maybe I'm missing something, but this looks racy to me. There's no
> > check after schedule() to see if module_unload_inhibit_cnt is really
> > zero, and nothing to keep somebody else from slipping in and raising it
> > again afterward.
>
> The unloading can proceed once module_unload_inhibit_cnt reaches zero.
> An unloading thread only has to care about inhibition put in effect
> before unloading has started, so there's no need to check again.

You haven't fully answered Jon's question. Suppose
module_unload_inhibit_cnt is nonzero, so the task adds itself to the
module_unload_wait queue, changes to TASK_UNINTERRUPTIBLE, and calls
schedule. There's nothing to prevent somebody else from waking the
task back up before the original inhibition has been lifted.

Alan Stern

2007-09-25 14:32:05

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Alan Stern wrote:
>> The unloading can proceed once module_unload_inhibit_cnt reaches zero.
>> An unloading thread only has to care about inhibition put in effect
>> before unloading has started, so there's no need to check again.
>
> You haven't fully answered Jon's question. Suppose
> module_unload_inhibit_cnt is nonzero, so the task adds itself to the
> module_unload_wait queue, changes to TASK_UNINTERRUPTIBLE, and calls
> schedule. There's nothing to prevent somebody else from waking the
> task back up before the original inhibition has been lifted.

Hmmm... I might be missing something here. Who else can wake up a
thread in uninterruptible sleep?

Thanks.

--
tejun

2007-09-25 15:09:19

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Tue, 25 Sep 2007, Tejun Heo wrote:

> Alan Stern wrote:
> >> The unloading can proceed once module_unload_inhibit_cnt reaches zero.
> >> An unloading thread only has to care about inhibition put in effect
> >> before unloading has started, so there's no need to check again.
> >
> > You haven't fully answered Jon's question. Suppose
> > module_unload_inhibit_cnt is nonzero, so the task adds itself to the
> > module_unload_wait queue, changes to TASK_UNINTERRUPTIBLE, and calls
> > schedule. There's nothing to prevent somebody else from waking the
> > task back up before the original inhibition has been lifted.
>
> Hmmm... I might be missing something here. Who else can wake up a
> thread in uninterruptible sleep?

In principle, anything can. There has never been any guarantee in the
kernel that a task sleeping on a waitqueue will remain asleep until
the waitqueue is signalled. That's part of the reason why things like
__wait_event() are coded as loops.

Alan Stern

2007-09-25 22:12:56

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 2/4] sysfs: allow suicide

On Thu, Sep 20, 2007 at 04:26:15PM +0900, Tejun Heo wrote:
> Hello, all.
>
> This is the second patchset of four sysfs update patchset series[1]
> and to be applied on top of the first patchset[1].

I have no specific objection to this patchset, but it seems that others
still do, so I'll hold off on applying them to my tree until it all gets
sorted out :)

thanks,

greg k-h

2007-09-25 23:17:43

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Alan Stern wrote:
> On Tue, 25 Sep 2007, Tejun Heo wrote:
>
>> Alan Stern wrote:
>>>> The unloading can proceed once module_unload_inhibit_cnt reaches zero.
>>>> An unloading thread only has to care about inhibition put in effect
>>>> before unloading has started, so there's no need to check again.
>>> You haven't fully answered Jon's question. Suppose
>>> module_unload_inhibit_cnt is nonzero, so the task adds itself to the
>>> module_unload_wait queue, changes to TASK_UNINTERRUPTIBLE, and calls
>>> schedule. There's nothing to prevent somebody else from waking the
>>> task back up before the original inhibition has been lifted.
>> Hmmm... I might be missing something here. Who else can wake up a
>> thread in uninterruptible sleep?
>
> In principle, anything can. There has never been any guarantee in the
> kernel that a task sleeping on a waitqueue will remain asleep until
> the waitqueue is signalled. That's part of the reason why things like
> __wait_event() are coded as loops.

Hmmm... I always thought the queue was because the condition can change
inbetween waking up and actually running. For example, if the condition
is !(queue empty), another task can enter the critical section and
consume the element which triggered wake up before the woken up task do.

I have no problem with changing the condition check to loop but it would
be great if someone can point me to a code where this unexpected wake up
is used.

Thanks.

--
tejun

2007-09-25 23:42:19

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Wed, 2007-09-26 at 08:15 +0900, Tejun Heo wrote:
> I have no problem with changing the condition check to loop but it would
> be great if someone can point me to a code where this unexpected wake up
> is used.

This is one of those areas where we're conservative. Historically there
have been random wakes, and noone is quite sure that signal code or the
freezer or whatever won't do it under some circumstances.

Thus it's always seen as better to wait on a specific condition.

Cheers,
Rusty.

2007-09-26 01:44:25

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

Rusty Russell wrote:
> On Wed, 2007-09-26 at 08:15 +0900, Tejun Heo wrote:
>> I have no problem with changing the condition check to loop but it would
>> be great if someone can point me to a code where this unexpected wake up
>> is used.
>
> This is one of those areas where we're conservative. Historically there
> have been random wakes, and noone is quite sure that signal code or the
> freezer or whatever won't do it under some circumstances.
>
> Thus it's always seen as better to wait on a specific condition.

I see. I'll update the code then. Thanks.

--
tejun

2007-09-26 14:39:20

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 1/4] module: implement module_inhibit_unload()

On Wed, 26 Sep 2007, Tejun Heo wrote:

> >> Hmmm... I might be missing something here. Who else can wake up a
> >> thread in uninterruptible sleep?
> >
> > In principle, anything can. There has never been any guarantee in the
> > kernel that a task sleeping on a waitqueue will remain asleep until
> > the waitqueue is signalled. That's part of the reason why things like
> > __wait_event() are coded as loops.
>
> Hmmm... I always thought the queue was because the condition can change
> inbetween waking up and actually running. For example, if the condition
> is !(queue empty), another task can enter the critical section and
> consume the element which triggered wake up before the woken up task do.

That's the other part of the reason for using a loop. :-)

Alan Stern

2007-09-28 13:55:57

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 4/4] sysfs: make suicidal nodes just do it directly

On Thu, 20 Sep 2007 16:26:15 +0900,
Tejun Heo <[email protected]> wrote:

> Sysfs now allows direct suicide. Make suicidial sysfs nodes to use
> it and remove now unncessary callback mechanism.
>
> Signed-off-by: Tejun Heo <[email protected]>
> ---
> drivers/base/core.c | 33 ----------------------
> drivers/s390/cio/ccwgroup.c | 25 +++-------------
> drivers/scsi/scsi_sysfs.c | 13 +--------
> fs/sysfs/file.c | 63 -------------------------------------------
> include/linux/device.h | 6 ----
> include/linux/sysfs.h | 9 ------
> 6 files changed, 6 insertions(+), 143 deletions(-)

ccwgroup was implicitly relying on a held reference, fix that.

Signed-off-by: Cornelia Huck <[email protected]>

---
drivers/s390/cio/ccwgroup.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)

--- linux-2.6.orig/drivers/s390/cio/ccwgroup.c
+++ linux-2.6/drivers/s390/cio/ccwgroup.c
@@ -72,7 +72,7 @@ ccwgroup_ungroup_store(struct device *de
struct ccwgroup_device *gdev;
int rc;

- gdev = to_ccwgroupdev(dev);
+ gdev = to_ccwgroupdev(get_device(dev));

if (gdev->state != CCWGROUP_OFFLINE)
return -EINVAL;
@@ -81,6 +81,7 @@ ccwgroup_ungroup_store(struct device *de
__ccwgroup_remove_symlinks(gdev);
device_unregister(dev);
mutex_unlock(&gdev->reg_mutex);
+ put_device(dev);

return count;
}

2007-09-28 14:29:13

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/4] sysfs: make suicidal nodes just do it directly

Cornelia Huck wrote:
> On Thu, 20 Sep 2007 16:26:15 +0900,
> Tejun Heo <[email protected]> wrote:
>
>> Sysfs now allows direct suicide. Make suicidial sysfs nodes to use
>> it and remove now unncessary callback mechanism.
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> ---
>> drivers/base/core.c | 33 ----------------------
>> drivers/s390/cio/ccwgroup.c | 25 +++-------------
>> drivers/scsi/scsi_sysfs.c | 13 +--------
>> fs/sysfs/file.c | 63 -------------------------------------------
>> include/linux/device.h | 6 ----
>> include/linux/sysfs.h | 9 ------
>> 6 files changed, 6 insertions(+), 143 deletions(-)
>
> ccwgroup was implicitly relying on a held reference, fix that.
>
> Signed-off-by: Cornelia Huck <[email protected]>

Thanks. Will incorporate into the next series.

--
tejun