2007-09-20 08:06:07

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Subject: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Hello, all.

This is the third patchset of four sysfs update patchset series[1] and
to be applied on top of the second patchset[2].

Currently, sysfs interface is based on kobj. This made more sense
before because lifetime of sysfs nodes were tracked using kobj
reference counts. However, this is no longer true. sysfs nodes are
represented with a sysfs_dirent and external reference is severed
immediately on node removal. The internal implementation reflects
that too and mostly handles sysfs_dirents.

This patchset divorces sysfs from kobject and driver model by
implementing sysfs_dirent based interface. This has the following
advantages.

* sysfs becomes a separate module and driver model becomes a user of
sysfs. Those two are not entangled anymore. Things are easier to
understand and test this way.

* Non-driver model users of sysfs (modules, blkdev, etc...) don't have
to jump through hoops to use sysfs. kobj based interface requires
attribute wrapping and is awkward to use directly. Also, the user
is required to create a dummy kobj which doesn't serve much purpose
than being a token for sysfs reference. New sysfs-dirent based
interface is straight forward proc-fs like interface and should be
easier and more intuitive for those users.

* As kobj didn't really represent what actually populate sysfs,
interface was a bit messy - there was no way to reference a leaf
node other than using its textual name and directories which aren't
associated with a kobj needed separate interface, which in turn,
made adding new features difficult. New interface is leaner and
more flexible.

kobject based interface is reimplemented as wrapper functions on top
of the new sysfs_dirent based interface. Long term plan is to update
kobject based users one-by-one and deprecate kobject based interface.

This change doesn't intend to replace driver-model based interface or
encourage random additions to sysfs hierarchy. Driver model internal
may change but interfaces to drivers and userland will stay the same.
The goals of this patchset are to 1. clean up sysfs and driver model
internals and 2. make lives easier for sysfs users which aren't
drivers or are having difficulties integrating into the current driver
model.

This patchset contains the following 22 patches.

0001-sysfs-make-sysfs_root-a-pointer.patch
0002-sysfs-separate-out-sysfs-kobject.h-and-fs-sysfs-kob.patch
0003-sysfs-make-sysfs_new_dirent-normalize-mode-and-d.patch
0004-sysfs-make-SYSFS_COPY_NAME-a-flag.patch
0005-sysfs-implement-sysfs_find_child.patch
0006-sysfs-restructure-addrm-helpers.patch
0007-sysfs-implement-sysfs_dirent-based-remove-interface.patch
0008-sysfs-implement-sysfs_dirent-based-directory-interf.patch
0009-sysfs-rename-internal-function-sysfs_add_file.patch
0010-sysfs-drop-kobj-and-attr-from-file-related-symbols.patch
0011-sysfs-implement-sysfs_dirent-based-file-interface.patch
0012-sysfs-drop-kobj-and-attr-from-bin-related-symbols.patch
0013-sysfs-implement-sysfs_dirent-based-bin-interface.patch
0014-sysfs-s-symlink-link-g.patch
0015-sysfs-implement-sysfs_dirent-based-link-interface.patch
0016-sysfs-convert-group-implementation-to-use-sd-based.patch
0017-sysfs-s-sysfs_rename_mutex-sysfs_op_mutex-and-prot.patch
0018-kobject-implement-__kobject_set_name.patch
0019-sysfs-implement-sysfs_dirent-based-rename-sysfs_r.patch
0020-sysfs-kill-now-unused-__sysfs_add_file.patch
0021-sysfs-kill-sysfs_hash_and_remove.patch
0022-sysfs-move-sysfs_assoc_lock-into-fs-sysfs-kobject.c.patch

0001-0003 are preparations. 0004-0006 implement new features needed
for sysfs_dirent interface. 0007-0016 implement sysfs_dirent based
add, remove interfaces and reimplement kobj-based ones in terms of
them.

0017-0018 preps for sysfs_dirent based rename interface.
sysfs_rename_mutex is renamed to sysfs_op_mutex and protects all tree
modifying operations. 0019 implements sysfs_dirent based rename
interface. This is one mighty rename interface which can do both
moving and renaming. The implementation is over-done to later
accomodate symlink auto renaming and hopefully help shadow renaming.

0020-0022 cleans up now unused stuff.

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/582105
[2] http://thread.gmane.org/gmane.linux.kernel/582130



2007-09-20 08:06:34

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 04/22] sysfs: make SYSFS_COPY_NAME a flag

Currently name is implicitly copied for directory and symlink nodes.
Make the behavior flexible by making it a flag. If SYSFS_COPY_NAME
bit is specified in @mode when calling sysfs_new_dirent(), the name is
copied.

SYSFS_COPY_NAME is defined as S_IFMT so that it can be specified with
@mode bits.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 21 +++++++++++++++------
fs/sysfs/symlink.c | 3 ++-
fs/sysfs/sysfs.h | 2 +-
include/linux/sysfs.h | 10 ++++++++++
4 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 02918d3..584f17c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -14,6 +14,11 @@
#include <linux/mutex.h>
#include "sysfs.h"

+/* verify all mode flags are inside S_IFMT */
+#if (SYSFS_MODE_FLAGS & ~S_IFMT)
+#error SYSFS mode flags out of S_IFMT
+#endif
+
DEFINE_MUTEX(sysfs_mutex);
DEFINE_MUTEX(sysfs_rename_mutex);
spinlock_t sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;
@@ -275,7 +280,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)

if (sysfs_type(sd) == SYSFS_KOBJ_LINK)
sysfs_put(sd->s_symlink.target_sd);
- if (sysfs_type(sd) & SYSFS_COPY_NAME)
+ if (sd->s_flags & SYSFS_FLAG_NAME_COPIED)
kfree(sd->s_name);
kfree(sd->s_iattr);
sysfs_free_ino(sd->s_ino);
@@ -301,11 +306,12 @@ static struct dentry_operations sysfs_dentry_ops = {
/**
* sysfs_new_dirent - allocate and initialize sysfs_dirent
* @name: name for the new sysfs_dirent
- * @mode: mask of bits from S_IRWXUGO
+ * @mode: mask of bits from S_IRWXUGO | SYSFS_COPY_NAME
* @type: one of SYSFS_{DIR|FILE|BIN|LINK}
*
* Allocate and initialize a sysfs_dirent with the specified
- * parameters.
+ * parameters. If SYSFS_COPY_NAME is specified in @mode, @name
+ * is duplicated.
*
* LOCKING:
* Kernel thread context (may sleep).
@@ -316,14 +322,17 @@ static struct dentry_operations sysfs_dentry_ops = {
*/
struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
{
+ unsigned int flags = type;
char *dup_name = NULL;
struct sysfs_dirent *sd;

/* need to copy name? */
- if (type & SYSFS_COPY_NAME) {
+ if (mode & SYSFS_COPY_NAME) {
name = dup_name = kstrdup(name, GFP_KERNEL);
if (!name)
return NULL;
+
+ flags |= SYSFS_FLAG_NAME_COPIED;
}

/* normalize mode */
@@ -355,7 +364,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)

sd->s_name = name;
sd->s_mode = mode;
- sd->s_flags = type;
+ sd->s_flags = flags;

return sd;

@@ -647,7 +656,7 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
int rc;

/* allocate */
- sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
+ sd = sysfs_new_dirent(name, mode | SYSFS_COPY_NAME, SYSFS_DIR);
if (!sd)
return -ENOMEM;
sd->s_dir.kobj = kobj;
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index bf96bcd..982085c 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -82,7 +82,8 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
goto out_put;

error = -ENOMEM;
- sd = sysfs_new_dirent(name, S_IRWXUGO, SYSFS_KOBJ_LINK);
+ sd = sysfs_new_dirent(name, S_IRWXUGO | SYSFS_COPY_NAME,
+ SYSFS_KOBJ_LINK);
if (!sd)
goto out_put;

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 9180e2c..db1a433 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -55,10 +55,10 @@ struct sysfs_dirent {
#define SYSFS_KOBJ_ATTR 0x0002
#define SYSFS_KOBJ_BIN_ATTR 0x0004
#define SYSFS_KOBJ_LINK 0x0008
-#define SYSFS_COPY_NAME (SYSFS_DIR | SYSFS_KOBJ_LINK)

#define SYSFS_FLAG_MASK ~SYSFS_TYPE_MASK
#define SYSFS_FLAG_REMOVED 0x0200
+#define SYSFS_FLAG_NAME_COPIED 0x0400

static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
{
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 38b73f9..5646e56 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -20,6 +20,16 @@

struct vm_area_struct;

+/*
+ * @mode bits for sysfs_add_*() functions. Only S_IALLUGO bits are
+ * valid as real mode bits. Bits in S_IFMT are used to set sysfs
+ * specific flags.
+ */
+#define SYSFS_COPY_NAME 010000 /* copy passed @name */
+
+/* collection of all flags for verification */
+#define SYSFS_MODE_FLAGS SYSFS_COPY_NAME
+
#ifdef CONFIG_SYSFS

int __must_check sysfs_init(void);
--
1.5.0.3


2007-09-20 08:06:57

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 03/22] sysfs: make sysfs_new_dirent() normalize @mode and determine file type

Currently sysfs_new_dirent() takes @mode and @type. @mode contains
S_IF* mask which is also duplicately indicated by @type. This patch
sysfs_new_dirent() determine S_IF* mask and normalize @mode such that
only bits from S_IALLUGO is taken as @mode.

This is to allow later use of unused @mode bits as special flags.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 36 +++++++++++++++++++++++++++++++++++-
fs/sysfs/file.c | 2 +-
fs/sysfs/symlink.c | 2 +-
3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ba631eb..02918d3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -298,17 +298,51 @@ static struct dentry_operations sysfs_dentry_ops = {
.d_iput = sysfs_d_iput,
};

+/**
+ * sysfs_new_dirent - allocate and initialize sysfs_dirent
+ * @name: name for the new sysfs_dirent
+ * @mode: mask of bits from S_IRWXUGO
+ * @type: one of SYSFS_{DIR|FILE|BIN|LINK}
+ *
+ * Allocate and initialize a sysfs_dirent with the specified
+ * parameters.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the new sysfs_dirent on success, NULL on allocation
+ * failure.
+ */
struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
{
char *dup_name = NULL;
struct sysfs_dirent *sd;

+ /* need to copy name? */
if (type & SYSFS_COPY_NAME) {
name = dup_name = kstrdup(name, GFP_KERNEL);
if (!name)
return NULL;
}

+ /* normalize mode */
+ mode &= S_IALLUGO;
+
+ switch (type) {
+ case SYSFS_DIR:
+ mode |= S_IFDIR;
+ break;
+ case SYSFS_KOBJ_ATTR:
+ case SYSFS_KOBJ_BIN_ATTR:
+ mode |= S_IFREG;
+ break;
+ case SYSFS_KOBJ_LINK:
+ mode |= S_IFLNK;
+ break;
+ }
+
+ /* allocate and initialize */
sd = kmem_cache_zalloc(sysfs_dir_cachep, GFP_KERNEL);
if (!sd)
goto err_out1;
@@ -607,7 +641,7 @@ struct sysfs_dirent *sysfs_get_dirent(struct sysfs_dirent *parent_sd,
static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
const char *name, struct sysfs_dirent **p_sd)
{
- umode_t mode = S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO;
+ umode_t mode = S_IRWXU | S_IRUGO | S_IXUGO;
struct sysfs_addrm_cxt acxt;
struct sysfs_dirent *sd;
int rc;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 5d708ff..8d8e1ee 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -571,7 +571,7 @@ const struct file_operations sysfs_file_operations = {
int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
int type)
{
- umode_t mode = (attr->mode & S_IALLUGO) | S_IFREG;
+ umode_t mode = attr->mode & S_IALLUGO;
struct sysfs_addrm_cxt acxt;
struct sysfs_dirent *sd;
int rc;
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 6b3358e..bf96bcd 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -82,7 +82,7 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
goto out_put;

error = -ENOMEM;
- sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+ sd = sysfs_new_dirent(name, S_IRWXUGO, SYSFS_KOBJ_LINK);
if (!sd)
goto out_put;

--
1.5.0.3


2007-09-20 08:07:28

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 01/22] sysfs: make sysfs_root a pointer

In the upcoming new sysfs interface, sysfs_root will be exported. To
ease usage and make dummy declaration easier, make sysfs_root a
pointer.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 4 ++--
fs/sysfs/mount.c | 8 +++++---
fs/sysfs/symlink.c | 2 +-
fs/sysfs/sysfs.h | 2 +-
4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index fe8270c..ba631eb 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -651,7 +651,7 @@ int sysfs_create_dir(struct kobject * kobj)
if (kobj->parent)
parent_sd = kobj->parent->sd;
else
- parent_sd = &sysfs_root;
+ parent_sd = sysfs_root;

error = create_dir(kobj, parent_sd, kobject_name(kobj), &sd);
if (!error)
@@ -832,7 +832,7 @@ int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)

mutex_lock(&sysfs_rename_mutex);
BUG_ON(!sd->s_parent);
- new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
+ new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : sysfs_root;

error = 0;
if (sd->s_parent == new_parent_sd)
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 465902c..d00d4b9 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -23,7 +23,7 @@ static const struct super_operations sysfs_ops = {
.drop_inode = generic_delete_inode,
};

-struct sysfs_dirent sysfs_root = {
+static struct sysfs_dirent sysfs_root_storage = {
.s_name = "",
.s_count = ATOMIC_INIT(1),
.s_flags = SYSFS_DIR,
@@ -31,6 +31,8 @@ struct sysfs_dirent sysfs_root = {
.s_ino = 1,
};

+struct sysfs_dirent * const sysfs_root = &sysfs_root_storage;
+
static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
{
struct inode *inode;
@@ -44,7 +46,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
sysfs_sb = sb;

/* get root inode, initialize and unlock it */
- inode = sysfs_get_inode(&sysfs_root);
+ inode = sysfs_get_inode(sysfs_root);
if (!inode) {
pr_debug("sysfs: could not get root inode\n");
return -ENOMEM;
@@ -57,7 +59,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
iput(inode);
return -ENOMEM;
}
- root->d_fsdata = &sysfs_root;
+ root->d_fsdata = sysfs_root;
sb->s_root = root;
return 0;
}
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index ffa82e9..6b3358e 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -61,7 +61,7 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
BUG_ON(!name);

if (!kobj)
- parent_sd = &sysfs_root;
+ parent_sd = sysfs_root;
else
parent_sd = kobj->sd;

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 58f517b..9180e2c 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -78,7 +78,7 @@ struct sysfs_addrm_cxt {
/*
* mount.c
*/
-extern struct sysfs_dirent sysfs_root;
+extern struct sysfs_dirent * const sysfs_root;
extern struct super_block *sysfs_sb;
extern struct kmem_cache *sysfs_dir_cachep;

--
1.5.0.3


2007-09-20 08:07:56

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 02/22] sysfs: separate out sysfs-kobject.h and fs/sysfs/kobject.c

Sysfs is about to get a new interface which is independent from the
driver model and kobject. Create include sysfs-kobject.h and move all
kobject based interface into it. Also, create fs/sysfs/kobject.c
which is currently empty but will host compatibility interface
functions based on the new interface.

sysfs-kobject.h is automatically included from sysfs.h for
compatibility for now.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/Makefile | 2 +-
fs/sysfs/kobject.c | 15 +++
include/linux/sysfs-kobject.h | 200 +++++++++++++++++++++++++++++++++++++++++
include/linux/sysfs.h | 197 +++-------------------------------------
4 files changed, 230 insertions(+), 184 deletions(-)
create mode 100644 fs/sysfs/kobject.c
create mode 100644 include/linux/sysfs-kobject.h

diff --git a/fs/sysfs/Makefile b/fs/sysfs/Makefile
index 7a1ceb9..f58bce9 100644
--- a/fs/sysfs/Makefile
+++ b/fs/sysfs/Makefile
@@ -3,4 +3,4 @@
#

obj-y := inode.o file.o dir.o symlink.o mount.o bin.o \
- group.o
+ group.o kobject.o
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
new file mode 100644
index 0000000..5ebd755
--- /dev/null
+++ b/fs/sysfs/kobject.c
@@ -0,0 +1,15 @@
+/*
+ * sysfs/kobject.c - compatibility sysfs interface based on kobject
+ *
+ * Copyright (c) 2001,2002 Patrick Mochel
+ * Copyright (c) 2004 Silicon Graphics, Inc.
+ *
+ * This is compatibility interface which wraps the primary interface
+ * defined in linux/sysfs.h to remain compatible with the original
+ * kobject based interface.
+ *
+ * Please see Documentation/filesystems/sysfs.txt for more information.
+ */
+
+#include <linux/sysfs.h>
+#include <linux/sysfs-kobject.h>
diff --git a/include/linux/sysfs-kobject.h b/include/linux/sysfs-kobject.h
new file mode 100644
index 0000000..4c821c2
--- /dev/null
+++ b/include/linux/sysfs-kobject.h
@@ -0,0 +1,200 @@
+/*
+ * sysfs-kobject.h - compatibility sysfs interface based on kobject
+ *
+ * Copyright (c) 2001,2002 Patrick Mochel
+ * Copyright (c) 2004 Silicon Graphics, Inc.
+ *
+ * This is compatibility interface which wraps the primary interface
+ * defined in linux/sysfs.h to remain compatible with the original
+ * kobject based interface. Please don't use in new codes.
+ *
+ * Please see Documentation/filesystems/sysfs.txt for more information.
+ */
+
+#ifndef _SYSFS_KOBJECT_H
+#define _SYSFS_KOBJECT_H
+
+#include <linux/sysfs.h>
+
+struct kobject;
+struct module;
+
+/* FIXME
+ * The *owner field is no longer used, but leave around
+ * until the tree gets cleaned up fully.
+ */
+struct attribute {
+ const char *name;
+ struct module *owner;
+ mode_t mode;
+};
+
+struct attribute_group {
+ const char *name;
+ struct attribute **attrs;
+};
+
+/**
+ * Use these macros to make defining attributes easier. See include/linux/device.h
+ * for examples..
+ */
+
+#define __ATTR(_name,_mode,_show,_store) { \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .show = _show, \
+ .store = _store, \
+}
+
+#define __ATTR_RO(_name) { \
+ .attr = { .name = __stringify(_name), .mode = 0444 }, \
+ .show = _name##_show, \
+}
+
+#define __ATTR_NULL { .attr = { .name = NULL } }
+
+#define attr_name(_attr) (_attr).attr.name
+
+struct bin_attribute {
+ struct attribute attr;
+ size_t size;
+ void *private;
+ ssize_t (*read)(struct kobject *, struct bin_attribute *,
+ char *, loff_t, size_t);
+ ssize_t (*write)(struct kobject *, struct bin_attribute *,
+ char *, loff_t, size_t);
+ int (*mmap)(struct kobject *, struct bin_attribute *attr,
+ struct vm_area_struct *vma);
+};
+
+struct sysfs_ops {
+ ssize_t (*show)(struct kobject *, struct attribute *,char *);
+ ssize_t (*store)(struct kobject *,struct attribute *,const char *, size_t);
+};
+
+#ifdef CONFIG_SYSFS
+
+int __must_check sysfs_create_dir(struct kobject *kobj);
+void sysfs_remove_dir(struct kobject *kobj);
+int __must_check sysfs_rename_dir(struct kobject *kobj, const char *new_name);
+int __must_check sysfs_move_dir(struct kobject *kobj,
+ struct kobject *new_parent_kobj);
+
+int __must_check sysfs_create_file(struct kobject *kobj,
+ const struct attribute *attr);
+int __must_check sysfs_chmod_file(struct kobject *kobj, struct attribute *attr,
+ mode_t mode);
+void sysfs_remove_file(struct kobject *kobj, const struct attribute *attr);
+
+int __must_check sysfs_create_bin_file(struct kobject *kobj,
+ struct bin_attribute *attr);
+void sysfs_remove_bin_file(struct kobject *kobj, struct bin_attribute *attr);
+
+int __must_check sysfs_create_link(struct kobject *kobj, struct kobject *target,
+ const char *name);
+void sysfs_remove_link(struct kobject *kobj, const char *name);
+
+int __must_check sysfs_create_group(struct kobject *kobj,
+ const struct attribute_group *grp);
+void sysfs_remove_group(struct kobject *kobj,
+ const struct attribute_group *grp);
+int sysfs_add_file_to_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group);
+void sysfs_remove_file_from_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group);
+
+void sysfs_notify(struct kobject *kobj, char *dir, char *attr);
+
+#else /* CONFIG_SYSFS */
+
+static inline int sysfs_create_dir(struct kobject *kobj)
+{
+ return 0;
+}
+
+static inline void sysfs_remove_dir(struct kobject *kobj)
+{
+ ;
+}
+
+static inline int sysfs_rename_dir(struct kobject *kobj, const char *new_name)
+{
+ return 0;
+}
+
+static inline int sysfs_move_dir(struct kobject *kobj,
+ struct kobject *new_parent_kobj)
+{
+ return 0;
+}
+
+static inline int sysfs_create_file(struct kobject *kobj,
+ const struct attribute *attr)
+{
+ return 0;
+}
+
+static inline int sysfs_chmod_file(struct kobject *kobj,
+ struct attribute *attr, mode_t mode)
+{
+ return 0;
+}
+
+static inline void sysfs_remove_file(struct kobject *kobj,
+ const struct attribute *attr)
+{
+ ;
+}
+
+static inline int sysfs_create_bin_file(struct kobject *kobj,
+ struct bin_attribute *attr)
+{
+ return 0;
+}
+
+static inline int sysfs_remove_bin_file(struct kobject *kobj,
+ struct bin_attribute *attr)
+{
+ return 0;
+}
+
+static inline int sysfs_create_link(struct kobject *kobj,
+ struct kobject *target, const char *name)
+{
+ return 0;
+}
+
+static inline void sysfs_remove_link(struct kobject *kobj, const char *name)
+{
+ ;
+}
+
+static inline int sysfs_create_group(struct kobject *kobj,
+ const struct attribute_group *grp)
+{
+ return 0;
+}
+
+static inline void sysfs_remove_group(struct kobject *kobj,
+ const struct attribute_group *grp)
+{
+ ;
+}
+
+static inline int sysfs_add_file_to_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group)
+{
+ return 0;
+}
+
+static inline void sysfs_remove_file_from_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group)
+{
+}
+
+static inline void sysfs_notify(struct kobject *kobj, char *dir, char *attr)
+{
+}
+
+#endif /* CONFIG_SYSFS */
+
+#endif /* _SYSFS_KOBJECT_H */
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 8af072e..38b73f9 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -1,8 +1,13 @@
/*
* sysfs.h - definitions for the device driver filesystem
*
- * Copyright (c) 2001,2002 Patrick Mochel
- * Copyright (c) 2004 Silicon Graphics, Inc.
+ * Primary sysfs interface based on sysfs_dirent
+ *
+ * If you're using sysfs directly instead of via driver model, please
+ * use this interface. It's independent from the driver model and
+ * kobject, cleaner and includes more features such as plugging of
+ * subtree while building it. Kobject-based compatibility interface
+ * is defined in linux/sysfs-kobject.h
*
* Please see Documentation/filesystems/sysfs.txt for more information.
*/
@@ -11,195 +16,16 @@
#define _SYSFS_H_

#include <linux/compiler.h>
-#include <linux/errno.h>
-#include <linux/list.h>
-#include <asm/atomic.h>
-
-struct kobject;
-struct module;
-
-/* FIXME
- * The *owner field is no longer used, but leave around
- * until the tree gets cleaned up fully.
- */
-struct attribute {
- const char *name;
- struct module *owner;
- mode_t mode;
-};
-
-struct attribute_group {
- const char *name;
- struct attribute **attrs;
-};
-
-
-
-/**
- * Use these macros to make defining attributes easier. See include/linux/device.h
- * for examples..
- */
-
-#define __ATTR(_name,_mode,_show,_store) { \
- .attr = {.name = __stringify(_name), .mode = _mode }, \
- .show = _show, \
- .store = _store, \
-}
-
-#define __ATTR_RO(_name) { \
- .attr = { .name = __stringify(_name), .mode = 0444 }, \
- .show = _name##_show, \
-}
-
-#define __ATTR_NULL { .attr = { .name = NULL } }
-
-#define attr_name(_attr) (_attr).attr.name
+#include <linux/types.h>

struct vm_area_struct;

-struct bin_attribute {
- struct attribute attr;
- size_t size;
- void *private;
- ssize_t (*read)(struct kobject *, struct bin_attribute *,
- char *, loff_t, size_t);
- ssize_t (*write)(struct kobject *, struct bin_attribute *,
- char *, loff_t, size_t);
- int (*mmap)(struct kobject *, struct bin_attribute *attr,
- struct vm_area_struct *vma);
-};
-
-struct sysfs_ops {
- ssize_t (*show)(struct kobject *, struct attribute *,char *);
- ssize_t (*store)(struct kobject *,struct attribute *,const char *, size_t);
-};
-
#ifdef CONFIG_SYSFS

-int __must_check sysfs_create_dir(struct kobject *kobj);
-void sysfs_remove_dir(struct kobject *kobj);
-int __must_check sysfs_rename_dir(struct kobject *kobj, const char *new_name);
-int __must_check sysfs_move_dir(struct kobject *kobj,
- struct kobject *new_parent_kobj);
-
-int __must_check sysfs_create_file(struct kobject *kobj,
- const struct attribute *attr);
-int __must_check sysfs_chmod_file(struct kobject *kobj, struct attribute *attr,
- mode_t mode);
-void sysfs_remove_file(struct kobject *kobj, const struct attribute *attr);
-
-int __must_check sysfs_create_bin_file(struct kobject *kobj,
- struct bin_attribute *attr);
-void sysfs_remove_bin_file(struct kobject *kobj, struct bin_attribute *attr);
-
-int __must_check sysfs_create_link(struct kobject *kobj, struct kobject *target,
- const char *name);
-void sysfs_remove_link(struct kobject *kobj, const char *name);
-
-int __must_check sysfs_create_group(struct kobject *kobj,
- const struct attribute_group *grp);
-void sysfs_remove_group(struct kobject *kobj,
- const struct attribute_group *grp);
-int sysfs_add_file_to_group(struct kobject *kobj,
- const struct attribute *attr, const char *group);
-void sysfs_remove_file_from_group(struct kobject *kobj,
- const struct attribute *attr, const char *group);
-
-void sysfs_notify(struct kobject *kobj, char *dir, char *attr);
-
-extern int __must_check sysfs_init(void);
+int __must_check sysfs_init(void);

#else /* CONFIG_SYSFS */

-static inline int sysfs_create_dir(struct kobject *kobj)
-{
- return 0;
-}
-
-static inline void sysfs_remove_dir(struct kobject *kobj)
-{
- ;
-}
-
-static inline int sysfs_rename_dir(struct kobject *kobj, const char *new_name)
-{
- return 0;
-}
-
-static inline int sysfs_move_dir(struct kobject *kobj,
- struct kobject *new_parent_kobj)
-{
- return 0;
-}
-
-static inline int sysfs_create_file(struct kobject *kobj,
- const struct attribute *attr)
-{
- return 0;
-}
-
-static inline int sysfs_chmod_file(struct kobject *kobj,
- struct attribute *attr, mode_t mode)
-{
- return 0;
-}
-
-static inline void sysfs_remove_file(struct kobject *kobj,
- const struct attribute *attr)
-{
- ;
-}
-
-static inline int sysfs_create_bin_file(struct kobject *kobj,
- struct bin_attribute *attr)
-{
- return 0;
-}
-
-static inline int sysfs_remove_bin_file(struct kobject *kobj,
- struct bin_attribute *attr)
-{
- return 0;
-}
-
-static inline int sysfs_create_link(struct kobject *kobj,
- struct kobject *target, const char *name)
-{
- return 0;
-}
-
-static inline void sysfs_remove_link(struct kobject *kobj, const char *name)
-{
- ;
-}
-
-static inline int sysfs_create_group(struct kobject *kobj,
- const struct attribute_group *grp)
-{
- return 0;
-}
-
-static inline void sysfs_remove_group(struct kobject *kobj,
- const struct attribute_group *grp)
-{
- ;
-}
-
-static inline int sysfs_add_file_to_group(struct kobject *kobj,
- const struct attribute *attr, const char *group)
-{
- return 0;
-}
-
-static inline void sysfs_remove_file_from_group(struct kobject *kobj,
- const struct attribute *attr, const char *group)
-{
-}
-
-static inline void sysfs_notify(struct kobject *kobj, char *dir, char *attr)
-{
-}
-
static inline int __must_check sysfs_init(void)
{
return 0;
@@ -207,4 +33,9 @@ static inline int __must_check sysfs_init(void)

#endif /* CONFIG_SYSFS */

+/*
+ * Implicitly include kobject based compatibility interface for now
+ */
+#include <linux/sysfs-kobject.h>
+
#endif /* _SYSFS_H_ */
--
1.5.0.3


2007-09-20 08:08:20

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 06/22] sysfs: restructure addrm helpers

Restruct addrm helpers such that it can remove nodes under different
parents at one go.

sysfs_addrm_start() now doesn't take @sd. It now only initializes
@acxt and lock sysfs_mutex. Parent inode lookup now lives in
sysfs_addrm_get_parent_inode() and sysfs_add/remove_one() call them
directly. The function unlocks i_mutex of the previous inode and
grabs and locks the inode for the specified @parent_sd. This allows
sysfs_add/remove_one() to be called for any node.

This will be used to recursively remove sysfs nodes.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 144 +++++++++++++++++++++++++++++++++------------------
fs/sysfs/file.c | 4 +-
fs/sysfs/inode.c | 2 +-
fs/sysfs/symlink.c | 4 +-
fs/sysfs/sysfs.h | 10 ++--
5 files changed, 103 insertions(+), 61 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7a6be9a..b1cf090 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -375,6 +375,27 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
return NULL;
}

+/**
+ * sysfs_addrm_start - prepare for sysfs_dirent add/remove
+ * @acxt: pointer to sysfs_addrm_cxt to be used
+ *
+ * This function is called when the caller is about to add or
+ * remove sysfs_dirents. This function initializes @acxt and
+ * acquires sysfs_mutex. @acxt is used to keep and pass context
+ * to other addrm functions.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep). sysfs_mutex is locked on
+ * return.
+ */
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
+{
+ memset(acxt, 0, sizeof(*acxt));
+ acxt->removed_tail = &acxt->removed;
+
+ mutex_lock(&sysfs_mutex);
+}
+
static int sysfs_ilookup_test(struct inode *inode, void *arg)
{
struct sysfs_dirent *sd = arg;
@@ -382,39 +403,53 @@ static int sysfs_ilookup_test(struct inode *inode, void *arg)
}

/**
- * sysfs_addrm_start - prepare for sysfs_dirent add/remove
- * @acxt: pointer to sysfs_addrm_cxt to be used
- * @parent_sd: parent sysfs_dirent
+ * sysfs_addrm_get_parent_inode - get parent inode for add/rm
+ * @acxt: addrm context to prepare parent inode for
+ * @new_parent: parent to prepare (can be NULL)
*
- * This function is called when the caller is about to add or
- * remove sysfs_dirent under @parent_sd. This function acquires
- * sysfs_mutex, grabs inode for @parent_sd if available and lock
- * i_mutex of it. @acxt is used to keep and pass context to
- * other addrm functions.
+ * Release inode of the current @acxt->parent_sd and prepare
+ * inode of @new_parent. If @new_parent is NULL, the current
+ * parent inode is released.
*
* LOCKING:
- * Kernel thread context (may sleep). sysfs_mutex is locked on
- * return. i_mutex of parent inode is locked on return if
- * available.
+ * mutex_lock(sysfs_mutex). Returns with i_mutex of @new_parent
+ * locked if available. sysfs_mutex might be released and
+ * regrabbed.
+ *
+ * RETURNS:
+ * inode of @new_parent if available, NULL otherwise.
*/
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
- struct sysfs_dirent *parent_sd)
+static struct inode *sysfs_addrm_get_parent_inode(struct sysfs_addrm_cxt *acxt,
+ struct sysfs_dirent *new_parent)
{
struct inode *inode;

- memset(acxt, 0, sizeof(*acxt));
- acxt->parent_sd = parent_sd;
- acxt->removed_tail = &acxt->removed;
+ /* nothing to do if new equals current */
+ if (new_parent == acxt->parent_sd)
+ goto out;
+
+ /* if we're holding inode of the current parent, release it */
+ if ((inode = acxt->parent_inode)) {
+ mutex_unlock(&inode->i_mutex);
+ iput(inode);
+ }
+
+ acxt->parent_sd = NULL;
+ acxt->parent_inode = NULL;
+
+ if (!new_parent)
+ goto out;
+
+ /* make @new_parent the current one */
+ acxt->parent_sd = new_parent;

/* Lookup parent inode. inode initialization and I_NEW
- * clearing are protected by sysfs_mutex. By grabbing it and
- * looking up with _nowait variant, inode state can be
+ * clearing are protected by sysfs_mutex. By looking up with
+ * _nowait variant while holding it, inode state can be
* determined reliably.
*/
- mutex_lock(&sysfs_mutex);
-
- inode = ilookup5_nowait(sysfs_sb, parent_sd->s_ino, sysfs_ilookup_test,
- parent_sd);
+ inode = ilookup5_nowait(sysfs_sb, new_parent->s_ino, sysfs_ilookup_test,
+ new_parent);

if (inode && !(inode->i_state & I_NEW)) {
/* parent inode available */
@@ -431,16 +466,19 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
}
} else
iput(inode);
+ out:
+ return acxt->parent_inode;
}

/**
* sysfs_add_one - add sysfs_dirent to parent
* @acxt: addrm context to use
+ * @parent: parent sysfs_dirent to add sysfs_dirent under
* @sd: sysfs_dirent to be added
*
- * Get @acxt->parent_sd and set sd->s_parent to it and increment
- * nlink of parent inode if @sd is a directory and link into the
- * children list of the parent.
+ * Get @parent and set sd->s_parent to it and increment nlink of
+ * parent inode if @sd is a directory and link into the children
+ * list of the parent.
*
* This function should be called between calls to
* sysfs_addrm_start() and sysfs_addrm_finish() and should be
@@ -453,20 +491,26 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
* 0 on success, -EEXIST if entry with the given name already
* exists.
*/
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *parent,
+ struct sysfs_dirent *sd)
{
- if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
- return -EEXIST;
+ struct inode *parent_inode;

- sd->s_parent = sysfs_get(acxt->parent_sd);
+ if (sysfs_find_dirent(parent, sd->s_name))
+ return -EEXIST;

- if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
- inc_nlink(acxt->parent_inode);
+ parent_inode = sysfs_addrm_get_parent_inode(acxt, parent);

- acxt->cnt++;
+ sd->s_parent = sysfs_get(parent);

sysfs_link_sibling(sd);

+ if (parent_inode) {
+ parent_inode->i_ctime = parent_inode->i_mtime = CURRENT_TIME;
+ if (sysfs_type(sd) == SYSFS_DIR)
+ inc_nlink(parent_inode);
+ }
+
return 0;
}

@@ -487,8 +531,12 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
*/
void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
{
+ struct inode *parent_inode;
+
BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);

+ parent_inode = sysfs_addrm_get_parent_inode(acxt, sd->s_parent);
+
sysfs_unlink_sibling(sd);

sd->s_flags |= SYSFS_FLAG_REMOVED;
@@ -497,10 +545,11 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
acxt->removed_tail = &sd->s_sibling;
*acxt->removed_tail = NULL;

- if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
- drop_nlink(acxt->parent_inode);
-
- acxt->cnt++;
+ if (parent_inode) {
+ parent_inode->i_ctime = parent_inode->i_mtime = CURRENT_TIME;
+ if (sysfs_type(sd) == SYSFS_DIR)
+ drop_nlink(parent_inode);
+ }
}

/**
@@ -568,18 +617,11 @@ repeat:
*/
void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
{
- /* release resources acquired by sysfs_addrm_start() */
+ /* Release resources acquired by
+ * sysfs_addrm_get_parent_inode() and sysfs_addrm_start().
+ */
+ sysfs_addrm_get_parent_inode(acxt, NULL);
mutex_unlock(&sysfs_mutex);
- if (acxt->parent_inode) {
- struct inode *inode = acxt->parent_inode;
-
- /* if added/removed, update timestamps on the parent */
- if (acxt->cnt)
- inode->i_ctime = inode->i_mtime = CURRENT_TIME;
-
- mutex_unlock(&inode->i_mutex);
- iput(inode);
- }

/* kill removed sysfs_dirents */
while (acxt->removed) {
@@ -695,8 +737,8 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
sd->s_dir.kobj = kobj;

/* link in */
- sysfs_addrm_start(&acxt, parent_sd);
- rc = sysfs_add_one(&acxt, sd);
+ sysfs_addrm_start(&acxt);
+ rc = sysfs_add_one(&acxt, parent_sd, sd);
sysfs_addrm_finish(&acxt);

if (rc == 0)
@@ -778,7 +820,7 @@ static void remove_dir(struct sysfs_dirent *sd)
{
struct sysfs_addrm_cxt acxt;

- sysfs_addrm_start(&acxt, sd->s_parent);
+ sysfs_addrm_start(&acxt);
sysfs_remove_one(&acxt, sd);
sysfs_addrm_finish(&acxt);
}
@@ -798,7 +840,7 @@ static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
return;

pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
- sysfs_addrm_start(&acxt, dir_sd);
+ sysfs_addrm_start(&acxt);
pos = &dir_sd->s_dir.children;
while (*pos) {
struct sysfs_dirent *sd = *pos;
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 8d8e1ee..fb77d54 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -581,8 +581,8 @@ int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
return -ENOMEM;
sd->s_attr.attr = (void *)attr;

- sysfs_addrm_start(&acxt, dir_sd);
- rc = sysfs_add_one(&acxt, sd);
+ sysfs_addrm_start(&acxt);
+ rc = sysfs_add_one(&acxt, dir_sd, sd);
sysfs_addrm_finish(&acxt);

if (rc)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 2210cf0..95bde29 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -214,7 +214,7 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
if (!dir_sd)
return -ENOENT;

- sysfs_addrm_start(&acxt, dir_sd);
+ sysfs_addrm_start(&acxt);

sd = sysfs_find_dirent(dir_sd, name);
if (sd)
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 982085c..897cc2f 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -90,8 +90,8 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
sd->s_symlink.target_sd = target_sd;
target_sd = NULL; /* reference is now owned by the symlink */

- sysfs_addrm_start(&acxt, parent_sd);
- error = sysfs_add_one(&acxt, sd);
+ sysfs_addrm_start(&acxt);
+ error = sysfs_add_one(&acxt, parent_sd, sd);
sysfs_addrm_finish(&acxt);

if (error)
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index db1a433..1f3915f 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -66,13 +66,13 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
}

/*
- * Context structure to be used while adding/removing nodes.
+ * Context structure to be used while adding/removing nodes. Don't
+ * dereference outside of addrm functions.
*/
struct sysfs_addrm_cxt {
struct sysfs_dirent *parent_sd;
struct inode *parent_inode;
struct sysfs_dirent *removed, **removed_tail;
- int cnt;
};

/*
@@ -97,9 +97,9 @@ struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
void sysfs_put_active(struct sysfs_dirent *sd);
struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
void sysfs_put_active_two(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
- struct sysfs_dirent *parent_sd);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *parent,
+ struct sysfs_dirent *sd);
void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);

--
1.5.0.3


2007-09-20 08:08:50

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 08/22] sysfs: implement sysfs_dirent based directory interface

Convert sd->s_dir.kobj to sd->s_dir.data and make it void *, and
implement sd based directory creation function sysfs_add_dir(). Using
this function the caller can create directory node with arbitrary user
data. Also, name copying is not implicit. Name is copied iff
SYSFS_COPY_NAME should be specified in @mode.

* sysfs_root is exported. To be used as @parent when creating a node
below the sysfs root.

* Kobject-based sysfs_create_dir() is reimplemented in terms of
sysfs_add_dir() and moved to fs/sysfs/kobject.c.

* Users of sysfs_create_dir() in sysfs are converted to use
sysfs_add_dir().

* attr and bin_attr still can cope only with kobject sd->s_dir.data,
so sysfs_add_dir() can't be used with arbitrary pointer yet. This
will be changed by following patches.

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/bin.c | 6 +-
fs/sysfs/dir.c | 108 +++++++++++++++++++++++++++----------------------
fs/sysfs/file.c | 6 +-
fs/sysfs/group.c | 8 ++--
fs/sysfs/kobject.c | 26 ++++++++++++
fs/sysfs/mount.c | 2 +
fs/sysfs/sysfs.h | 8 +--
include/linux/sysfs.h | 16 +++++++
8 files changed, 117 insertions(+), 63 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 1c12cf0..91e573f 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -31,7 +31,7 @@ fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
{
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
int rc;

/* need attr_sd for attr, its parent for kobj */
@@ -88,7 +88,7 @@ flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
{
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
int rc;

/* need attr_sd for attr, its parent for kobj */
@@ -141,7 +141,7 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
struct bin_buffer *bb = file->private_data;
struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
int rc;

mutex_lock(&bb->mutex);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7cb3f1e..b15ade7 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -7,7 +7,6 @@
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/module.h>
-#include <linux/kobject.h>
#include <linux/namei.h>
#include <linux/idr.h>
#include <linux/completion.h>
@@ -639,6 +638,40 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
}

/**
+ * sysfs_insert_one - shortcut function to add single sysfs_dirent
+ * @parent: parent to add sysfs_dirent under
+ * @sd: sysfs_dirent to add
+ *
+ * A shortcut function to perform sysfs_addrm_start, add_one,
+ * addrm_finish sequence and error handling for a single
+ * sysfs_dirent.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the new sysfs_dirent on success, ERR_PTR() value on
+ * failure.
+ */
+struct sysfs_dirent *sysfs_insert_one(struct sysfs_dirent *parent,
+ struct sysfs_dirent *sd)
+{
+ struct sysfs_addrm_cxt acxt;
+ int rc;
+
+ sysfs_addrm_start(&acxt);
+ rc = sysfs_add_one(&acxt, parent, sd);
+ sysfs_addrm_finish(&acxt);
+
+ if (rc) {
+ sysfs_put(sd);
+ return ERR_PTR(rc);
+ }
+
+ return sd;
+}
+
+/**
* sysfs_find_dirent - find sysfs_dirent with the given name
* @parent_sd: sysfs_dirent to search under
* @name: name to look for
@@ -722,60 +755,39 @@ struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
}
EXPORT_SYMBOL_GPL(sysfs_find_child);

-static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
- const char *name, struct sysfs_dirent **p_sd)
-{
- umode_t mode = S_IRWXU | S_IRUGO | S_IXUGO;
- struct sysfs_addrm_cxt acxt;
- struct sysfs_dirent *sd;
- int rc;
-
- /* allocate */
- sd = sysfs_new_dirent(name, mode | SYSFS_COPY_NAME, SYSFS_DIR);
- if (!sd)
- return -ENOMEM;
- sd->s_dir.kobj = kobj;
-
- /* link in */
- sysfs_addrm_start(&acxt);
- rc = sysfs_add_one(&acxt, parent_sd, sd);
- sysfs_addrm_finish(&acxt);
-
- if (rc == 0)
- *p_sd = sd;
- else
- sysfs_put(sd);
-
- return rc;
-}
-
-int sysfs_create_subdir(struct kobject *kobj, const char *name,
- struct sysfs_dirent **p_sd)
-{
- return create_dir(kobj, kobj->sd, name, p_sd);
-}
-
/**
- * sysfs_create_dir - create a directory for an object.
- * @kobj: object we're creating directory for.
+ * sysfs_add_dir - add a new sysfs directory
+ * @parent: sysfs_dirent to add the directory under
+ * @name: name of the new directory
+ * @mode: mode and SYSFS_* flags for the new directory
+ * @data: s_dir.data for the new directory
+ *
+ * Add a new directory under @parent with the specified
+ * parameters. If SYSFS_COPY_NAME is set in @mode, @name is
+ * copied before being used.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the new sysfs_dirent on success, ERR_PTR() value on
+ * error.
*/
-int sysfs_create_dir(struct kobject * kobj)
+struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, void *data)
{
- struct sysfs_dirent *parent_sd, *sd;
- int error = 0;
+ struct sysfs_dirent *sd;

- BUG_ON(!kobj);
+ /* allocate and initialize */
+ sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
+ if (!sd)
+ return ERR_PTR(-ENOMEM);

- if (kobj->parent)
- parent_sd = kobj->parent->sd;
- else
- parent_sd = sysfs_root;
+ sd->s_dir.data = data;

- error = create_dir(kobj, parent_sd, kobject_name(kobj), &sd);
- if (!error)
- kobj->sd = sd;
- return error;
+ return sysfs_insert_one(parent, sd);
}
+EXPORT_SYMBOL_GPL(sysfs_add_dir);

static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
struct nameidata *nd)
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 1faba5f..8154fbb 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -145,7 +145,7 @@ void sysfs_file_check_suicide(struct sysfs_dirent *sd)
static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer)
{
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
struct sysfs_ops * ops = buffer->ops;
int ret = 0;
ssize_t count;
@@ -265,7 +265,7 @@ static int
flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
{
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
struct sysfs_ops * ops = buffer->ops;
int rc;

@@ -404,7 +404,7 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
static int sysfs_open_file(struct inode *inode, struct file *file)
{
struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.data;
struct sysfs_buffer * buffer;
struct sysfs_ops * ops = NULL;
int error;
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index cef0376..e4993fd 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -39,7 +39,7 @@ static int create_files(struct sysfs_dirent *dir_sd,
}


-int sysfs_create_group(struct kobject * kobj,
+int sysfs_create_group(struct kobject * kobj,
const struct attribute_group * grp)
{
struct sysfs_dirent *sd;
@@ -48,9 +48,9 @@ int sysfs_create_group(struct kobject * kobj,
BUG_ON(!kobj || !kobj->sd);

if (grp->name) {
- error = sysfs_create_subdir(kobj, grp->name, &sd);
- if (error)
- return error;
+ sd = sysfs_add_dir(kobj->sd, grp->name, SYSFS_DIR_MODE, kobj);
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
} else
sd = kobj->sd;
sysfs_get(sd);
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 9415f18..8e4677c 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -30,6 +30,32 @@ static int sysfs_remove_child(struct sysfs_dirent *parent, const char *name)
}

/**
+ * sysfs_create_dir - create a directory for an object.
+ * @kobj: object we're creating directory for.
+ */
+int sysfs_create_dir(struct kobject *kobj)
+{
+ struct sysfs_dirent *parent, *sd;
+
+ BUG_ON(!kobj);
+
+ if (kobj->parent) {
+ parent = kobj->parent->sd;
+ if (!parent)
+ return -EFAULT;
+ } else
+ parent = sysfs_root;
+
+ sd = sysfs_add_dir(parent, kobject_name(kobj),
+ SYSFS_DIR_MODE | SYSFS_COPY_NAME, kobj);
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+
+ kobj->sd = sd;
+ return 0;
+}
+
+/**
* sysfs_remove_dir - remove an object's directory.
* @kobj: object.
*
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index d00d4b9..d61eb08 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -8,6 +8,7 @@
#include <linux/mount.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/module.h>

#include "sysfs.h"

@@ -32,6 +33,7 @@ static struct sysfs_dirent sysfs_root_storage = {
};

struct sysfs_dirent * const sysfs_root = &sysfs_root_storage;
+EXPORT_SYMBOL_GPL(sysfs_root);

static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
{
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 9494f3d..82ade38 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -2,7 +2,7 @@ struct sysfs_open_dirent;

/* type-specific structures for sysfs_dirent->s_* union members */
struct sysfs_elem_dir {
- struct kobject *kobj;
+ void *data;
/* children list starts here and goes through sd->s_sibling */
struct sysfs_dirent *children;
};
@@ -78,7 +78,6 @@ struct sysfs_addrm_cxt {
/*
* mount.c
*/
-extern struct sysfs_dirent * const sysfs_root;
extern struct super_block *sysfs_sb;
extern struct kmem_cache *sysfs_dir_cachep;

@@ -102,6 +101,8 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *parent,
struct sysfs_dirent *sd);
void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
+struct sysfs_dirent *sysfs_insert_one(struct sysfs_dirent *parent,
+ struct sysfs_dirent *sd);

struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
const unsigned char *name);
@@ -113,9 +114,6 @@ void __sysfs_remove(struct sysfs_dirent *sd, int recurse);

void release_sysfs_dirent(struct sysfs_dirent *sd);

-int sysfs_create_subdir(struct kobject *kobj, const char *name,
- struct sysfs_dirent **p_sd);
-
static inline struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd)
{
if (sd) {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 4ad2874..3c64b4a 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -17,6 +17,8 @@

#include <linux/compiler.h>
#include <linux/types.h>
+#include <linux/stat.h>
+#include <linux/err.h>

struct sysfs_dirent;
struct vm_area_struct;
@@ -26,6 +28,7 @@ struct vm_area_struct;
* valid as real mode bits. Bits in S_IFMT are used to set sysfs
* specific flags.
*/
+#define SYSFS_DIR_MODE (S_IRWXU | S_IRUGO | S_IXUGO)
#define SYSFS_COPY_NAME 010000 /* copy passed @name */

/* collection of all flags for verification */
@@ -33,6 +36,11 @@ struct vm_area_struct;

#ifdef CONFIG_SYSFS

+extern struct sysfs_dirent * const sysfs_root;
+
+struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, void *data);
+
struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
void sysfs_remove(struct sysfs_dirent *sd);
@@ -41,6 +49,14 @@ int __must_check sysfs_init(void);

#else /* CONFIG_SYSFS */

+#define sysfs_root ((struct sysfs_dirent *)NULL)
+
+static inline struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, void *data)
+{
+ return NULL;
+}
+
static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name)
{
--
1.5.0.3


2007-09-20 08:09:21

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 10/22] sysfs: drop kobj and attr from file related symbols

Currently, regular sysfs files are called (kobj) attrs or files. This
is a bit confusing and sd-based interface won't be bound to attribute.
Use 'file' consistently and drop all kobj and attr from symbol names.

This patch doesn't introduce any logic change.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 4 +-
fs/sysfs/file.c | 140 +++++++++++++++++++++++++++---------------------------
fs/sysfs/group.c | 2 +-
fs/sysfs/inode.c | 2 +-
fs/sysfs/sysfs.h | 6 +-
5 files changed, 77 insertions(+), 77 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b15ade7..170749d 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -341,7 +341,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
case SYSFS_DIR:
mode |= S_IFDIR;
break;
- case SYSFS_KOBJ_ATTR:
+ case SYSFS_FILE:
case SYSFS_KOBJ_BIN_ATTR:
mode |= S_IFREG;
break;
@@ -630,7 +630,7 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
sd->s_sibling = NULL;

sysfs_drop_dentry(sd);
- if (sysfs_type(sd) == SYSFS_KOBJ_ATTR)
+ if (sysfs_type(sd) == SYSFS_FILE)
sysfs_file_check_suicide(sd);
sysfs_deactivate(sd);
sysfs_put(sd);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 3b74912..ae4d7cb 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -55,7 +55,7 @@ static struct sysfs_ops subsys_sysfs_ops = {
* files.
*
* filp->private_data points to sysfs_buffer and
- * sysfs_dirent->s_attr.open points to sysfs_open_dirent. s_attr.open
+ * sysfs_dirent->s_file.open points to sysfs_open_dirent. s_file.open
* is protected by sysfs_open_dirent_lock.
*/
static spinlock_t sysfs_open_dirent_lock = SPIN_LOCK_UNLOCKED;
@@ -103,7 +103,7 @@ void sysfs_file_check_suicide(struct sysfs_dirent *sd)

spin_lock(&sysfs_open_dirent_lock);

- od = sd->s_attr.open;
+ od = sd->s_file.open;
if (od) {
list_for_each_entry(buffer, &od->buffers, list) {
if (buffer->accessor != current)
@@ -144,8 +144,8 @@ void sysfs_file_check_suicide(struct sysfs_dirent *sd)
*/
static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer)
{
- struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = dentry->d_fsdata;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
struct sysfs_ops * ops = buffer->ops;
int ret = 0;
ssize_t count;
@@ -155,19 +155,19 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
if (!buffer->page)
return -ENOMEM;

- /* need attr_sd for attr and ops, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ /* need sd for attr and ops, its parent for kobj */
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

- buffer->event = atomic_read(&attr_sd->s_attr.open->event);
+ buffer->event = atomic_read(&sd->s_file.open->event);
buffer->accessor = current;

- count = ops->show(kobj, attr_sd->s_attr.attr, buffer->page);
+ count = ops->show(kobj, sd->s_file.attr, buffer->page);

if (buffer->committed_suicide)
module_allow_unload();
else
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

BUG_ON(count > (ssize_t)PAGE_SIZE);
if (count >= 0) {
@@ -180,24 +180,22 @@ static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer
}

/**
- * sysfs_read_file - read an attribute.
+ * sysfs_read_file - read from a sysfs file
* @file: file pointer.
* @buf: buffer to fill.
* @count: number of bytes to read.
* @ppos: starting offset in file.
*
- * Userspace wants to read an attribute file. The attribute descriptor
- * is in the file's ->d_fsdata. The target object is in the directory's
- * ->d_fsdata.
+ * Userspace wants to read a file. The sysfs_dirent is in the
+ * file's ->d_fsdata.
*
- * We call fill_read_buffer() to allocate and fill the buffer from the
- * object's show() method exactly once (if the read is happening from
- * the beginning of the file). That should fill the entire buffer with
- * all the data the object has to offer for that attribute.
- * We then call flush_read_buffer() to copy the buffer to userspace
- * in the increments specified.
+ * We call fill_read_buffer() to allocate and fill the buffer
+ * from the object's show() method exactly once (if the read is
+ * happening from the beginning of the file). That should fill
+ * the entire buffer with all the data the object has to offer
+ * for that file. We then call flush_read_buffer() to copy the
+ * buffer to userspace in the increments specified.
*/
-
static ssize_t
sysfs_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos)
{
@@ -264,43 +262,45 @@ fill_write_buffer(struct sysfs_buffer * buffer, const char __user * buf, size_t
static int
flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
{
- struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = dentry->d_fsdata;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
struct sysfs_ops * ops = buffer->ops;
int rc;

/* need attr_sd for attr and ops, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

buffer->accessor = current;

- rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);
+ rc = ops->store(kobj, sd->s_file.attr, buffer->page, count);

if (buffer->committed_suicide)
module_allow_unload();
else
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

return rc;
}


/**
- * sysfs_write_file - write an attribute.
+ * sysfs_write_file - write to a sysfs file
* @file: file pointer
* @buf: data to write
* @count: number of bytes
* @ppos: starting offset
*
- * Similar to sysfs_read_file(), though working in the opposite direction.
- * We allocate and fill the data from the user in fill_write_buffer(),
- * then push it to the kobject in flush_write_buffer().
- * There is no easy way for us to know if userspace is only doing a partial
- * write, so we don't support them. We expect the entire buffer to come
- * on the first write.
- * Hint: if you're writing a value, first read the file, modify only the
- * the value you're changing, then write entire buffer back.
+ * Similar to sysfs_read_file(), though working in the opposite
+ * direction. We allocate and fill the data from the user in
+ * fill_write_buffer(), then push it to the sysfs_dirent in
+ * flush_write_buffer(). There is no easy way for us to know if
+ * userspace is only doing a partial write, so we don't support
+ * them. We expect the entire buffer to come on the first write.
+ * Hint: if you're writing a value, first read the file, modify
+ * only the the value you're changing, then write entire buffer
+ * back.
+ *
*/

static ssize_t
@@ -324,7 +324,7 @@ sysfs_write_file(struct file *file, const char __user *buf, size_t count, loff_t
* @sd: target sysfs_dirent
* @buffer: sysfs_buffer for this instance of open
*
- * If @sd->s_attr.open exists, increment its reference count;
+ * If @sd->s_file.open exists, increment its reference count;
* otherwise, create one. @buffer is chained to the buffers
* list.
*
@@ -342,12 +342,12 @@ static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
retry:
spin_lock(&sysfs_open_dirent_lock);

- if (!sd->s_attr.open && new_od) {
- sd->s_attr.open = new_od;
+ if (!sd->s_file.open && new_od) {
+ sd->s_file.open = new_od;
new_od = NULL;
}

- od = sd->s_attr.open;
+ od = sd->s_file.open;
if (od) {
atomic_inc(&od->refcnt);
list_add_tail(&buffer->list, &od->buffers);
@@ -377,7 +377,7 @@ static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
* @sd: target sysfs_dirent
* @buffer: associated sysfs_buffer
*
- * Put @sd->s_attr.open and unlink @buffer from the buffers list.
+ * Put @sd->s_file.open and unlink @buffer from the buffers list.
* If reference count reaches zero, disassociate and free it.
*
* LOCKING:
@@ -386,13 +386,13 @@ static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
struct sysfs_buffer *buffer)
{
- struct sysfs_open_dirent *od = sd->s_attr.open;
+ struct sysfs_open_dirent *od = sd->s_file.open;

spin_lock(&sysfs_open_dirent_lock);

list_del(&buffer->list);
if (atomic_dec_and_test(&od->refcnt))
- sd->s_attr.open = NULL;
+ sd->s_file.open = NULL;
else
od = NULL;

@@ -403,14 +403,14 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,

static int sysfs_open_file(struct inode *inode, struct file *file)
{
- struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
struct sysfs_buffer * buffer;
struct sysfs_ops * ops = NULL;
int error;

- /* need attr_sd for attr and ops, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ /* need sd for attr and ops, its parent for kobj */
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

/* if the kobject has no ktype, then we assume that it is a subsystem
@@ -463,18 +463,18 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
file->private_data = buffer;

/* make sure we have open dirent struct */
- error = sysfs_get_open_dirent(attr_sd, buffer);
+ error = sysfs_get_open_dirent(sd, buffer);
if (error)
goto err_free;

/* open succeeded, put active references */
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);
return 0;

err_free:
kfree(buffer);
err_out:
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);
return error;
}

@@ -492,33 +492,33 @@ static int sysfs_release(struct inode *inode, struct file *filp)
return 0;
}

-/* Sysfs attribute files are pollable. The idea is that you read
- * the content and then you use 'poll' or 'select' to wait for
- * the content to change. When the content changes (assuming the
- * manager for the kobject supports notification), poll will
- * return POLLERR|POLLPRI, and select will return the fd whether
- * it is waiting for read, write, or exceptions.
- * Once poll/select indicates that the value has changed, you
- * need to close and re-open the file, as simply seeking and reading
- * again will not get new data, or reset the state of 'poll'.
- * Reminder: this only works for attributes which actively support
- * it, and it is not possible to test an attribute from userspace
- * to see if it supports poll (Neither 'poll' nor 'select' return
- * an appropriate error code). When in doubt, set a suitable timeout value.
+/* Sysfs files are pollable. The idea is that you read the content
+ * and then you use 'poll' or 'select' to wait for the content to
+ * change. When the content changes (assuming the manager for the
+ * sysfs_dirent supports notification), poll will return
+ * POLLERR|POLLPRI, and select will return the fd whether it is
+ * waiting for read, write, or exceptions. Once poll/select indicates
+ * that the value has changed, you need to close and re-open the file,
+ * as simply seeking and reading again will not get new data, or reset
+ * the state of 'poll'. Reminder: this only works for files which
+ * actively support it, and it is not possible to test a file from
+ * userspace to see if it supports poll (Neither 'poll' nor 'select'
+ * return an appropriate error code). When in doubt, set a suitable
+ * timeout value.
*/
static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
{
- struct sysfs_buffer * buffer = filp->private_data;
- struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
- struct sysfs_open_dirent *od = attr_sd->s_attr.open;
+ struct sysfs_buffer *buffer = filp->private_data;
+ struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
+ struct sysfs_open_dirent *od = sd->s_file.open;

/* need parent for the kobj, grab both */
- if (!sysfs_get_active_two(attr_sd))
+ if (!sysfs_get_active_two(sd))
goto trigger;

poll_wait(filp, &od->poll, wait);

- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

if (buffer->event != atomic_read(&od->event))
goto trigger;
@@ -545,7 +545,7 @@ void sysfs_notify(struct kobject *k, char *dir, char *attr)

spin_lock(&sysfs_open_dirent_lock);

- od = sd->s_attr.open;
+ od = sd->s_file.open;
if (od) {
atomic_inc(&od->event);
wake_up_interruptible(&od->poll);
@@ -579,7 +579,7 @@ int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
sd = sysfs_new_dirent(attr->name, mode, type);
if (!sd)
return -ENOMEM;
- sd->s_attr.attr = (void *)attr;
+ sd->s_file.attr = (void *)attr;

sysfs_addrm_start(&acxt);
rc = sysfs_add_one(&acxt, dir_sd, sd);
@@ -602,7 +602,7 @@ int sysfs_create_file(struct kobject * kobj, const struct attribute * attr)
{
BUG_ON(!kobj || !kobj->sd || !attr);

- return __sysfs_add_file(kobj->sd, attr, SYSFS_KOBJ_ATTR);
+ return __sysfs_add_file(kobj->sd, attr, SYSFS_FILE);

}

@@ -623,7 +623,7 @@ int sysfs_add_file_to_group(struct kobject *kobj,
if (!dir_sd)
return -ENOENT;

- error = __sysfs_add_file(dir_sd, attr, SYSFS_KOBJ_ATTR);
+ error = __sysfs_add_file(dir_sd, attr, SYSFS_FILE);
sysfs_put(dir_sd);

return error;
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 5c5aab0..bb94f6a 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -32,7 +32,7 @@ static int create_files(struct sysfs_dirent *dir_sd,
int error = 0;

for (attr = grp->attrs; *attr && !error; attr++)
- error = __sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
+ error = __sysfs_add_file(dir_sd, *attr, SYSFS_FILE);
if (error)
remove_files(dir_sd, grp);
return error;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 95bde29..318e5d5 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -162,7 +162,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
inode->i_fop = &sysfs_dir_operations;
inode->i_nlink = sysfs_count_nlink(sd);
break;
- case SYSFS_KOBJ_ATTR:
+ case SYSFS_FILE:
inode->i_size = PAGE_SIZE;
inode->i_fop = &sysfs_file_operations;
break;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index c57b2dc..4e032eb 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -11,7 +11,7 @@ struct sysfs_elem_symlink {
struct sysfs_dirent *target_sd;
};

-struct sysfs_elem_attr {
+struct sysfs_elem_file {
struct attribute *attr;
struct sysfs_open_dirent *open;
};
@@ -38,7 +38,7 @@ struct sysfs_dirent {
union {
struct sysfs_elem_dir s_dir;
struct sysfs_elem_symlink s_symlink;
- struct sysfs_elem_attr s_attr;
+ struct sysfs_elem_file s_file;
struct sysfs_elem_bin_attr s_bin_attr;
};

@@ -52,7 +52,7 @@ struct sysfs_dirent {

#define SYSFS_TYPE_MASK 0x00ff
#define SYSFS_DIR 0x0001
-#define SYSFS_KOBJ_ATTR 0x0002
+#define SYSFS_FILE 0x0002
#define SYSFS_KOBJ_BIN_ATTR 0x0004
#define SYSFS_KOBJ_LINK 0x0008

--
1.5.0.3


2007-09-20 08:09:49

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 07/22] sysfs: implement sysfs_dirent based remove interface sysfs_remove()

This patch implements new sysfs_remove(), which takes @sd as target
argument and can be used on any type of node. Also, it recurses into
sub directories.

* sysfs_remove_file(), sysfs_remove_file_from_group(),
sysfs_remove_bin_file() are reimplemented using sysfs_remove() in
fs/sysfs/kobject.c.

* To keep backward compatibility sysfs_remove_dir() is reimplemented
using internal function __sysfs_remove() with @recurse argument set
to zero so that it doesn't descend into subdirectories.

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/bin.c | 13 -----
fs/sysfs/dir.c | 125 ++++++++++++++++++++++++++++++++++---------------
fs/sysfs/file.c | 35 --------------
fs/sysfs/group.c | 4 +-
fs/sysfs/kobject.c | 92 ++++++++++++++++++++++++++++++++++++
fs/sysfs/symlink.c | 13 -----
fs/sysfs/sysfs.h | 3 +-
include/linux/sysfs.h | 6 ++
8 files changed, 189 insertions(+), 102 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 247ea19..1c12cf0 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -237,17 +237,4 @@ int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr)
return sysfs_add_file(kobj->sd, &attr->attr, SYSFS_KOBJ_BIN_ATTR);
}

-
-/**
- * sysfs_remove_bin_file - remove binary file for object.
- * @kobj: object.
- * @attr: attribute descriptor.
- */
-
-void sysfs_remove_bin_file(struct kobject * kobj, struct bin_attribute * attr)
-{
- sysfs_hash_and_remove(kobj->sd, attr->attr.name);
-}
-
EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
-EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b1cf090..7cb3f1e 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -816,64 +816,113 @@ const struct inode_operations sysfs_dir_inode_operations = {
.setattr = sysfs_setattr,
};

-static void remove_dir(struct sysfs_dirent *sd)
+/**
+ * sysfs_tree_walk_next - walk sysfs subtree
+ * @top: top of the subtree to walk
+ * @cur: the current sysfs_dirent (NULL to begin walk)
+ *
+ * Walk subtree of @top. Each call to this function returns the
+ * next node to visit. The walk is descendant first,
+ * left-to-right. The last node naturally is @top. The walk
+ * order is to allow removing the previous node while walking the
+ * tree.
+ *
+ * LOCKING:
+ * mutex_lock(sysfs_mutex)
+ *
+ * RETURNS:
+ * Next sysfs_dirent to visit, NULL if the walk is complete.
+ */
+static struct sysfs_dirent *sysfs_tree_walk_next(struct sysfs_dirent *top,
+ struct sysfs_dirent *cur)
{
- struct sysfs_addrm_cxt acxt;
+ if (cur == NULL) {
+ /* we're beginning a walk, go to the deepest child */
+ cur = top;

- sysfs_addrm_start(&acxt);
- sysfs_remove_one(&acxt, sd);
- sysfs_addrm_finish(&acxt);
-}
+ while (sysfs_type(cur) == SYSFS_DIR && cur->s_dir.children)
+ cur = cur->s_dir.children;

-void sysfs_remove_subdir(struct sysfs_dirent *sd)
-{
- remove_dir(sd);
-}
+ return cur;
+ } else if (cur == top) {
+ /* walk ends at @top */
+ return NULL;
+ }

+ /* continue walking */
+ if (cur->s_sibling) {
+ /* if possible, go right and deep */
+ cur = cur->s_sibling;
+
+ /* go to the deepest child */
+ while (sysfs_type(cur) == SYSFS_DIR && cur->s_dir.children)
+ cur = cur->s_dir.children;
+ } else {
+ /* this subtree is done, go up */
+ cur = cur->s_parent;
+ }
+
+ return cur;
+}

-static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
+/**
+ * __sysfs_remove - remove a sysfs_dirent
+ * @sd: target sysfs_dirent to be removed
+ * @recurse: recurse into subdirectories
+ *
+ * Remove @sd. If @sd is a directory, all its leaf children are
+ * removed. If @recurse is not zero, all the directory children
+ * are recursively removed too.
+ *
+ * This is an internal function to be used to implement
+ * sysfs_remove() and sysfs_remove_dir(). @recurse is necessary
+ * to support the original sysfs_remove_dir() semantics which
+ * didn't remove subdirectories.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ */
+void __sysfs_remove(struct sysfs_dirent *sd, int recurse)
{
struct sysfs_addrm_cxt acxt;
- struct sysfs_dirent **pos;
+ struct sysfs_dirent *cur, *next;

- if (!dir_sd)
+ /* noop on NULL */
+ if (!sd)
return;

- pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
sysfs_addrm_start(&acxt);
- pos = &dir_sd->s_dir.children;
- while (*pos) {
- struct sysfs_dirent *sd = *pos;

- if (sysfs_type(sd) != SYSFS_DIR)
- sysfs_remove_one(&acxt, sd);
- else
- pos = &(*pos)->s_sibling;
- }
- sysfs_addrm_finish(&acxt);
+ cur = sysfs_tree_walk_next(sd, NULL); /* prime walk */
+ do {
+ /* find out @next before removing @cur */
+ next = sysfs_tree_walk_next(sd, cur);

- remove_dir(dir_sd);
+ /* if ! recursing, delete only direct leaf children and self */
+ if (!recurse && cur != sd &&
+ (sysfs_type(cur) == SYSFS_DIR || cur->s_parent != sd))
+ continue;
+
+ sysfs_remove_one(&acxt, cur);
+ } while ((cur = next));
+
+ sysfs_addrm_finish(&acxt);
}

/**
- * sysfs_remove_dir - remove an object's directory.
- * @kobj: object.
+ * sysfs_remove - remove a sysfs_dirent and all its children recursively
+ * @sd: target sysfs_dirent to be removed
*
- * The only thing special about this is that we remove any files in
- * the directory before we remove the directory, and we've inlined
- * what used to be sysfs_rmdir() below, instead of calling separately.
+ * Remove @sd and all its children recursively.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
*/
-
-void sysfs_remove_dir(struct kobject * kobj)
+void sysfs_remove(struct sysfs_dirent *sd)
{
- struct sysfs_dirent *sd = kobj->sd;
-
- spin_lock(&sysfs_assoc_lock);
- kobj->sd = NULL;
- spin_unlock(&sysfs_assoc_lock);
-
- __sysfs_remove_dir(sd);
+ __sysfs_remove(sd, 1);
}
+EXPORT_SYMBOL_GPL(sysfs_remove);

int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
{
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index fb77d54..1faba5f 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -681,39 +681,4 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
}
EXPORT_SYMBOL_GPL(sysfs_chmod_file);

-
-/**
- * sysfs_remove_file - remove an object attribute.
- * @kobj: object we're acting for.
- * @attr: attribute descriptor.
- *
- * Hash the attribute name and kill the victim.
- */
-
-void sysfs_remove_file(struct kobject * kobj, const struct attribute * attr)
-{
- sysfs_hash_and_remove(kobj->sd, attr->name);
-}
-
-
-/**
- * sysfs_remove_file_from_group - remove an attribute file from a group.
- * @kobj: object we're acting for.
- * @attr: attribute descriptor.
- * @group: group name.
- */
-void sysfs_remove_file_from_group(struct kobject *kobj,
- const struct attribute *attr, const char *group)
-{
- struct sysfs_dirent *dir_sd;
-
- dir_sd = sysfs_get_dirent(kobj->sd, group);
- if (dir_sd) {
- sysfs_hash_and_remove(dir_sd, attr->name);
- sysfs_put(dir_sd);
- }
-}
-EXPORT_SYMBOL_GPL(sysfs_remove_file_from_group);
-
EXPORT_SYMBOL_GPL(sysfs_create_file);
-EXPORT_SYMBOL_GPL(sysfs_remove_file);
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index d197237..cef0376 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -57,7 +57,7 @@ int sysfs_create_group(struct kobject * kobj,
error = create_files(sd, grp);
if (error) {
if (grp->name)
- sysfs_remove_subdir(sd);
+ sysfs_remove(sd);
}
sysfs_put(sd);
return error;
@@ -77,7 +77,7 @@ void sysfs_remove_group(struct kobject * kobj,

remove_files(sd, grp);
if (grp->name)
- sysfs_remove_subdir(sd);
+ sysfs_remove(sd);

sysfs_put(sd);
}
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 5ebd755..9415f18 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -13,3 +13,95 @@

#include <linux/sysfs.h>
#include <linux/sysfs-kobject.h>
+#include <linux/kobject.h>
+#include <linux/module.h>
+#include "sysfs.h"
+
+static int sysfs_remove_child(struct sysfs_dirent *parent, const char *name)
+{
+ struct sysfs_dirent *sd;
+
+ sd = sysfs_find_child(parent, name);
+ if (!sd)
+ return -ENOENT;
+
+ sysfs_remove(sd);
+ return 0;
+}
+
+/**
+ * sysfs_remove_dir - remove an object's directory.
+ * @kobj: object.
+ *
+ * Remove sysfs directory associated with @kobj. All children
+ * which are leaf are removed but subdirectories are left alone.
+ */
+void sysfs_remove_dir(struct kobject * kobj)
+{
+ struct sysfs_dirent *sd = kobj->sd;
+
+ spin_lock(&sysfs_assoc_lock);
+ kobj->sd = NULL;
+ spin_unlock(&sysfs_assoc_lock);
+
+ __sysfs_remove(sd, 0);
+}
+
+/**
+ * sysfs_remove_file - remove an object attribute.
+ * @kobj: object we're acting for.
+ * @attr: attribute descriptor.
+ *
+ * Hash the attribute name and kill the victim.
+ */
+void sysfs_remove_file(struct kobject *kobj, const struct attribute *attr)
+{
+ sysfs_remove_child(kobj->sd, attr->name);
+}
+EXPORT_SYMBOL_GPL(sysfs_remove_file);
+
+/**
+ * sysfs_remove_bin_file - remove binary file for object.
+ * @kobj: object.
+ * @attr: attribute descriptor.
+ */
+void sysfs_remove_bin_file(struct kobject * kobj, struct bin_attribute * attr)
+{
+ if (sysfs_remove_child(kobj->sd, attr->attr.name) < 0) {
+ printk(KERN_ERR "%s: "
+ "bad dentry or inode or no such file: \"%s\"\n",
+ __FUNCTION__, attr->attr.name);
+ dump_stack();
+ }
+}
+EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);
+
+/**
+ * sysfs_remove_link - remove symlink in object's directory.
+ * @kobj: object we're acting for.
+ * @name: name of the symlink to remove.
+ */
+void sysfs_remove_link(struct kobject * kobj, const char * name)
+{
+ sysfs_remove_child(kobj->sd, name);
+}
+EXPORT_SYMBOL_GPL(sysfs_remove_link);
+
+/**
+ * sysfs_remove_file_from_group - remove an attribute file from a group.
+ * @kobj: object we're acting for.
+ * @attr: attribute descriptor.
+ * @group: group name.
+ */
+void sysfs_remove_file_from_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group)
+{
+ struct sysfs_dirent *dir_sd;
+
+ dir_sd = sysfs_get_dirent(kobj->sd, group);
+ if (dir_sd) {
+ sysfs_remove_child(dir_sd, attr->name);
+ sysfs_put(dir_sd);
+ }
+}
+EXPORT_SYMBOL_GPL(sysfs_remove_file_from_group);
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 897cc2f..9c15a32 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -105,18 +105,6 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
return error;
}

-
-/**
- * sysfs_remove_link - remove symlink in object's directory.
- * @kobj: object we're acting for.
- * @name: name of the symlink to remove.
- */
-
-void sysfs_remove_link(struct kobject * kobj, const char * name)
-{
- sysfs_hash_and_remove(kobj->sd, name);
-}
-
static int sysfs_get_target_path(struct sysfs_dirent * parent_sd,
struct sysfs_dirent * target_sd, char *path)
{
@@ -178,4 +166,3 @@ const struct inode_operations sysfs_symlink_inode_operations = {


EXPORT_SYMBOL_GPL(sysfs_create_link);
-EXPORT_SYMBOL_GPL(sysfs_remove_link);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 1f3915f..9494f3d 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -109,11 +109,12 @@ struct sysfs_dirent *sysfs_get_dirent(struct sysfs_dirent *parent_sd,
const unsigned char *name);
struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);

+void __sysfs_remove(struct sysfs_dirent *sd, int recurse);
+
void release_sysfs_dirent(struct sysfs_dirent *sd);

int sysfs_create_subdir(struct kobject *kobj, const char *name,
struct sysfs_dirent **p_sd);
-void sysfs_remove_subdir(struct sysfs_dirent *sd);

static inline struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd)
{
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index f030dc6..4ad2874 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -18,6 +18,7 @@
#include <linux/compiler.h>
#include <linux/types.h>

+struct sysfs_dirent;
struct vm_area_struct;

/*
@@ -34,6 +35,7 @@ struct vm_area_struct;

struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
+void sysfs_remove(struct sysfs_dirent *sd);

int __must_check sysfs_init(void);

@@ -45,6 +47,10 @@ static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
return NULL;
}

+static inline void sysfs_remove(struct sysfs_dirent *sd)
+{
+}
+
static inline int __must_check sysfs_init(void)
{
return 0;
--
1.5.0.3


2007-09-20 08:10:26

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 14/22] sysfs: s/symlink/link/g

Currently, sysfs symlinks are either called symlinks or links. Rename
everything to link and drop attr and _sd postfix as new interface
won't be bound to kobj.

This patch doesn't introduce any logic change.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 6 +++---
fs/sysfs/inode.c | 4 ++--
fs/sysfs/symlink.c | 9 ++++-----
fs/sysfs/sysfs.h | 16 ++++++++--------
4 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 263d346..a20beff 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -277,8 +277,8 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
*/
parent_sd = sd->s_parent;

- if (sysfs_type(sd) == SYSFS_KOBJ_LINK)
- sysfs_put(sd->s_symlink.target_sd);
+ if (sysfs_type(sd) == SYSFS_LINK)
+ sysfs_put(sd->s_link.target);
if (sd->s_flags & SYSFS_FLAG_NAME_COPIED)
kfree(sd->s_name);
kfree(sd->s_iattr);
@@ -345,7 +345,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
case SYSFS_BIN:
mode |= S_IFREG;
break;
- case SYSFS_KOBJ_LINK:
+ case SYSFS_LINK:
mode |= S_IFLNK;
break;
}
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index d303e62..8df357e 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -168,8 +168,8 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
inode->i_size = sd->s_bin.size;
inode->i_fop = &sysfs_bin_file_operations;
break;
- case SYSFS_KOBJ_LINK:
- inode->i_op = &sysfs_symlink_inode_operations;
+ case SYSFS_LINK:
+ inode->i_op = &sysfs_link_inode_operations;
break;
default:
BUG();
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 9c15a32..2c3e4f7 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -82,12 +82,11 @@ int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char
goto out_put;

error = -ENOMEM;
- sd = sysfs_new_dirent(name, S_IRWXUGO | SYSFS_COPY_NAME,
- SYSFS_KOBJ_LINK);
+ sd = sysfs_new_dirent(name, S_IRWXUGO | SYSFS_COPY_NAME, SYSFS_LINK);
if (!sd)
goto out_put;

- sd->s_symlink.target_sd = target_sd;
+ sd->s_link.target = target_sd;
target_sd = NULL; /* reference is now owned by the symlink */

sysfs_addrm_start(&acxt);
@@ -131,7 +130,7 @@ static int sysfs_getlink(struct dentry *dentry, char * path)
{
struct sysfs_dirent *sd = dentry->d_fsdata;
struct sysfs_dirent *parent_sd = sd->s_parent;
- struct sysfs_dirent *target_sd = sd->s_symlink.target_sd;
+ struct sysfs_dirent *target_sd = sd->s_link.target;
int error;

mutex_lock(&sysfs_mutex);
@@ -158,7 +157,7 @@ static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd, void *co
free_page((unsigned long)page);
}

-const struct inode_operations sysfs_symlink_inode_operations = {
+const struct inode_operations sysfs_link_inode_operations = {
.readlink = generic_readlink,
.follow_link = sysfs_follow_link,
.put_link = sysfs_put_link,
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index d8c61c5..c5593f9 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -7,8 +7,8 @@ struct sysfs_elem_dir {
struct sysfs_dirent *children;
};

-struct sysfs_elem_symlink {
- struct sysfs_dirent *target_sd;
+struct sysfs_elem_link {
+ struct sysfs_dirent *target;
};

struct sysfs_elem_file {
@@ -39,10 +39,10 @@ struct sysfs_dirent {
const char *s_name;

union {
- struct sysfs_elem_dir s_dir;
- struct sysfs_elem_symlink s_symlink;
- struct sysfs_elem_file s_file;
- struct sysfs_elem_bin s_bin;
+ struct sysfs_elem_dir s_dir;
+ struct sysfs_elem_link s_link;
+ struct sysfs_elem_file s_file;
+ struct sysfs_elem_bin s_bin;
};

unsigned int s_flags;
@@ -57,7 +57,7 @@ struct sysfs_dirent {
#define SYSFS_DIR 0x0001
#define SYSFS_FILE 0x0002
#define SYSFS_BIN 0x0004
-#define SYSFS_KOBJ_LINK 0x0008
+#define SYSFS_LINK 0x0008

#define SYSFS_FLAG_MASK ~SYSFS_TYPE_MASK
#define SYSFS_FLAG_REMOVED 0x0200
@@ -156,4 +156,4 @@ extern const struct file_operations sysfs_bin_file_operations;
/*
* symlink.c
*/
-extern const struct inode_operations sysfs_symlink_inode_operations;
+extern const struct inode_operations sysfs_link_inode_operations;
--
1.5.0.3


2007-09-20 08:10:58

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 12/22] sysfs: drop kobj and attr from bin related symbols

Currently, binary sysfs files are called (kobj) bin attrs. sd-based
interface won't be bound to bin_attribute. Use 'bin' as name and drop
all kobj and attr from symbol names.

This patch doesn't introduce any logic change.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/bin.c | 54 +++++++++++++++++++++++++++---------------------------
fs/sysfs/dir.c | 2 +-
fs/sysfs/inode.c | 6 +++---
fs/sysfs/sysfs.h | 8 ++++----
4 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 20c2e78..dad891c 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -29,20 +29,20 @@ struct bin_buffer {
static int
fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
{
- struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = dentry->d_fsdata;
+ struct bin_attribute *attr = sd->s_bin.bin_attr;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
int rc;

- /* need attr_sd for attr, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ /* need sd for attr, its parent for kobj */
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

rc = -EIO;
if (attr->read)
rc = attr->read(kobj, attr, buffer, off, count);

- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

return rc;
}
@@ -86,20 +86,20 @@ read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
static int
flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
{
- struct sysfs_dirent *attr_sd = dentry->d_fsdata;
- struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = dentry->d_fsdata;
+ struct bin_attribute *attr = sd->s_bin.bin_attr;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
int rc;

- /* need attr_sd for attr, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ /* need sd for attr, its parent for kobj */
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

rc = -EIO;
if (attr->write)
rc = attr->write(kobj, attr, buffer, offset, count);

- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

return rc;
}
@@ -139,15 +139,15 @@ static ssize_t write(struct file *file, const char __user *userbuf,
static int mmap(struct file *file, struct vm_area_struct *vma)
{
struct bin_buffer *bb = file->private_data;
- struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
- struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
- struct kobject *kobj = attr_sd->s_parent->s_dir.data;
+ struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
+ struct bin_attribute *attr = sd->s_bin.bin_attr;
+ struct kobject *kobj = sd->s_parent->s_dir.data;
int rc;

mutex_lock(&bb->mutex);

- /* need attr_sd for attr, its parent for kobj */
- if (!sysfs_get_active_two(attr_sd))
+ /* need sd for attr, its parent for kobj */
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

rc = -EINVAL;
@@ -157,7 +157,7 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
if (rc == 0 && !bb->mmapped)
bb->mmapped = 1;
else
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);

mutex_unlock(&bb->mutex);

@@ -166,13 +166,13 @@ static int mmap(struct file *file, struct vm_area_struct *vma)

static int open(struct inode * inode, struct file * file)
{
- struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
- struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
+ struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
+ struct bin_attribute *attr = sd->s_bin.bin_attr;
struct bin_buffer *bb = NULL;
int error;

/* binary file operations requires both @sd and its parent */
- if (!sysfs_get_active_two(attr_sd))
+ if (!sysfs_get_active_two(sd))
return -ENODEV;

error = -EACCES;
@@ -194,28 +194,28 @@ static int open(struct inode * inode, struct file * file)
file->private_data = bb;

/* open succeeded, put active references */
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);
return 0;

err_out:
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);
kfree(bb);
return error;
}

static int release(struct inode * inode, struct file * file)
{
- struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+ struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
struct bin_buffer *bb = file->private_data;

if (bb->mmapped)
- sysfs_put_active_two(attr_sd);
+ sysfs_put_active_two(sd);
kfree(bb->buffer);
kfree(bb);
return 0;
}

-const struct file_operations bin_fops = {
+const struct file_operations sysfs_bin_file_operations = {
.read = read,
.write = write,
.mmap = mmap,
@@ -234,7 +234,7 @@ int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr)
{
BUG_ON(!kobj || !kobj->sd || !attr);

- return __sysfs_add_file(kobj->sd, &attr->attr, SYSFS_KOBJ_BIN_ATTR);
+ return __sysfs_add_file(kobj->sd, &attr->attr, SYSFS_BIN);
}

EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index f4a6f2f..263d346 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -342,7 +342,7 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
mode |= S_IFDIR;
break;
case SYSFS_FILE:
- case SYSFS_KOBJ_BIN_ATTR:
+ case SYSFS_BIN:
mode |= S_IFREG;
break;
case SYSFS_KOBJ_LINK:
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 318e5d5..593b1da 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -166,10 +166,10 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
inode->i_size = PAGE_SIZE;
inode->i_fop = &sysfs_file_operations;
break;
- case SYSFS_KOBJ_BIN_ATTR:
- bin_attr = sd->s_bin_attr.bin_attr;
+ case SYSFS_BIN:
+ bin_attr = sd->s_bin.bin_attr;
inode->i_size = bin_attr->size;
- inode->i_fop = &bin_fops;
+ inode->i_fop = &sysfs_bin_file_operations;
break;
case SYSFS_KOBJ_LINK:
inode->i_op = &sysfs_symlink_inode_operations;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index ca02276..3f505d7 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -17,7 +17,7 @@ struct sysfs_elem_file {
struct sysfs_open_dirent *open;
};

-struct sysfs_elem_bin_attr {
+struct sysfs_elem_bin {
struct bin_attribute *bin_attr;
};

@@ -40,7 +40,7 @@ struct sysfs_dirent {
struct sysfs_elem_dir s_dir;
struct sysfs_elem_symlink s_symlink;
struct sysfs_elem_file s_file;
- struct sysfs_elem_bin_attr s_bin_attr;
+ struct sysfs_elem_bin s_bin;
};

unsigned int s_flags;
@@ -54,7 +54,7 @@ struct sysfs_dirent {
#define SYSFS_TYPE_MASK 0x00ff
#define SYSFS_DIR 0x0001
#define SYSFS_FILE 0x0002
-#define SYSFS_KOBJ_BIN_ATTR 0x0004
+#define SYSFS_BIN 0x0004
#define SYSFS_KOBJ_LINK 0x0008

#define SYSFS_FLAG_MASK ~SYSFS_TYPE_MASK
@@ -149,7 +149,7 @@ int __sysfs_add_file(struct sysfs_dirent *dir_sd,
/*
* bin.c
*/
-extern const struct file_operations bin_fops;
+extern const struct file_operations sysfs_bin_file_operations;

/*
* symlink.c
--
1.5.0.3


2007-09-20 08:11:25

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 05/22] sysfs: implement sysfs_find_child()

Implement sysfs_find_child() which finds a child of a sysfs_dirent by
name. This function does not grab reference of the found child. The
caller is supposed to have reference if the child exists. This
function is useful for callers which own the sysfs_dirent to be looked
up but don't wanna keep a pointer to it.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 33 +++++++++++++++++++++++++++++++++
include/linux/sysfs.h | 9 +++++++++
2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 584f17c..7a6be9a 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -647,6 +647,39 @@ struct sysfs_dirent *sysfs_get_dirent(struct sysfs_dirent *parent_sd,
return sd;
}

+/**
+ * sysfs_find_child - find sysfs_dirent with the given name
+ * @parent: sysfs_dirent to search under
+ * @name: name to look for
+ *
+ * This is exported version of sysfs_find_dirent(). This
+ * function doesn't grab reference to the found dirent. The
+ * caller must already have a reference to it. This function is
+ * useful for callers which own the sysfs_dirent to be looked up
+ * but don't wanna keep a pointer to it.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the looked up sysfs_dirent on success, NULL if
+ * there's no such entry, ERR_PTR(-EBADF) is @parent is ERR_PTR()
+ * value.
+ *
+ */
+struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
+ const char *name)
+{
+ struct sysfs_dirent *sd;
+
+ mutex_lock(&sysfs_mutex);
+ sd = sysfs_find_dirent(parent, name);
+ mutex_unlock(&sysfs_mutex);
+
+ return sd;
+}
+EXPORT_SYMBOL_GPL(sysfs_find_child);
+
static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
const char *name, struct sysfs_dirent **p_sd)
{
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 5646e56..f030dc6 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -32,10 +32,19 @@ struct vm_area_struct;

#ifdef CONFIG_SYSFS

+struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
+ const char *name);
+
int __must_check sysfs_init(void);

#else /* CONFIG_SYSFS */

+static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
+ const char *name)
+{
+ return NULL;
+}
+
static inline int __must_check sysfs_init(void)
{
return 0;
--
1.5.0.3


2007-09-20 08:11:49

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 09/22] sysfs: rename internal function sysfs_add_file()

The function name sysfs_add_file() will be used for sd-based file
interface. Rename the internal function to __sysfs_add_file(). Note
that the internal function will be removed once the new interface is
in place, so double underscore prefix should do it for the time being.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/bin.c | 2 +-
fs/sysfs/file.c | 6 +++---
fs/sysfs/group.c | 2 +-
fs/sysfs/sysfs.h | 4 ++--
4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 91e573f..20c2e78 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -234,7 +234,7 @@ int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr)
{
BUG_ON(!kobj || !kobj->sd || !attr);

- return sysfs_add_file(kobj->sd, &attr->attr, SYSFS_KOBJ_BIN_ATTR);
+ return __sysfs_add_file(kobj->sd, &attr->attr, SYSFS_KOBJ_BIN_ATTR);
}

EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 8154fbb..3b74912 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -568,7 +568,7 @@ const struct file_operations sysfs_file_operations = {
};


-int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
+int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
int type)
{
umode_t mode = attr->mode & S_IALLUGO;
@@ -602,7 +602,7 @@ int sysfs_create_file(struct kobject * kobj, const struct attribute * attr)
{
BUG_ON(!kobj || !kobj->sd || !attr);

- return sysfs_add_file(kobj->sd, attr, SYSFS_KOBJ_ATTR);
+ return __sysfs_add_file(kobj->sd, attr, SYSFS_KOBJ_ATTR);

}

@@ -623,7 +623,7 @@ int sysfs_add_file_to_group(struct kobject *kobj,
if (!dir_sd)
return -ENOENT;

- error = sysfs_add_file(dir_sd, attr, SYSFS_KOBJ_ATTR);
+ error = __sysfs_add_file(dir_sd, attr, SYSFS_KOBJ_ATTR);
sysfs_put(dir_sd);

return error;
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index e4993fd..5c5aab0 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -32,7 +32,7 @@ static int create_files(struct sysfs_dirent *dir_sd,
int error = 0;

for (attr = grp->attrs; *attr && !error; attr++)
- error = sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
+ error = __sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
if (error)
remove_files(dir_sd, grp);
return error;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 82ade38..c57b2dc 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -142,8 +142,8 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
extern const struct file_operations sysfs_file_operations;

void sysfs_file_check_suicide(struct sysfs_dirent *sd);
-int sysfs_add_file(struct sysfs_dirent *dir_sd,
- const struct attribute *attr, int type);
+int __sysfs_add_file(struct sysfs_dirent *dir_sd,
+ const struct attribute *attr, int type);

/*
* bin.c
--
1.5.0.3


2007-09-20 08:12:19

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 15/22] sysfs: implement sysfs_dirent based link interface

sysfs_add_link() takes parent sd, name, mode and the target sd and
creates a link accordingly. Currently, sysfs links can only point to
directories but this limitiation is artificial to avoid inflating the
sysfs_dirent structure by one pointer with future changes and can be
easily removed.

Kobject based sysfs_create_link() is reimplemented using
sysfs_add_link().

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/kobject.c | 47 +++++++++++++++++++++++++++
fs/sysfs/symlink.c | 83 +++++++++++++++++-------------------------------
include/linux/sysfs.h | 10 ++++++
3 files changed, 87 insertions(+), 53 deletions(-)

diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index a34c54e..7400575 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -355,6 +355,53 @@ void sysfs_remove_bin_file(struct kobject * kobj, struct bin_attribute * attr)
}
EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);

+/*
+ * kobject link interface
+ */
+
+/**
+ * sysfs_create_link - create symlink between two objects.
+ * @kobj: object whose directory we're creating the link in.
+ * @target: object we're pointing to.
+ * @name: name of the symlink.
+ */
+int sysfs_create_link(struct kobject *kobj, struct kobject *target,
+ const char *name)
+{
+ struct sysfs_dirent *parent_sd, *target_sd, *sd;
+
+ BUG_ON(!name);
+
+ if (!kobj)
+ parent_sd = sysfs_root;
+ else
+ parent_sd = kobj->sd;
+
+ if (!parent_sd)
+ return -EFAULT;
+
+ /* target->sd can go away beneath us but is protected with
+ * sysfs_assoc_lock. Fetch target_sd from it.
+ */
+ spin_lock(&sysfs_assoc_lock);
+ target_sd = NULL;
+ if (target->sd)
+ target_sd = sysfs_get(target->sd);
+ spin_unlock(&sysfs_assoc_lock);
+
+ if (!target_sd)
+ return -ENOENT;
+
+ sd = sysfs_add_link(parent_sd, name, SYSFS_COPY_NAME, target_sd);
+
+ sysfs_put(target_sd);
+
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(sysfs_create_link);
+
/**
* sysfs_remove_link - remove symlink in object's directory.
* @kobj: object we're acting for.
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 2c3e4f7..42ecb69 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -5,7 +5,6 @@
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/module.h>
-#include <linux/kobject.h>
#include <linux/namei.h>
#include <linux/mutex.h>

@@ -45,64 +44,45 @@ static void fill_object_path(struct sysfs_dirent *sd, char *buffer, int length)
}

/**
- * sysfs_create_link - create symlink between two objects.
- * @kobj: object whose directory we're creating the link in.
- * @target: object we're pointing to.
- * @name: name of the symlink.
+ * sysfs_add_link - add a new sysfs symlink
+ * @parent: sysfs_dirent to add symlink under
+ * @name: name of the symlink
+ * @mode: SYSFS_* flags for the new symlink
+ * @target: target of the symlink
+ *
+ * Add a new symlink which points to @target under @parent with
+ * the specified parameters.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the new sysfs_dirent on success, ERR_PTR() value on
+ * error.
*/
-int sysfs_create_link(struct kobject * kobj, struct kobject * target, const char * name)
+struct sysfs_dirent *sysfs_add_link(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ struct sysfs_dirent *target)
{
- struct sysfs_dirent *parent_sd = NULL;
- struct sysfs_dirent *target_sd = NULL;
- struct sysfs_dirent *sd = NULL;
- struct sysfs_addrm_cxt acxt;
- int error;
-
- BUG_ON(!name);
-
- if (!kobj)
- parent_sd = sysfs_root;
- else
- parent_sd = kobj->sd;
+ struct sysfs_dirent *sd;

- error = -EFAULT;
- if (!parent_sd)
- goto out_put;
-
- /* target->sd can go away beneath us but is protected with
- * sysfs_assoc_lock. Fetch target_sd from it.
+ /* Only symlink to directories are allowed. This is an
+ * artificial limitation. If ever needed, allowing symlinks
+ * to point to other types of sysfs nodes isn't difficult.
*/
- spin_lock(&sysfs_assoc_lock);
- if (target->sd)
- target_sd = sysfs_get(target->sd);
- spin_unlock(&sysfs_assoc_lock);
-
- error = -ENOENT;
- if (!target_sd)
- goto out_put;
+ if (sysfs_type(target) != SYSFS_DIR)
+ return ERR_PTR(-EINVAL);

- error = -ENOMEM;
- sd = sysfs_new_dirent(name, S_IRWXUGO | SYSFS_COPY_NAME, SYSFS_LINK);
+ /* allocate & initialize */
+ sd = sysfs_new_dirent(name, mode | S_IRWXUGO, SYSFS_LINK);
if (!sd)
- goto out_put;
-
- sd->s_link.target = target_sd;
- target_sd = NULL; /* reference is now owned by the symlink */
+ return ERR_PTR(-ENOMEM);

- sysfs_addrm_start(&acxt);
- error = sysfs_add_one(&acxt, parent_sd, sd);
- sysfs_addrm_finish(&acxt);
+ sd->s_link.target = sysfs_get(target);

- if (error)
- goto out_put;
-
- return 0;
-
- out_put:
- sysfs_put(target_sd);
- sysfs_put(sd);
- return error;
+ return sysfs_insert_one(parent, sd);
}
+EXPORT_SYMBOL_GPL(sysfs_add_link);

static int sysfs_get_target_path(struct sysfs_dirent * parent_sd,
struct sysfs_dirent * target_sd, char *path)
@@ -162,6 +142,3 @@ const struct inode_operations sysfs_link_inode_operations = {
.follow_link = sysfs_follow_link,
.put_link = sysfs_put_link,
};
-
-
-EXPORT_SYMBOL_GPL(sysfs_create_link);
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index dfb6bd7..08ed1b0 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -60,6 +60,9 @@ struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
struct sysfs_dirent *sysfs_add_bin(struct sysfs_dirent *parent,
const char *name, mode_t mode, size_t size,
const struct sysfs_bin_ops *bops, void *data);
+struct sysfs_dirent *sysfs_add_link(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ struct sysfs_dirent *target);

struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
@@ -94,6 +97,13 @@ static inline struct sysfs_dirent *sysfs_add_bin(struct sysfs_dirent *parent,
return NULL;
}

+static inline struct sysfs_dirent *sysfs_add_link(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ struct sysfs_dirent *target)
+{
+ return NULL;
+}
+
static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name)
{
--
1.5.0.3


2007-09-20 08:12:46

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 18/22] kobject: implement __kobject_set_name()

Implement __kobject_set_name() which takes pre-allocated @new_name and
renames the kobject without failing.

Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/kobject.h | 1 +
lib/kobject.c | 13 +++++++++++++
2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/include/linux/kobject.h b/include/linux/kobject.h
index a8a84fc..f7a7734 100644
--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -68,6 +68,7 @@ struct kobject {
struct sysfs_dirent * sd;
};

+extern void __kobject_set_name(struct kobject *kobj, const char *new_name);
extern int kobject_set_name(struct kobject *, const char *, ...)
__attribute__((format(printf,2,3)));

diff --git a/lib/kobject.c b/lib/kobject.c
index a280c62..1623125 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -230,6 +230,19 @@ int kobject_register(struct kobject * kobj)
return error;
}

+/**
+ * __kobject_set_name - Set the name of an object to preallocated string
+ * @kobj: object.
+ * @new_name: pointer to pre-allocated string for the new name
+ *
+ * Set the name of @kobj to @new_name. @new_name is used
+ * directly and will be freed using kfree() on @kobj release.
+ */
+void __kobject_set_name(struct kobject *kobj, const char *new_name)
+{
+ kfree(kobj->k_name);
+ kobj->k_name = new_name;
+}

/**
* kobject_set_name - Set the name of an object
--
1.5.0.3


2007-09-20 08:13:13

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 19/22] sysfs: implement sysfs_dirent based rename - sysfs_rename()

sysfs_rename() takes target @sd, @new_parent and @new_name and rename
@sd to @new_name and move it under @new_parent. @new_parent and/or
@new_name can be NULL if the specific operation is not needed.

To handle both move and rename && prepare for multiple renames in one
shot for easier symlink handling and shadow dirents, sysfs_rename() is
implemented to be able to handle arbitrary number of rename/move
operations.

During sysfs_prep_rename(), it acquires all the resources it will need
during the operations including dentries and copied names. After all
are acquired, all needed i_mutexes are locked. Deadlock is avoided by
using trylock. If any lock acquisition fails, it releases all
i_mutexes and retries after 1ms. Because i_mutexes are used very
lightly in sysfs, almost like spinlocks just to satisfy VFS locking
rules, I don't think there will be any starvation issues.

This makes rename a heavy operation but sysfs_rename() may fail and
it's shady-side-of-the-moon cold path where programming convenience
dominates performance by all measures.

sysfs_rename() can be called on any type of sysfs_dirent and always
copies @new_name.

Kobject based sysfs_rename_dir() and sysfs_move_dir() are
reimplemented using sysfs_remove().

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 433 +++++++++++++++++++++++++++++++++----------------
fs/sysfs/kobject.c | 31 ++++
include/linux/sysfs.h | 9 +
3 files changed, 337 insertions(+), 136 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 986718c..d0eb9bf 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -11,6 +11,7 @@
#include <linux/idr.h>
#include <linux/completion.h>
#include <linux/mutex.h>
+#include <linux/delay.h>
#include "sysfs.h"

/* verify all mode flags are inside S_IFMT */
@@ -948,142 +949,6 @@ void sysfs_remove(struct sysfs_dirent *sd)
}
EXPORT_SYMBOL_GPL(sysfs_remove);

-int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
-{
- struct sysfs_dirent *sd = kobj->sd;
- struct dentry *parent = NULL;
- struct dentry *old_dentry = NULL, *new_dentry = NULL;
- const char *dup_name = NULL;
- int error;
-
- mutex_lock(&sysfs_op_mutex);
-
- error = 0;
- if (strcmp(sd->s_name, new_name) == 0)
- goto out; /* nothing to rename */
-
- /* get the original dentry */
- old_dentry = sysfs_get_dentry(sd);
- if (IS_ERR(old_dentry)) {
- error = PTR_ERR(old_dentry);
- goto out;
- }
-
- parent = old_dentry->d_parent;
-
- /* lock parent and get dentry for new name */
- mutex_lock(&parent->d_inode->i_mutex);
- mutex_lock(&sysfs_mutex);
-
- error = -EEXIST;
- if (sysfs_find_dirent(sd->s_parent, new_name))
- goto out_unlock;
-
- error = -ENOMEM;
- new_dentry = d_alloc_name(parent, new_name);
- if (!new_dentry)
- goto out_unlock;
-
- /* rename kobject and sysfs_dirent */
- error = -ENOMEM;
- new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
- if (!new_name)
- goto out_unlock;
-
- error = kobject_set_name(kobj, "%s", new_name);
- if (error)
- goto out_unlock;
-
- dup_name = sd->s_name;
- sd->s_name = new_name;
-
- /* rename */
- d_add(new_dentry, NULL);
- d_move(old_dentry, new_dentry);
-
- error = 0;
- out_unlock:
- mutex_unlock(&sysfs_mutex);
- mutex_unlock(&parent->d_inode->i_mutex);
- kfree(dup_name);
- dput(old_dentry);
- dput(new_dentry);
- out:
- mutex_unlock(&sysfs_op_mutex);
- return error;
-}
-
-int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
-{
- struct sysfs_dirent *sd = kobj->sd;
- struct sysfs_dirent *new_parent_sd;
- struct dentry *old_parent, *new_parent = NULL;
- struct dentry *old_dentry = NULL, *new_dentry = NULL;
- int error;
-
- mutex_lock(&sysfs_op_mutex);
- BUG_ON(!sd->s_parent);
- new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : sysfs_root;
-
- error = 0;
- if (sd->s_parent == new_parent_sd)
- goto out; /* nothing to move */
-
- /* get dentries */
- old_dentry = sysfs_get_dentry(sd);
- if (IS_ERR(old_dentry)) {
- error = PTR_ERR(old_dentry);
- goto out;
- }
- old_parent = old_dentry->d_parent;
-
- new_parent = sysfs_get_dentry(new_parent_sd);
- if (IS_ERR(new_parent)) {
- error = PTR_ERR(new_parent);
- goto out;
- }
-
-again:
- mutex_lock(&old_parent->d_inode->i_mutex);
- if (!mutex_trylock(&new_parent->d_inode->i_mutex)) {
- mutex_unlock(&old_parent->d_inode->i_mutex);
- goto again;
- }
- mutex_lock(&sysfs_mutex);
-
- error = -EEXIST;
- if (sysfs_find_dirent(new_parent_sd, sd->s_name))
- goto out_unlock;
-
- error = -ENOMEM;
- new_dentry = d_alloc_name(new_parent, sd->s_name);
- if (!new_dentry)
- goto out_unlock;
-
- error = 0;
- d_add(new_dentry, NULL);
- d_move(old_dentry, new_dentry);
- dput(new_dentry);
-
- /* Remove from old parent's list and insert into new parent's list. */
- sysfs_unlink_sibling(sd);
- sysfs_get(new_parent_sd);
- sysfs_put(sd->s_parent);
- sd->s_parent = new_parent_sd;
- sysfs_link_sibling(sd);
-
- out_unlock:
- mutex_unlock(&sysfs_mutex);
- mutex_unlock(&new_parent->d_inode->i_mutex);
- mutex_unlock(&old_parent->d_inode->i_mutex);
- out:
- dput(new_parent);
- dput(old_dentry);
- dput(new_dentry);
- mutex_unlock(&sysfs_op_mutex);
- return error;
-}
-
/* Relationship between s_mode and the DT_xxx types */
static inline unsigned char dt_type(struct sysfs_dirent *sd)
{
@@ -1143,6 +1008,302 @@ const struct file_operations sysfs_dir_operations = {
.readdir = sysfs_readdir,
};

+/*
+ * Renaming a single node can result in renames of multiple nodes as
+ * symlinks pointing to the node are renamed together. To rename
+ * multiple nodes atogether atomically, all resources including
+ * i_mutexes and dentries are grabbed before committing the operation.
+ *
+ * i_mutexes are recorded using sysfs_rcxt_mutex_ent as they might not
+ * correspond one to one to renames (e.g. two symlinks to the same
+ * target in the same directory). All other resources are recorded in
+ * sysfs_rcxt_rename_ent. Both are chained to sysfs_rename_context to
+ * be used later.
+ */
+struct sysfs_rcxt_rename_ent {
+ struct list_head list;
+ struct sysfs_dirent *sd;
+ struct sysfs_dirent *new_parent;
+ const char *old_name;
+ const char *new_name;
+ int old_name_copied;
+ int new_name_copied;
+ struct dentry *old_dentry;
+ struct dentry *new_dentry;
+};
+
+struct sysfs_rcxt_mutex_ent {
+ struct list_head list;
+ struct mutex *mutex;
+ int locked;
+};
+
+struct sysfs_rename_context {
+ struct list_head mutexes;
+ struct list_head renames;
+};
+
+static struct sysfs_rcxt_rename_ent *
+sysfs_rcxt_add(struct sysfs_rename_context *rcxt, struct sysfs_dirent *sd,
+ struct sysfs_dirent *new_parent)
+{
+ struct sysfs_rcxt_rename_ent *rent;
+
+ rent = kzalloc(sizeof(*rent), GFP_KERNEL);
+ if (!rent)
+ return NULL;
+
+ rent->sd = sysfs_get(sd);
+ rent->new_parent = sysfs_get(new_parent);
+ rent->old_name = sd->s_name;
+ rent->old_name_copied = !!(sd->s_flags & SYSFS_FLAG_NAME_COPIED);
+
+ list_add_tail(&rent->list, &rcxt->renames);
+
+ return rent;
+}
+
+static int sysfs_rcxt_add_mutex(struct sysfs_rename_context *rcxt,
+ struct mutex *mutex)
+{
+ struct sysfs_rcxt_mutex_ent *ment;
+
+ list_for_each_entry(ment, &rcxt->mutexes, list)
+ if (ment->mutex == mutex)
+ return 0;
+
+ ment = kzalloc(sizeof(*ment), GFP_KERNEL);
+ if (!ment)
+ return -ENOMEM;
+
+ ment->mutex = mutex;
+
+ list_add_tail(&ment->list, &rcxt->mutexes);
+
+ return 0;
+}
+
+static int sysfs_rcxt_get_dentries(struct sysfs_rename_context *rcxt,
+ struct sysfs_rcxt_rename_ent *rent)
+{
+ struct dentry *old_dentry, *new_parent_dentry, *new_dentry;
+ int rc;
+
+ /* get old dentry */
+ old_dentry = sysfs_get_dentry(rent->sd);
+ if (IS_ERR(old_dentry))
+ return PTR_ERR(old_dentry);
+ rent->old_dentry = old_dentry;
+
+ /* allocate new dentry */
+ new_parent_dentry = sysfs_get_dentry(rent->new_parent);
+ if (IS_ERR(new_parent_dentry))
+ return PTR_ERR(new_parent_dentry);
+
+ new_dentry = d_alloc_name(new_parent_dentry, rent->new_name);
+ dput(new_parent_dentry);
+ if (!new_dentry)
+ return -ENOMEM;
+ rent->new_dentry = new_dentry;
+
+ /* add i_mutexes to mutex list */
+ rc = sysfs_rcxt_add_mutex(rcxt, &old_dentry->d_parent->d_inode->i_mutex);
+ if (rc)
+ return rc;
+
+ rc = sysfs_rcxt_add_mutex(rcxt, &new_dentry->d_parent->d_inode->i_mutex);
+ if (rc)
+ return rc;
+
+ return 0;
+}
+
+static void sysfs_post_rename(struct sysfs_rename_context *rcxt, int error)
+{
+ struct sysfs_rcxt_mutex_ent *ment, *next_ment;
+ struct sysfs_rcxt_rename_ent *rent, *next_rent;
+
+ /* release all mutexes */
+ list_for_each_entry_safe(ment, next_ment, &rcxt->mutexes, list) {
+ if (ment->locked)
+ mutex_unlock(ment->mutex);
+
+ list_del(&ment->list);
+ kfree(ment);
+ }
+
+ /* release all renames */
+ list_for_each_entry_safe(rent, next_rent, &rcxt->renames, list) {
+ /* If rename succeeded, old name is unused; otherwise,
+ * new name is unused. Free accordingly.
+ */
+ if (!error) {
+ if (rent->old_name_copied)
+ kfree(rent->old_name);
+ } else {
+ if (rent->new_name_copied)
+ kfree(rent->new_name);
+ }
+
+ dput(rent->old_dentry);
+ dput(rent->new_dentry);
+ sysfs_put(rent->sd);
+ sysfs_put(rent->new_parent);
+
+ list_del(&rent->list);
+ kfree(rent);
+ }
+}
+
+static int sysfs_prep_rename(struct sysfs_rename_context *rcxt,
+ struct sysfs_dirent *sd,
+ struct sysfs_dirent *new_parent,
+ const char *new_name)
+{
+ struct sysfs_rcxt_rename_ent *rent;
+ struct sysfs_rcxt_mutex_ent *ment;
+ int rc;
+
+ INIT_LIST_HEAD(&rcxt->mutexes);
+ INIT_LIST_HEAD(&rcxt->renames);
+
+ /*
+ * prep @sd
+ */
+ rc = -ENOMEM;
+ rent = sysfs_rcxt_add(rcxt, sd, new_parent);
+ if (!rent)
+ goto err;
+
+ rc = -ENOMEM;
+ rent->new_name = kstrdup(new_name, GFP_KERNEL);
+ if (!rent->new_name)
+ goto err;
+ rent->new_name_copied = 1;
+
+ rc = sysfs_rcxt_get_dentries(rcxt, rent);
+ if (rc)
+ goto err;
+
+ /*
+ * lock all i_mutexes
+ */
+ try_lock:
+ list_for_each_entry(ment, &rcxt->mutexes, list) {
+ if (!mutex_trylock(ment->mutex)) {
+ /* unlock all and retry */
+ list_for_each_entry(ment, &rcxt->mutexes, list) {
+ if (ment->locked) {
+ mutex_unlock(ment->mutex);
+ ment->locked = 0;
+ }
+ }
+
+ /* No need to be over-anxious, let's take it
+ * slow. Sysfs i_mutexes are lightly loaded
+ * and starvation is highly unlikely.
+ */
+ msleep(1);
+ goto try_lock;
+ }
+
+ ment->locked = 1;
+ }
+
+ return 0;
+
+ err:
+ sysfs_post_rename(rcxt, rc);
+ return rc;
+}
+
+/**
+ * sysfs_rename - rename and/or move sysfs node
+ * @sd: sysfs_dirent to rename
+ * @new_parent: new parent to move @sd under (NULL if only renaming)
+ * @new_name: new name to rename @sd to (NULL if only moving)
+ *
+ * Rename and/or move @sd. If both @new_parent and @new_name are
+ * specified, @sd is renamed to @new_name and moved under
+ * @new_parent atomically. If only one of the two is specified,
+ * only the specified operation is performed.
+ *
+ * Renaming and/or moving a sysfs node which is pointed to by
+ * symlinks causes the symlinks to be renamed according to their
+ * name formats.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * 0 on success -errno on failure.
+ */
+int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
+ const char *new_name)
+{
+ struct sysfs_rename_context rcxt;
+ struct sysfs_rcxt_rename_ent *rent;
+ int error;
+
+ mutex_lock(&sysfs_op_mutex);
+
+ if (!new_parent)
+ new_parent = sd->s_parent;
+ if (!new_name)
+ new_name = sd->s_name;
+
+ error = 0;
+ if (sd->s_parent == new_parent && !strcmp(sd->s_name, new_name))
+ goto out; /* nothing to rename */
+
+ error = sysfs_prep_rename(&rcxt, sd, new_parent, new_name);
+ if (error)
+ goto out;
+
+ /* check whether there are duplicate names */
+ error = -EEXIST;
+ list_for_each_entry(rent, &rcxt.renames, list)
+ if (sysfs_find_dirent(rent->new_parent, rent->new_name))
+ goto out_post;
+
+ mutex_lock(&sysfs_mutex);
+
+ /* rename sysfs_dirents and dentries */
+ list_for_each_entry(rent, &rcxt.renames, list) {
+ /* rename sd */
+ rent->sd->s_name = rent->new_name;
+ rent->sd->s_flags &= ~SYSFS_FLAG_NAME_COPIED;
+ if (rent->new_name_copied)
+ rent->sd->s_flags |= SYSFS_FLAG_NAME_COPIED;
+
+ /* move sd */
+ if (rent->sd->s_parent != rent->new_parent) {
+ sysfs_unlink_sibling(rent->sd);
+ sysfs_put(rent->sd->s_parent);
+ rent->sd->s_parent = sysfs_get(rent->new_parent);
+ sysfs_link_sibling(rent->sd);
+ }
+
+ /* update dcache and inode accordingly */
+ if (sysfs_type(rent->sd) == SYSFS_DIR) {
+ drop_nlink(rent->old_dentry->d_parent->d_inode);
+ inc_nlink(rent->new_dentry->d_parent->d_inode);
+ }
+ d_add(rent->new_dentry, NULL);
+ d_move(rent->old_dentry, rent->new_dentry);
+ }
+
+ mutex_unlock(&sysfs_mutex);
+ error = 0;
+ /* fall through */
+ out_post:
+ sysfs_post_rename(&rcxt, error);
+ out:
+ mutex_unlock(&sysfs_op_mutex);
+ return error;
+}
+EXPORT_SYMBOL_GPL(sysfs_rename);
+
/**
* sysfs_chmod - chmod a sysfs_dirent
* @sd: sysfs_dirent to chmod
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 0a0d583..55b884b 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -75,6 +75,37 @@ void sysfs_remove_dir(struct kobject * kobj)
__sysfs_remove(sd, 0);
}

+int sysfs_rename_dir(struct kobject *kobj, const char *new_name)
+{
+ const char *dup_name;
+ int rc;
+
+ dup_name = kstrdup(new_name, GFP_KERNEL);
+ if (!dup_name)
+ return -ENOMEM;
+
+ rc = sysfs_rename(kobj->sd, NULL, new_name);
+ if (rc) {
+ kfree(dup_name);
+ return rc;
+ }
+
+ __kobject_set_name(kobj, dup_name);
+
+ return 0;
+}
+
+int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+{
+ struct sysfs_dirent *sd = kobj->sd;
+ struct sysfs_dirent *new_parent_sd = new_parent_kobj->sd;
+
+ if (!new_parent_sd)
+ new_parent_sd = sysfs_root;
+
+ return sysfs_rename(sd, new_parent_sd, NULL);
+}
+
/*
* Subsystem file operations. These operations allow subsystems to
* have files that can be read/written.
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 08ed1b0..f0279a7 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -67,6 +67,8 @@ struct sysfs_dirent *sysfs_add_link(struct sysfs_dirent *parent,
struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
void sysfs_remove(struct sysfs_dirent *sd);
+int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
+ const char *new_name);

void sysfs_notify_file(struct sysfs_dirent *sd);
int sysfs_chmod(struct sysfs_dirent *sd, mode_t mode);
@@ -114,6 +116,13 @@ static inline void sysfs_remove(struct sysfs_dirent *sd)
{
}

+static inline int sysfs_rename(struct sysfs_dirent *sd,
+ struct sysfs_dirent *new_parent,
+ const char *new_name)
+{
+ return 0;
+}
+
static inline void sysfs_notify_file(struct sysfs_dirent *sd)
{
}
--
1.5.0.3


2007-09-20 08:13:43

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 16/22] sysfs: convert group implementation to use sd-based interface

Reimplement group interface in terms of sd-based interface in
fs/sysfs/kobject.c.

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/Makefile | 3 +-
fs/sysfs/group.c | 87 ----------------------------------------------------
fs/sysfs/kobject.c | 72 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 73 insertions(+), 89 deletions(-)
delete mode 100644 fs/sysfs/group.c

diff --git a/fs/sysfs/Makefile b/fs/sysfs/Makefile
index f58bce9..b25ed0d 100644
--- a/fs/sysfs/Makefile
+++ b/fs/sysfs/Makefile
@@ -2,5 +2,4 @@
# Makefile for the sysfs virtual filesystem
#

-obj-y := inode.o file.o dir.o symlink.o mount.o bin.o \
- group.o kobject.o
+obj-y := inode.o file.o dir.o symlink.o mount.o bin.o kobject.o
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
deleted file mode 100644
index bb94f6a..0000000
--- a/fs/sysfs/group.c
+++ /dev/null
@@ -1,87 +0,0 @@
-/*
- * fs/sysfs/group.c - Operations for adding/removing multiple files at once.
- *
- * Copyright (c) 2003 Patrick Mochel
- * Copyright (c) 2003 Open Source Development Lab
- *
- * This file is released undert the GPL v2.
- *
- */
-
-#include <linux/kobject.h>
-#include <linux/module.h>
-#include <linux/dcache.h>
-#include <linux/namei.h>
-#include <linux/err.h>
-#include "sysfs.h"
-
-
-static void remove_files(struct sysfs_dirent *dir_sd,
- const struct attribute_group *grp)
-{
- struct attribute *const* attr;
-
- for (attr = grp->attrs; *attr; attr++)
- sysfs_hash_and_remove(dir_sd, (*attr)->name);
-}
-
-static int create_files(struct sysfs_dirent *dir_sd,
- const struct attribute_group *grp)
-{
- struct attribute *const* attr;
- int error = 0;
-
- for (attr = grp->attrs; *attr && !error; attr++)
- error = __sysfs_add_file(dir_sd, *attr, SYSFS_FILE);
- if (error)
- remove_files(dir_sd, grp);
- return error;
-}
-
-
-int sysfs_create_group(struct kobject * kobj,
- const struct attribute_group * grp)
-{
- struct sysfs_dirent *sd;
- int error;
-
- BUG_ON(!kobj || !kobj->sd);
-
- if (grp->name) {
- sd = sysfs_add_dir(kobj->sd, grp->name, SYSFS_DIR_MODE, kobj);
- if (IS_ERR(sd))
- return PTR_ERR(sd);
- } else
- sd = kobj->sd;
- sysfs_get(sd);
- error = create_files(sd, grp);
- if (error) {
- if (grp->name)
- sysfs_remove(sd);
- }
- sysfs_put(sd);
- return error;
-}
-
-void sysfs_remove_group(struct kobject * kobj,
- const struct attribute_group * grp)
-{
- struct sysfs_dirent *dir_sd = kobj->sd;
- struct sysfs_dirent *sd;
-
- if (grp->name) {
- sd = sysfs_get_dirent(dir_sd, grp->name);
- BUG_ON(!sd);
- } else
- sd = sysfs_get(dir_sd);
-
- remove_files(sd, grp);
- if (grp->name)
- sysfs_remove(sd);
-
- sysfs_put(sd);
-}
-
-
-EXPORT_SYMBOL_GPL(sysfs_create_group);
-EXPORT_SYMBOL_GPL(sysfs_remove_group);
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 7400575..0a0d583 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -413,6 +413,78 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
}
EXPORT_SYMBOL_GPL(sysfs_remove_link);

+/*
+ * group interface
+ */
+static void remove_files(struct sysfs_dirent *dir_sd,
+ const struct attribute_group *grp)
+{
+ struct attribute *const* attr;
+
+ for (attr = grp->attrs; *attr; attr++)
+ sysfs_remove_child(dir_sd, (*attr)->name);
+}
+
+static int create_files(struct kobject *kobj, struct sysfs_dirent *dir_sd,
+ const struct attribute_group *grp)
+{
+ struct attribute *const* attr;
+ struct sysfs_dirent *sd;
+
+ for (attr = grp->attrs; *attr; attr++) {
+ sd = sysfs_add_file(dir_sd, (*attr)->name, (*attr)->mode,
+ sysfs_attr_fops(kobj), (void *)*attr);
+ if (IS_ERR(sd)) {
+ remove_files(dir_sd, grp);
+ return PTR_ERR(sd);
+ }
+ }
+
+ return 0;
+}
+
+int sysfs_create_group(struct kobject *kobj, const struct attribute_group *grp)
+{
+ struct sysfs_dirent *sd;
+ int error;
+
+ BUG_ON(!kobj || !kobj->sd);
+
+ if (grp->name) {
+ sd = sysfs_add_dir(kobj->sd, grp->name,
+ SYSFS_DIR_MODE | SYSFS_COPY_NAME, kobj);
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+ } else
+ sd = kobj->sd;
+ sysfs_get(sd);
+ error = create_files(kobj, sd, grp);
+ if (error && grp->name)
+ sysfs_remove(sd);
+ sysfs_put(sd);
+ return error;
+}
+EXPORT_SYMBOL_GPL(sysfs_create_group);
+
+void sysfs_remove_group(struct kobject *kobj, const struct attribute_group *grp)
+{
+ struct sysfs_dirent *dir_sd = kobj->sd;
+ struct sysfs_dirent *sd;
+
+ if (grp->name) {
+ sd = sysfs_get_dirent(dir_sd, grp->name);
+ BUG_ON(!sd);
+ } else
+ sd = sysfs_get(dir_sd);
+
+ remove_files(sd, grp);
+ if (grp->name)
+ sysfs_remove(sd);
+
+ sysfs_put(sd);
+}
+EXPORT_SYMBOL_GPL(sysfs_remove_group);
+
/**
* sysfs_add_file_to_group - add an attribute file to a pre-existing group.
* @kobj: object we're acting for.
--
1.5.0.3


2007-09-20 08:14:15

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 21/22] sysfs: kill sysfs_hash_and_remove()

Kill now unused sysfs_hash_and_remove().

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/inode.c | 22 ----------------------
fs/sysfs/sysfs.h | 1 -
2 files changed, 0 insertions(+), 23 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 8df357e..0aeea4b 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -202,25 +202,3 @@ struct inode * sysfs_get_inode(struct sysfs_dirent *sd)

return inode;
}
-
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
-{
- struct sysfs_addrm_cxt acxt;
- struct sysfs_dirent *sd;
-
- if (!dir_sd)
- return -ENOENT;
-
- sysfs_addrm_start(&acxt);
-
- sd = sysfs_find_dirent(dir_sd, name);
- if (sd)
- sysfs_remove_one(&acxt, sd);
-
- sysfs_addrm_finish(&acxt);
-
- if (sd)
- return 0;
- else
- return -ENOENT;
-}
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 62239e3..69e1451 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -137,7 +137,6 @@ static inline void sysfs_put(struct sysfs_dirent *sd)
*/
struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);

/*
* file.c
--
1.5.0.3


2007-09-20 08:14:42

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 20/22] sysfs: kill now unused __sysfs_add_file()

Kill now unused __sysfs_add_file().

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/file.c | 23 -----------------------
fs/sysfs/sysfs.h | 2 --
2 files changed, 0 insertions(+), 25 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 4c0e29f..1d93940 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -556,26 +556,3 @@ struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
return sysfs_insert_one(parent, sd);
}
EXPORT_SYMBOL(sysfs_add_file);
-
-int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
- int type)
-{
- umode_t mode = attr->mode & S_IALLUGO;
- struct sysfs_addrm_cxt acxt;
- struct sysfs_dirent *sd;
- int rc;
-
- sd = sysfs_new_dirent(attr->name, mode, type);
- if (!sd)
- return -ENOMEM;
- sd->s_file.data = (void *)attr;
-
- sysfs_addrm_start(&acxt);
- rc = sysfs_add_one(&acxt, dir_sd, sd);
- sysfs_addrm_finish(&acxt);
-
- if (rc)
- sysfs_put(sd);
-
- return rc;
-}
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 16ecd6a..62239e3 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -145,8 +145,6 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
extern const struct file_operations sysfs_file_operations;

void sysfs_file_check_suicide(struct sysfs_dirent *sd);
-int __sysfs_add_file(struct sysfs_dirent *dir_sd,
- const struct attribute *attr, int type);

/*
* bin.c
--
1.5.0.3


2007-09-20 08:15:13

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 17/22] sysfs: s/sysfs_rename_mutex/sysfs_op_mutex/ and protect all tree modifying ops

Rename sysfs_rename_mutex to sysfs_op_mutex and protect operations
which modify tree with it. ie.

sysfs_op_mutex : above i_mutexes in the lock hierarchy and
guarantees exclusion against all tree
modifications.

sysfs_mutex : under i_mutexes in the lock hierarchy and protects
vfs tree walking from actual tree modification.

So, when one wants to modify tree structure, it should first grab
sysfs_op_mutex mutex, at which point tree structure is guaranteed to
not change beneath it, and then sysfs_mutex when it actually modifies
the tree.

This widened op mutex will be used to make using symlinks easier and
the extended mutex coverage won't add any noticeable contention (only
one extra locking and unlocking around sysfs_mutex in add/remove
paths), not that it would matter even if it actually does.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 38 +++++++++++++++++++++++++-------------
fs/sysfs/sysfs.h | 2 +-
2 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index a20beff..986718c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -18,8 +18,18 @@
#error SYSFS mode flags out of S_IFMT
#endif

+/* sysfs_op_mutex is above i_mutexes in the lock hierarchy and
+ * guarantees exclusion against operations which might change tree
+ * structure (add, remove and rename). sysfs_mutex provide exclusion
+ * between tree modifying operations and vfs tree walking and is below
+ * i_mutexes in the lock hierarchy.
+ *
+ * If a thread is holding sysfs_op_mutex, no one else will change the
+ * tree structure beneath it. When the thread actually wants to
+ * change the tree structure, it needs to grab sysfs_mutex too.
+ */
+DEFINE_MUTEX(sysfs_op_mutex);
DEFINE_MUTEX(sysfs_mutex);
-DEFINE_MUTEX(sysfs_rename_mutex);
spinlock_t sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;

static spinlock_t sysfs_ino_lock = SPIN_LOCK_UNLOCKED;
@@ -87,7 +97,7 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
* dentry for each step.
*
* LOCKING:
- * mutex_lock(sysfs_rename_mutex)
+ * mutex_lock(sysfs_op_mutex)
*
* RETURNS:
* Pointer to found dentry on success, ERR_PTR() value on error.
@@ -380,18 +390,19 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
*
* This function is called when the caller is about to add or
* remove sysfs_dirents. This function initializes @acxt and
- * acquires sysfs_mutex. @acxt is used to keep and pass context
- * to other addrm functions.
+ * acquires sysfs_op_mutex and sysfs_mutex. @acxt is used to
+ * keep and pass context to other addrm functions.
*
* LOCKING:
- * Kernel thread context (may sleep). sysfs_mutex is locked on
- * return.
+ * Kernel thread context (may sleep). sysfs_op_mutex and
+ * sysfs_mutex are locked on return.
*/
void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
{
memset(acxt, 0, sizeof(*acxt));
acxt->removed_tail = &acxt->removed;

+ mutex_lock(&sysfs_op_mutex);
mutex_lock(&sysfs_mutex);
}

@@ -621,6 +632,7 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
*/
sysfs_addrm_get_parent_inode(acxt, NULL);
mutex_unlock(&sysfs_mutex);
+ mutex_unlock(&sysfs_op_mutex);

/* kill removed sysfs_dirents */
while (acxt->removed) {
@@ -679,7 +691,7 @@ struct sysfs_dirent *sysfs_insert_one(struct sysfs_dirent *parent,
* Look for sysfs_dirent with name @name under @parent_sd.
*
* LOCKING:
- * mutex_lock(sysfs_mutex)
+ * mutex_lock(sysfs_op_mutex) and/or mutex_lock(sysfs_mutex)
*
* RETURNS:
* Pointer to sysfs_dirent if found, NULL if not.
@@ -944,7 +956,7 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
const char *dup_name = NULL;
int error;

- mutex_lock(&sysfs_rename_mutex);
+ mutex_lock(&sysfs_op_mutex);

error = 0;
if (strcmp(sd->s_name, new_name) == 0)
@@ -997,7 +1009,7 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
dput(old_dentry);
dput(new_dentry);
out:
- mutex_unlock(&sysfs_rename_mutex);
+ mutex_unlock(&sysfs_op_mutex);
return error;
}

@@ -1009,7 +1021,7 @@ int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
struct dentry *old_dentry = NULL, *new_dentry = NULL;
int error;

- mutex_lock(&sysfs_rename_mutex);
+ mutex_lock(&sysfs_op_mutex);
BUG_ON(!sd->s_parent);
new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : sysfs_root;

@@ -1068,7 +1080,7 @@ again:
dput(new_parent);
dput(old_dentry);
dput(new_dentry);
- mutex_unlock(&sysfs_rename_mutex);
+ mutex_unlock(&sysfs_op_mutex);
return error;
}

@@ -1151,9 +1163,9 @@ int sysfs_chmod(struct sysfs_dirent *sd, mode_t mode)
struct iattr newattrs;
int rc;

- mutex_lock(&sysfs_rename_mutex);
+ mutex_lock(&sysfs_op_mutex);
dentry = sysfs_get_dentry(sd);
- mutex_unlock(&sysfs_rename_mutex);
+ mutex_unlock(&sysfs_op_mutex);
if (IS_ERR(dentry))
return PTR_ERR(dentry);

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index c5593f9..16ecd6a 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -87,8 +87,8 @@ extern struct kmem_cache *sysfs_dir_cachep;
/*
* dir.c
*/
+extern struct mutex sysfs_op_mutex;
extern struct mutex sysfs_mutex;
-extern struct mutex sysfs_rename_mutex;
extern spinlock_t sysfs_assoc_lock;

extern const struct file_operations sysfs_dir_operations;
--
1.5.0.3


2007-09-20 08:15:47

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 22/22] sysfs: move sysfs_assoc_lock into fs/sysfs/kobject.c and make it static

sysfs_assoc_lock which protects kobj <-> sd association is now only
used in fs/sysfs/kobject.c. Move it there and make it static.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 1 -
fs/sysfs/kobject.c | 2 ++
fs/sysfs/sysfs.h | 1 -
3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d0eb9bf..a74ca4a 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -31,7 +31,6 @@
*/
DEFINE_MUTEX(sysfs_op_mutex);
DEFINE_MUTEX(sysfs_mutex);
-spinlock_t sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;

static spinlock_t sysfs_ino_lock = SPIN_LOCK_UNLOCKED;
static DEFINE_IDA(sysfs_ino_ida);
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 55b884b..16e10de 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -19,6 +19,8 @@

#define to_sattr(a) container_of(a,struct subsys_attribute, attr)

+static spinlock_t sysfs_assoc_lock = SPIN_LOCK_UNLOCKED;
+
static int sysfs_remove_child(struct sysfs_dirent *parent, const char *name)
{
struct sysfs_dirent *sd;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 69e1451..732b292 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -89,7 +89,6 @@ extern struct kmem_cache *sysfs_dir_cachep;
*/
extern struct mutex sysfs_op_mutex;
extern struct mutex sysfs_mutex;
-extern spinlock_t sysfs_assoc_lock;

extern const struct file_operations sysfs_dir_operations;
extern const struct inode_operations sysfs_dir_inode_operations;
--
1.5.0.3


2007-09-20 08:16:25

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 11/22] sysfs: implement sysfs_dirent based file interface

sysfs_add_file() creates a text sysfs file which is equivalent to
attribute file of the original API. Each sysfs file has its own ops
containing show and store, and can have arbitrary private data. Both
show and store methods are given two private data arguments - one for
its own, the other for its parent's. This is primarily to support the
original kobject based API where private data was only available for
directory nodes but can be used for other purposes too. As with
sysfs_add_dir(), SYSFS_COPY_NAME flag can also be used.

sysfs_notify_file() is equivalent to sysfs_notify() of the original
API and can be called only on file dirents.

sysfs_chmod() is extended version of sysfs_chmod_file(). It lives in
fs/sysfs/dir.c and can be called on any type of node.

The new interface makes sysfs files very straight forward compared to
the original overly and uglily objectified API. A sysfs file lives
under a sysfs directory, has two opertaions - show and store which
take buffer and its own and parent's private data. No hidden method
selection or enforcing the same methods for all attributes under a
directory; thus, no extra wrapping or forced de-multiplexing in
show/store methods is necessary.

Kobject based interface works by setting parent_data to kobj and data
to attr. All kobject-based file related API functions are
reimplemented using the above functions in fs/sysfs/kobject.c.

This patch doesn't introduce any behavior change to the original API
except that sysfs_pos is now determined on each read/write operation
instead of on each open. This shouldn't cause any noticeable
difference.

This patch increase the size of sysfs_dirent by one pointer.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/dir.c | 47 +++++++
fs/sysfs/file.c | 335 +++++++++++++++++--------------------------------
fs/sysfs/kobject.c | 192 ++++++++++++++++++++++++++++
fs/sysfs/sysfs.h | 3 +-
include/linux/sysfs.h | 28 ++++
5 files changed, 385 insertions(+), 220 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 170749d..f4a6f2f 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -1130,3 +1130,50 @@ const struct file_operations sysfs_dir_operations = {
.read = generic_read_dir,
.readdir = sysfs_readdir,
};
+
+/**
+ * sysfs_chmod - chmod a sysfs_dirent
+ * @sd: sysfs_dirent to chmod
+ * @mode: target mode
+ *
+ * chmod @sd to @mode.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+int sysfs_chmod(struct sysfs_dirent *sd, mode_t mode)
+{
+ struct dentry *dentry;
+ struct inode *inode;
+ struct iattr newattrs;
+ int rc;
+
+ mutex_lock(&sysfs_rename_mutex);
+ dentry = sysfs_get_dentry(sd);
+ mutex_unlock(&sysfs_rename_mutex);
+ if (IS_ERR(dentry))
+ return PTR_ERR(dentry);
+
+ inode = dentry->d_inode;
+
+ mutex_lock(&inode->i_mutex);
+
+ newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
+ newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
+ rc = notify_change(dentry, &newattrs);
+
+ if (rc == 0) {
+ mutex_lock(&sysfs_mutex);
+ sd->s_mode = newattrs.ia_mode;
+ mutex_unlock(&sysfs_mutex);
+ }
+
+ mutex_unlock(&inode->i_mutex);
+
+ dput(dentry);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(sysfs_chmod);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index ae4d7cb..4c0e29f 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -3,8 +3,6 @@
*/

#include <linux/module.h>
-#include <linux/kobject.h>
-#include <linux/namei.h>
#include <linux/poll.h>
#include <linux/list.h>
#include <linux/mutex.h>
@@ -12,43 +10,6 @@

#include "sysfs.h"

-#define to_sattr(a) container_of(a,struct subsys_attribute, attr)
-
-/*
- * Subsystem file operations.
- * These operations allow subsystems to have files that can be
- * read/written.
- */
-static ssize_t
-subsys_attr_show(struct kobject * kobj, struct attribute * attr, char * page)
-{
- struct kset *kset = to_kset(kobj);
- struct subsys_attribute * sattr = to_sattr(attr);
- ssize_t ret = -EIO;
-
- if (sattr->show)
- ret = sattr->show(kset, page);
- return ret;
-}
-
-static ssize_t
-subsys_attr_store(struct kobject * kobj, struct attribute * attr,
- const char * page, size_t count)
-{
- struct kset *kset = to_kset(kobj);
- struct subsys_attribute * sattr = to_sattr(attr);
- ssize_t ret = -EIO;
-
- if (sattr->store)
- ret = sattr->store(kset, page, count);
- return ret;
-}
-
-static struct sysfs_ops subsys_sysfs_ops = {
- .show = subsys_attr_show,
- .store = subsys_attr_store,
-};
-
/*
* There's one sysfs_buffer for each open file and one
* sysfs_open_dirent for each sysfs_dirent with one or more open
@@ -70,8 +31,7 @@ struct sysfs_open_dirent {
struct sysfs_buffer {
size_t count;
loff_t pos;
- char * page;
- struct sysfs_ops * ops;
+ char *page;
struct mutex mutex;
int needs_read_fill;
int event;
@@ -136,47 +96,44 @@ void sysfs_file_check_suicide(struct sysfs_dirent *sd)
* @dentry: dentry pointer.
* @buffer: data buffer for file.
*
- * Allocate @buffer->page, if it hasn't been already, then call the
- * kobject's show() method to fill the buffer with this attribute's
- * data.
- * This is called only once, on the file's first read unless an error
- * is returned.
+ * Allocate @buffer->page, if it hasn't been already, then call
+ * the fops->show() method to fill the buffer with this file's
+ * data. This is called only once, on the file's first read
+ * unless an error is returned.
*/
-static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer)
+static int fill_read_buffer(struct dentry *dentry, struct sysfs_buffer *buffer)
{
struct sysfs_dirent *sd = dentry->d_fsdata;
- struct kobject *kobj = sd->s_parent->s_dir.data;
- struct sysfs_ops * ops = buffer->ops;
- int ret = 0;
- ssize_t count;
+ const struct sysfs_file_ops *fops = sd->s_file.ops;
+ int rc;

if (!buffer->page)
buffer->page = (char *) get_zeroed_page(GFP_KERNEL);
if (!buffer->page)
return -ENOMEM;

- /* need sd for attr and ops, its parent for kobj */
+ /* need sd for fops and data, its parent for parent_data */
if (!sysfs_get_active_two(sd))
return -ENODEV;

buffer->event = atomic_read(&sd->s_file.open->event);
buffer->accessor = current;

- count = ops->show(kobj, sd->s_file.attr, buffer->page);
+ rc = fops->show(buffer->page, sd->s_file.data,
+ sd->s_parent->s_dir.data);

if (buffer->committed_suicide)
module_allow_unload();
else
sysfs_put_active_two(sd);

- BUG_ON(count > (ssize_t)PAGE_SIZE);
- if (count >= 0) {
+ BUG_ON(rc > (ssize_t)PAGE_SIZE);
+ if (rc >= 0) {
buffer->needs_read_fill = 0;
- buffer->count = count;
- } else {
- ret = count;
+ buffer->count = rc;
+ rc = 0;
}
- return ret;
+ return rc;
}

/**
@@ -247,33 +204,30 @@ fill_write_buffer(struct sysfs_buffer * buffer, const char __user * buf, size_t
return error ? -EFAULT : count;
}

-
/**
- * flush_write_buffer - push buffer to kobject.
- * @dentry: dentry to the attribute
+ * flush_write_buffer - push buffer to sysfs_dirent
+ * @dentry: dentry to the file
* @buffer: data buffer for file.
* @count: number of bytes
*
- * Get the correct pointers for the kobject and the attribute we're
- * dealing with, then call the store() method for the attribute,
+ * Call the sysfs_dirent's fops->store() method for the file,
* passing the buffer that we acquired in fill_write_buffer().
*/
-
-static int
-flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
+static int flush_write_buffer(struct dentry *dentry,
+ struct sysfs_buffer *buffer, size_t count)
{
struct sysfs_dirent *sd = dentry->d_fsdata;
- struct kobject *kobj = sd->s_parent->s_dir.data;
- struct sysfs_ops * ops = buffer->ops;
+ const struct sysfs_file_ops *fops = sd->s_file.ops;
int rc;

- /* need attr_sd for attr and ops, its parent for kobj */
+ /* need sd for fops and data, its parent for parent_data */
if (!sysfs_get_active_two(sd))
return -ENODEV;

buffer->accessor = current;

- rc = ops->store(kobj, sd->s_file.attr, buffer->page, count);
+ rc = fops->store(buffer->page, count, sd->s_file.data,
+ sd->s_parent->s_dir.data);

if (buffer->committed_suicide)
module_allow_unload();
@@ -401,56 +355,56 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
kfree(od);
}

+/**
+ * sysfs_open_file - open method for file sysfs_dirents
+ * @inode: inode to open, passed from VFS layer
+ * @file: file struct to be opened, passed from VFS layer
+ *
+ * Open a file sysfs_dirent. This function performs several
+ * checks including sysfs_dirent and parent alivenss and access
+ * permissions. If that passes, it allocates or gets data
+ * structures needed for file operations and grants the open
+ * request.
+ *
+ * LOCKING:
+ * Determined by VFS layer.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
static int sysfs_open_file(struct inode *inode, struct file *file)
{
struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
- struct kobject *kobj = sd->s_parent->s_dir.data;
- struct sysfs_buffer * buffer;
- struct sysfs_ops * ops = NULL;
+ const struct sysfs_file_ops *fops = sd->s_file.ops;
+ struct sysfs_buffer *buffer;
int error;

- /* need sd for attr and ops, its parent for kobj */
+ /* file operations require both @sd and its parent */
if (!sysfs_get_active_two(sd))
return -ENODEV;

- /* if the kobject has no ktype, then we assume that it is a subsystem
- * itself, and use ops for it.
- */
- if (kobj->kset && kobj->kset->ktype)
- ops = kobj->kset->ktype->sysfs_ops;
- else if (kobj->ktype)
- ops = kobj->ktype->sysfs_ops;
- else
- ops = &subsys_sysfs_ops;
-
error = -EACCES;

- /* No sysfs operations, either from having no subsystem,
- * or the subsystem have no operations.
- */
- if (!ops)
+ /* can't open a file without fops */
+ if (!fops)
goto err_out;

- /* File needs write support.
- * The inode's perms must say it's ok,
+ /* File needs write support. The sd's perms must say it's ok,
* and we must have a store method.
*/
- if (file->f_mode & FMODE_WRITE) {
- if (!(inode->i_mode & S_IWUGO) || !ops->store)
- goto err_out;
- }
+ if ((file->f_mode & FMODE_WRITE) &&
+ (!(sd->s_mode & S_IWUGO) || !fops->store))
+ goto err_out;

- /* File needs read support.
- * The inode's perms must say it's ok, and we there
- * must be a show method for it.
+ /* File needs read support. The sd's perms must say it's ok,
+ * and we must have a show method for it.
*/
- if (file->f_mode & FMODE_READ) {
- if (!(inode->i_mode & S_IRUGO) || !ops->show)
- goto err_out;
- }
+ if ((file->f_mode & FMODE_READ) &&
+ (!(inode->i_mode & S_IRUGO) || !fops->show))
+ goto err_out;

- /* No error? Great, allocate a buffer for the file, and store it
- * it in file->private_data for easy access.
+ /* Allocate a buffer for the file, and store it it in
+ * file->private_data for easy access.
*/
error = -ENOMEM;
buffer = kzalloc(sizeof(struct sysfs_buffer), GFP_KERNEL);
@@ -459,7 +413,6 @@ static int sysfs_open_file(struct inode *inode, struct file *file)

mutex_init(&buffer->mutex);
buffer->needs_read_fill = 1;
- buffer->ops = ops;
file->private_data = buffer;

/* make sure we have open dirent struct */
@@ -512,7 +465,7 @@ static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
struct sysfs_open_dirent *od = sd->s_file.open;

- /* need parent for the kobj, grab both */
+ /* file operations require both @sd and its parent */
if (!sysfs_get_active_two(sd))
goto trigger;

@@ -530,33 +483,34 @@ static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
return POLLERR|POLLPRI;
}

-void sysfs_notify(struct kobject *k, char *dir, char *attr)
+/**
+ * sysfs_notify_file - notify file update
+ * @sd: target sysfs_dirent
+ *
+ * To be called on sysfs files which support polling after the
+ * file is updated. This function marks @sd changed and wakes up
+ * polling waiters.
+ *
+ * LOCKING:
+ * None.
+ */
+void sysfs_notify_file(struct sysfs_dirent *sd)
{
- struct sysfs_dirent *sd = k->sd;
-
- mutex_lock(&sysfs_mutex);
-
- if (sd && dir)
- sd = sysfs_find_dirent(sd, dir);
- if (sd && attr)
- sd = sysfs_find_dirent(sd, attr);
- if (sd) {
- struct sysfs_open_dirent *od;
+ struct sysfs_open_dirent *od;

- spin_lock(&sysfs_open_dirent_lock);
+ BUG_ON(sysfs_type(sd) != SYSFS_FILE);

- od = sd->s_file.open;
- if (od) {
- atomic_inc(&od->event);
- wake_up_interruptible(&od->poll);
- }
+ spin_lock(&sysfs_open_dirent_lock);

- spin_unlock(&sysfs_open_dirent_lock);
+ od = sd->s_file.open;
+ if (od) {
+ atomic_inc(&od->event);
+ wake_up_interruptible(&od->poll);
}

- mutex_unlock(&sysfs_mutex);
+ spin_unlock(&sysfs_open_dirent_lock);
}
-EXPORT_SYMBOL_GPL(sysfs_notify);
+EXPORT_SYMBOL_GPL(sysfs_notify_file);

const struct file_operations sysfs_file_operations = {
.read = sysfs_read_file,
@@ -567,6 +521,41 @@ const struct file_operations sysfs_file_operations = {
.poll = sysfs_poll,
};

+/**
+ * sysfs_add_file - add a sysfs file
+ * @parent: parent to add the sysfs file under
+ * @name: name of new sysfs file
+ * @mode: mode of new sysfs file
+ * @fops: s_file.ops for new sysfs file
+ * @data: s_file.data for new sysfs file
+ *
+ * Add a new sysfs file with the specified parameters under
+ * @parent.
+ *
+ * LOCKING:
+ * Kernel thread context (may sleep).
+ *
+ * RETURNS:
+ * Pointer to the new sysfs_dirent on success, ERR_PTR() value on
+ * failure.
+ */
+struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ const struct sysfs_file_ops *fops, void *data)
+{
+ struct sysfs_dirent *sd;
+
+ /* allocate and initialize */
+ sd = sysfs_new_dirent(name, mode, SYSFS_FILE);
+ if (!sd)
+ return ERR_PTR(-ENOMEM);
+
+ sd->s_file.ops = fops;
+ sd->s_file.data = data;
+
+ return sysfs_insert_one(parent, sd);
+}
+EXPORT_SYMBOL(sysfs_add_file);

int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
int type)
@@ -579,7 +568,7 @@ int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
sd = sysfs_new_dirent(attr->name, mode, type);
if (!sd)
return -ENOMEM;
- sd->s_file.attr = (void *)attr;
+ sd->s_file.data = (void *)attr;

sysfs_addrm_start(&acxt);
rc = sysfs_add_one(&acxt, dir_sd, sd);
@@ -590,95 +579,3 @@ int __sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,

return rc;
}
-
-
-/**
- * sysfs_create_file - create an attribute file for an object.
- * @kobj: object we're creating for.
- * @attr: atrribute descriptor.
- */
-
-int sysfs_create_file(struct kobject * kobj, const struct attribute * attr)
-{
- BUG_ON(!kobj || !kobj->sd || !attr);
-
- return __sysfs_add_file(kobj->sd, attr, SYSFS_FILE);
-
-}
-
-
-/**
- * sysfs_add_file_to_group - add an attribute file to a pre-existing group.
- * @kobj: object we're acting for.
- * @attr: attribute descriptor.
- * @group: group name.
- */
-int sysfs_add_file_to_group(struct kobject *kobj,
- const struct attribute *attr, const char *group)
-{
- struct sysfs_dirent *dir_sd;
- int error;
-
- dir_sd = sysfs_get_dirent(kobj->sd, group);
- if (!dir_sd)
- return -ENOENT;
-
- error = __sysfs_add_file(dir_sd, attr, SYSFS_FILE);
- sysfs_put(dir_sd);
-
- return error;
-}
-EXPORT_SYMBOL_GPL(sysfs_add_file_to_group);
-
-/**
- * sysfs_chmod_file - update the modified mode value on an object attribute.
- * @kobj: object we're acting for.
- * @attr: attribute descriptor.
- * @mode: file permissions.
- *
- */
-int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
-{
- struct sysfs_dirent *victim_sd = NULL;
- struct dentry *victim = NULL;
- struct inode * inode;
- struct iattr newattrs;
- int rc;
-
- rc = -ENOENT;
- victim_sd = sysfs_get_dirent(kobj->sd, attr->name);
- if (!victim_sd)
- goto out;
-
- mutex_lock(&sysfs_rename_mutex);
- victim = sysfs_get_dentry(victim_sd);
- mutex_unlock(&sysfs_rename_mutex);
- if (IS_ERR(victim)) {
- rc = PTR_ERR(victim);
- victim = NULL;
- goto out;
- }
-
- inode = victim->d_inode;
-
- mutex_lock(&inode->i_mutex);
-
- newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
- newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
- rc = notify_change(victim, &newattrs);
-
- if (rc == 0) {
- mutex_lock(&sysfs_mutex);
- victim_sd->s_mode = newattrs.ia_mode;
- mutex_unlock(&sysfs_mutex);
- }
-
- mutex_unlock(&inode->i_mutex);
- out:
- dput(victim);
- sysfs_put(victim_sd);
- return rc;
-}
-EXPORT_SYMBOL_GPL(sysfs_chmod_file);
-
-EXPORT_SYMBOL_GPL(sysfs_create_file);
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 8e4677c..56ec704 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -17,6 +17,8 @@
#include <linux/module.h>
#include "sysfs.h"

+#define to_sattr(a) container_of(a,struct subsys_attribute, attr)
+
static int sysfs_remove_child(struct sysfs_dirent *parent, const char *name)
{
struct sysfs_dirent *sd;
@@ -73,6 +75,128 @@ void sysfs_remove_dir(struct kobject * kobj)
__sysfs_remove(sd, 0);
}

+/*
+ * Subsystem file operations. These operations allow subsystems to
+ * have files that can be read/written.
+ */
+static ssize_t
+subsys_attr_show(struct kobject *kobj, struct attribute *attr, char *page)
+{
+ struct kset *kset = to_kset(kobj);
+ struct subsys_attribute *sattr = to_sattr(attr);
+ ssize_t ret = -EIO;
+
+ if (sattr->show)
+ ret = sattr->show(kset, page);
+ return ret;
+}
+
+static ssize_t
+subsys_attr_store(struct kobject *kobj, struct attribute *attr,
+ const char *page, size_t count)
+{
+ struct kset *kset = to_kset(kobj);
+ struct subsys_attribute *sattr = to_sattr(attr);
+ ssize_t ret = -EIO;
+
+ if (sattr->store)
+ ret = sattr->store(kset, page, count);
+ return ret;
+}
+
+static struct sysfs_ops subsys_sysfs_ops = {
+ .show = subsys_attr_show,
+ .store = subsys_attr_store,
+};
+
+/*
+ * kobject file show/store
+ */
+static struct sysfs_ops *sysfs_attr_ops(struct kobject *kobj)
+{
+ /* If the kobject has no ktype, then we assume that it is a
+ * subsystem itself, and use ops for it.
+ */
+ if (kobj->kset && kobj->kset->ktype)
+ return kobj->kset->ktype->sysfs_ops;
+ else if (kobj->ktype)
+ return kobj->ktype->sysfs_ops;
+ else
+ return &subsys_sysfs_ops;
+}
+
+static ssize_t sysfs_attr_show(char *page, void *data, void *parent_data)
+{
+ struct kobject *kobj = parent_data;
+ struct sysfs_ops *attr_ops = sysfs_attr_ops(kobj);
+ struct attribute *attr = data;
+
+ if (!attr_ops->show)
+ return -EACCES;
+
+ return attr_ops->show(kobj, attr, page);
+}
+
+static ssize_t sysfs_attr_store(const char *page, size_t size, void *data,
+ void *parent_data)
+{
+ struct kobject *kobj = parent_data;
+ struct sysfs_ops *attr_ops = sysfs_attr_ops(kobj);
+ struct attribute *attr = data;
+
+ if (!attr_ops->store)
+ return -EACCES;
+
+ return attr_ops->store(kobj, attr, page, size);
+}
+
+static const struct sysfs_file_ops sysfs_attr_rw_fops = {
+ .show = sysfs_attr_show,
+ .store = sysfs_attr_store,
+};
+
+static const struct sysfs_file_ops sysfs_attr_ro_fops = {
+ .show = sysfs_attr_show,
+};
+
+static const struct sysfs_file_ops sysfs_attr_wo_fops = {
+ .store = sysfs_attr_store,
+};
+
+static const struct sysfs_file_ops *sysfs_attr_fops(struct kobject *kobj)
+{
+ struct sysfs_ops *attr_ops = sysfs_attr_ops(kobj);
+
+ if (attr_ops) {
+ if (attr_ops->show && attr_ops->store)
+ return &sysfs_attr_rw_fops;
+ else if (attr_ops->show)
+ return &sysfs_attr_ro_fops;
+ else if (attr_ops->store)
+ return &sysfs_attr_wo_fops;
+ }
+ return NULL;
+}
+
+/**
+ * sysfs_create_file - create an attribute file for an object.
+ * @kobj: object we're creating for.
+ * @attr: atrribute descriptor.
+ */
+int sysfs_create_file(struct kobject *kobj, const struct attribute *attr)
+{
+ struct sysfs_dirent *sd;
+
+ BUG_ON(!kobj->sd || !attr);
+
+ sd = sysfs_add_file(kobj->sd, attr->name, attr->mode,
+ sysfs_attr_fops(kobj), (void *)attr);
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(sysfs_create_file);
+
/**
* sysfs_remove_file - remove an object attribute.
* @kobj: object we're acting for.
@@ -86,6 +210,48 @@ void sysfs_remove_file(struct kobject *kobj, const struct attribute *attr)
}
EXPORT_SYMBOL_GPL(sysfs_remove_file);

+void sysfs_notify(struct kobject *k, char *dir, char *attr)
+{
+ struct sysfs_dirent *sd = k->sd;
+
+ mutex_lock(&sysfs_mutex);
+
+ if (sd && dir)
+ sd = sysfs_find_dirent(sd, dir);
+ if (sd && attr)
+ sd = sysfs_find_dirent(sd, attr);
+
+ sysfs_get(sd);
+
+ mutex_unlock(&sysfs_mutex);
+
+ if (sd)
+ sysfs_notify_file(sd);
+
+ sysfs_put(sd);
+}
+EXPORT_SYMBOL_GPL(sysfs_notify);
+
+/**
+ * sysfs_chmod_file - update the modified mode value on an object attribute.
+ * @kobj: object we're acting for.
+ * @attr: attribute descriptor.
+ * @mode: file permissions.
+ *
+ */
+int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
+{
+ struct sysfs_dirent *sd;
+ int rc = -ENOENT;
+
+ sd = sysfs_get_dirent(kobj->sd, attr->name);
+ if (sd)
+ rc = sysfs_chmod(sd, mode);
+ sysfs_put(sd);
+ return rc;
+}
+EXPORT_SYMBOL_GPL(sysfs_chmod_file);
+
/**
* sysfs_remove_bin_file - remove binary file for object.
* @kobj: object.
@@ -114,6 +280,32 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
EXPORT_SYMBOL_GPL(sysfs_remove_link);

/**
+ * sysfs_add_file_to_group - add an attribute file to a pre-existing group.
+ * @kobj: object we're acting for.
+ * @attr: attribute descriptor.
+ * @group: group name.
+ */
+int sysfs_add_file_to_group(struct kobject *kobj,
+ const struct attribute *attr, const char *group)
+{
+ struct sysfs_dirent *dir_sd;
+ struct sysfs_dirent *sd;
+
+ dir_sd = sysfs_get_dirent(kobj->sd, group);
+ if (!dir_sd)
+ return -ENOENT;
+
+ sd = sysfs_add_file(dir_sd, attr->name, attr->mode,
+ sysfs_attr_fops(kobj), (void *)attr);
+ sysfs_put(dir_sd);
+
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(sysfs_add_file_to_group);
+
+/**
* sysfs_remove_file_from_group - remove an attribute file from a group.
* @kobj: object we're acting for.
* @attr: attribute descriptor.
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 4e032eb..ca02276 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -12,7 +12,8 @@ struct sysfs_elem_symlink {
};

struct sysfs_elem_file {
- struct attribute *attr;
+ const struct sysfs_file_ops *ops;
+ void *data;
struct sysfs_open_dirent *open;
};

diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 3c64b4a..9304db7 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -34,17 +34,29 @@ struct vm_area_struct;
/* collection of all flags for verification */
#define SYSFS_MODE_FLAGS SYSFS_COPY_NAME

+struct sysfs_file_ops {
+ ssize_t (*show)(char *page, void *data, void *parent_data);
+ ssize_t (*store)(const char *page, size_t size, void *data,
+ void *parent_data);
+};
+
#ifdef CONFIG_SYSFS

extern struct sysfs_dirent * const sysfs_root;

struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
const char *name, mode_t mode, void *data);
+struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ const struct sysfs_file_ops *fops, void *data);

struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
void sysfs_remove(struct sysfs_dirent *sd);

+void sysfs_notify_file(struct sysfs_dirent *sd);
+int sysfs_chmod(struct sysfs_dirent *sd, mode_t mode);
+
int __must_check sysfs_init(void);

#else /* CONFIG_SYSFS */
@@ -57,6 +69,13 @@ static inline struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
return NULL;
}

+static inline struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
+ const char *name, mode_t mode,
+ const struct sysfs_file_ops *fops, void *data)
+{
+ return NULL;
+}
+
static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name)
{
@@ -67,6 +86,15 @@ static inline void sysfs_remove(struct sysfs_dirent *sd)
{
}

+static inline void sysfs_notify_file(struct sysfs_dirent *sd)
+{
+}
+
+static inline int sysfs_chmod(struct sysfs_dirent *sd, mode_t mode)
+{
+ return 0;
+}
+
static inline int __must_check sysfs_init(void)
{
return 0;
--
1.5.0.3


2007-09-20 08:16:53

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 13/22] sysfs: implement sysfs_dirent based bin interface

sysfs_add_bin() creates a binary sysfs file which is equivalent to
binary attribute file of the original API. As with file, bin ops take
private data of both itself and parent to support kobject based API
and kobject based interface works by setting parent_data to kobj and
data to battr.

Other than interface change, internal implementation is mostly
identical.

This patch doesn't introduce any behavior change to the original API.

Signed-off-by: Tejun Heo <[email protected]>
---
fs/sysfs/bin.c | 77 +++++++++++++++++++++++--------------------
fs/sysfs/inode.c | 5 +--
fs/sysfs/kobject.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++
fs/sysfs/sysfs.h | 4 ++-
include/linux/sysfs.h | 18 ++++++++++
5 files changed, 150 insertions(+), 41 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index dad891c..d6cc7b3 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -11,7 +11,6 @@
#include <linux/errno.h>
#include <linux/fs.h>
#include <linux/kernel.h>
-#include <linux/kobject.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/mutex.h>
@@ -26,21 +25,21 @@ struct bin_buffer {
int mmapped;
};

-static int
-fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
+static int fill_read(struct dentry *dentry, char *buffer, loff_t off,
+ size_t count)
{
struct sysfs_dirent *sd = dentry->d_fsdata;
- struct bin_attribute *attr = sd->s_bin.bin_attr;
- struct kobject *kobj = sd->s_parent->s_dir.data;
+ const struct sysfs_bin_ops *bops = sd->s_bin.ops;
int rc;

- /* need sd for attr, its parent for kobj */
+ /* need sd for bops and data, its parent for parent_data */
if (!sysfs_get_active_two(sd))
return -ENODEV;

- rc = -EIO;
- if (attr->read)
- rc = attr->read(kobj, attr, buffer, off, count);
+ rc = -EACCES;
+ if (bops->read)
+ rc = bops->read(buffer, off, count, sd->s_bin.data,
+ sd->s_parent->s_dir.data);

sysfs_put_active_two(sd);

@@ -83,21 +82,21 @@ read(struct file *file, char __user *userbuf, size_t bytes, loff_t *off)
return count;
}

-static int
-flush_write(struct dentry *dentry, char *buffer, loff_t offset, size_t count)
+static int flush_write(struct dentry *dentry, char *buffer, loff_t offset,
+ size_t count)
{
struct sysfs_dirent *sd = dentry->d_fsdata;
- struct bin_attribute *attr = sd->s_bin.bin_attr;
- struct kobject *kobj = sd->s_parent->s_dir.data;
+ const struct sysfs_bin_ops *bops = sd->s_bin.ops;
int rc;

- /* need sd for attr, its parent for kobj */
+ /* need sd for bops and data, its parent for parent_data */
if (!sysfs_get_active_two(sd))
return -ENODEV;

- rc = -EIO;
- if (attr->write)
- rc = attr->write(kobj, attr, buffer, offset, count);
+ rc = -EACCES;
+ if (bops->write)
+ rc = bops->write(buffer, offset, count, sd->s_bin.data,
+ sd->s_parent->s_dir.data);

sysfs_put_active_two(sd);

@@ -140,19 +139,19 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
{
struct bin_buffer *bb = file->private_data;
struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
- struct bin_attribute *attr = sd->s_bin.bin_attr;
- struct kobject *kobj = sd->s_parent->s_dir.data;
+ const struct sysfs_bin_ops *bops = sd->s_bin.ops;
int rc;

mutex_lock(&bb->mutex);

- /* need sd for attr, its parent for kobj */
+ /* need sd for bops and data, its parent for parent_data */
if (!sysfs_get_active_two(sd))
return -ENODEV;

rc = -EINVAL;
- if (attr->mmap)
- rc = attr->mmap(kobj, attr, vma);
+ if (bops->mmap)
+ rc = bops->mmap(vma, sd->s_bin.data,
+ sd->s_parent->s_dir.data);

if (rc == 0 && !bb->mmapped)
bb->mmapped = 1;
@@ -167,7 +166,7 @@ static int mmap(struct file *file, struct vm_area_struct *vma)
static int open(struct inode * inode, struct file * file)
{
struct sysfs_dirent *sd = file->f_path.dentry->d_fsdata;
- struct bin_attribute *attr = sd->s_bin.bin_attr;
+ const struct sysfs_bin_ops *bops = sd->s_bin.ops;
struct bin_buffer *bb = NULL;
int error;

@@ -176,9 +175,11 @@ static int open(struct inode * inode, struct file * file)
return -ENODEV;

error = -EACCES;
- if ((file->f_mode & FMODE_WRITE) && !(attr->write || attr->mmap))
+ if (!bops)
goto err_out;
- if ((file->f_mode & FMODE_READ) && !(attr->read || attr->mmap))
+ if ((file->f_mode & FMODE_WRITE) && !(bops->write || bops->mmap))
+ goto err_out;
+ if ((file->f_mode & FMODE_READ) && !(bops->read || bops->mmap))
goto err_out;

error = -ENOMEM;
@@ -224,17 +225,21 @@ const struct file_operations sysfs_bin_file_operations = {
.release = release,
};

-/**
- * sysfs_create_bin_file - create binary file for object.
- * @kobj: object.
- * @attr: attribute descriptor.
- */
-
-int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr)
+struct sysfs_dirent *sysfs_add_bin(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, size_t size,
+ const struct sysfs_bin_ops *bops, void *data)
{
- BUG_ON(!kobj || !kobj->sd || !attr);
+ struct sysfs_dirent *sd;

- return __sysfs_add_file(kobj->sd, &attr->attr, SYSFS_BIN);
-}
+ /* allocate and initialize */
+ sd = sysfs_new_dirent(name, mode, SYSFS_BIN);
+ if (!sd)
+ return ERR_PTR(-ENOMEM);
+
+ sd->s_bin.ops = bops;
+ sd->s_bin.size = size;
+ sd->s_bin.data = data;

-EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
+ return sysfs_insert_one(parent, sd);
+}
+EXPORT_SYMBOL(sysfs_add_bin);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 593b1da..d303e62 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -136,8 +136,6 @@ static int sysfs_count_nlink(struct sysfs_dirent *sd)

static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
{
- struct bin_attribute *bin_attr;
-
inode->i_blocks = 0;
inode->i_mapping->a_ops = &sysfs_aops;
inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
@@ -167,8 +165,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
inode->i_fop = &sysfs_file_operations;
break;
case SYSFS_BIN:
- bin_attr = sd->s_bin.bin_attr;
- inode->i_size = bin_attr->size;
+ inode->i_size = sd->s_bin.size;
inode->i_fop = &sysfs_bin_file_operations;
break;
case SYSFS_KOBJ_LINK:
diff --git a/fs/sysfs/kobject.c b/fs/sysfs/kobject.c
index 56ec704..a34c54e 100644
--- a/fs/sysfs/kobject.c
+++ b/fs/sysfs/kobject.c
@@ -252,6 +252,93 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
}
EXPORT_SYMBOL_GPL(sysfs_chmod_file);

+/*
+ * kobject bin_attribute interface
+ */
+static ssize_t sysfs_battr_read(char *buf, loff_t off, size_t size,
+ void *data, void *parent_data)
+{
+ struct kobject *kobj = parent_data;
+ struct bin_attribute *battr = data;
+
+ if (!battr->read)
+ return -EACCES;
+
+ return battr->read(kobj, battr, buf, off, size);
+}
+
+static ssize_t sysfs_battr_write(char *buf, loff_t off, size_t size,
+ void *data, void *parent_data)
+{
+ struct kobject *kobj = parent_data;
+ struct bin_attribute *battr = data;
+
+ if (!battr->write)
+ return -EACCES;
+
+ return battr->write(kobj, battr, buf, off, size);
+}
+
+static int sysfs_battr_mmap(struct vm_area_struct *vma, void *data,
+ void *parent_data)
+{
+ struct kobject *kobj = parent_data;
+ struct bin_attribute *battr = data;
+
+ if (!battr->mmap)
+ return -EINVAL;
+
+ return battr->mmap(kobj, battr, vma);
+}
+
+static const struct sysfs_bin_ops sysfs_battr_rwm_bops = {
+ .read = sysfs_battr_read,
+ .write = sysfs_battr_write,
+ .mmap = sysfs_battr_mmap,
+};
+
+static const struct sysfs_bin_ops sysfs_battr_rw_bops = {
+ .read = sysfs_battr_read,
+ .write = sysfs_battr_write,
+};
+
+static const struct sysfs_bin_ops sysfs_battr_ro_bops = {
+ .read = sysfs_battr_read,
+};
+
+static const struct sysfs_bin_ops sysfs_battr_wo_bops = {
+ .write = sysfs_battr_write,
+};
+
+/**
+ * sysfs_create_bin_file - create binary file for object.
+ * @kobj: object.
+ * @attr: attribute descriptor.
+ */
+int sysfs_create_bin_file(struct kobject *kobj, struct bin_attribute *battr)
+{
+ struct sysfs_dirent *sd;
+ const struct sysfs_bin_ops *bops = NULL;
+
+ BUG_ON(!kobj || !kobj->sd || !battr);
+
+ if (battr->mmap)
+ bops = &sysfs_battr_rwm_bops;
+ else if (battr->read && battr->write)
+ bops = &sysfs_battr_rw_bops;
+ else if (battr->read)
+ bops = &sysfs_battr_ro_bops;
+ else if (battr->write)
+ bops = &sysfs_battr_wo_bops;
+
+ sd = sysfs_add_bin(kobj->sd, battr->attr.name, battr->attr.mode,
+ battr->size, bops, battr);
+ if (IS_ERR(sd))
+ return PTR_ERR(sd);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
+
/**
* sysfs_remove_bin_file - remove binary file for object.
* @kobj: object.
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3f505d7..d8c61c5 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -18,7 +18,9 @@ struct sysfs_elem_file {
};

struct sysfs_elem_bin {
- struct bin_attribute *bin_attr;
+ const struct sysfs_bin_ops *ops;
+ void *data;
+ size_t size;
};

/*
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9304db7..dfb6bd7 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -40,6 +40,14 @@ struct sysfs_file_ops {
void *parent_data);
};

+struct sysfs_bin_ops {
+ ssize_t (*read)(char *buf, loff_t off, size_t size, void *data,
+ void *parent_data);
+ ssize_t (*write)(char *buf, loff_t off, size_t size, void *data,
+ void *parent_data);
+ int (*mmap)(struct vm_area_struct *vma, void *data, void *parent_data);
+};
+
#ifdef CONFIG_SYSFS

extern struct sysfs_dirent * const sysfs_root;
@@ -49,6 +57,9 @@ struct sysfs_dirent *sysfs_add_dir(struct sysfs_dirent *parent,
struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
const char *name, mode_t mode,
const struct sysfs_file_ops *fops, void *data);
+struct sysfs_dirent *sysfs_add_bin(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, size_t size,
+ const struct sysfs_bin_ops *bops, void *data);

struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name);
@@ -76,6 +87,13 @@ static inline struct sysfs_dirent *sysfs_add_file(struct sysfs_dirent *parent,
return NULL;
}

+static inline struct sysfs_dirent *sysfs_add_bin(struct sysfs_dirent *parent,
+ const char *name, mode_t mode, size_t size,
+ const struct sysfs_bin_ops *bops, void *data)
+{
+ return NULL;
+}
+
static inline struct sysfs_dirent *sysfs_find_child(struct sysfs_dirent *parent,
const char *name)
{
--
1.5.0.3


2007-09-25 22:17:53

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Thu, Sep 20, 2007 at 05:05:39PM +0900, Tejun Heo wrote:
> Subject: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model
>
> Hello, all.
>
> This is the third patchset of four sysfs update patchset series[1] and
> to be applied on top of the second patchset[2].
>
> Currently, sysfs interface is based on kobj. This made more sense
> before because lifetime of sysfs nodes were tracked using kobj
> reference counts. However, this is no longer true. sysfs nodes are
> represented with a sysfs_dirent and external reference is severed
> immediately on node removal. The internal implementation reflects
> that too and mostly handles sysfs_dirents.
>
> This patchset divorces sysfs from kobject and driver model by
> implementing sysfs_dirent based interface. This has the following
> advantages.
>
> * sysfs becomes a separate module and driver model becomes a user of
> sysfs. Those two are not entangled anymore. Things are easier to
> understand and test this way.

This is good, I like this.

> * Non-driver model users of sysfs (modules, blkdev, etc...) don't have
> to jump through hoops to use sysfs. kobj based interface requires
> attribute wrapping and is awkward to use directly. Also, the user
> is required to create a dummy kobj which doesn't serve much purpose
> than being a token for sysfs reference. New sysfs-dirent based
> interface is straight forward proc-fs like interface and should be
> easier and more intuitive for those users.

This is not good, I don't like this :(

As we spoke a few weeks ago, the non-driver model users of sysfs are ok.
Yes, it's not trivial to use sysfs in this manner, and it should be made
easier, but we still need to keep our tree of objects. Using a kobject
for sysfs access is a good thing as it provides a tiny grasp on keeping
the usage of sysfs under control.

So while I like the separation of sysfs and kobjects from an
architectural and conceptual level for testing and understanding, I do
not want to allow the use of sysfs without creating a backing kobject
like we do today.

I'm all for making the "raw" kobject access easier, cleaning up the
attribute "mess" that you need to go through. The cleanups that Kay and
I have been doing in the kset and subsystem area are steps in that
direction and I have more I want to do there to help make it easier to
use and understand.

So, I'll try to pick and choose from this patchset what I feel is ok for
now.

Or does it depend on the second set of patches that are yet to be
applied due to disagreements about module lifetimes?

thanks,

greg k-h

2007-09-27 18:18:21

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Hello, Greg.

Sorry about the late reply. I'm sandwiched between several release
dates (I bet you know) and sudden burst of family/personal events (all
kinds of them - good, annual and bad).

Greg KH wrote:
>> * sysfs becomes a separate module and driver model becomes a user of
>> sysfs. Those two are not entangled anymore. Things are easier to
>> understand and test this way.
>
> This is good, I like this.

Great. :-)

>> * Non-driver model users of sysfs (modules, blkdev, etc...) don't have
>> to jump through hoops to use sysfs. kobj based interface requires
>> attribute wrapping and is awkward to use directly. Also, the user
>> is required to create a dummy kobj which doesn't serve much purpose
>> than being a token for sysfs reference. New sysfs-dirent based
>> interface is straight forward proc-fs like interface and should be
>> easier and more intuitive for those users.
>
> This is not good, I don't like this :(
>
> As we spoke a few weeks ago, the non-driver model users of sysfs are ok.
> Yes, it's not trivial to use sysfs in this manner, and it should be made
> easier, but we still need to keep our tree of objects. Using a kobject
> for sysfs access is a good thing as it provides a tiny grasp on keeping
> the usage of sysfs under control.
>
> So while I like the separation of sysfs and kobjects from an
> architectural and conceptual level for testing and understanding, I do
> not want to allow the use of sysfs without creating a backing kobject
> like we do today.
>
> I'm all for making the "raw" kobject access easier, cleaning up the
> attribute "mess" that you need to go through. The cleanups that Kay and
> I have been doing in the kset and subsystem area are steps in that
> direction and I have more I want to do there to help make it easier to
> use and understand.

I suppose I failed to sell new sysfs_dirent based interface idea
face-to-face. I'll try it one more time on-line. I tend to do these
things better on-line, especially in English. So, please spare some
more time on the subject.

IMHO, removal of kobject from sysfs interface is a logical and necessary
step toward easier driver model, not an unnecessary because-we-can
modification. I need to go back to what a "kobject" is to explain this.

1. What is a kobject?

If I understood it correctly, kobject was separated out from device
driver model to allow entities outside of driver model to use sysfs, so
it's a part of device driver object which is necessary to interact with
sysfs.

Originally, driver model objects and their sysfs representation was
tightly coupled. This is what made kobject a "kobject" not
"sysfs_something". Driver model and sysfs shared the same object to
represent kernel and sysfs-side. kobject was a base class of all driver
model objects, interaction with sysfs was through this base object and
implementation of sysfs also depended on kobject.

The functionality served by kobject can be broken down into the
following two.

F-a. To serve as an entity both subsystems can share lifespan
management. ie. both subsystems reference count on kobject.

F-b. To serve as an entity both subsystems can base their internal
representation on. (base object in OO term).

2. Implementation of immediate detach of sysfs nodes

Unfortunately, this tight coupling caused several problems. One of the
most annoying problems was that userland was allowed to interfere
directly with lifespan management of kernel objects which formed basis
of driver model, causing quite some number of problems directly and
indirectly and unfortunately the problem couldn't be contained inside
driver model. Mid or low level driver implementation was affected too.

As a response, immediate detach of sysfs nodes was implemented. When a
sysfs node is removed, it immediately disconnects from the associated
kobject. This way, the burden of lifespan management (at least sysfs
related part of it) is contained inside sysfs proper where we can afford
more effort, testing and thus complexity. On an unrelated note, I think
this is the beginning step toward a bigger change, that is, shielding
drivers from the complexity of object lifespan management.

Anyways, so, now that immediate disconnect is in place, sysfs is no
longer involved in lifespan management of driver model objects. It
attaches and detaches when it's told to do so. Naturally, most of
internal implementation changed to use independent objects
(sysfs_dirent) instead of kobject in the process.

3. Where does that leave kobject?

If you combine #1 and #2, both functionalities become questionable.

F-a. sysfs no longer plays role in lifespan management of driver model
object. This functionality is exactly what's killed by #2.

F-b. In the process of #2, the internal representation naturally moved
over to sysfs_dirent. The interface remained the same but after
dereferencing kobject->sd, kobject itself is mostly irrelevant to sysfs
and where kobj is still used, the code is either difficult to read or
outright buggy. This is expected. Lifespans of sysfs and driver model
objects are managed completely independently. Dereferencing objects on
the other domain is inherently cumbersome.

With both F-a and F-b nullified, left purposes kobject still serve are
the followings.

L-a. Serve as opaque token in sysfs interface but with all the reasons
to do so removed, this is at best cumbersome. It's an opaque token but
with a lot of unnecessary baggages. This role can be _much_ better
served by sysfs_dirent.

L-b. Serve as something a subsystem can use to count references which
also can be used to access sysfs if wanted. To me, this feels like a a
flash light which can also be used to spread butter.

IMHO, both L-a and L-b contribute only to obfuscation of the driver
model and sysfs.

4. So?

>From #3, as kobject no longer serves any valid purpose to sysfs, it's
natural conclusion to try to remove kobject from sysfs, which of course
brings up the question of conversion cost.

95+% of sysfs users use it through driver model which wraps sysfs
interface and exports it as a part of driver model. For these,
conversion only needs to happen inside the driver model, so we
definitely can do that.

The rest isn't great in number and, much more importantly, many of those
suffer from the current interface which is painful to use independently.
For example, kernel/module.c does all the kobject dances including
defining a subsystem just to ignore everything else and use it as an
opaque token to sysfs (kset_find_obj doesn't count, a generic map or
sysfs with sysfs_dirent interface can do that just as well).

In addition, as done in the current patchset series, the kobject
interface can be maintained while the conversion is in progress, so IMHO
converting to sysfs_dirent interface is the right thing to do (tm).

And, as we need to design new interface anyway, this is a good
opportunity to improve sysfs API so that it can be used more easily.
The requirement for sysfs is different from other filesystems, it's
userland visible presentation of kernel objects (most of them being
driver model objects) and has different requirements. For example, a
symlink is completely dependent on the target it points to, so that's
where all the new functionalities came from. I'll try to explain each
of them in respective replies.

5. Wouldn't that allow manifestation of random hierarchy all over sysfs?

I really don't know whether it will or not but I don't agree interface
obfuscation is the right way to prevent that. IMHO, if we need better
policing under /sys than regular review process can provide, we should
force it by clearly defined policies and documentation not by
obfuscation, which, BTW, can't really prevent anything.

For example...

* I don't worry too much about hierarchy under /sys/devices. Most use
and will continue to use interface provided by the driver model which
forces the current structure. If this is not enough, we can, for
example, disallow a sysfs node representing a device from having subdirs
deeper than one level or any subdirs which can generate uevent.

* For the other currently existing top directories, I think they're
already too specific for anyone to add random hierarchy under. If
random top level directory is worried, we can add a central list of
allowed top directories in fs/sysfs/mount.c so that no one can sneak behind.

These are just examples but both can be implemented and documented
easily and IMHO will usually result in better end result.

So, no, I don't agree to keeping kobject based interface to keep sysfs
hierarchy tidy.

6. Conclusion

I think I said enough about why kobject based interface isn't such a
good idea anymore. I'll try to cover why it's a good idea to move over
to new sysfs_dirent based interface.

* It's a clean up and a big one at that. It makes sysfs code and
interface much more straight-forward and its users will benefit too when
they are converted over to new interface.

* It removes unnecessary API-visible use of kobject. I think driver
model should head toward moving object lifespan management and other
complexities to higher level - ie. driver model, block layer, etc - and
export simple interface to drivers. kobject is too much of
implementation detail to export to drivers. Removing kobject from sysfs
interface is a step toward that direction.

* sysfs_dirent interface is native to what sysfs does and thus can be
much more flexible. This will help improving the driver model and
adding new features.

> So, I'll try to pick and choose from this patchset what I feel is ok for
> now.
>
> Or does it depend on the second set of patches that are yet to be
> applied due to disagreements about module lifetimes?

The whole series is centered around ideas illustrated above. I think it
would be better to establish consensus before starting to merge patches
(the first patchset was misc cleanup, so that's okay). We might end up
heading different direction.

Hmmm... That was a rather long write-up. If you're still reading, I
appreciate your time and hope I didn't waste it for nothing. :-)

Thanks a lot.

--
tejun

2007-09-27 19:26:32

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model


I still need to look at the code in detail but I have some concerns
I want to inject into this conversation of future sysfs architecture.

- If we want to carefully limit sysfs from going to wild code review
is clearly not enough. We need some technological measures to
assist us. As the experience with sysctl has shown.

I discovered that something like 10% of the sysctl entries were
buggy and had been for years when I added basic runtime sanity
checks.

I had also found one instance in the kernel and had one instance
from outside the kernel where people had created files under
/proc/sys not as sysctls but as using the infrastructure from
proc_generic.c because it happened to work.

So while it very well may be we don't want to use the kobject
interface anymore. I expect that we want to have the sysfs_dirent
interface not exported to modules, and only allow direct
access from code compiled into the kernel.

Mostly I am thinking that any non-object model users should have
their own dedicated wrapper layer. To help keep things consistent
and to make it hard enough to abuse the system that people will
find that it is usually easier to do it the right way.

- The network namespace work scheduled to be merged in 2.6.24 is
currently has a dependency in Kconfig that is "&& !SYSFS"
because sysfs is currently very much a moving target.

Does it look like we can resolve Tejun's work for 2.6.24?
If not does it make sense to push my patches that allow
multiple mounts of sysfs for 2.6.24? So I can allow
network namespaces in the presence of sysfs.

Outside of sysfs and the device model I'm only talk maybe 30 lines
of code... So I could easily merge that patch later in the
merge window after the other pieces have gone in.

- Farther down the road we have the device namespace.
The bounding requirements are:
- We want to restrict which set of devices a subset of process
can access.
- When we migrate an application we want to preserve the device
numbers of all devices that show up in the new location.
So filesystems whose block devices reside on a SAN, ramdisks,
ttys, etc.
Other devices that really are different we can handle with
hotplug remove and add events, during the migration.

So while there is lower hanging fruit the requirements for a
device namespace are becoming clear, and don't look like something
we will ultimately be able to dodge.

For sysfs the implication is that we will need to filter the
hotplug events based upon the device namespace of the recipient, and
we will need to restrict the set of devices that show up in sysfs
based on who mounts it (as the prototype patches with the network
namespace are doing).

Also fun is that the dev file implementation needs to be able to
report different major:minor numbers based on which mount of
sysfs we are dealing with.

Eric

2007-09-29 22:08:18

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Hello, Eric.

Eric W. Biederman wrote:
> Mostly I am thinking that any non-object model users should have
> their own dedicated wrapper layer. To help keep things consistent
> and to make it hard enough to abuse the system that people will
> find that it is usually easier to do it the right way.

Hmmm... I think most current non-driver-model sysfs users are deep in
kernel anyway, but I think not exporting sysfs interface at all might be
a bit too restrictive. I think we need to examine the current
non-driver-model sysfs users thoroughly to determine what to do about
this. But, yes, I do agree that we need to put restrictions one way or
the other.

> Does it look like we can resolve Tejun's work for 2.6.24?
> If not does it make sense to push my patches that allow
> multiple mounts of sysfs for 2.6.24? So I can allow
> network namespaces in the presence of sysfs.
>
> Outside of sysfs and the device model I'm only talk maybe 30 lines
> of code... So I could easily merge that patch later in the
> merge window after the other pieces have gone in.

I think it would be better if namespace comes after interface update and
other new features, especially symlink renaming, but, under the current
circumstance, it might delay namespace unnecessarily and I have no
problem with your patches going in first. My concerns are...

* Do you think you can use new rename implementation contained in this
series? It borrows basic ideas from the implementation you used for
namespace but is more generic. It would be great if you can use it
without too much modification.

* I'm still against using callbacks to determine namespace tags because
callbacks need to be coupled with sysfs internals more tightly and are
more difficult to grasp interface-wise.

> - Farther down the road we have the device namespace.
> The bounding requirements are:
> - We want to restrict which set of devices a subset of process
> can access.
> - When we migrate an application we want to preserve the device
> numbers of all devices that show up in the new location.
> So filesystems whose block devices reside on a SAN, ramdisks,
> ttys, etc.
> Other devices that really are different we can handle with
> hotplug remove and add events, during the migration.
>
> So while there is lower hanging fruit the requirements for a
> device namespace are becoming clear, and don't look like something
> we will ultimately be able to dodge.
>
> For sysfs the implication is that we will need to filter the
> hotplug events based upon the device namespace of the recipient, and
> we will need to restrict the set of devices that show up in sysfs
> based on who mounts it (as the prototype patches with the network
> namespace are doing).
>
> Also fun is that the dev file implementation needs to be able to
> report different major:minor numbers based on which mount of
> sysfs we are dealing with.

Ah... Coming few months will be fun, won't they? :-)

--
tejun

2007-10-05 06:19:57

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Thu, Sep 27, 2007 at 04:35:07AM -0700, Tejun Heo wrote:
> Hello, Greg.
>
> Sorry about the late reply. I'm sandwiched between several release
> dates (I bet you know) and sudden burst of family/personal events (all
> kinds of them - good, annual and bad).

Same here, I'm swamped with stuff, and now am on vacation for a few
days...

> I suppose I failed to sell new sysfs_dirent based interface idea
> face-to-face. I'll try it one more time on-line. I tend to do these
> things better on-line, especially in English. So, please spare some
> more time on the subject.
>
> IMHO, removal of kobject from sysfs interface is a logical and necessary
> step toward easier driver model, not an unnecessary because-we-can
> modification. I need to go back to what a "kobject" is to explain this.
>
> 1. What is a kobject?
>
> If I understood it correctly, kobject was separated out from device
> driver model to allow entities outside of driver model to use sysfs, so
> it's a part of device driver object which is necessary to interact with
> sysfs.

Yes, it was done so that we could have /sys/block for block devices. Al
Viro did that work.

> Originally, driver model objects and their sysfs representation was
> tightly coupled. This is what made kobject a "kobject" not
> "sysfs_something". Driver model and sysfs shared the same object to
> represent kernel and sysfs-side. kobject was a base class of all driver
> model objects, interaction with sysfs was through this base object and
> implementation of sysfs also depended on kobject.
>
> The functionality served by kobject can be broken down into the
> following two.
>
> F-a. To serve as an entity both subsystems can share lifespan
> management. ie. both subsystems reference count on kobject.
>
> F-b. To serve as an entity both subsystems can base their internal
> representation on. (base object in OO term).

Yes, those are two functions, I can agree with.

> 2. Implementation of immediate detach of sysfs nodes
>
> Unfortunately, this tight coupling caused several problems. One of the
> most annoying problems was that userland was allowed to interfere
> directly with lifespan management of kernel objects which formed basis
> of driver model, causing quite some number of problems directly and
> indirectly and unfortunately the problem couldn't be contained inside
> driver model. Mid or low level driver implementation was affected too.
>
> As a response, immediate detach of sysfs nodes was implemented. When a
> sysfs node is removed, it immediately disconnects from the associated
> kobject. This way, the burden of lifespan management (at least sysfs
> related part of it) is contained inside sysfs proper where we can afford
> more effort, testing and thus complexity. On an unrelated note, I think
> this is the beginning step toward a bigger change, that is, shielding
> drivers from the complexity of object lifespan management.

I agree. Your work in this area has been great and helped out a lot.

> Anyways, so, now that immediate disconnect is in place, sysfs is no
> longer involved in lifespan management of driver model objects. It
> attaches and detaches when it's told to do so. Naturally, most of
> internal implementation changed to use independent objects
> (sysfs_dirent) instead of kobject in the process.
>
> 3. Where does that leave kobject?
>
> If you combine #1 and #2, both functionalities become questionable.
>
> F-a. sysfs no longer plays role in lifespan management of driver model
> object. This functionality is exactly what's killed by #2.

Yes, but a kobject is still needed internally for the lifespan
management.

> F-b. In the process of #2, the internal representation naturally moved
> over to sysfs_dirent. The interface remained the same but after
> dereferencing kobject->sd, kobject itself is mostly irrelevant to sysfs
> and where kobj is still used, the code is either difficult to read or
> outright buggy. This is expected. Lifespans of sysfs and driver model
> objects are managed completely independently. Dereferencing objects on
> the other domain is inherently cumbersome.
>
> With both F-a and F-b nullified, left purposes kobject still serve are
> the followings.
>
> L-a. Serve as opaque token in sysfs interface but with all the reasons
> to do so removed, this is at best cumbersome. It's an opaque token but
> with a lot of unnecessary baggages. This role can be _much_ better
> served by sysfs_dirent.

Possibly, I'm still not sold on this.

> L-b. Serve as something a subsystem can use to count references which
> also can be used to access sysfs if wanted. To me, this feels like a a
> flash light which can also be used to spread butter.

Heh, that's a good analogy :)

> IMHO, both L-a and L-b contribute only to obfuscation of the driver
> model and sysfs.

No. I think you are missing a number of things that kobjects provide
and allow:
- a structurual heirachy of devices. Combine kobjects with
ksets and ktypes, and you have a very powerful system to
categorize objects and their representation to each other.
- a consistant and easy interface to userspace through
uevents/hotplug of the creation and removal of these objects.
This keeps the different parts of the kernel that need this
interface from having to create it every time on their own.
- a way to easily create and export attributes in sysfs
automatically.
- a way to provide working reference counting for a variety of
different objects.

All of those are still needed for the kernel.

> 4. So?
>
> From #3, as kobject no longer serves any valid purpose to sysfs, it's
> natural conclusion to try to remove kobject from sysfs, which of course
> brings up the question of conversion cost.

I don't mind the removal of kobjects from sysfs in order to make sysfs
and kobjects work better/simpler. However the majority of the patches
you created to do this end up with more code overall, and are of no
benifit to the current users of sysfs and kobjects in the kernel.

> 95+% of sysfs users use it through driver model which wraps sysfs
> interface and exports it as a part of driver model. For these,
> conversion only needs to happen inside the driver model, so we
> definitely can do that.

But what would that benifit the driver model?

> The rest isn't great in number and, much more importantly, many of those
> suffer from the current interface which is painful to use independently.
> For example, kernel/module.c does all the kobject dances including
> defining a subsystem just to ignore everything else and use it as an
> opaque token to sysfs (kset_find_obj doesn't count, a generic map or
> sysfs with sysfs_dirent interface can do that just as well).

I will not deny that the current use of kobjects/ksets/ktypes (subsystem
is now gone) is difficult and extreemly painful. I am currently working
to fix this issue. But don't think that the reason this is hard to use
means that it should be abolished alltogether.

Rather, it means that this interface to using kobjects needs to be fixed
and made easier, not circumvented.

> 5. Wouldn't that allow manifestation of random hierarchy all over sysfs?
>
> I really don't know whether it will or not but I don't agree interface
> obfuscation is the right way to prevent that. IMHO, if we need better
> policing under /sys than regular review process can provide, we should
> force it by clearly defined policies and documentation not by
> obfuscation, which, BTW, can't really prevent anything.

No, I think that we have been lucky so far that it is so hard to get
sysfs representation working properly for "raw" kobjects. It has made
people really think why they want to add things there, and usually just
give up and go and put things into the proper place in the /sys/devices/
tree.

Also, not everything that people keep wanting to put in /sys should go
there. The perfict example of that is the recent BDI stuff. It belongs
in the driver tree, not in a new /sys/bdi/ location. If sysfs were
"easier" to use, then it would be abused this way.

The end goal for sysfs is to present a heirachy of devices that are in
the kernel today. It is not a replacement for everything that people
feel they need to export to userspace in whatever form they want to.
There are rules that need to be followed in the exportation of data, as
userspace programs expect this. The current kobject interface tries
very hard to enforce those rules, and it needs to stay combined with
sysfs that way.

> For example...
>
> * I don't worry too much about hierarchy under /sys/devices. Most use
> and will continue to use interface provided by the driver model which
> forces the current structure. If this is not enough, we can, for
> example, disallow a sysfs node representing a device from having subdirs
> deeper than one level or any subdirs which can generate uevent.

No, I don't think that is necessary as the driver model kind of enforces
that already today.

> * For the other currently existing top directories, I think they're
> already too specific for anyone to add random hierarchy under. If
> random top level directory is worried, we can add a central list of
> allowed top directories in fs/sysfs/mount.c so that no one can sneak behind.

That might be a good idea to implement today :)

> These are just examples but both can be implemented and documented
> easily and IMHO will usually result in better end result.
>
> So, no, I don't agree to keeping kobject based interface to keep sysfs
> hierarchy tidy.

I strongly object here.

I think that if your current patches are accepted, we will see a lot of
new users of sysfs in ways that are not "standard" to how it is used
today. Things that rely on "close" happening to the sysfs file, or
trees created that do not emit uevents.

A good example for why we need to keep things the same way today is the
SLUB code. It exports data through sysfs and automatically started
exporting things through uevents. People realized this and can now
easily write tools that watch for those events to show things happening
in the slab allocator.

> 6. Conclusion
>
> I think I said enough about why kobject based interface isn't such a
> good idea anymore. I'll try to cover why it's a good idea to move over
> to new sysfs_dirent based interface.
>
> * It's a clean up and a big one at that. It makes sysfs code and
> interface much more straight-forward and its users will benefit too when
> they are converted over to new interface.

I don't object to a clean up. What I object to is the use by other code
of sysfs by not using kobjects. I feel that if you really want to do
that, then go write a new filesystem for that kind of thing. We have
already done this with debugfs and securityfs. I really want to enforce
the kobject interface to the users of sysfs.

Now if we can keep that enforcement of sysfs, then I have no objections
to cleaning up the internal interface between sysfs and kobjects, and
the overall fixing of the kobject/kset/ktype code. That is all good
things overall.

> * It removes unnecessary API-visible use of kobject. I think driver
> model should head toward moving object lifespan management and other
> complexities to higher level - ie. driver model, block layer, etc - and
> export simple interface to drivers. kobject is too much of
> implementation detail to export to drivers. Removing kobject from sysfs
> interface is a step toward that direction.

The driver model never shows kobjects to users. Ok, in places it does
sneak through, but I'll be glad to take patches to fix that up wherever
needed.

> * sysfs_dirent interface is native to what sysfs does and thus can be
> much more flexible. This will help improving the driver model and
> adding new features.

I'm all for helping the driver model and adding new features to it.
Just don't take away the enforcement of kobjects and sysfs at the same
time please.

thanks,

greg k-h

2007-10-05 06:23:08

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Thu, Sep 27, 2007 at 01:25:48PM -0600, Eric W. Biederman wrote:
>
> I still need to look at the code in detail but I have some concerns
> I want to inject into this conversation of future sysfs architecture.
>
> - If we want to carefully limit sysfs from going to wild code review
> is clearly not enough. We need some technological measures to
> assist us. As the experience with sysctl has shown.

I totally agree. You should see the ways that people have tried to
circumvent the current kobject/sysfs code over the past years. It's so
scary it's not even funny...

> - The network namespace work scheduled to be merged in 2.6.24 is
> currently has a dependency in Kconfig that is "&& !SYSFS"
> because sysfs is currently very much a moving target.
>
> Does it look like we can resolve Tejun's work for 2.6.24?
> If not does it make sense to push my patches that allow
> multiple mounts of sysfs for 2.6.24? So I can allow
> network namespaces in the presence of sysfs.
>
> Outside of sysfs and the device model I'm only talk maybe 30 lines
> of code... So I could easily merge that patch later in the
> merge window after the other pieces have gone in.

I would be interested in seeing what your patches look like. I don't
think that we should take any more sysfs changes for 2.6.24 as we do
have a lot of them right now, and I don't think that Tejun and I agree
on the future direction of the outstanding ones just yet.

But I don't think that your multiple-mount patches could make it into
.24, unless .23 is still weeks away.

> - Farther down the road we have the device namespace.
> The bounding requirements are:
> - We want to restrict which set of devices a subset of process
> can access.

That's reasonable.

> - When we migrate an application we want to preserve the device
> numbers of all devices that show up in the new location.
> So filesystems whose block devices reside on a SAN, ramdisks,
> ttys, etc.
> Other devices that really are different we can handle with
> hotplug remove and add events, during the migration.
>
> So while there is lower hanging fruit the requirements for a
> device namespace are becoming clear, and don't look like something
> we will ultimately be able to dodge.
>
> For sysfs the implication is that we will need to filter the
> hotplug events based upon the device namespace of the recipient, and
> we will need to restrict the set of devices that show up in sysfs
> based on who mounts it (as the prototype patches with the network
> namespace are doing).

That is going to be interesting to see how you come up with a way to do
hat.

> Also fun is that the dev file implementation needs to be able to
> report different major:minor numbers based on which mount of
> sysfs we are dealing with.

Um, no, that's not going to happen. /dev/sda will _always_ have the
same major:minor number, as defined by the LSB spec. You can not break
that at all. So while you might not want to show all mounts
/sys/devices/block/sda/ the ones that you do, will all have the LSB
defined major:minor number assigned to it.

thanks,

greg k-h

2007-10-05 08:01:12

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Hello, Greg.

I think this definitely needs more discussion, so here we go...

Greg KH wrote:
>> 1. What is a kobject?
[--snip--]
>> The functionality served by kobject can be broken down into the
>> following two.
>>
>> F-a. To serve as an entity both subsystems can share lifespan
>> management. ie. both subsystems reference count on kobject.
>>
>> F-b. To serve as an entity both subsystems can base their internal
>> representation on. (base object in OO term).
>
> Yes, those are two functions, I can agree with.
>
[--snip--]
>> 3. Where does that leave kobject?
>>
>> If you combine #1 and #2, both functionalities become questionable.
>>
>> F-a. sysfs no longer plays role in lifespan management of driver model
>> object. This functionality is exactly what's killed by #2.
>
> Yes, but a kobject is still needed internally for the lifespan
> management.

Yes, exactly - "internally" to the driver model (or drivers which ride
along). To sysfs, it has no function other than being an opaque token.

[--snip--]
>> IMHO, both L-a and L-b contribute only to obfuscation of the driver
>> model and sysfs.
>
> No. I think you are missing a number of things that kobjects provide
> and allow:
> - a structurual heirachy of devices. Combine kobjects with
> ksets and ktypes, and you have a very powerful system to
> categorize objects and their representation to each other.

Yes, which only needs to be used _inside_ the driver model
implementation proper.

> - a consistant and easy interface to userspace through
> uevents/hotplug of the creation and removal of these objects.
> This keeps the different parts of the kernel that need this
> interface from having to create it every time on their own.

Things can be much easier than now. Also, the above paragraph is
inconsistent with the rest of your argument or am I misunderstanding
what you mean by the above paragraph?

> - a way to easily create and export attributes in sysfs
> automatically.

This is and should be the function of the driver model not kobjects.
Removing kobject from the interface doesn't change anything about this.

> - a way to provide working reference counting for a variety of
> different objects.

To me, this just feels wrong and does more harm than it helps. I really
think we shouldn't have multi-role flash light at the core of our driver
model (inside driver model proper, no problem, but not as exported
interface).

> All of those are still needed for the kernel.

For #1 and #3, I agree if you limit the scope to driver model proper but
I am not arguing kobject and all its friends should be abolished. I'm
arguing that there is no reason to export it as API because it doesn't
add any value exported.

For #4, I don't know. This can be a matter of taste but I don't think
#4 alone stands as justification for kobject as external API.

For #2, I might be misundertanding. Please elaborate if I am. For
uevent issue, I'll talk about more later.

So, I honestly don't think the above four arguments successfully counter
the original arguments. If I'm missing something, feel free to hammer
me into enlightenment. :-)

>> 4. So?
>>
>> From #3, as kobject no longer serves any valid purpose to sysfs, it's
>> natural conclusion to try to remove kobject from sysfs, which of course
>> brings up the question of conversion cost.
>
> I don't mind the removal of kobjects from sysfs in order to make sysfs
> and kobjects work better/simpler. However the majority of the patches
> you created to do this end up with more code overall, and are of no
> benifit to the current users of sysfs and kobjects in the kernel.
>
>> 95+% of sysfs users use it through driver model which wraps sysfs
>> interface and exports it as a part of driver model. For these,
>> conversion only needs to happen inside the driver model, so we
>> definitely can do that.
>
> But what would that benifit the driver model?

There is no code reduction or functionality improvement yet because all
of them are still using the compatibility interface. Properly
converted, sysfs handling code all over the kernel can be _much_
simplified and more robust. I bet there are numerous bugs in sysfs
creation failure handling path all over mid/low level drivers. New
interface makes those bugs much less likely.

>> The rest isn't great in number and, much more importantly, many of those
>> suffer from the current interface which is painful to use independently.
>> For example, kernel/module.c does all the kobject dances including
>> defining a subsystem just to ignore everything else and use it as an
>> opaque token to sysfs (kset_find_obj doesn't count, a generic map or
>> sysfs with sysfs_dirent interface can do that just as well).
>
> I will not deny that the current use of kobjects/ksets/ktypes (subsystem
> is now gone) is difficult and extreemly painful. I am currently working
> to fix this issue. But don't think that the reason this is hard to use
> means that it should be abolished alltogether.
>
> Rather, it means that this interface to using kobjects needs to be fixed
> and made easier, not circumvented.

The thing is that functionality-wise, kobject and its friends don't
serve anything anymore outside of driver model implementation proper
(I'll talk about uevent later) and thus there is no reason to use it
outside of driver model implementation anymore in the long term.

If something is needed but bypassed, it's circumvented but that isn't
the case here. kobject and its friends no longer have any essential
functionality in the exported API. It's just a dead weight. (Any entity
in the driver model can and should use what the driver model exports, so
that part is irrelevant here.)

>> 5. Wouldn't that allow manifestation of random hierarchy all over sysfs?
>>
>> I really don't know whether it will or not but I don't agree interface
>> obfuscation is the right way to prevent that. IMHO, if we need better
>> policing under /sys than regular review process can provide, we should
>> force it by clearly defined policies and documentation not by
>> obfuscation, which, BTW, can't really prevent anything.
>
> No, I think that we have been lucky so far that it is so hard to get
> sysfs representation working properly for "raw" kobjects. It has made
> people really think why they want to add things there, and usually just
> give up and go and put things into the proper place in the /sys/devices/
> tree.

I can agree to that. Unfortunately, it also sometimes distorts driver
implementation because representing the proper picture is so difficult.
libata attributes are under constant pressure to escape to SCSI and
block nodes (nothing bad has actually happened yet tho), new features
are being delayed and/or pushed to use different userland interface
(module parameter being the most common). I know libata is a corner
case at the moment but a bit of flexibility would have been very helpful
for both developers and users.

> Also, not everything that people keep wanting to put in /sys should go
> there. The perfict example of that is the recent BDI stuff. It belongs
> in the driver tree, not in a new /sys/bdi/ location. If sysfs were
> "easier" to use, then it would be abused this way.
>
> The end goal for sysfs is to present a heirachy of devices that are in
> the kernel today. It is not a replacement for everything that people
> feel they need to export to userspace in whatever form they want to.
> There are rules that need to be followed in the exportation of data, as
> userspace programs expect this. The current kobject interface tries
> very hard to enforce those rules, and it needs to stay combined with
> sysfs that way.

Yes, fully agreed. What I'm trying to argue is that obfuscation isn't
the optimal way to achieve that. We can do it in saner and less painful
way.

[--snip--]
>> So, no, I don't agree to keeping kobject based interface to keep sysfs
>> hierarchy tidy.
>
> I strongly object here.
>
> I think that if your current patches are accepted, we will see a lot of
> new users of sysfs in ways that are not "standard" to how it is used
> today. Things that rely on "close" happening to the sysfs file, or
> trees created that do not emit uevents.

Adding policies to prevent such usages to easy interface is the right
thing to do. Currently, we don't even have defined policies for sysfs
outside of driver model. The only thing is that it's difficult to
understand and painful to use.

I just don't really get how it's okay to keep kobject based interface
just to make things more difficult and solely depend on that artificial
difficulty for keeping the tree tidy.

We can enforce stronger rules with easier interface. Just lemme know
the rules. I'll enforce those rules in the sysfs core such that
changing those rules will have to go through driver model review chain.
Wouldn't that be much better?

> A good example for why we need to keep things the same way today is the
> SLUB code. It exports data through sysfs and automatically started
> exporting things through uevents. People realized this and can now
> easily write tools that watch for those events to show things happening
> in the slab allocator.

Yes and sysfs restructuring when it's finished won't change that at all.
Things will be better toward the same goal. Remember that I said the
next step was moving uevent over to sysfs? Uevent belongs to sysfs
because it's by design bound to userland visible representation of
kernel objects. The current placement is awkward - kobject carries
uevent related fields whether it's needed or not, uevent suppression is
in struct device not in kobject and sysfs creation / uevent
synchronization is done in awkward way.

>> 6. Conclusion
>>
>> I think I said enough about why kobject based interface isn't such a
>> good idea anymore. I'll try to cover why it's a good idea to move over
>> to new sysfs_dirent based interface.
>>
>> * It's a clean up and a big one at that. It makes sysfs code and
>> interface much more straight-forward and its users will benefit too when
>> they are converted over to new interface.
>
> I don't object to a clean up. What I object to is the use by other code
> of sysfs by not using kobjects. I feel that if you really want to do
> that, then go write a new filesystem for that kind of thing. We have
> already done this with debugfs and securityfs. I really want to enforce
> the kobject interface to the users of sysfs.
>
> Now if we can keep that enforcement of sysfs, then I have no objections
> to cleaning up the internal interface between sysfs and kobjects, and
> the overall fixing of the kobject/kset/ktype code. That is all good
> things overall.

I think what's missing here is why we need to enforce kobject interface.
It certainly isn't for kobject itself's sake, right? Originally, it
served a valid purpose for interaction with sysfs. Also, by the virtue
of being difficult to use, it limited the usage of sysfs.

My arguments here are...

1. kobject no longer has such valid purpose as far as sysfs is
concerned, which was its biggest out-of-driver-model functionality.
And, in the long term, I don't see any reason why kobject needs to be
visible outside of driver model.

2. I'm all for keeping the tree tidy but I think it's better and more
cleanly done by well defined policies clearly stated in the
documentation and enforced by code such that changing sysfs hierarchy
always goes through driver model review chain.

3. In this series, all that happened was implementing new interface and
features and reimplement original interface in terms of them. As such,
there is no code clean up out of sysfs. In fact, sysfs gained
considerable amount of code. Considering wide spread use of sysfs, I'm
pretty sure the net code amount and complexity will drop considerably
with future API user conversions.

Hopefully, I stated things clearer this time. If you disagree, please
try to convince me. I'm listening and I think we really need to
establish consensus on this subject.

Thanks a lot.

--
tejun

2007-10-05 12:14:18

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Greg KH <[email protected]> writes:
>
>> Also fun is that the dev file implementation needs to be able to
>> report different major:minor numbers based on which mount of
>> sysfs we are dealing with.
>
> Um, no, that's not going to happen. /dev/sda will _always_ have the
> same major:minor number, as defined by the LSB spec. You can not break
> that at all. So while you might not want to show all mounts
> /sys/devices/block/sda/ the ones that you do, will all have the LSB
> defined major:minor number assigned to it.

Hmm. If that is in the LSB it must come from
Documentation/devices.txt I'm not after changing the user
visible major/minor assignments.

Let me see if a concrete example will help. Suppose I have
have a SAN with two disks: disk-1 and disk-2. I have
two machines A and B. On machine A I get the mapping:
sda -> disk-1, sdb ->disk-2. On machine B I wind up with
a different probe order so I get the mapping: sda -> disk-2
sdb ->disk-1.

To be very clear by sda I mean the block device with major 8 and
minor 0, and by sdb I mean the block device with major 8 and minor
16.

So I decide I want an environment on machine B that looks just
like the environment on machine A, so I can bring transfer over
a running program or whatever. So I run around looking at UUID
labels and what not and I discover that the machine B knows disk-1 as
sdb and that machine A knows disk-1 as sda. So I want to say:
/sys/devices/block/sdb show up in this other device namespace as
/sys/devices/block/sda.

In that instance a running program won't notice the difference.

Eric

2007-10-05 12:44:53

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Greg KH <[email protected]> writes:

> I would be interested in seeing what your patches look like.

Sure.

> I don't
> think that we should take any more sysfs changes for 2.6.24 as we do
> have a lot of them right now, and I don't think that Tejun and I agree
> on the future direction of the outstanding ones just yet.

Sounds reasonable.

> But I don't think that your multiple-mount patches could make it into
> .24, unless .23 is still weeks away.

Well I have posted them all earlier. At this point I it makes most
sense to wait until after the big merge happen and every rebases on
top of that. Then everyone will have network namespace support and
it is easier to look through all of the patches. Especially since
it looks like the merge window will open any day now.

I will quickly recap the essence of what I am looking at:
On directories of interest I tag all of their directory
entries with which namespace they belong to. On a mount
of sysfs I remember which namespace we were in when we
mounted sysfs. The I filter readdir and lookup based upon
the namespace I captured at mount time. I do my best
to generalize it so that the logic can work for different
namespaces.

Currently the heart of the patch from the network namespace
is below (I sniped the part that does the capture at mount time).
Basically the interface to users of this functionality is just
providing some way to go from a super block or a kobject to
the tag sysfs is using to filter things.

So I get one sysfs_dirent tree, but each super_block has it's
own tree of dcache entries.

Everything else is pretty much details in checking and propagating
the tags into the appropriate places.

Eric

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 5adfdc2..a300f6e 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -435,6 +437,23 @@ static void netdev_release(struct device *d)
kfree((char *)dev - dev->padded);
}

+static const void *net_sb_tag(struct sysfs_tag_info *info)
+{
+ return info->net_ns;
+}
+
+static const void *net_kobject_tag(struct kobject *kobj)
+{
+ struct net_device *dev;
+ dev = container_of(kobj, struct net_device, dev.kobj);
+ return dev->nd_net;
+}
+
+static const struct sysfs_tagged_dir_operations net_tagged_dir_operations = {
+ .sb_tag = net_sb_tag,
+ .kobject_tag = net_kobject_tag,
+};
+
static struct class net_class = {
.name = "net",
.dev_release = netdev_release,
@@ -444,6 +463,7 @@ static struct class net_class = {
#ifdef CONFIG_HOTPLUG
.dev_uevent = netdev_uevent,
#endif
+ .tag_ops = &net_tagged_dir_operations,
};

/* Delete sysfs entries but hold kobject reference until after all
--
1.5.3.rc6.17.g1911

2007-10-05 13:00:28

by Kirill Korotaev

[permalink] [raw]
Subject: Re: [Devel] Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Eric W. Biederman wrote:
> Greg KH <[email protected]> writes:
>
>>> Also fun is that the dev file implementation needs to be able to
>>> report different major:minor numbers based on which mount of
>>> sysfs we are dealing with.
>>
>>Um, no, that's not going to happen. /dev/sda will _always_ have the
>>same major:minor number, as defined by the LSB spec. You can not break
>>that at all. So while you might not want to show all mounts
>>/sys/devices/block/sda/ the ones that you do, will all have the LSB
>>defined major:minor number assigned to it.
>
>
> Hmm. If that is in the LSB it must come from
> Documentation/devices.txt I'm not after changing the user
> visible major/minor assignments.
>
> Let me see if a concrete example will help. Suppose I have
> have a SAN with two disks: disk-1 and disk-2. I have
> two machines A and B. On machine A I get the mapping:
> sda -> disk-1, sdb ->disk-2. On machine B I wind up with
> a different probe order so I get the mapping: sda -> disk-2
> sdb ->disk-1.
>
> To be very clear by sda I mean the block device with major 8 and
> minor 0, and by sdb I mean the block device with major 8 and minor
> 16.
>
> So I decide I want an environment on machine B that looks just
> like the environment on machine A, so I can bring transfer over
> a running program or whatever. So I run around looking at UUID
> labels and what not and I discover that the machine B knows disk-1 as
> sdb and that machine A knows disk-1 as sda. So I want to say:
> /sys/devices/block/sdb show up in this other device namespace as
> /sys/devices/block/sda.

Imho environments to be migratable should have no direct access to the devices.
You can use any of stacked virtual filesystems to hide real
device from container.
You will have problems much bigger than this one otherwise
(imagine access to video, sound etc.)

Kirill

2007-10-05 13:27:16

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Devel] Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Kirill Korotaev <[email protected]> writes:

> Imho environments to be migratable should have no direct access to the devices.
> You can use any of stacked virtual filesystems to hide real
> device from container.
> You will have problems much bigger than this one otherwise
> (imagine access to video, sound etc.)

What I am primarily concern about is when you can make the case that
the hardware we are talking is present before and after the migration.

When you are directly accessing a device. For times when it makes
sense to directly access hardware in a container (think infiniband
OS-bypass NICs). We need to tell user space that the device was
unplugged and another one was plugged in. If user space can cope with
that things should continue to work.

There are some very specific cases that we can support:
- Stateless devices like /dev/zero and dev/random.
- Virtual devices like ttys, ramdisks, loop devices
- Remote block devices like SCSI disks on a san, iSCSI, nbd, ATAoE.
- Local pseudo block devices like the backing devices for virtual
filesystems.

There are very specific limits in which this can work and be useable,
and I don't claim to have looked at all of the details, but for the
block device case in particular we export the block device number
to user space in stat. There are some common applications which do
memorize stat data for files things like: git, incremental backup
software, and intrusion detection software.

Frankly the times when this matters is rare enough I don't put a
big priority on getting this done quickly. However a strong case
has been made for all of this filtering so we can run things like
udev in a container. Initially I only expect stateless character
devices and ttys to be allowed in a namespace, and I don't have
a clue what device number we will use in st_dev for stat in the
case our block device isn't in the user space interface. I just know
that it sounds like where we want to be eventually and thinking
about it now isn't a problem.

Eric

2007-10-09 09:29:36

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Fri, 05 Oct 2007 17:00:48 +0900,
Tejun Heo <[email protected]> wrote:

> I think this definitely needs more discussion, so here we go...

I agree, so I'll give my 0.02 ? here...
>
> Greg KH wrote:
> >> 1. What is a kobject?
> [--snip--]
> >> The functionality served by kobject can be broken down into the
> >> following two.
> >>
> >> F-a. To serve as an entity both subsystems can share lifespan
> >> management. ie. both subsystems reference count on kobject.
> >>
> >> F-b. To serve as an entity both subsystems can base their internal
> >> representation on. (base object in OO term).
> >
> > Yes, those are two functions, I can agree with.

I think that's the heart of the question: We first need to agree what
use the different components should have.

(a) The driver model

The driver model serves as a unified layer for all devices managed by
the kernel, organized in trees, and the drivers handling them. This
includes busses, matching of devices and drivers, attributes and so on.
Userspace expects to see these devices, drivers, busses and attributes
by looking under /sys/devices/. /sys/class/ and /sys/bus/ provide
additional views on this data.

(b) kobjects, ktypes, ksets

kobjects provide a mechanism to arrange kernel objects into a tree-like
structure. ktypes and ksets are mechanisms to further order these
objects. Changes in the kobject hierarchy are communicated to userspace
via uevents.

The driver core is implemented using this infrastructure.

(c) krefs

krefs provide a generic reference counting mechanism.

The kobject infrastructure uses krefs for its reference counting needs.

(d) sysfs

sysfs is a virtual filesystem. It exports information on kernel objects
to user space. (IMO, that's the key: sysfs is userspace representation.)

That said, it is logical that kobjects are made visible to userspace
via sysfs. If someone is trying to make things show up in sysfs and has
to jump through hoops to cook up kobjects, they're probably using the
wrong infrastructure.

There are two big problems with the tight coupling of sysfs and kobjects:

- lifetime rules; but this fortunately hugely improved with the
previous patches :)

- relaying implementation details to userspace so that they cannot be
easily changed. We would need to allow kobjects not showing up in sysfs
and making symlinks a sysfs facility not relying on kobjects to help
there.

> >
> [--snip--]
> >> 3. Where does that leave kobject?
> >>
> >> If you combine #1 and #2, both functionalities become questionable.
> >>
> >> F-a. sysfs no longer plays role in lifespan management of driver model
> >> object. This functionality is exactly what's killed by #2.
> >
> > Yes, but a kobject is still needed internally for the lifespan
> > management.
>
> Yes, exactly - "internally" to the driver model (or drivers which ride
> along). To sysfs, it has no function other than being an opaque token.

But krefs are used for kobject reference counting, or am I
misunderstanding here?

>
> [--snip--]
> >> IMHO, both L-a and L-b contribute only to obfuscation of the driver
> >> model and sysfs.
> >
> > No. I think you are missing a number of things that kobjects provide
> > and allow:
> > - a structurual heirachy of devices. Combine kobjects with
> > ksets and ktypes, and you have a very powerful system to
> > categorize objects and their representation to each other.
>
> Yes, which only needs to be used _inside_ the driver model
> implementation proper.

There are use cases outside of the driver model prober where you may
want to use kobject for hierarchy.

>
> > - a consistant and easy interface to userspace through
> > uevents/hotplug of the creation and removal of these objects.
> > This keeps the different parts of the kernel that need this
> > interface from having to create it every time on their own.
>
> Things can be much easier than now. Also, the above paragraph is
> inconsistent with the rest of your argument or am I misunderstanding
> what you mean by the above paragraph?

I see uevents as a notifier for changes in the kobject hierarchy, so
they belong to that layer. However, the layering between kobjects and
sysfs (path names etc.) could probably be made cleaner.

>
> > - a way to easily create and export attributes in sysfs
> > automatically.
>
> This is and should be the function of the driver model not kobjects.

I agree, attributes should belong to the driver model.

> Removing kobject from the interface doesn't change anything about this.

Hm. Currently you have a file<->attribute correlation. This would
change if you allow non-attribute files.

>
> > - a way to provide working reference counting for a variety of
> > different objects.
>
> To me, this just feels wrong and does more harm than it helps. I really
> think we shouldn't have multi-role flash light at the core of our driver
> model (inside driver model proper, no problem, but not as exported
> interface).

And I still think that this is the purpose of krefs :)

>
> > All of those are still needed for the kernel.
>
> For #1 and #3, I agree if you limit the scope to driver model proper but
> I am not arguing kobject and all its friends should be abolished. I'm
> arguing that there is no reason to export it as API because it doesn't
> add any value exported.

I see the value for those code paths that want to provide a hierarchy
of kernel objects outside the driver model proper.

> >> The rest isn't great in number and, much more importantly, many of those
> >> suffer from the current interface which is painful to use independently.
> >> For example, kernel/module.c does all the kobject dances including
> >> defining a subsystem just to ignore everything else and use it as an
> >> opaque token to sysfs (kset_find_obj doesn't count, a generic map or
> >> sysfs with sysfs_dirent interface can do that just as well).
> >
> > I will not deny that the current use of kobjects/ksets/ktypes (subsystem
> > is now gone) is difficult and extreemly painful. I am currently working
> > to fix this issue. But don't think that the reason this is hard to use
> > means that it should be abolished alltogether.
> >
> > Rather, it means that this interface to using kobjects needs to be fixed
> > and made easier, not circumvented.

Yes, an easier-to-use interface to the kobject stuff would be helpful
for everyone :)

> The thing is that functionality-wise, kobject and its friends don't
> serve anything anymore outside of driver model implementation proper
> (I'll talk about uevent later) and thus there is no reason to use it
> outside of driver model implementation anymore in the long term.

I disagree. A hierarchy of kernel objects has uses beyond the driver
model.

> I can agree to that. Unfortunately, it also sometimes distorts driver
> implementation because representing the proper picture is so difficult.

Yes, stuff visible to userspace is hard :(

> libata attributes are under constant pressure to escape to SCSI and
> block nodes (nothing bad has actually happened yet tho), new features
> are being delayed and/or pushed to use different userland interface
> (module parameter being the most common). I know libata is a corner
> case at the moment but a bit of flexibility would have been very helpful
> for both developers and users.

I'm not familiar with the libata attributes, but shouldn't these
features really be part of the respective block devices or module
parameters?

> >> So, no, I don't agree to keeping kobject based interface to keep sysfs
> >> hierarchy tidy.
> >
> > I strongly object here.
> >
> > I think that if your current patches are accepted, we will see a lot of
> > new users of sysfs in ways that are not "standard" to how it is used
> > today. Things that rely on "close" happening to the sysfs file, or
> > trees created that do not emit uevents.
>
> Adding policies to prevent such usages to easy interface is the right
> thing to do. Currently, we don't even have defined policies for sysfs
> outside of driver model. The only thing is that it's difficult to
> understand and painful to use.

s/outside of driver model/outside of kobjects/

And I don't think something that is not part of the kobject hierarchy
really wants to have a sysfs representation.

> Yes and sysfs restructuring when it's finished won't change that at all.
> Things will be better toward the same goal. Remember that I said the
> next step was moving uevent over to sysfs? Uevent belongs to sysfs
> because it's by design bound to userland visible representation of
> kernel objects. The current placement is awkward - kobject carries
> uevent related fields whether it's needed or not, uevent suppression is
> in struct device not in kobject and sysfs creation / uevent
> synchronization is done in awkward way.

uevent suppression does not belong to devices, yes. Rather to
kobjects :) And I think uevents as a way to notify users of changes in
the kobject hierarchy is useful outside of sysfs.

> > Now if we can keep that enforcement of sysfs, then I have no objections
> > to cleaning up the internal interface between sysfs and kobjects, and
> > the overall fixing of the kobject/kset/ktype code. That is all good
> > things overall.

*nod* Cleaning up the layering and the interfaces is a good thing in
itself.

> I think what's missing here is why we need to enforce kobject interface.
> It certainly isn't for kobject itself's sake, right? Originally, it
> served a valid purpose for interaction with sysfs. Also, by the virtue
> of being difficult to use, it limited the usage of sysfs.
>
> My arguments here are...
>
> 1. kobject no longer has such valid purpose as far as sysfs is
> concerned, which was its biggest out-of-driver-model functionality.
> And, in the long term, I don't see any reason why kobject needs to be
> visible outside of driver model.
>
> 2. I'm all for keeping the tree tidy but I think it's better and more
> cleanly done by well defined policies clearly stated in the
> documentation and enforced by code such that changing sysfs hierarchy
> always goes through driver model review chain.
>
> 3. In this series, all that happened was implementing new interface and
> features and reimplement original interface in terms of them. As such,
> there is no code clean up out of sysfs. In fact, sysfs gained
> considerable amount of code. Considering wide spread use of sysfs, I'm
> pretty sure the net code amount and complexity will drop considerably
> with future API user conversions.
>
> Hopefully, I stated things clearer this time. If you disagree, please
> try to convince me. I'm listening and I think we really need to
> establish consensus on this subject.

I hope I have made my own view clear :) Thanks for reading through this
longish mail.

2007-10-09 22:58:47

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Tue, Oct 09, 2007 at 11:29:01AM +0200, Cornelia Huck wrote:
> On Fri, 05 Oct 2007 17:00:48 +0900,
> Tejun Heo <[email protected]> wrote:
>
> > I think this definitely needs more discussion, so here we go...
>
> I agree, so I'll give my 0.02 ? here...
> >
> > Greg KH wrote:
> > >> 1. What is a kobject?
> > [--snip--]
> > >> The functionality served by kobject can be broken down into the
> > >> following two.
> > >>
> > >> F-a. To serve as an entity both subsystems can share lifespan
> > >> management. ie. both subsystems reference count on kobject.
> > >>
> > >> F-b. To serve as an entity both subsystems can base their internal
> > >> representation on. (base object in OO term).
> > >
> > > Yes, those are two functions, I can agree with.
>
> I think that's the heart of the question: We first need to agree what
> use the different components should have.
>
> (a) The driver model
>
> The driver model serves as a unified layer for all devices managed by
> the kernel, organized in trees, and the drivers handling them. This
> includes busses, matching of devices and drivers, attributes and so on.
> Userspace expects to see these devices, drivers, busses and attributes
> by looking under /sys/devices/. /sys/class/ and /sys/bus/ provide
> additional views on this data.
>
> (b) kobjects, ktypes, ksets
>
> kobjects provide a mechanism to arrange kernel objects into a tree-like
> structure. ktypes and ksets are mechanisms to further order these
> objects. Changes in the kobject hierarchy are communicated to userspace
> via uevents.
>
> The driver core is implemented using this infrastructure.
>
> (c) krefs
>
> krefs provide a generic reference counting mechanism.
>
> The kobject infrastructure uses krefs for its reference counting needs.
>
> (d) sysfs
>
> sysfs is a virtual filesystem. It exports information on kernel objects
> to user space. (IMO, that's the key: sysfs is userspace representation.)

Ok, I agree with all of the above :)

> That said, it is logical that kobjects are made visible to userspace
> via sysfs. If someone is trying to make things show up in sysfs and has
> to jump through hoops to cook up kobjects, they're probably using the
> wrong infrastructure.

I agree.

> There are two big problems with the tight coupling of sysfs and kobjects:
>
> - lifetime rules; but this fortunately hugely improved with the
> previous patches :)

Yes, I think that's pretty much fixed now.

> - relaying implementation details to userspace so that they cannot be
> easily changed. We would need to allow kobjects not showing up in sysfs
> and making symlinks a sysfs facility not relying on kobjects to help
> there.

Huh? Why would you want a kobject to not show up in sysfs?

And yes, I agree we could use some more "automatic" help in regards to
symlinks in sysfs when we change kobjects around. That would make
things simpler for the kobject core.

> > [--snip--]
> > >> 3. Where does that leave kobject?
> > >>
> > >> If you combine #1 and #2, both functionalities become questionable.
> > >>
> > >> F-a. sysfs no longer plays role in lifespan management of driver model
> > >> object. This functionality is exactly what's killed by #2.
> > >
> > > Yes, but a kobject is still needed internally for the lifespan
> > > management.
> >
> > Yes, exactly - "internally" to the driver model (or drivers which ride
> > along). To sysfs, it has no function other than being an opaque token.
>
> But krefs are used for kobject reference counting, or am I
> misunderstanding here?
>
> >
> > [--snip--]
> > >> IMHO, both L-a and L-b contribute only to obfuscation of the driver
> > >> model and sysfs.
> > >
> > > No. I think you are missing a number of things that kobjects provide
> > > and allow:
> > > - a structurual heirachy of devices. Combine kobjects with
> > > ksets and ktypes, and you have a very powerful system to
> > > categorize objects and their representation to each other.
> >
> > Yes, which only needs to be used _inside_ the driver model
> > implementation proper.
>
> There are use cases outside of the driver model prober where you may
> want to use kobject for hierarchy.

agreed.

> > > - a consistant and easy interface to userspace through
> > > uevents/hotplug of the creation and removal of these objects.
> > > This keeps the different parts of the kernel that need this
> > > interface from having to create it every time on their own.
> >
> > Things can be much easier than now. Also, the above paragraph is
> > inconsistent with the rest of your argument or am I misunderstanding
> > what you mean by the above paragraph?
>
> I see uevents as a notifier for changes in the kobject hierarchy, so
> they belong to that layer. However, the layering between kobjects and
> sysfs (path names etc.) could probably be made cleaner.

also agreed.

> > > - a way to easily create and export attributes in sysfs
> > > automatically.
> >
> > This is and should be the function of the driver model not kobjects.
>
> I agree, attributes should belong to the driver model.
>
> > Removing kobject from the interface doesn't change anything about this.
>
> Hm. Currently you have a file<->attribute correlation. This would
> change if you allow non-attribute files.

I don't want to have non-attribute files, that's my main point here.

> > > - a way to provide working reference counting for a variety of
> > > different objects.
> >
> > To me, this just feels wrong and does more harm than it helps. I really
> > think we shouldn't have multi-role flash light at the core of our driver
> > model (inside driver model proper, no problem, but not as exported
> > interface).
>
> And I still think that this is the purpose of krefs :)

Ok, yes, at the very base level, it is, you are correct :)

> > > All of those are still needed for the kernel.
> >
> > For #1 and #3, I agree if you limit the scope to driver model proper but
> > I am not arguing kobject and all its friends should be abolished. I'm
> > arguing that there is no reason to export it as API because it doesn't
> > add any value exported.
>
> I see the value for those code paths that want to provide a hierarchy
> of kernel objects outside the driver model proper.

Yes.

> > >> The rest isn't great in number and, much more importantly, many of those
> > >> suffer from the current interface which is painful to use independently.
> > >> For example, kernel/module.c does all the kobject dances including
> > >> defining a subsystem just to ignore everything else and use it as an
> > >> opaque token to sysfs (kset_find_obj doesn't count, a generic map or
> > >> sysfs with sysfs_dirent interface can do that just as well).
> > >
> > > I will not deny that the current use of kobjects/ksets/ktypes (subsystem
> > > is now gone) is difficult and extreemly painful. I am currently working
> > > to fix this issue. But don't think that the reason this is hard to use
> > > means that it should be abolished alltogether.
> > >
> > > Rather, it means that this interface to using kobjects needs to be fixed
> > > and made easier, not circumvented.
>
> Yes, an easier-to-use interface to the kobject stuff would be helpful
> for everyone :)
>
> > The thing is that functionality-wise, kobject and its friends don't
> > serve anything anymore outside of driver model implementation proper
> > (I'll talk about uevent later) and thus there is no reason to use it
> > outside of driver model implementation anymore in the long term.
>
> I disagree. A hierarchy of kernel objects has uses beyond the driver
> model.

i agree.

thanks,

greg k-h

2007-10-09 22:59:01

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Fri, Oct 05, 2007 at 05:00:48PM +0900, Tejun Heo wrote:
> Hello, Greg.
>
> I think this definitely needs more discussion, so here we go...

heh :)

> Greg KH wrote:
> >> 1. What is a kobject?
> [--snip--]
> >> The functionality served by kobject can be broken down into the
> >> following two.
> >>
> >> F-a. To serve as an entity both subsystems can share lifespan
> >> management. ie. both subsystems reference count on kobject.
> >>
> >> F-b. To serve as an entity both subsystems can base their internal
> >> representation on. (base object in OO term).
> >
> > Yes, those are two functions, I can agree with.
> >
> [--snip--]
> >> 3. Where does that leave kobject?
> >>
> >> If you combine #1 and #2, both functionalities become questionable.
> >>
> >> F-a. sysfs no longer plays role in lifespan management of driver model
> >> object. This functionality is exactly what's killed by #2.
> >
> > Yes, but a kobject is still needed internally for the lifespan
> > management.
>
> Yes, exactly - "internally" to the driver model (or drivers which ride
> along). To sysfs, it has no function other than being an opaque token.

I agree.

> [--snip--]
> >> IMHO, both L-a and L-b contribute only to obfuscation of the driver
> >> model and sysfs.
> >
> > No. I think you are missing a number of things that kobjects provide
> > and allow:
> > - a structurual heirachy of devices. Combine kobjects with
> > ksets and ktypes, and you have a very powerful system to
> > categorize objects and their representation to each other.
>
> Yes, which only needs to be used _inside_ the driver model
> implementation proper.

And for kobjects too. They need a way to show categories and heirachy.

> > - a consistant and easy interface to userspace through
> > uevents/hotplug of the creation and removal of these objects.
> > This keeps the different parts of the kernel that need this
> > interface from having to create it every time on their own.
>
> Things can be much easier than now. Also, the above paragraph is
> inconsistent with the rest of your argument or am I misunderstanding
> what you mean by the above paragraph?

I think you must be misunderstanding what I mean here.

In the past, to add something like "userspace notification of hotplug"
to different kernel subsystems, we had to add it all over the place.
Now, with a unified way to represent objects in the kernel (kobjects and
then 'struct device') we only have to add the logic in one place for the
core code, and all subsystems have automatic access to it. That's one
of the main benefits of this core code.

> > - a way to easily create and export attributes in sysfs
> > automatically.
>
> This is and should be the function of the driver model not kobjects.
> Removing kobject from the interface doesn't change anything about this.

No, kobjects need to do this, not just the driver model. What about
things that do not currently use the driver model (like block devices).
Are you going to tell them they can't have attributes? :)

> > - a way to provide working reference counting for a variety of
> > different objects.
>
> To me, this just feels wrong and does more harm than it helps. I really
> think we shouldn't have multi-role flash light at the core of our driver
> model (inside driver model proper, no problem, but not as exported
> interface).

Well, that's what struct kref does, as Cornelia pointed out. That's
even lower down than a kobject.

> > All of those are still needed for the kernel.
>
> For #1 and #3, I agree if you limit the scope to driver model proper but
> I am not arguing kobject and all its friends should be abolished. I'm
> arguing that there is no reason to export it as API because it doesn't
> add any value exported.

I think the block layer would disagree with you :)

> For #4, I don't know. This can be a matter of taste but I don't think
> #4 alone stands as justification for kobject as external API.

Sorry, struct kref does that. You want a kobject to show hierarchy and
classification of types.

> For #2, I might be misundertanding. Please elaborate if I am. For
> uevent issue, I'll talk about more later.
>
> So, I honestly don't think the above four arguments successfully counter
> the original arguments. If I'm missing something, feel free to hammer
> me into enlightenment. :-)

The "original" argument was that I don't want to allow users of sysfs
who are not using a struct kobject. Perhaps we should just focus on
that issue instead of debating the relative merits of kobject and struct
device for right now.

> >> 4. So?
> >>
> >> From #3, as kobject no longer serves any valid purpose to sysfs, it's
> >> natural conclusion to try to remove kobject from sysfs, which of course
> >> brings up the question of conversion cost.
> >
> > I don't mind the removal of kobjects from sysfs in order to make sysfs
> > and kobjects work better/simpler. However the majority of the patches
> > you created to do this end up with more code overall, and are of no
> > benifit to the current users of sysfs and kobjects in the kernel.
> >
> >> 95+% of sysfs users use it through driver model which wraps sysfs
> >> interface and exports it as a part of driver model. For these,
> >> conversion only needs to happen inside the driver model, so we
> >> definitely can do that.
> >
> > But what would that benifit the driver model?
>
> There is no code reduction or functionality improvement yet because all
> of them are still using the compatibility interface. Properly
> converted, sysfs handling code all over the kernel can be _much_
> simplified and more robust. I bet there are numerous bugs in sysfs
> creation failure handling path all over mid/low level drivers. New
> interface makes those bugs much less likely.

But as long as we stick with using kobjects for the sysfs interface, it
should not really matter, right?

> >> The rest isn't great in number and, much more importantly, many of those
> >> suffer from the current interface which is painful to use independently.
> >> For example, kernel/module.c does all the kobject dances including
> >> defining a subsystem just to ignore everything else and use it as an
> >> opaque token to sysfs (kset_find_obj doesn't count, a generic map or
> >> sysfs with sysfs_dirent interface can do that just as well).
> >
> > I will not deny that the current use of kobjects/ksets/ktypes (subsystem
> > is now gone) is difficult and extreemly painful. I am currently working
> > to fix this issue. But don't think that the reason this is hard to use
> > means that it should be abolished alltogether.
> >
> > Rather, it means that this interface to using kobjects needs to be fixed
> > and made easier, not circumvented.
>
> The thing is that functionality-wise, kobject and its friends don't
> serve anything anymore outside of driver model implementation proper
> (I'll talk about uevent later) and thus there is no reason to use it
> outside of driver model implementation anymore in the long term.

Um, no, I strongly disagree here.

> If something is needed but bypassed, it's circumvented but that isn't
> the case here. kobject and its friends no longer have any essential
> functionality in the exported API. It's just a dead weight. (Any entity
> in the driver model can and should use what the driver model exports, so
> that part is irrelevant here.)

I don't think that things under /sys/fs/ or /sys/module/ or /sys/kernel/
or /sys/firmware/ should be forced to use the driver model. Instead,
they should use kobjects, which make much more sense there to show the
heirachy involved.

> >> 5. Wouldn't that allow manifestation of random hierarchy all over sysfs?
> >>
> >> I really don't know whether it will or not but I don't agree interface
> >> obfuscation is the right way to prevent that. IMHO, if we need better
> >> policing under /sys than regular review process can provide, we should
> >> force it by clearly defined policies and documentation not by
> >> obfuscation, which, BTW, can't really prevent anything.
> >
> > No, I think that we have been lucky so far that it is so hard to get
> > sysfs representation working properly for "raw" kobjects. It has made
> > people really think why they want to add things there, and usually just
> > give up and go and put things into the proper place in the /sys/devices/
> > tree.
>
> I can agree to that. Unfortunately, it also sometimes distorts driver
> implementation because representing the proper picture is so difficult.
> libata attributes are under constant pressure to escape to SCSI and
> block nodes (nothing bad has actually happened yet tho), new features
> are being delayed and/or pushed to use different userland interface
> (module parameter being the most common). I know libata is a corner
> case at the moment but a bit of flexibility would have been very helpful
> for both developers and users.

Well, what is the probem with the current driver core code (becides the
whole scsi mess which I think is now cleaned up a bunch) that is keeping
libata from doing what it wants to do?

It might be easier to focus on this, than debating the relative merits
of splitting sysfs away from kobjects, as specific examples are much
better to work with.

> > Also, not everything that people keep wanting to put in /sys should go
> > there. The perfict example of that is the recent BDI stuff. It belongs
> > in the driver tree, not in a new /sys/bdi/ location. If sysfs were
> > "easier" to use, then it would be abused this way.
> >
> > The end goal for sysfs is to present a heirachy of devices that are in
> > the kernel today. It is not a replacement for everything that people
> > feel they need to export to userspace in whatever form they want to.
> > There are rules that need to be followed in the exportation of data, as
> > userspace programs expect this. The current kobject interface tries
> > very hard to enforce those rules, and it needs to stay combined with
> > sysfs that way.
>
> Yes, fully agreed. What I'm trying to argue is that obfuscation isn't
> the optimal way to achieve that. We can do it in saner and less painful
> way.

Heh, ok :)

I really don't want the kobject interface to be such an obfuscated mess,
and am trying to fix that. It's just taking time, as it is such a gross
mess in places. But we have the framework layed out for where we want
to go, so at least we have a known direction.

> [--snip--]
> >> So, no, I don't agree to keeping kobject based interface to keep sysfs
> >> hierarchy tidy.
> >
> > I strongly object here.
> >
> > I think that if your current patches are accepted, we will see a lot of
> > new users of sysfs in ways that are not "standard" to how it is used
> > today. Things that rely on "close" happening to the sysfs file, or
> > trees created that do not emit uevents.
>
> Adding policies to prevent such usages to easy interface is the right
> thing to do. Currently, we don't even have defined policies for sysfs
> outside of driver model. The only thing is that it's difficult to
> understand and painful to use.
>
> I just don't really get how it's okay to keep kobject based interface
> just to make things more difficult and solely depend on that artificial
> difficulty for keeping the tree tidy.
>
> We can enforce stronger rules with easier interface. Just lemme know
> the rules. I'll enforce those rules in the sysfs core such that
> changing those rules will have to go through driver model review chain.
> Wouldn't that be much better?

The rules for sysfs files are the following:
- one value, in text format, per file.
- no action apon open/close
- binary files are only allowed for "pass-through" type files
that the kernel does not touch (like for firmware and pci
config space)
- directories should be associated with a kobject where it makes
sense (no nesting deep subdirectories without a kobject
present)
- when a directory is created/removed, a uevent should happen
declaring what type of device was created/removed.

There are probably some more.

The main thing that your split apart of the sysfs and kobjects would be
that you could easily create files in sysfs that are not associated with
a kobject. That is what I want to stay away from as that is not what
sysfs is for.


For an explaination of how the usage of the layout of the usage of sysfs
for userspace programs, see the file, Documentation/sysfs-rules.txt

> > A good example for why we need to keep things the same way today is the
> > SLUB code. It exports data through sysfs and automatically started
> > exporting things through uevents. People realized this and can now
> > easily write tools that watch for those events to show things happening
> > in the slab allocator.
>
> Yes and sysfs restructuring when it's finished won't change that at all.
> Things will be better toward the same goal. Remember that I said the
> next step was moving uevent over to sysfs? Uevent belongs to sysfs
> because it's by design bound to userland visible representation of
> kernel objects. The current placement is awkward - kobject carries
> uevent related fields whether it's needed or not, uevent suppression is
> in struct device not in kobject and sysfs creation / uevent
> synchronization is done in awkward way.
>
> >> 6. Conclusion
> >>
> >> I think I said enough about why kobject based interface isn't such a
> >> good idea anymore. I'll try to cover why it's a good idea to move over
> >> to new sysfs_dirent based interface.
> >>
> >> * It's a clean up and a big one at that. It makes sysfs code and
> >> interface much more straight-forward and its users will benefit too when
> >> they are converted over to new interface.
> >
> > I don't object to a clean up. What I object to is the use by other code
> > of sysfs by not using kobjects. I feel that if you really want to do
> > that, then go write a new filesystem for that kind of thing. We have
> > already done this with debugfs and securityfs. I really want to enforce
> > the kobject interface to the users of sysfs.
> >
> > Now if we can keep that enforcement of sysfs, then I have no objections
> > to cleaning up the internal interface between sysfs and kobjects, and
> > the overall fixing of the kobject/kset/ktype code. That is all good
> > things overall.
>
> I think what's missing here is why we need to enforce kobject interface.
> It certainly isn't for kobject itself's sake, right? Originally, it
> served a valid purpose for interaction with sysfs. Also, by the virtue
> of being difficult to use, it limited the usage of sysfs.
>
> My arguments here are...
>
> 1. kobject no longer has such valid purpose as far as sysfs is
> concerned, which was its biggest out-of-driver-model functionality.
> And, in the long term, I don't see any reason why kobject needs to be
> visible outside of driver model.

see my above directory examples of why the driver model would not work
for them, but kobjects are still needed.

> 2. I'm all for keeping the tree tidy but I think it's better and more
> cleanly done by well defined policies clearly stated in the
> documentation and enforced by code such that changing sysfs hierarchy
> always goes through driver model review chain.

"well defined policies" are tough to enforce over the years. Trust me,
I've found this out the hard way. People will do things with the
interface that you never notice until it's too late and the
userspace-visable interface has been there for 3 kernel versions and
you can't change it. By removing the kobject requirement, we open
ourselves up to a lot of different usages of sysfs beyond what I think
we really want/need.

> 3. In this series, all that happened was implementing new interface and
> features and reimplement original interface in terms of them. As such,
> there is no code clean up out of sysfs. In fact, sysfs gained
> considerable amount of code. Considering wide spread use of sysfs, I'm
> pretty sure the net code amount and complexity will drop considerably
> with future API user conversions.

I don't see how any of the current in-kernel usages of kobjects/sysfs
could be cleaned up by your added apis. Do you have an idea of a
current user of sysfs that could benifit from these api changes?

> Hopefully, I stated things clearer this time. If you disagree, please
> try to convince me. I'm listening and I think we really need to
> establish consensus on this subject.

I think I understand why you are doing all of this, and I hope I can
convince you that we still need kobjects, and we need to limit sysfs
usage to kobjects only. I don't see the need to remove that
requirement.

If there are things in sysfs that we can not / are not handling properly
with kobjects, perhaps they belong in a different virtual filesystem?
sysfs isn't for everything :)

thanks,

greg k-h

2007-10-09 22:59:31

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Fri, Oct 05, 2007 at 06:12:41AM -0600, Eric W. Biederman wrote:
> Greg KH <[email protected]> writes:
> >
> >> Also fun is that the dev file implementation needs to be able to
> >> report different major:minor numbers based on which mount of
> >> sysfs we are dealing with.
> >
> > Um, no, that's not going to happen. /dev/sda will _always_ have the
> > same major:minor number, as defined by the LSB spec. You can not break
> > that at all. So while you might not want to show all mounts
> > /sys/devices/block/sda/ the ones that you do, will all have the LSB
> > defined major:minor number assigned to it.
>
> Hmm. If that is in the LSB it must come from
> Documentation/devices.txt

Yes, that is the requirement.

> I'm not after changing the user visible major/minor assignments.

Oh, I misunderstood what you wrote above then.

> Let me see if a concrete example will help. Suppose I have
> have a SAN with two disks: disk-1 and disk-2. I have
> two machines A and B. On machine A I get the mapping:
> sda -> disk-1, sdb ->disk-2. On machine B I wind up with
> a different probe order so I get the mapping: sda -> disk-2
> sdb ->disk-1.

Ok.

> To be very clear by sda I mean the block device with major 8 and
> minor 0, and by sdb I mean the block device with major 8 and minor
> 16.

Ok.

> So I decide I want an environment on machine B that looks just
> like the environment on machine A, so I can bring transfer over
> a running program or whatever. So I run around looking at UUID
> labels and what not and I discover that the machine B knows disk-1 as
> sdb and that machine A knows disk-1 as sda. So I want to say:
> /sys/devices/block/sdb show up in this other device namespace as
> /sys/devices/block/sda.

Ah, but if you do that then the "other" device namespace would have
/sys/devices/block/sda/dev be 8:16, right? And that is not valid as sda
for that namespace must always map to the device with a 8:0 major:minor
as per the LSB spec.

So, no, that's not going to be allowed, sorry.

thanks,

greg k-h

2007-10-09 22:59:44

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Fri, Oct 05, 2007 at 06:44:04AM -0600, Eric W. Biederman wrote:
> Greg KH <[email protected]> writes:
>
> > I would be interested in seeing what your patches look like.
>
> Sure.
>
> > I don't
> > think that we should take any more sysfs changes for 2.6.24 as we do
> > have a lot of them right now, and I don't think that Tejun and I agree
> > on the future direction of the outstanding ones just yet.
>
> Sounds reasonable.
>
> > But I don't think that your multiple-mount patches could make it into
> > .24, unless .23 is still weeks away.
>
> Well I have posted them all earlier. At this point I it makes most
> sense to wait until after the big merge happen and every rebases on
> top of that. Then everyone will have network namespace support and
> it is easier to look through all of the patches. Especially since
> it looks like the merge window will open any day now.
>
> I will quickly recap the essence of what I am looking at:
> On directories of interest I tag all of their directory
> entries with which namespace they belong to. On a mount
> of sysfs I remember which namespace we were in when we
> mounted sysfs. The I filter readdir and lookup based upon
> the namespace I captured at mount time. I do my best
> to generalize it so that the logic can work for different
> namespaces.

Ok, I have no objection to that. Let's wait for 2.6.24 to settle down
:)

thanks,

greg k-h

2007-10-09 23:20:53

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

> > - relaying implementation details to userspace so that they cannot be
> > easily changed. We would need to allow kobjects not showing up in sysfs
> > and making symlinks a sysfs facility not relying on kobjects to help
> > there.
>
> Huh? Why would you want a kobject to not show up in sysfs?

I think the text you replied to explained it perfectly: becase you
don't want internal implementation details to be exposed the userspace
and become an unchangeable part of the kernel/userspace interface.

2007-10-09 23:29:37

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Tue, Oct 09, 2007 at 04:20:39PM -0700, Roland Dreier wrote:
> > > - relaying implementation details to userspace so that they cannot be
> > > easily changed. We would need to allow kobjects not showing up in sysfs
> > > and making symlinks a sysfs facility not relying on kobjects to help
> > > there.
> >
> > Huh? Why would you want a kobject to not show up in sysfs?
>
> I think the text you replied to explained it perfectly: becase you
> don't want internal implementation details to be exposed the userspace
> and become an unchangeable part of the kernel/userspace interface.

But a kobject represents "something" in the kernel. If it's there, then
it shows up in sysfs. But if it isn't, or it changes somehow, then it
no longer is in sysfs, which is fine, and your userspace tools have to
be able to handle that by virtue of the rules of how to use sysfs from
userspace.

thanks,

greg k-h

2007-10-10 09:05:43

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Tue, 9 Oct 2007 15:26:38 -0700,
Greg KH <[email protected]> wrote:

> > - relaying implementation details to userspace so that they cannot be
> > easily changed. We would need to allow kobjects not showing up in sysfs
> > and making symlinks a sysfs facility not relying on kobjects to help
> > there.
>
> Huh? Why would you want a kobject to not show up in sysfs?

If we consider everything that shows up in sysfs as API, we cannot go
ahead and change it without risking userspace breakage. The hierarchy
of objects may very well be an "implementation detail". Otherwise, we
would need to lay out for every user of the kobject infrastructure
which information userspace can rely on (like it is documented for the
driver model).

> And yes, I agree we could use some more "automatic" help in regards to
> symlinks in sysfs when we change kobjects around. That would make
> things simpler for the kobject core.

Yes, non-automatic symlink handling is a large source of pain and
errors. They should probably be tied to sysfs_dirents instead of
kobjects.

> > > Removing kobject from the interface doesn't change anything about this.
> >
> > Hm. Currently you have a file<->attribute correlation. This would
> > change if you allow non-attribute files.
>
> I don't want to have non-attribute files, that's my main point here.

That's what I tried to say as well :)

2007-10-10 09:11:38

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Tue, 9 Oct 2007 16:28:04 -0700,
Greg KH <[email protected]> wrote:

> On Tue, Oct 09, 2007 at 04:20:39PM -0700, Roland Dreier wrote:
> > > > - relaying implementation details to userspace so that they cannot be
> > > > easily changed. We would need to allow kobjects not showing up in sysfs
> > > > and making symlinks a sysfs facility not relying on kobjects to help
> > > > there.
> > >
> > > Huh? Why would you want a kobject to not show up in sysfs?
> >
> > I think the text you replied to explained it perfectly: becase you
> > don't want internal implementation details to be exposed the userspace
> > and become an unchangeable part of the kernel/userspace interface.
>
> But a kobject represents "something" in the kernel. If it's there, then
> it shows up in sysfs. But if it isn't, or it changes somehow, then it
> no longer is in sysfs, which is fine, and your userspace tools have to
> be able to handle that by virtue of the rules of how to use sysfs from
> userspace.

The rules for using sysfs information are currently only laid out for
the driver model. We would need similar rules for every other user of
kobjects.

This only works if we make the following the general rules:

- You may not rely on any information in sysfs to be present.
- Exceptions to that rule are documented for that special user of sysfs.

2007-10-10 13:45:57

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Greg KH <[email protected]> writes:

> On Fri, Oct 05, 2007 at 06:12:41AM -0600, Eric W. Biederman wrote:
>> Greg KH <[email protected]> writes:
>> >
>> >> Also fun is that the dev file implementation needs to be able to
>> >> report different major:minor numbers based on which mount of
>> >> sysfs we are dealing with.
>> >
>> > Um, no, that's not going to happen. /dev/sda will _always_ have the
>> > same major:minor number, as defined by the LSB spec. You can not break
>> > that at all. So while you might not want to show all mounts
>> > /sys/devices/block/sda/ the ones that you do, will all have the LSB
>> > defined major:minor number assigned to it.
>>
>> Hmm. If that is in the LSB it must come from
>> Documentation/devices.txt
>
> Yes, that is the requirement.
>
>> I'm not after changing the user visible major/minor assignments.
>
> Oh, I misunderstood what you wrote above then.

My above sentence is slightly misleading. That should have been.
I am not after changing the device name to major:minor assignments
as specified in Documentation/devices.txt.

So within a single device namespace everything is normal and as it
always has been. Weirdness only ensues when you look across device
namespaces.

>> Let me see if a concrete example will help. Suppose I have
>> have a SAN with two disks: disk-1 and disk-2. I have
>> two machines A and B. On machine A I get the mapping:
>> sda -> disk-1, sdb ->disk-2. On machine B I wind up with
>> a different probe order so I get the mapping: sda -> disk-2
>> sdb ->disk-1.
>
> Ok.
>
>> To be very clear by sda I mean the block device with major 8 and
>> minor 0, and by sdb I mean the block device with major 8 and minor
>> 16.
>
> Ok.
>
>> So I decide I want an environment on machine B that looks just
>> like the environment on machine A, so I can bring transfer over
>> a running program or whatever. So I run around looking at UUID
>> labels and what not and I discover that the machine B knows disk-1 as
>> sdb and that machine A knows disk-1 as sda. So I want to say:
>> /sys/devices/block/sdb show up in this other device namespace as
>> /sys/devices/block/sda.

>
> Ah, but if you do that then the "other" device namespace would have
> /sys/devices/block/sda/dev be 8:16, right?

No. The "other" device namespace I would construct on machine B to
look just like the device namespace that existed on machine A.
Making /sys/devices/block/sda would still be 8:0.

So to be very clear on machine B when talking about disk-1 I would have.
initial device namespace:
/sys/devices/block/sdb
/sys/devices/block/sdb/dev 8:16

"other" device namespace:
/sys/devices/block/sda
/sys/devices/block/sda/dev 8:0

Similarly on machine B when talking about disk-2 I would have.
initial device namespace:
/sys/devices/block/sda
/sys/devices/block/sda/dev 8:0

"other" device namespace:
/sys/devices/block/sdb
/sys/devices/block/sdb/dev 8:16

So between the two devices namespaces on machine B the two disks
would exchange their user visible identities.

Eric

2007-10-10 15:39:03

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Tue, 9 Oct 2007, Greg KH wrote:

> The rules for sysfs files are the following:
> - one value, in text format, per file.
> - no action apon open/close
> - binary files are only allowed for "pass-through" type files
> that the kernel does not touch (like for firmware and pci
> config space)
> - directories should be associated with a kobject where it makes
> sense (no nesting deep subdirectories without a kobject
> present)

You have to stretch this a little for the power/ subdirectory every
device gets. The only kobject it corresponds to is the one in the
device, which already corresponds to the parent directory.

Alan Stern

2007-10-10 16:17:00

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Wed, 10 Oct 2007 11:38:50 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> On Tue, 9 Oct 2007, Greg KH wrote:
>
> > The rules for sysfs files are the following:
> > - one value, in text format, per file.
> > - no action apon open/close
> > - binary files are only allowed for "pass-through" type files
> > that the kernel does not touch (like for firmware and pci
> > config space)
> > - directories should be associated with a kobject where it makes
> > sense (no nesting deep subdirectories without a kobject
> > present)
>
> You have to stretch this a little for the power/ subdirectory every
> device gets. The only kobject it corresponds to is the one in the
> device, which already corresponds to the parent directory.

Yes, this should be "directories should be associated with a kobject or
generated by an attribute group". This way you won't get deep nesting
either.

2007-10-10 17:24:55

by Martin Bligh

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

> The rules for sysfs files are the following:
> - one value, in text format, per file.
> - no action apon open/close
> - binary files are only allowed for "pass-through" type files
> that the kernel does not touch (like for firmware and pci
> config space)
> - directories should be associated with a kobject where it makes
> sense (no nesting deep subdirectories without a kobject
> present)
> - when a directory is created/removed, a uevent should happen
> declaring what type of device was created/removed.

So you'll be removing:

/sys/devices/system/node/node?/meminfo

then?

along with:

/sys/devices/system/node/node?/distance
/sys/devices/system/node/node?/numastat

and all the other things that violate the rules?

(which I do agree with ... I just don't think sysfs works for
performance stats as we've discussed several times before ;-))

M.

2007-10-10 17:32:24

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Wed, Oct 10, 2007 at 10:24:43AM -0700, Martin Bligh wrote:
>> The rules for sysfs files are the following:
>> - one value, in text format, per file.
>> - no action apon open/close
>> - binary files are only allowed for "pass-through" type files
>> that the kernel does not touch (like for firmware and pci
>> config space)
>> - directories should be associated with a kobject where it makes
>> sense (no nesting deep subdirectories without a kobject
>> present)
>> - when a directory is created/removed, a uevent should happen
>> declaring what type of device was created/removed.
>
> So you'll be removing:
>
> /sys/devices/system/node/node?/meminfo
>
> then?
>
> along with:
>
> /sys/devices/system/node/node?/distance
> /sys/devices/system/node/node?/numastat
>
> and all the other things that violate the rules?

I would love to do that :)

And that goes to show how trying to enforce these kinds of rules is damm
hard. Things slip by that I never notice because they are only for odd
types of hardware :)

> (which I do agree with ... I just don't think sysfs works for
> performance stats as we've discussed several times before ;-))

I agree that this doesn't work too, but also that if it's really needed,
it can be done, just let us know about it (like
/sys/block/BLOCKDEV/stat)

thanks,

greg k-h

2007-10-10 18:26:42

by Martin Bligh

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Greg KH wrote:
> On Wed, Oct 10, 2007 at 10:24:43AM -0700, Martin Bligh wrote:
>>> The rules for sysfs files are the following:
>>> - one value, in text format, per file.
>>> - no action apon open/close
>>> - binary files are only allowed for "pass-through" type files
>>> that the kernel does not touch (like for firmware and pci
>>> config space)
>>> - directories should be associated with a kobject where it makes
>>> sense (no nesting deep subdirectories without a kobject
>>> present)
>>> - when a directory is created/removed, a uevent should happen
>>> declaring what type of device was created/removed.
>> So you'll be removing:
>>
>> /sys/devices/system/node/node?/meminfo
>>
>> then?
>>
>> along with:
>>
>> /sys/devices/system/node/node?/distance
>> /sys/devices/system/node/node?/numastat
>>
>> and all the other things that violate the rules?
>
> I would love to do that :)
>
> And that goes to show how trying to enforce these kinds of rules is damm
> hard. Things slip by that I never notice because they are only for odd
> types of hardware :)

Is there no way to enforce that in the sysfs interface?
(Haven't looked, I'll admit).

>> (which I do agree with ... I just don't think sysfs works for
>> performance stats as we've discussed several times before ;-))
>
> I agree that this doesn't work too, but also that if it's really needed,
> it can be done, just let us know about it (like
> /sys/block/BLOCKDEV/stat)

OK. Would be nice if we could get rid of /sys/devices/system at
some point, which seems like a fairly crazy path, but still ;-)

M.

2007-10-10 18:49:18

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Wed, Oct 10, 2007 at 11:26:32AM -0700, Martin Bligh wrote:
> Greg KH wrote:
>> On Wed, Oct 10, 2007 at 10:24:43AM -0700, Martin Bligh wrote:
>>>> The rules for sysfs files are the following:
>>>> - one value, in text format, per file.
>>>> - no action apon open/close
>>>> - binary files are only allowed for "pass-through" type files
>>>> that the kernel does not touch (like for firmware and pci
>>>> config space)
>>>> - directories should be associated with a kobject where it makes
>>>> sense (no nesting deep subdirectories without a kobject
>>>> present)
>>>> - when a directory is created/removed, a uevent should happen
>>>> declaring what type of device was created/removed.
>>> So you'll be removing:
>>>
>>> /sys/devices/system/node/node?/meminfo
>>>
>>> then?
>>>
>>> along with:
>>>
>>> /sys/devices/system/node/node?/distance
>>> /sys/devices/system/node/node?/numastat
>>>
>>> and all the other things that violate the rules?
>> I would love to do that :)
>> And that goes to show how trying to enforce these kinds of rules is damm
>> hard. Things slip by that I never notice because they are only for odd
>> types of hardware :)
>
> Is there no way to enforce that in the sysfs interface?
> (Haven't looked, I'll admit).

Hm, we could parse the buffer from the response and complain if we
notice spaces in it :)

>>> (which I do agree with ... I just don't think sysfs works for
>>> performance stats as we've discussed several times before ;-))
>> I agree that this doesn't work too, but also that if it's really needed,
>> it can be done, just let us know about it (like
>> /sys/block/BLOCKDEV/stat)
>
> OK. Would be nice if we could get rid of /sys/devices/system at
> some point, which seems like a fairly crazy path, but still ;-)

I agree, getting rid of the sysdev stuff is on the list of things I want
to see changed, that code is not nice...

thanks,

greg k-h

2007-10-10 21:03:32

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

On Wed, Oct 10, 2007 at 07:16:48AM -0600, Eric W. Biederman wrote:
> Greg KH <[email protected]> writes:
>
> > On Fri, Oct 05, 2007 at 06:12:41AM -0600, Eric W. Biederman wrote:
> >> Greg KH <[email protected]> writes:
> >> >
> >> >> Also fun is that the dev file implementation needs to be able to
> >> >> report different major:minor numbers based on which mount of
> >> >> sysfs we are dealing with.
> >> >
> >> > Um, no, that's not going to happen. /dev/sda will _always_ have the
> >> > same major:minor number, as defined by the LSB spec. You can not break
> >> > that at all. So while you might not want to show all mounts
> >> > /sys/devices/block/sda/ the ones that you do, will all have the LSB
> >> > defined major:minor number assigned to it.
> >>
> >> Hmm. If that is in the LSB it must come from
> >> Documentation/devices.txt
> >
> > Yes, that is the requirement.
> >
> >> I'm not after changing the user visible major/minor assignments.
> >
> > Oh, I misunderstood what you wrote above then.
>
> My above sentence is slightly misleading. That should have been.
> I am not after changing the device name to major:minor assignments
> as specified in Documentation/devices.txt.
>
> So within a single device namespace everything is normal and as it
> always has been. Weirdness only ensues when you look across device
> namespaces.
>
> >> Let me see if a concrete example will help. Suppose I have
> >> have a SAN with two disks: disk-1 and disk-2. I have
> >> two machines A and B. On machine A I get the mapping:
> >> sda -> disk-1, sdb ->disk-2. On machine B I wind up with
> >> a different probe order so I get the mapping: sda -> disk-2
> >> sdb ->disk-1.
> >
> > Ok.
> >
> >> To be very clear by sda I mean the block device with major 8 and
> >> minor 0, and by sdb I mean the block device with major 8 and minor
> >> 16.
> >
> > Ok.
> >
> >> So I decide I want an environment on machine B that looks just
> >> like the environment on machine A, so I can bring transfer over
> >> a running program or whatever. So I run around looking at UUID
> >> labels and what not and I discover that the machine B knows disk-1 as
> >> sdb and that machine A knows disk-1 as sda. So I want to say:
> >> /sys/devices/block/sdb show up in this other device namespace as
> >> /sys/devices/block/sda.
>
> >
> > Ah, but if you do that then the "other" device namespace would have
> > /sys/devices/block/sda/dev be 8:16, right?
>
> No. The "other" device namespace I would construct on machine B to
> look just like the device namespace that existed on machine A.
> Making /sys/devices/block/sda would still be 8:0.
>
> So to be very clear on machine B when talking about disk-1 I would have.
> initial device namespace:
> /sys/devices/block/sdb
> /sys/devices/block/sdb/dev 8:16
>
> "other" device namespace:
> /sys/devices/block/sda
> /sys/devices/block/sda/dev 8:0
>
> Similarly on machine B when talking about disk-2 I would have.
> initial device namespace:
> /sys/devices/block/sda
> /sys/devices/block/sda/dev 8:0
>
> "other" device namespace:
> /sys/devices/block/sdb
> /sys/devices/block/sdb/dev 8:16
>
> So between the two devices namespaces on machine B the two disks
> would exchange their user visible identities.

Ah, ok, that makes more sense.

And seems quite difficult to do, good luck with that :)

greg k-h

2007-10-10 21:18:46

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Greg KH <[email protected]> writes:

> Ah, ok, that makes more sense.
>
> And seems quite difficult to do, good luck with that :)

Thanks. At least now all I have to do is worry about the
details when we get that far instead of selling the big picture...

My gut feel is that sysfs is probably the hardest part to deal with,
and maybe half of the problem. Just intercepting the lookup by
device number is fairly simple, I think there is one spot
for block devices and another for character devices.

Anyway once the network namespace support is in with any luck that
will have solved half the sysfs problem.

Eric

2007-10-16 22:19:28

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

Eric W. Biederman [[email protected]] wrote:
| Greg KH <[email protected]> writes:
|
| > On Fri, Oct 05, 2007 at 06:12:41AM -0600, Eric W. Biederman wrote:
| >> Greg KH <[email protected]> writes:
| >> >
| >> >> Also fun is that the dev file implementation needs to be able to
| >> >> report different major:minor numbers based on which mount of
| >> >> sysfs we are dealing with.
| >> >
| >> > Um, no, that's not going to happen. /dev/sda will _always_ have the
| >> > same major:minor number, as defined by the LSB spec. You can not break
| >> > that at all. So while you might not want to show all mounts
| >> > /sys/devices/block/sda/ the ones that you do, will all have the LSB
| >> > defined major:minor number assigned to it.
| >>
| >> Hmm. If that is in the LSB it must come from
| >> Documentation/devices.txt
| >
| > Yes, that is the requirement.
| >
| >> I'm not after changing the user visible major/minor assignments.
| >
| > Oh, I misunderstood what you wrote above then.
|
| My above sentence is slightly misleading. That should have been.
| I am not after changing the device name to major:minor assignments
| as specified in Documentation/devices.txt.
|
| So within a single device namespace everything is normal and as it
| always has been. Weirdness only ensues when you look across device
| namespaces.
|
| >> Let me see if a concrete example will help. Suppose I have
| >> have a SAN with two disks: disk-1 and disk-2. I have
| >> two machines A and B. On machine A I get the mapping:
| >> sda -> disk-1, sdb ->disk-2. On machine B I wind up with
| >> a different probe order so I get the mapping: sda -> disk-2
| >> sdb ->disk-1.
| >
| > Ok.
| >
| >> To be very clear by sda I mean the block device with major 8 and
| >> minor 0, and by sdb I mean the block device with major 8 and minor
| >> 16.
| >
| > Ok.
| >
| >> So I decide I want an environment on machine B that looks just
| >> like the environment on machine A, so I can bring transfer over
| >> a running program or whatever. So I run around looking at UUID
| >> labels and what not and I discover that the machine B knows disk-1 as
| >> sdb and that machine A knows disk-1 as sda. So I want to say:
| >> /sys/devices/block/sdb show up in this other device namespace as
| >> /sys/devices/block/sda.
|
| >
| > Ah, but if you do that then the "other" device namespace would have
| > /sys/devices/block/sda/dev be 8:16, right?
|
| No. The "other" device namespace I would construct on machine B to
| look just like the device namespace that existed on machine A.
| Making /sys/devices/block/sda would still be 8:0.
|
| So to be very clear on machine B when talking about disk-1 I would have.
| initial device namespace:
| /sys/devices/block/sdb
| /sys/devices/block/sdb/dev 8:16
|
| "other" device namespace:
| /sys/devices/block/sda
| /sys/devices/block/sda/dev 8:0
|
| Similarly on machine B when talking about disk-2 I would have.
| initial device namespace:
| /sys/devices/block/sda
| /sys/devices/block/sda/dev 8:0
|
| "other" device namespace:
| /sys/devices/block/sdb
| /sys/devices/block/sdb/dev 8:16
|
| So between the two devices namespaces on machine B the two disks
| would exchange their user visible identities.

So an application that would migrate from machine A to B has to
use virtual names (like "disk-1" and "disk-2") to access the disk
right ?

|
| Eric
| _______________________________________________
| Containers mailing list
| [email protected]
| https://lists.linux-foundation.org/mailman/listinfo/containers

2007-10-16 23:55:55

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

[email protected] writes:

> | No. The "other" device namespace I would construct on machine B to
> | look just like the device namespace that existed on machine A.
> | Making /sys/devices/block/sda would still be 8:0.
> |
> | So to be very clear on machine B when talking about disk-1 I would have.
> | initial device namespace:
> | /sys/devices/block/sdb
> | /sys/devices/block/sdb/dev 8:16
> |
> | "other" device namespace:
> | /sys/devices/block/sda
> | /sys/devices/block/sda/dev 8:0
> |
> | Similarly on machine B when talking about disk-2 I would have.
> | initial device namespace:
> | /sys/devices/block/sda
> | /sys/devices/block/sda/dev 8:0
> |
> | "other" device namespace:
> | /sys/devices/block/sdb
> | /sys/devices/block/sdb/dev 8:16
> |
> | So between the two devices namespaces on machine B the two disks
> | would exchange their user visible identities.
>
> So an application that would migrate from machine A to B has to
> use virtual names (like "disk-1" and "disk-2") to access the disk
> right ?

No. It is worse you need to access a filesystem and probably
a block device that is available on both machine A and machine B.
With care we can introduce appropriate namespaces and namespace semantics
so we can make the names be what we need.

For a classic tricky case think what it would require to migrate
a git archive with checked out files and not need to say
"git-update-index --refresh" before you work with the files.

I used names like disk-1 and disk-2 instead of UUIDs because it
was easier for me to type and think about. You do need some
kind of absolute disk or filesystem identity you can refer back to.

Eric