I assume that the (empty, currently) entries in /sys/cdev are there for
some purpose... for purposes of my own, I've been thinking that it would
be nice to be able to find which device numbers got assigned when using the
new char dev registration functions. Thus, the following patch, which adds
default "dev" and "count" attributes to /sys/cdev entries.
I have no idea whether this follows the original plan for /sys/cdev.
There *is* a plan for it, no...?
jon
Jonathan Corbet
Executive editor, LWN.net
[email protected]
diff -urN -X dontdiff test5-vanilla/fs/char_dev.c test5/fs/char_dev.c
--- test5-vanilla/fs/char_dev.c Mon Sep 8 13:50:01 2003
+++ test5/fs/char_dev.c Sat Sep 20 06:46:24 2003
@@ -342,7 +342,13 @@
int cdev_add(struct cdev *p, dev_t dev, unsigned count)
{
- int err = kobject_add(&p->kobj);
+ int err;
+
+ /* Remember for sysfs; maybe caller should set these? */
+ p->firstdev = dev;
+ p->count = count;
+
+ err = kobject_add(&p->kobj);
if (err)
return err;
err = kobj_map(cdev_map, dev, count, NULL, exact_match, exact_lock, p);
@@ -383,6 +389,61 @@
}
}
+/*
+ * Simple /sys/cdev attribute stuff, added by [email protected]. This is
+ * heavily patterned after the /sys/block code.
+ */
+struct cdev_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct cdev *, char *);
+};
+
+static ssize_t cdev_dev_read(struct cdev *cd, char *page)
+{
+ return sprintf(page, "%d:%d\n", MAJOR(cd->firstdev),
+ MINOR(cd->firstdev));
+}
+
+static ssize_t cdev_count_read(struct cdev *cd, char *page)
+{
+ return sprintf(page, "%d\n", cd->count);
+}
+
+static struct cdev_attribute cdev_attr_dev = {
+ .attr = { .name = "dev", .mode = S_IRUGO },
+ .show = cdev_dev_read
+};
+
+static struct cdev_attribute cdev_attr_count = {
+ .attr = { .name = "count", .mode = S_IRUGO },
+ .show = cdev_count_read
+};
+
+
+static struct attribute *default_cdev_attrs[] = {
+ &cdev_attr_dev.attr,
+ &cdev_attr_count.attr,
+ NULL
+};
+
+/*
+ * General show function.
+ */
+static ssize_t cdev_attr_show(struct kobject *kobj, struct attribute *attr,
+ char *page)
+{
+ struct cdev *cd = container_of(kobj, struct cdev, kobj);
+ struct cdev_attribute *cattr =
+ container_of(attr, struct cdev_attribute, attr);
+ if (cattr->show)
+ return cattr->show(cd, page);
+ return 0;
+}
+
+static struct sysfs_ops cdev_sysfs_ops = {
+ .show = cdev_attr_show
+};
+
static decl_subsys(cdev, NULL, NULL);
static void cdev_default_release(struct kobject *kobj)
@@ -400,10 +461,14 @@
static struct kobj_type ktype_cdev_default = {
.release = cdev_default_release,
+ .sysfs_ops = &cdev_sysfs_ops,
+ .default_attrs = default_cdev_attrs,
};
static struct kobj_type ktype_cdev_dynamic = {
.release = cdev_dynamic_release,
+ .sysfs_ops = &cdev_sysfs_ops,
+ .default_attrs = default_cdev_attrs,
};
static struct kset kset_dynamic = {
diff -urN -X dontdiff test5-vanilla/include/linux/cdev.h test5/include/linux/cdev.h
--- test5-vanilla/include/linux/cdev.h Mon Sep 8 13:50:09 2003
+++ test5/include/linux/cdev.h Sat Sep 20 06:12:52 2003
@@ -3,6 +3,8 @@
#ifdef __KERNEL__
struct cdev {
+ dev_t firstdev;
+ int count;
struct kobject kobj;
struct module *owner;
struct file_operations *ops;
On Fri, Sep 19, 2003 at 05:10:46PM -0600, Jonathan Corbet wrote:
> +static ssize_t cdev_dev_read(struct cdev *cd, char *page)
> +{
> + return sprintf(page, "%d:%d\n", MAJOR(cd->firstdev),
> + MINOR(cd->firstdev));
> +}
You should use the print_dev_t() call here instead of doing it by hand.
thanks,
greg k-h
On Fri, Sep 19, 2003 at 05:10:46PM -0600, Jonathan Corbet wrote:
> I assume that the (empty, currently) entries in /sys/cdev are there for
> some purpose... for purposes of my own, I've been thinking that it would
> be nice to be able to find which device numbers got assigned when using the
> new char dev registration functions. Thus, the following patch, which adds
> default "dev" and "count" attributes to /sys/cdev entries.
>
> I have no idea whether this follows the original plan for /sys/cdev.
> There *is* a plan for it, no...?
They are more of a side effect. The thing is, expected setup for character
devices is the following:
* driver has embedded struct cdev in its data structures
* it has the thing registered in global search structure - as tty
does, for example.
* open() on a character device finds the kobject of that cdev (same
as block device open() finds kobject of gendisk) and uses it for the rest of
work. Namely, file_operations() are picked from it and inode->i_cdev /
inode->i_cindex are set. Then we call ->open().
* ->open() can use ->i_cdev to get whatever data structure driver
had intended and avoid any lookups of its own (of course it's still perfectly
free to look at the minor, mess with its own arrya/lists/whatnot - it's not
mandatory anymore, but it certainly won't break).
To avoid breaking the old drivers we do the following: register_chrdev()
allocates a cdev itself, inserts it in search tree, etc., so open() on
character device registered that way will work with the new scheme. We'll
find the cdev allocated by register_chrdev(), take the file_operations from
it (register_chrdev() had stored the pointer when it had created cdev), set
->i_cindex (potentially useful for ->open()) and ->i_cdev (useless for ->open(),
since it has no relationship to any driver objects). And call ->open().
FWIW, I would be just fine with not seeing these guys in sysfs - they don't
have any intrinsic sense from the driver POV and they exist only for those
who use old scheme, so putting them to any universal use is hopeless.
Real driver-registered cdevs are different story - they should bear whatever
attributes driver wants them to bear. Note that even for tty conversion is
not complete - we still have sucky refcounting there and it has to be fixed
first. Once it's done, get_tty_driver() goes to hell and tty_driver embedded
cdev will get sane attributes.
BTW, I suspect that we need a way to say "this kobject and its children
will *never* have any attributes and will never be seen in sysfs". There
are quite a few uses when we keep kobject as a refcounting vehicle, etc.,
but have nothing meaningful to show in sysfs tree. Keep in mind that
sysfs nodes (including attributes) are not free - it's struct inode +
struct dentry at the very least. Both pinned down and permanently mapped...
On Sat, Sep 20, 2003 at 02:28:44AM +0100, [email protected] wrote:
>
> BTW, I suspect that we need a way to say "this kobject and its children
> will *never* have any attributes and will never be seen in sysfs". There
> are quite a few uses when we keep kobject as a refcounting vehicle, etc.,
> but have nothing meaningful to show in sysfs tree. Keep in mind that
> sysfs nodes (including attributes) are not free - it's struct inode +
> struct dentry at the very least. Both pinned down and permanently mapped...
Hm, I've used them this way by just calling kobject_init, kobject_get
and kobject_put. As long as you never call kobject_register or
kobject_add, the sysfs hookups never happen.
Is that what you are looking for?
thanks,
greg k-h
> BTW, I suspect that we need a way to say "this kobject and its children
> will *never* have any attributes and will never be seen in sysfs". There
> are quite a few uses when we keep kobject as a refcounting vehicle, etc.,
> but have nothing meaningful to show in sysfs tree. Keep in mind that
> sysfs nodes (including attributes) are not free - it's struct inode +
> struct dentry at the very least. Both pinned down and permanently mapped...
There is already - use kobject_init() to initialize the refcount to 1, and
kobject_put() to clean it up. (kobject_add() and kobject_del() can be used
to later add/remove it from the sysfs hierarchy).
Or, since those objects will probably never get added to sysfs, and since
few objects are ever promoted, we may just want to create a new, smaller
object purely for the refcounting vehicle..
Pat
> > I have no idea whether this follows the original plan for /sys/cdev.
>
> They are more of a side effect.
So...to be sure I have things straight.../sys/cdev isn't really meant to be
there, and char devices wanting to do things in sysfs should be working
under /sys/devices or /sys/class or /sys/somethingelse?
> * driver has embedded struct cdev in its data structures
> * ->open() can use ->i_cdev to get whatever data structure driver
> had intended and avoid any lookups of its own
I noticed there's no "private" member in the cdev structure. So drivers
should embed the structure and use container_of to get their real structure
of interest?
[Forgive my ignorance here...] If I embed a struct cdev within my own
device structure, how do I know when I can safely free said device
structure? Will there be a release method that gets exposed at the driver
level, or am I missing something obvious again?
Thanks,
jon
Jonathan Corbet
Executive editor, LWN.net
[email protected]
On Mon, Sep 22, 2003 at 10:46:09AM -0600, Jonathan Corbet wrote:
> > > I have no idea whether this follows the original plan for /sys/cdev.
> >
> > They are more of a side effect.
>
> So...to be sure I have things straight.../sys/cdev isn't really meant to be
> there, and char devices wanting to do things in sysfs should be working
> under /sys/devices or /sys/class or /sys/somethingelse?
<nod>
For block devices we really have non-trivial properties associated with them -
struct gendisk and request queue, for one thing. And these are uniform across
the entire set.
Character devices per se have *no* meaningful properties common to the entire
class. To put it another way, while block devices do form a natural class,
character ones do not.
> > * driver has embedded struct cdev in its data structures
>
> > * ->open() can use ->i_cdev to get whatever data structure driver
> > had intended and avoid any lookups of its own
>
> I noticed there's no "private" member in the cdev structure. So drivers
> should embed the structure and use container_of to get their real structure
> of interest?
Yes, if they bother with cdev of their own in the first place.
> [Forgive my ignorance here...] If I embed a struct cdev within my own
> device structure, how do I know when I can safely free said device
> structure? Will there be a release method that gets exposed at the driver
> level, or am I missing something obvious again?
Umm... Any kobject has ->release() method, obviously. NOTE: as always,
embedding kobject into one of your objects requires serious thinking about
lifetime rules for your objects. Until they are cleaned up you'd better
*not* mess with that. OTOH, in a lot of cases cleanup is needed anyway -
e.g. if you declare per-object sysctls/procfs entries/whatnot, you have
the same sort of problems already (plus you need to deal with the same
sort of races in your code that finds object by major:minor; many drivers
are FUBAR in that area).
Forgive my insistence here...I'm trying to figure out what LDD3 is going to
say on this subject. If we get it right, hopefully that will head off a
lot of questions in the future...
> > If I embed a struct cdev within my own
> > device structure, how do I know when I can safely free said device
> > structure? Will there be a release method that gets exposed at the driver
> > level, or am I missing something obvious again?
>
> Umm... Any kobject has ->release() method, obviously.
Actually, as I read it, each kobject has a kobj_type pointer, and in *that*
structure is a release() method. I had found it...:)
The struct cdev which I, as a driver author, can embed within my own
structure has a kobject in it. If it's an embedded cdev, its ktype pointer
will be aimed at ktype_cdev_default, which sets up cdev_default_release()
as its release function.
If I understand things correctly, as long as references to the embedded
struct cdev remain, I cannot free the driver-specific structure in which
the struct cdev is embedded. So, somehow, I need to know when that struct
cdev's release() method is called. I could do that by changing its ktype
field to my own special type, and remembering to call cdev_purge() in my
own release function. Somehow, however, that doesn't feel like the right
approach. It seems to me like we need a release() method in struct cdev
that is called from the struct cdev's own release() method - at least, in
the non-dynamic case. No? What am I missing here?
Thanks,
jon
Jonathan Corbet
Executive editor, LWN.net
[email protected]
On Mon, Sep 22, 2003 at 03:15:11PM -0600, Jonathan Corbet wrote:
> The struct cdev which I, as a driver author, can embed within my own
> structure has a kobject in it. If it's an embedded cdev, its ktype pointer
> will be aimed at ktype_cdev_default, which sets up cdev_default_release()
> as its release function.
>
> If I understand things correctly, as long as references to the embedded
> struct cdev remain, I cannot free the driver-specific structure in which
> the struct cdev is embedded. So, somehow, I need to know when that struct
> cdev's release() method is called. I could do that by changing its ktype
> field to my own special type, and remembering to call cdev_purge() in my
> own release function. Somehow, however, that doesn't feel like the right
> approach. It seems to me like we need a release() method in struct cdev
> that is called from the struct cdev's own release() method - at least, in
> the non-dynamic case. No? What am I missing here?
Mostly the fact that current cdev_init() is a temporary measure. Note that
the only embedded cdev use is in tty_driver and it has tons of problems in
refcounting/lifetime rules (had them since way back, actually.
FWIW, I hope to get the following (and pretty soon, since dev_t mess appears
to be mostly over):
* real classes (e.g. tty drivers) get their own subtrees in sysfs,
complete with real attributes, etc. They also get ktypes of their own, with
->release() doing all their freeing, yadda, yadda. One of the things that
should be done is cdev_purge().
* that transition is done on subsystem-by-subsystem basis, with
usual work on lifetime rules. Again, it's a non-trivial work and subsystems
that are not up to that are free to keep doing exactly what they did in 2.4.
* Freeing stuff in the converted subsystems *must* be done from
->release(). We can empty the object (i.e. mark it dead and junk most of
its fields) when we unregister the sucker, but we can't free it until it's
released.
One more thing: our support of device removal is atrocious. That's probably
the most important lesson I've learnt from Linux - there are objects that
should stay alive as long as they are wanted, there are objects that should
stay alive until they want to die and there should be a clear boundary between
them - preferably designed to be there from the very beginning, since moving
it around afterwards is a royal PITA. Filesystem objects tend to belong to
the first category, network ones to the second. Hardware, sad as it is,
obviously belongs to the second category - after all, if it decides to let
the magic smoke out, no amount of "but I still need it" will help you.
Somewhere on the path from opened file to the piece of hardware there should
be a boundary.
A *lot* of grotty mess had come out of the fact that we never had a clear
boundary there. Note that so-called "module removal" issues are of the same
origin - modules themselves belong to the first class (you can keep them
loaded as long as they are used; no problems with that) and that masks the
real problem. Namely, device removal appears to be tied to the removal of
1st category object (== rmmod) and that gives an illusion that there's a
single place where we could deal with all these issues. That's what Rusty
is trying to do, BTW - put a lot of smarts into rmmod in hope to deal with
that mess.
Unfortunately, it doesn't work. Not only because there is hotplug, USB,
SCSI, etc. to deal with, but because there are object classes that contain
both hardware (which we could pretend to be non-removable) and non-hardware.
Trivial example: network interfaces. Whatever you do, you *must* deal with
ppp link going up and down. It's equivalent to inserting/removing NIC and
there is no hope in hell to deal with that anywhere near rmmod.
IOW, at some point we will have to admit that "device disappears" is a normal
event that should be handled on generic basis - not by ad-hackery in individual
driver. If anything, rmmod should be "atomically check that no exported
objects are in use and initiate removal on all of them" sort of event.
Now, for block devices there is some breathing space - we have block_device,
gendisk and queue to play with; somewhere among those a boundary can be
set and it's not too hard to do. It's messy (and in part it's my fault -
basically, back in 2.3 I'd considered block_device to be permanently
associated with device number and not with the driver objects; that turned
out to be a bad mistake), but at least here I have a hope to get the things
right before 2.6.0 with minimal PITA.
For character devices we have *nothing*. It goes from file to device number
and there drivers do whatever they do. Removal issues are there - in spades.
TTY hangup, USB device removal, yadda, yadda. Some subsystems try to handle
that with ad-hackery, faith and a bit of duct-tape. Some do not even pretend
to (look at drivers/input and try to figure out what protects the lists of
objects; the answer's "nothing").
A very big part of motivation for embedded cdevs was to get the real driver
objects into the game in hope to localize the mess. It *definitely* won't
be finished for all subsystems by the time of 2.6.0, so we have no hope in
hell to get it rigth until 2.7. There will be need of follow-up work on
device removal issues in that area. And it won't really happen until the
next branch. What we can do is to get the things into more or less sane
shape without making them worse. Full solution will have to wait. And
I suspect that we'll have to go through interface changes before it's over.
The bottom line:
* embedded cdevs are mostly subsystem business right now and this
work just begins
* eventually it will filter down to drivers
* for now any driver can regsiter regions as it used to - the way
it works involves cdevs, but that's 100% transparent from the driver POV.
* we'll certainly need to change cdev-related interfaces several
times before they settle down; hopefully we'll have a clear picture by the
2.6.0 time.
* in 2.7 there will be a lot of fun work related to hotplug/device
removal/insertion/invalidation; hopefully by that time we'll have to deal
only with generic subsystem code and not with individual drivers.
On Mon, Sep 22, 2003 at 11:37:39PM +0100, [email protected] wrote:
> On Mon, Sep 22, 2003 at 03:15:11PM -0600, Jonathan Corbet wrote:
>
> > The struct cdev which I, as a driver author, can embed within my own
> > structure has a kobject in it. If it's an embedded cdev, its ktype pointer
> > will be aimed at ktype_cdev_default, which sets up cdev_default_release()
> > as its release function.
> >
> > If I understand things correctly, as long as references to the embedded
> > struct cdev remain, I cannot free the driver-specific structure in which
> > the struct cdev is embedded. So, somehow, I need to know when that struct
> > cdev's release() method is called. I could do that by changing its ktype
> > field to my own special type, and remembering to call cdev_purge() in my
> > own release function. Somehow, however, that doesn't feel like the right
> > approach. It seems to me like we need a release() method in struct cdev
> > that is called from the struct cdev's own release() method - at least, in
> > the non-dynamic case. No? What am I missing here?
>
> Mostly the fact that current cdev_init() is a temporary measure. Note that
> the only embedded cdev use is in tty_driver and it has tons of problems in
> refcounting/lifetime rules (had them since way back, actually.
>
> FWIW, I hope to get the following (and pretty soon, since dev_t mess appears
> to be mostly over):
> * real classes (e.g. tty drivers) get their own subtrees in sysfs,
> complete with real attributes, etc. They also get ktypes of their own, with
> ->release() doing all their freeing, yadda, yadda. One of the things that
> should be done is cdev_purge().
Most classes do have their own subtree in sysfs. That's what struct
class is for. Now I do have a bunch of patches sitting in my tree to
fix up those classes that do not currently have this, but I've been
waiting for things to die down a bit before submitting them. I can dig
them out for anyone to play with if you like.
The big issue is that char devices do not all work alike. Infact there
are almost no "raw" char devices that do not fit into some kind of class
today. So I think that if we use cdev in the core char layer, that's
great, and needed. But to move that functionality up the levels, into
the different classes probably isn't needed, and would just get messy
quickly (sound devices don't have much in common with tty devices.)
Yes, they both need a reference counting object, hence the struct class
usage. Now if you think we can merge struct class with cdev, that might
be interesting...
> One more thing: our support of device removal is atrocious.
We have gotten much better at this in 2.6. USB devices should be sane
now. PCI devices, well, it's gotten better... Block devices are a lot
better thanks to the work you've done already (I think the scsi people
have some minor issues left here...)
> IOW, at some point we will have to admit that "device disappears" is a normal
> event that should be handled on generic basis - not by ad-hackery in individual
> driver.
Agreed.
> * we'll certainly need to change cdev-related interfaces several
> times before they settle down; hopefully we'll have a clear picture by the
> 2.6.0 time.
Wait, isn't it 2.6.0 time any day now? :)
thanks,
greg k-h