LinuxLists.cc - Switching from IOCTLs to a RAMFS

2002-10-24 16:20:14

Subject: Switching from IOCTLs to a RAMFS

Based on the feedback and comments regarding
the use of IOCTLs in EVMS, we are switching to
the more preferred method of using a ram based
fs. Since we are going through this effort, I
would like to get it right now, rather than
having to switch to another ramfs system later
on. The question I have is: should we roll our
own fs, (a.k.a. evmsfs) or should we use sysfs
for this purpose? My initial thoughts are that
sysfs should be used. However, recent discussions
about device mapper have suggested a custom ramfs.
Which is the *best* choice?

Thanks,
Mark

2002-10-24 16:26:41

by Patrick Mochel

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

On Thu, 24 Oct 2002, Mark Peloquin wrote:

>
> Based on the feedback and comments regarding
> the use of IOCTLs in EVMS, we are switching to
> the more preferred method of using a ram based
> fs. Since we are going through this effort, I
> would like to get it right now, rather than
> having to switch to another ramfs system later
> on. The question I have is: should we roll our
> own fs, (a.k.a. evmsfs) or should we use sysfs
> for this purpose? My initial thoughts are that
> sysfs should be used. However, recent discussions
> about device mapper have suggested a custom ramfs.
> Which is the *best* choice?

Use sysfs, please. Coming out of the kernel summit, the goal was to move
as much stuff to a ramfs-based system, rather than ioctl and procfs, and
Linus explicitly said to try and put them all in the same filesystem.

-pat

2002-10-24 17:40:41

by Jeff Garzik

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

Mark Peloquin wrote:
> Based on the feedback and comments regarding
> the use of IOCTLs in EVMS, we are switching to
> the more preferred method of using a ram based
> fs. Since we are going through this effort, I
> would like to get it right now, rather than
> having to switch to another ramfs system later
> on. The question I have is: should we roll our
> own fs, (a.k.a. evmsfs) or should we use sysfs
> for this purpose? My initial thoughts are that
> sysfs should be used. However, recent discussions
> about device mapper have suggested a custom ramfs.
> Which is the *best* choice?

(cc'd viro and mochel, as I feel they are 'owners' in the subject area)

Let's jump back a bit, for a second. Why is procfs bad news? There are
minor issues with the implementation of single-page output and lack of
pure file operations, but the big issue is lack of a sane namespace.
sysfs is no better than procfs if we keep heaving junk into it without
thinking about proper namespace organization.

I personally prefer a separate filesystem for what you describe. That
gives the EVMS team control over their own portion of the namespace,
while giving complete flexibility. I do _not_ see sysfs as simply a
procfs replacement -- sysfs IMO is more intended as a way to organize
certain events and export internal kernel structure.

To tangent a bit, WRT a private evmsfs, make sure that (a) you prefer
ASCII over binary interfaces where reasonable, and (b) any binary
interfaces you have are fixed-endian and 64-bit safe from the get-go.
Consider crazy cases like someone exporting evmsfs over NFS, from a
32-bit IA32 server to a big-endian 64-bit client.

Jeff

2002-10-24 18:05:41

by Patrick Mochel

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

On Thu, 24 Oct 2002, Jeff Garzik wrote:

> Mark Peloquin wrote:
> > Based on the feedback and comments regarding
> > the use of IOCTLs in EVMS, we are switching to
> > the more preferred method of using a ram based
> > fs. Since we are going through this effort, I
> > would like to get it right now, rather than
> > having to switch to another ramfs system later
> > on. The question I have is: should we roll our
> > own fs, (a.k.a. evmsfs) or should we use sysfs
> > for this purpose? My initial thoughts are that
> > sysfs should be used. However, recent discussions
> > about device mapper have suggested a custom ramfs.
> > Which is the *best* choice?
>
>
> (cc'd viro and mochel, as I feel they are 'owners' in the subject area)
>
> Let's jump back a bit, for a second. Why is procfs bad news? There are
> minor issues with the implementation of single-page output and lack of
> pure file operations, but the big issue is lack of a sane namespace.
> sysfs is no better than procfs if we keep heaving junk into it without
> thinking about proper namespace organization.

That's one of my personal goals: to mandate some amount of sanity in the
namespace organization. Without it, sysfs is basically just a modernized
procfs.

> I personally prefer a separate filesystem for what you describe. That
> gives the EVMS team control over their own portion of the namespace,
> while giving complete flexibility. I do _not_ see sysfs as simply a
> procfs replacement -- sysfs IMO is more intended as a way to organize
> certain events and export internal kernel structure.

I do not view those as necessarily competing goals. The mission statement
of sysfs is to "export kernel objects, their attributes, and their
relation to other objects".

EVMS, like any other subsystem, has a set of objects and methods to
operate on them, as exported via attributes. They have their have their
own object hierarchy, and in no way do I want to dilute that (or pollute
anything else ;). sysfs should be able to handle this. It does today,
though it's not as seamless as I would prefer it.

I would rather mature the API and consolidate the common code, than have N
copies of the same filesystem, each with a slightly different purpose, in
existence. There are so many benefits:

- Less code duplication, and less places to fix identical bugs.

- It makes it easier to write for; instead of having to copy n' paste a
new filesystem to export your subsystem's objects, you can add a field
to a structure and call a function.

- It's easier for the user to mount one filesystem and get everything,
instead of trying to figure out what fs has what ifno.

- It's easier to associate objects between subsystems, since you can
internally create relative symlinks between two objects (and soon with a
single call).

-pat

2002-10-24 21:34:11

by Jeff Garzik

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

Patrick Mochel wrote:

> On Thu, 24 Oct 2002, Jeff Garzik wrote:
>
>> Mark Peloquin wrote:
>>
>>> Based on the feedback and comments regarding
>>> the use of IOCTLs in EVMS, we are switching to
>>> the more preferred method of using a ram based
>>> fs. Since we are going through this effort, I
>>> would like to get it right now, rather than
>>> having to switch to another ramfs system later
>>> on. The question I have is: should we roll our
>>> own fs, (a.k.a. evmsfs) or should we use sysfs
>>> for this purpose? My initial thoughts are that
>>> sysfs should be used. However, recent discussions
>>> about device mapper have suggested a custom ramfs.
>>> Which is the *best* choice?
>>
>>
>> (cc'd viro and mochel, as I feel they are 'owners' in the subject area)
>>
>> Let's jump back a bit, for a second. Why is procfs bad news? There are
>> minor issues with the implementation of single-page output and lack of
>> pure file operations, but the big issue is lack of a sane namespace.
>> sysfs is no better than procfs if we keep heaving junk into it without
>> thinking about proper namespace organization.
>
>
> That's one of my personal goals: to mandate some amount of sanity in the
> namespace organization. Without it, sysfs is basically just a modernized
> procfs.

Is there a namespace doc or guideline we can look at?
(for existing nodes, sure, but more guidelines for future nodes)

>
>> I personally prefer a separate filesystem for what you describe. That
>> gives the EVMS team control over their own portion of the namespace,
>> while giving complete flexibility. I do _not_ see sysfs as simply a
>> procfs replacement -- sysfs IMO is more intended as a way to organize
>> certain events and export internal kernel structure.
>
>
> I do not view those as necessarily competing goals. The mission statement
> of sysfs is to "export kernel objects, their attributes, and their
> relation to other objects".
>
> EVMS, like any other subsystem, has a set of objects and methods to
> operate on them, as exported via attributes. They have their have their
> own object hierarchy, and in no way do I want to dilute that (or pollute
> anything else ;). sysfs should be able to handle this. It does today,
> though it's not as seamless as I would prefer it.

I hope that sysfs imposes some sort of structure on random sysfs users?

>
> I would rather mature the API and consolidate the common code, than
> have N
> copies of the same filesystem, each with a slightly different purpose, in
> existence. There are so many benefits:
>
> - Less code duplication, and less places to fix identical bugs.

Not in this argument :) libfs.c handles this quite nicely. And it's
just a matter of moving more code into libfs.c for things like this.

In fact it looks like some of the sysfs/inode.c code could be moved to
libfs.c or should be using libfs.c code ;-)

Further, looking at current sysfs/inode.c code, it seems that ->read and
->write ops provided are severely lacking in flexibility. If you let
users provide their own file_operations directly, that would be nice.
Calling __get_free_page and having users send data to that page is easy
-- and kills quite a lot of flexibility that would push one towards
creating a private 'meta' filesystem. Having that page provided for you
is IMO really only useful for spitting out status data...

>
> - It makes it easier to write for; instead of having to copy n' paste a
> new filesystem to export your subsystem's objects, you can add a field
> to a structure and call a function.

This is a function of any API. copy-n-paste is not an argument against
a private filesystem -- see libfs.c counter-argument above.

> - It's easier for the user to mount one filesystem and get everything,
> instead of trying to figure out what fs has what ifno.

agreed

> - It's easier to associate objects between subsystems, since you can
> internally create relative symlinks between two objects (and soon with a
> single call).

agreed

So let users provide their own file_operations, and have guidelines for
new users, and I'll be happier :)

Jeff

2002-10-24 22:39:34

by Patrick Mochel

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

> Is there a namespace doc or guideline we can look at?
> (for existing nodes, sure, but more guidelines for future nodes)

There's nothing official, at least not yet. I know that's a pretty crappy
thing to say, but I've been this -><- close to doing for some time now..

> > I would rather mature the API and consolidate the common code, than
> > have N
> > copies of the same filesystem, each with a slightly different purpose, in
> > existence. There are so many benefits:
> >
> > - Less code duplication, and less places to fix identical bugs.
>
> Not in this argument :) libfs.c handles this quite nicely. And it's
> just a matter of moving more code into libfs.c for things like this.

ACK.

> In fact it looks like some of the sysfs/inode.c code could be moved to
> libfs.c or should be using libfs.c code ;-)

Ok, you got me there. :)

> Further, looking at current sysfs/inode.c code, it seems that ->read and
> ->write ops provided are severely lacking in flexibility. If you let
> users provide their own file_operations directly, that would be nice.
> Calling __get_free_page and having users send data to that page is easy
> -- and kills quite a lot of flexibility that would push one towards
> creating a private 'meta' filesystem. Having that page provided for you
> is IMO really only useful for spitting out status data...

Agreed. In many cases, that page should suffice. But, for data that you
want to poll(2) or select(2) for, it doesn't work. Now, it's easy to add
an API in which the caller can pass the file_operations for a file that
they're creating.

But, then we border on what exactly we want in sysfs. The original intent
was to provide a simple ASCII-based interface, and _not_ provide device
nodes.

As things have matured, and more people want to move stuff out of procfs
and replace ioctls with custom filesystem interfaces, I've questioned the
original intent. In the end, I'm really ambivalent about whether to use
sysfs or a custom filesystem for custom interfaces beyond the simple
ASCII-based one. I'm not trying to either save or conquer the world with
sysfs, and it would make my life a whole lot easier if no one at all used
it. ;)

But, I have arguments for both sides:

If sysfs is used, there are the arguments I presented last time: once fs
to mount, one API to create files in different subsystems, easier
association between objects.

While it's easy to create your own filesystem, either using fs/libfs.c
helpers and/or fs/nfsd/nfsctl.c as your based, you still end up with a lot
of replicated code. There will be copy-n-paste no matter what. I know it's
not a really solid argument, but how much overhead is it going to incur if
every subsystem and/or every object that belongs to each subsystem is
exporting a filesystem instance?

If sysfs isn't used, then everyone has their own freedom in how they read
and write files. There are no mutations to the sysfs API, and we don't let
it deviate from its orginal purpose. It's like procfs, only with N APIs.
;)

So, I'm in the middle. I don't want to convolute the API, but I'd rather
have something simple and something central for subsystems and drivers to
use; of course done right. :)

-pat

2002-10-26 21:54:46

by Jeff Garzik

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

Patrick Mochel wrote:

>>Is there a namespace doc or guideline we can look at?
>>(for existing nodes, sure, but more guidelines for future nodes)
>>
>>
>
>There's nothing official, at least not yet. I know that's a pretty crappy
>thing to say, but I've been this -><- close to doing for some time now..
>
hehe, that's cool :)

I'll just continue to point people to [email protected] when they ask
about a procfs alternative :)

>But, then we border on what exactly we want in sysfs. The original intent
>was to provide a simple ASCII-based interface, and _not_ provide device
>nodes.
>
>As things have matured, and more people want to move stuff out of procfs
>and replace ioctls with custom filesystem interfaces, I've questioned the
>original intent. In the end, I'm really ambivalent about whether to use
>sysfs or a custom filesystem for custom interfaces beyond the simple
>ASCII-based one. I'm not trying to either save or conquer the world with
>sysfs, and it would make my life a whole lot easier if no one at all used
>it. ;)
>
>But, I have arguments for both sides:
>
>If sysfs is used, there are the arguments I presented last time: once fs
>to mount, one API to create files in different subsystems, easier
>association between objects.
>
>While it's easy to create your own filesystem, either using fs/libfs.c
>helpers and/or fs/nfsd/nfsctl.c as your based, you still end up with a lot
>of replicated code. There will be copy-n-paste no matter what. I know it's
>not a really solid argument, but how much overhead is it going to incur if
>every subsystem and/or every object that belongs to each subsystem is
>exporting a filesystem instance?
>
>

Like I touched on in IRC, there is room for both sysfs and per-driver
filesystems.

I think just about everyone agrees that ioctls are a bad idea and a huge
maintenance annoyance. So, what is the alternate solution? IMO your
choices are presenting a device node for control via read(2), write(2),
and poll(2), or exporting a bunch of ASCII-controlled interfaces. While
I certainly agree with the overall strategy of sysfs, I can't see it as
being the best interface for wholesale replacement of groups of ioctls.
So that leaves per-driver filesystems, which have a bunch of benefits...
* allows for implementation of true character devices (chardevs),
something which sysfs was never intended to do
* solves module unloading problem, because filesystems must be mounted
before accessing and umounted before removal, which implies that there
will be no races at the open(2) level
* we all admit that sysfs doesn't attempt to solve similar problems as
devfs. so if one wants to do a sane device filesystem, a custom fs is
needed
* mount options are the current best solution for providing decent
default file permissions for dynamically instantiated file nodes. It
keeps policy in userspace while still providing dynamic file nodes in
the kernel. per-driver filesystems give you the granularity needed to
accomplish this.
* as fs/nfsd/nfsctl.c and libfs.c shows, you don't have to worry about
code bloat, so the only real overhead is the structures that are
involved in superblock/vfsmount operations

There were a couple other benefits I have forgotten by now :)

Jeff

2002-10-27 22:52:30

by Peter Chubb

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

>>>>> "Jeff" == Jeff Garzik <[email protected]> writes:

Jeff> Like I touched on in IRC, there is room for both sysfs and per-driver
Jeff> filesystems.

Jeff> I think just about everyone agrees that ioctls are a bad idea and a huge
Jeff> maintenance annoyance.

I note that the P1003.26 ballot has just been announced...

Title: P1003.26: Information Technology -- Portable Operating
System Interface (POSIX) -- Part 26: Device Control
Application Program Interface (API) [C Language]

Scope: This work will define an application program interface to
device drivers. The interface will be modeled on the
traditional ioctl() function, but will have enhancements
designed to address issues such as "type safety" and
reentrancy.

It may be worth looking at what the draft standard says before
committing to yet another interface specification.

Peter C

2002-10-28 00:12:48

by Jeff Garzik

[permalink] [raw]

Subject: Re: Switching from IOCTLs to a RAMFS

Peter Chubb wrote:

>>>>>>"Jeff" == Jeff Garzik <[email protected]> writes:
>>>>>>
>>>>>>
>
>
>Jeff> Like I touched on in IRC, there is room for both sysfs and per-driver
>Jeff> filesystems.
>
>Jeff> I think just about everyone agrees that ioctls are a bad idea and a huge
>Jeff> maintenance annoyance.
>
>I note that the P1003.26 ballot has just been announced...
>
> Title: P1003.26: Information Technology -- Portable Operating
> System Interface (POSIX) -- Part 26: Device Control
> Application Program Interface (API) [C Language]
>
> Scope: This work will define an application program interface to
> device drivers. The interface will be modeled on the
> traditional ioctl() function, but will have enhancements
> designed to address issues such as "type safety" and
> reentrancy.
>
>
>It may be worth looking at what the draft standard says before
>committing to yet another interface specification.
>
>

Already looked at it. It's awful, and retains many of the problems that
ioctl(2) presents to kernel maintainers.

I sent a comment in to the only email address I could find describing
the issues (politely!), but as a mere peon I doubt it will have much
effect. The best we can do is ignore this POSIX junk and hope it goes
away...

Jeff

2002-10-28 07:50:54

by Brad Hards

[permalink] [raw]

Subject: Kernel/userspace interfaces (was: Switching from IOCTLs to a RAMFS)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 28 Oct 2002 11:18, Jeff Garzik wrote:
<stuff about posix_devctl() snipped>
> I sent a comment in to the only email address I could find describing
> the issues (politely!), but as a mere peon I doubt it will have much
> effect. The best we can do is ignore this POSIX junk and hope it goes
> away...
I'd like something more positive than "hope it goes away"... :-)

This allows me to do my irregular "does anyone care about the kernel ABI / API
definition?" song'n'dance.

Currently we use standard unix semantics - char devices, block devices,
sockets, etc. However there is no definition for what that interface actually
does. We have (most of ?) SUSv3, but there are a lot of other things we're
doing. Some (many?) of the features aren't getting used because:
1. Not known by userspace programmers.
2. Non-standard semantics and no tutorial / example material.
3. Random changes to features and lack of versioning.

We also have serious problems with management of header files. "Use the
headers that came with your glibc" misses ioctl() definitions, which are
inherently kernel interfaces, not glibc interfaces.

I'll again offer to moderate a BoF, at LCA (http://www.linux.conf.au) in
Perth. I don't have anything like the answers, so its only worth doing if
someone with a clue is interested. LCA BoFs don't have much time or many
slots, so if there's a decent amount of interest, it might be worth doing on
the Tuesday (say for a couple of hours, in the vacant third mini-conf slot).

Does anyone care?

Brad
- --
http://linux.conf.au. 22-25Jan2003. Perth, Aust. I'm registered. Are you?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9vOvBW6pHgIdAuOMRAtmAAKCCb1eWxksZpiVNPjFYERC+79sWSwCgv4a5
RbtwYjH9COJbhKwqBw22hyI=
=HW49
-----END PGP SIGNATURE-----