2003-03-09 12:46:16

by Dave Jones

[permalink] [raw]
Subject: Fwd: struct inode size reduction.

Third retry, perhaps it'll make it through to the list
now that vger is sending mail again...


Attachments:
(No filename) (94.00 B)
(No filename) (68.26 kB)
Download all attachments

2003-03-09 17:02:37

by Andries Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 12:54:03PM -0100, Dave Jones wrote:
> Third retry, perhaps it'll make it through to the list
> now that vger is sending mail again...

> I've been running with this patch, with no untoward consequences,
> seems to pass basic testing here, any objections before I push
> this Linuswards ? Al ? Christoph ?
>
> --- bk-linus/include/linux/fs.h 2003-03-08 08:57:57.000000000 -0100
> +++ inode/include/linux/fs.h 2003-03-08 08:57:20.000000000 -0100
> @@ -382,12 +382,12 @@ struct inode {
> struct address_space *i_mapping;
> struct address_space i_data;
> struct dquot *i_dquot[MAXQUOTAS];
> - /* These three should probably be a union */
> struct list_head i_devices;
> - struct pipe_inode_info *i_pipe;
> - struct block_device *i_bdev;
> - struct char_device *i_cdev;
> -
> + union {
> + struct pipe_inode_info *i_pipe;
> + struct block_device *i_bdev;
> + struct char_device *i_cdev;
> + } type;

Not really any objection, but this is half work where
more can be done. The comment is right: also i_devices
can go into the union.
(And i_cdev can be deleted altogether, but that is an
independent matter.)

Andries


2003-03-09 19:25:47

by Dave Jones

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 06:13:14PM +0100, Andries Brouwer wrote:

> > - /* These three should probably be a union */
> > struct list_head i_devices;
> > - struct pipe_inode_info *i_pipe;
> > - struct block_device *i_bdev;
> > - struct char_device *i_cdev;
> > -
> > + union {
> > + struct pipe_inode_info *i_pipe;
> > + struct block_device *i_bdev;
> > + struct char_device *i_cdev;
> > + } type;
>
> Not really any objection, but this is half work where
> more can be done. The comment is right: also i_devices
> can go into the union.

The different size types threw me, and I figured it
was a misplaced comment. It certainly made more sense
that way when it mentioned 'these three' rather than
'these four'. looking at bd_acquire I'm not so sure
it's as simple a job as the other three were.

> (And i_cdev can be deleted altogether, but that is an
> independent matter.)

There still seems to be some users of that, so I'll
leave that to a follow up patch, (or someone else who
really knows whats going on there).

Dave

2003-03-09 19:45:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 07:33:59PM -0100, Dave Jones wrote:
> There still seems to be some users of that, so I'll
> leave that to a follow up patch, (or someone else who
> really knows whats going on there).

No, there are no users yet. But as people seem to be eager to make
dev_t bigger we'll need when we tackle the "CIDR" thing for character
devices, too.

2003-03-09 20:21:11

by Andries Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 07:55:55PM +0000, Christoph Hellwig wrote:
> On Sun, Mar 09, 2003 at 07:33:59PM -0100, Dave Jones wrote:
> > There still seems to be some users of that, so I'll
> > leave that to a follow up patch, (or someone else who
> > really knows whats going on there).
>
> No, there are no users yet. But as people seem to be eager to make
> dev_t bigger we'll need when we tackle the "CIDR" thing for character
> devices, too.

No, I already wrote "CIDR" for character devices.
(Will submit it when 2.5.65 appears.)

There is no use for i_cdev or struct char_device today,
and I will not use it for character device registration.
So unless you or someone else decides to start using it,
it can just be thrown out.

Andries


[I already submitted the patch throwing it out, but someone,
maybe it was Roman Zippel, complained that that was going
in the wrong direction. In the very long run that may be true
(or not), but for 2.6 I do not see the point of this dead code.]

2003-03-09 22:07:51

by Roman Zippel

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

Hi,

On Sun, 9 Mar 2003, Andries Brouwer wrote:

> [I already submitted the patch throwing it out, but someone,
> maybe it was Roman Zippel, complained that that was going
> in the wrong direction. In the very long run that may be true
> (or not), but for 2.6 I do not see the point of this dead code.]

My main question here is whether that code hurts in any way? Does it
prevent other cleanups? Sure this code needs more work to be really
useful, but as long as it only wastes a bit of space, I'd prefer to keep
it.

bye, Roman

2003-03-09 22:35:16

by J.A. Magallon

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.


On 03.09, Dave Jones wrote:
> Third retry, perhaps it'll make it through to the list
> now that vger is sending mail again...
>

If you do not need gcc-2.95 support, you can use anonymous unions:

+ union {
+ struct pipe_inode_info *i_pipe;
+ struct block_device *i_bdev;
+ struct char_device *i_cdev;
+ };

so you don't even have to change the accesses. Zero cost optimization patch.

I have tested on gcc-3.0, 3.2.2, RH gcc-2.96-98, and 2.95.2. Only 2.95.2
fails.

I do not know the compiler requirements for 2.5, or if there is any agreement
in kernel developement for not doing this. If this is the case, sorry for
the noise.

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.21-pre5-jam0 (gcc 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk))

2003-03-09 23:03:13

by Dave Jones

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 11:45:52PM +0100, J.A. Magallon wrote:

> If you do not need gcc-2.95 support, you can use anonymous unions:
> I have tested on gcc-3.0, 3.2.2, RH gcc-2.96-98, and 2.95.2. Only 2.95.2
> fails.

What about the magick ancient version that sparc uses ?

Dave

2003-03-09 22:57:56

by Andries Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Sun, Mar 09, 2003 at 11:18:24PM +0100, Roman Zippel wrote:
> On Sun, 9 Mar 2003, Andries Brouwer wrote:
>
> > [I already submitted the patch throwing it out, but someone,
> > maybe it was Roman Zippel, complained that that was going
> > in the wrong direction. In the very long run that may be true
> > (or not), but for 2.6 I do not see the point of this dead code.]
>
> My main question here is whether that code hurts in any way? Does it
> prevent other cleanups? Sure this code needs more work to be really
> useful, but as long as it only wastes a bit of space, I'd prefer to keep
> it.

Yes, dead code always hurts.
In a global change - should this dead code also be updated?
To do what?

Andries


=====
if (driver->flags & TTY_DRIVER_INSTALLED)
return 0;

- error = register_chrdev(driver->major, driver->name, &tty_fops);
+ error = register_chrdev_region(driver->major, driver->minor_start,
+ driver->num, driver->name, &tty_fops);
if (error < 0)
return error;
else if(driver->major == 0)
=====
+int register_chrdev(unsigned int major, const char *name,
+ struct file_operations *fops)
+{
+ return register_chrdev_region(major, 0, 256, name, fops);
+}
=====

2003-03-09 23:49:01

by J.A. Magallon

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.


On 03.10, Dave Jones wrote:
> On Sun, Mar 09, 2003 at 11:45:52PM +0100, J.A. Magallon wrote:
>
> > If you do not need gcc-2.95 support, you can use anonymous unions:
> > I have tested on gcc-3.0, 3.2.2, RH gcc-2.96-98, and 2.95.2. Only 2.95.2
> > fails.
>
> What about the magick ancient version that sparc uses

That was my point, I do not know if some arch still requires 2.95 in 2.6.
For 2.4, it states 2.95.3 as minimun.

There is not anything newer for sparc ? Curious...

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.21-pre5-jam0 (gcc 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk))

2003-03-10 02:13:14

by Alexander Viro

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Mon, Mar 10, 2003 at 12:08:24AM +0100, Andries Brouwer wrote:
> =====
> +int register_chrdev(unsigned int major, const char *name,
> + struct file_operations *fops)
> +{
> + return register_chrdev_region(major, 0, 256, name, fops);
> +}

That's Wrong API(tm). Why do you need to keep separation between
major and first minor? Just pass dev_t start and unsigned len.

BTW, I'd like to take a look at your CIDR for chrdev - I've got one
to resurrect and it might make sense to compare-and-merge. And
yes, I'm finally back - hopefully for good.

2003-03-10 04:56:35

by Miles Bader

[permalink] [raw]
Subject: Re: struct inode size reduction.

"J.A. Magallon" <[email protected]> writes:
> That was my point, I do not know if some arch still requires 2.95 in 2.6.
> For 2.4, it states 2.95.3 as minimun.

Even if no arch `requires' gcc 2.95 (I don't really know, though until
recently my arch didn't even _have_ a [working] gcc 3.x port!), there
are certainly a buttload of people _using_ it to build the kernel, so
requiring gcc 3.2, would be kind of obnoxious.

-Miles
--
Come now, if we were really planning to harm you, would we be waiting here,
beside the path, in the very darkest part of the forest?

2003-03-10 09:20:38

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

Ha Al!
A pleasure to see you back.

> That's Wrong API(tm).
> Why do you need to keep separation between major and first minor?

In the long run things may be different, but for today it is
meaningless to have all callers pack major and minor into a dev_t
and then have this routine unpack the dev_t again.
The source gets larger, the object code gets larger, the object
code gets slower. No advantages today.

> I'd like to take a look at your CIDR for chrdev
> I've got one to resurrect and it might make sense to compare-and-merge.

Yes, maybe.

Remember my design goals. Not to make a perfect 2.8, but to give
Linux 2.6 a larger dev_t. And to do so with minimal changes -
no global restructuring - that is for 2.7 - just the minimum
to get things to work.

> I'd like to take a look

The patch on char_dev.c is a bit large, but I showed the two
essential fragments, that should be enough to see what happened.

Let me give a somewhat more explicit commented version.

Many character drivers cannot handle more than 256 minors,
so keeping the old register call and letting it mean that
one wants to register 256 minors saves lots of changes.
Only few character drivers need to be changed.

+int register_chrdev(unsigned int major, const char *name,
+ struct file_operations *fops)
+{
+ return register_chrdev_region(major, 0, 256, name, fops);
+}

The old char_dev.c contained mysterious tty related code:
- isa_tty_dev / need_serial
- in register_chrdev the test chrdevs[major].fops != fops
- in get_chrfops the test that someone commented with
/* Force request_module anyway, but what for? */

All this is because the same major is registered several
times in the tty code, and there was no mechanism to
specify minors. Today I see

# cat /proc/devices | head
Character devices:
1 mem
2 pty
3 ttyp
4 vc/0
4 vc/%d
4 ttyS%d
5 tty
5 console
5 ptmx

that majors 4 and 5 have been registered three times.

The tty code is the only code that does this, so it is
the only place where the old register_chrdev call has to
be replaced by something specifying the minor range:

@@ -2123,7 +2123,8 @@
if (driver->flags & TTY_DRIVER_INSTALLED)
return 0;

- error = register_chrdev(driver->major, driver->name, &tty_fops);
+ error = register_chrdev_region(driver->major, driver->minor_start,
+ driver->num, driver->name, &tty_fops);
if (error < 0)
return error;
else if(driver->major == 0)

Then the registration has to store some stuff

+static struct char_device_struct {
+ struct char_device_struct *next;
+ unsigned int major;
+ unsigned int baseminor;
+ int minorct;
+ const char *name;
+ struct file_operations *fops;
+} *chrdevs[MAX_PROBE_HASH];

where the index in the hash is simply done by

+static inline int major_to_index(int major)
{
+ return major % MAX_PROBE_HASH;
}

with completely arbitrary MAX_PROBE_HASH - it could be 1.

+#define MAX_PROBE_HASH 255 /* random */

and the getfops must check minor

static struct file_operations *
lookup_chrfops(unsigned int major, unsigned int minor)
{
struct char_device_struct *cd;
struct file_operations *ret = NULL;
int i;

i = major_to_index(major);

read_lock(&chrdevs_lock);
for (cd = chrdevs[i]; cd; cd = cd->next) {
if (major == cd->major &&
minor - cd->baseminor < cd->minorct) {
ret = fops_get(cd->fops);
break;
}
}
read_unlock(&chrdevs_lock);

return ret;
}

static struct file_operations *
get_chrfops(unsigned int major, unsigned int minor)
{
struct file_operations *ret = NULL;

if (!major)
return NULL;

ret = lookup_chrfops(major, minor);

#ifdef CONFIG_KMOD
if (!ret) {
char name[32];
sprintf(name, "char-major-%d", major);
request_module(name);

ret = lookup_chrfops(major, minor);
}
#endif
return ret;
}

Finally the more-or-less obvious register_chrdev_region.
That is about it.

Result: no more occurrences of MAX_BLKDEV and MAX_CHRDEV
in the kernel source. There are a few other minor things
to polish, and then the type of dev_t can be changed.

This is getting a bit long, so let me continue in another letter.

Andries

2003-03-10 09:22:49

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

So the previous letter was the constructive part.
The destructive part was the deletion of i_cdev
and related code since it does not do anything today.

Comments?

Andries

2003-03-10 10:47:42

by Roman Zippel

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

Hi,

On Mon, 10 Mar 2003, Andries Brouwer wrote:

> > My main question here is whether that code hurts in any way? Does it
> > prevent other cleanups? Sure this code needs more work to be really
> > useful, but as long as it only wastes a bit of space, I'd prefer to keep
> > it.
>
> Yes, dead code always hurts.

It's not really dead code, it's not yet used code and if it stays there as
a reminder to actually do something about it, it's good that it hurts.

>
> - error = register_chrdev(driver->major, driver->name, &tty_fops);
> + error = register_chrdev_region(driver->major, driver->minor_start,
> + driver->num, driver->name, &tty_fops);

Are that much parameters really needed?
When I look through the character device list, I basically see two usages.
1. A character device is mapped to n device numbers (where n is <= 8). In
this case it should be enough to register a really available character
device with a single device number. More can be configured e.g. through a
sysfs interface. Currently we have here misc devices users, which is
running out of number space and the other users which are often wasting a
complete major number for a few devices.
2. A large number of dynamic virtual devices (e.g. terminals), these want
a complete major anyway and currently they have to register multiple of
them.
These are the two cases a new character device core should be able to
handle. On top of this we can still think about a small compatibility
layer.

bye, Roman


2003-03-10 11:55:01

by Andries Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Mon, Mar 10, 2003 at 11:58:17AM +0100, Roman Zippel wrote:

> > - error = register_chrdev(driver->major, driver->name, &tty_fops);
> > + error = register_chrdev_region(driver->major, driver->minor_start,
> > + driver->num, driver->name, &tty_fops);
>
> Are that much parameters really needed?

Yes.

If you want to follow Al you would write
MKDEV(driver->major, driver->minor_start)
instead of this
driver->major, driver->minor_start
and you would have one parameter less.

Andries


[I wrote Al and wanted to cc l-k but typoed. Let me copy this letter
for information purposes and to show that we do not want to write MKDEV.]

-------------------------------------------------------------------------
Some more comments on types and their structure.

As you know very well, I had a setup in the ancient times
with a kdev_t that was a pointer, and

#define MINOR(dev) (dev)->minor

That was nice and well - you thought that a refcount
was needed, I thought that with the given rules for use
it was possible to prove that no refcount was needed,
and the discussion was never settled.
If we ever meet FTF we can fight that fight.

In the below I will use kdev_t again, but now with an
entirely different meaning, not a pointer but an
arithmetic type.

User space (programs, file systems) has many dev_t
instances that must remain valid. So, when dev_t
changes size the 16-bit values must keep their old
major and minor. This means that for a dev_t the
operations MINOR, MAJOR, MKDEV are not completely trivial.
They involve tests on whether the value fits into 16 bits.
E.g.

#define MKDEV(ma,mi) ((dev_t)(((((ma) & ~0xff) == 0) && (((mi) & ~0xff) == 0
)) ? (((ma) << 8) | (mi)) : (((ma) << DEV_MINOR_BITS) | (mi))))

This is fine for communication with user space, but inside
the kernel itself we like something that is smaller and faster,
like

#define __mkdev(major, minor) (((major) << KDEV_MINOR_BITS) + (minor))

So inside the kernel we must use kdev_t where user space
uses dev_t. Now both are arithmetic, but the bits are packed
differently.

Now you understand why I would like to remove the MKDEV
that you have in every call of blk_register_region.

Andries

2003-03-10 16:16:53

by Roman Zippel

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

Hi,

On Mon, 10 Mar 2003, Andries Brouwer wrote:

> > > - error = register_chrdev(driver->major, driver->name, &tty_fops);
> > > + error = register_chrdev_region(driver->major, driver->minor_start,
> > > + driver->num, driver->name, &tty_fops);
> >
> > Are that much parameters really needed?
>
> Yes.

Why? Problems are hardly solved by adding more parameters.
If going to a larger number space means, that we have to add crappy
interfaces, we should rather keep it as it is.
Why do you need to partition the number space like this? I looked at the
users in the last mail for a reason. If we're going to change the
interface, it should reflect what we will need in the future.

bye, Roman

2003-03-10 17:21:42

by Andries Brouwer

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

On Mon, Mar 10, 2003 at 05:25:49PM +0100, Roman Zippel wrote:

> On Mon, 10 Mar 2003, Andries Brouwer wrote:
>
> > > > - error = register_chrdev(driver->major, driver->name, &tty_fops);
> > > > + error = register_chrdev_region(driver->major, driver->minor_start,
> > > > + driver->num, driver->name, &tty_fops);
> > >
> > > Are that much parameters really needed?
> >
> > Yes.
>
> Why? Problems are hardly solved by adding more parameters.
> If going to a larger number space means, that we have to add crappy
> interfaces, we should rather keep it as it is.
> Why do you need to partition the number space like this? I looked at the
> users in the last mail for a reason. If we're going to change the
> interface, it should reflect what we will need in the future.

Maybe I should not react, but let me answer once more.
You do not understand the part about "small steps".

You see a future and ask why I don't jump to the future you
see. The answer is that I take small steps.

Look at the current junk in char_dev.c and cry:
if (ret && isa_tty_dev(major)) {
lock_kernel();
if (need_serial(major,minor)) {
/* Force request_module anyway, but what for? */
fops_put(ret);
ret = NULL;
}
unlock_kernel();
}
Then be happy that it is gone.

You want a different interface? My changes make it easier for you
to get there. Please go ahead.

Andries

2003-03-10 18:28:42

by Roman Zippel

[permalink] [raw]
Subject: Re: Fwd: struct inode size reduction.

Hi,

On Mon, 10 Mar 2003, Andries Brouwer wrote:

> You see a future and ask why I don't jump to the future you
> see. The answer is that I take small steps.

Oh, I agree with small steps, but you should be damned careful with
temporary interfaces, they have the bad habit of spreading and the more we
have to cleanup later.
First we need a clean core, than we can start removing the current crap
properly.

bye, Roman