2002-01-04 00:16:32

by Eric S. Raymond

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Joerg Schilling <[email protected]>:
> The way /proc works has been introduced by Plan 9 in the first half
> of the 80s. What Linux added as an abuse of the /proc filesytem in
> principle is a Plan 9 idea too. It makes sense to have something
> similar, but please please _not_ inside the /proc tree.
>
> Sun is planning to have /sys with similar backgound in a future
> version of Solaris so it wouls make sense to talk to the Solaris
> kernel kackers to have a common way to go for the new /sys tree.

Well, hell. If the "/proc is a blight on the face of the planet" ranting that
I've been hearing is just about the *name* /proc, then let's separate the
name issue from the content issue.

The kind of non-per-process information that is now in /proc needs to still
be there for many purposes; autoconfiguration is the one that is bugging
me right now, but cluster management is just as important.

If moving /proc/cpuinfo to /sys/cpuinfo means people will stop trying to make
the cpuinfo information go away, then By all means let's move it.

I want /sys/dmi, too.

I'm willing to write up a proposal for /sys that would migrate the `unclean'
/proc stuff over to ./sys, and I'm willing to write the kernel patches to
implement the renaming.

I'm motivated to attack this right now because it touches the work I'm
doing on kernel autoconfiguration.

(Copied to the linux-kernel mailing list because of a parallel argument
happening there...)
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The common argument that crime is caused by poverty is a kind of
slander on the poor.
-- H. L. Mencken


2002-01-04 00:57:13

by Alexander Viro

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo



On Thu, 3 Jan 2002, Eric S. Raymond wrote:

> Well, hell. If the "/proc is a blight on the face of the planet" ranting that
> I've been hearing is just about the *name* /proc, then let's separate the
> name issue from the content issue.

It's more than just a name.
a) granularity. Current "all or nothing" policy in procfs has
a lot of obvious problems.
b) tree layout policy (lack thereof, to be precise).
c) horribly bad layout of many, many files. Any file exported by
kernel should be treated as user-visible API. As it is, common mentality
is "it's a common dump; anything goes here". Inconsistent across
architectures for no good reason, inconsistent across kernel versions,
just plain stupid, choke-full of buffer overruns...

Fixing these problems will _hurt_. Badly. We have to do it, but it
won't be fast and it certainly won't happen overnight.

2002-01-04 01:06:53

by Eric S. Raymond

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Alexander Viro <[email protected]>:
> It's more than just a name.
> a) granularity. Current "all or nothing" policy in procfs has
> a lot of obvious problems.
> b) tree layout policy (lack thereof, to be precise).
> c) horribly bad layout of many, many files. Any file exported by
> kernel should be treated as user-visible API. As it is, common mentality
> is "it's a common dump; anything goes here". Inconsistent across
> architectures for no good reason, inconsistent across kernel versions,
> just plain stupid, choke-full of buffer overruns...
>
> Fixing these problems will _hurt_. Badly. We have to do it, but it
> won't be fast and it certainly won't happen overnight.

I'm willing to work on this. Is there anywhere I can go to read up on
current proposals before I start coding?
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

When all government ...in little as in great things... shall be drawn to
Washington as the center of all power; it will render powerless the checks
provided of one government on another, and will become as venal and oppressive
as the government from which we separated." -- Thomas Jefferson, 1821

2002-01-04 02:01:47

by Timothy Covell

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

On Thursday 03 January 2002 18:56, Alexander Viro wrote:
> On Thu, 3 Jan 2002, Eric S. Raymond wrote:
> > Well, hell. If the "/proc is a blight on the face of the planet" ranting
> > that I've been hearing is just about the *name* /proc, then let's
> > separate the name issue from the content issue.
>
> It's more than just a name.
> a) granularity. Current "all or nothing" policy in procfs has
> a lot of obvious problems.
> b) tree layout policy (lack thereof, to be precise).
> c) horribly bad layout of many, many files. Any file exported by
> kernel should be treated as user-visible API. As it is, common mentality
> is "it's a common dump; anything goes here". Inconsistent across
> architectures for no good reason, inconsistent across kernel versions,
> just plain stupid, choke-full of buffer overruns...
>
> Fixing these problems will _hurt_. Badly. We have to do it, but it
> won't be fast and it certainly won't happen overnight.


Talking from the SysAdmin point of view, procfs is one of the truly
cool things which separates Linux from the others. I'd rather tune
/proc/sys stuff than use sysctl or Solaris' funky /etc/system and
ndd crap. It's the next best thing to "point and click" without going
over to the dark side.

Sure /system is a better name (extra typing becaue we can't have
/sys/sys can we??).

And while you all are at it, why not take a look at some of the naming
conventions that BeOS makes too. I'm _not_ being sarcastic.


Example1: Excellent devfs layout.

Example 2: BeOS root directory is a ramfs off of which the
other drives/filesystems are mounted. (I haven't thought
this one out too much but I could image that it would make
some things easier.)

Example 3: Kernel Modules are in the directory:
/boot/beos/system/add-ons/kernel
Perhaps we could have directories something like:

/boot/kernel
/boot/grub
/boot/lilo
/dev using devfs !
/etc
/home
/system/config/sys
/net
/system/modules/kernelversion/ (modules in devfs similar tree)
/system/info (for cpuinfo, ioports, meminfo, filesystems, etc.)
/sbin (or even in /system/bin ???)
/tmp
/usr
/var

Example 3: BeOS moves /usr/local stuff to more of a per user
configuration where each user has a $HOME/config directory.
Of course, we would put things like .Xdefaults, kde, gnome, etc.
directories here which vary according to user while still keeping
/usr/local for all users.

My ~/config contains things like "find ~/config -type d | hand edit some"

config/add-ons/media/decoders
config/add-ons/media/encoders
config/add-ons/media/extractors
config/add-ons/media/writers
config/add-ons/net_server
config/be/Applications
config/be/Demos
config/be/Preferences
config/bin
config/boot (Things my personal boot/login preferences)
config/doc
config/doc/postgresql
config/doc/postgresql/html
config/documentation
config/etc
config/fonts
config/include
config/include/openssl
config/include/postgresql
config/include/postgresql/lib
config/include/postgresql/libpq
config/lib
config/lib/perl5
config/lib/perl5/5.00503
config/lib/perl5/site_perl
config/lib/perl5/site_perl/5.005/BePC-beos
config/man
config/man/man1
config/servers
config/settings
config/settings/beos_mime
config/settings/beos_mime/application
config/settings/beos_mime/audio
config/settings/beos_mime/image
config/settings/beos_mime/message
config/settings/beos_mime/text
config/settings/beos_mime/video
config/share
config/share/postgresql
config/ssl
config/ssl/certs
config/ssl/lib
config/ssl/man
config/ssl/man/man1


Of course, on heavily user subscribed systems, some sort of
NT like COW technique might be nesc. if too many file duplications
occur in the ~/config directories. Having a good /usr/local would
prevent much of this growth, at least in theory. As would strict
quotas. :-)



Just some thoughts.

--
[email protected].

2002-01-04 08:18:26

by Erik Andersen

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

On Thu Jan 03, 2002 at 07:52:07PM -0500, Eric S. Raymond wrote:
> Alexander Viro <[email protected]>:
> > It's more than just a name.
> > a) granularity. Current "all or nothing" policy in procfs has
> > a lot of obvious problems.
> > b) tree layout policy (lack thereof, to be precise).
> > c) horribly bad layout of many, many files. Any file exported by
> > kernel should be treated as user-visible API. As it is, common mentality
> > is "it's a common dump; anything goes here". Inconsistent across
> > architectures for no good reason, inconsistent across kernel versions,
> > just plain stupid, choke-full of buffer overruns...
> >
> > Fixing these problems will _hurt_. Badly. We have to do it, but it
> > won't be fast and it certainly won't happen overnight.
>
> I'm willing to work on this. Is there anywhere I can go to read up on
> current proposals before I start coding?

I once wrote up /dev/ps and /dev/mounts drivers to eliminate proc
for embedded systems (pointer available if you care). It was not
warmly received, but I did form some opinions in the process.

The main things to think about are
1) machine readability
Generally speaking the kernel gods have decided that
ASCII is good, binary structures and such are bad (think
endiannes, nfs exports, and similar oddness).
2) typing
Right now, if some /proc file prints a number, user space
has to go digging about in the kernel sources to find
what type that thing is -- int, uint, long, long long, etc.
Cant tell without digging in the source. And what if
someone then changes the type next week -- userspace
then overflows.
3) field length
When coping a string from /proc (say /proc/mounts),
userspace has to go digging in the kernel source to
find the field length. So if I copy things into a
static buffer, I may be fine. Till someone changes
the kernel to print out a bit more stuff. Then I've
either got a buffer overflow (if I can't code) or a
truncated string. Either way, its a problem.

So what is needed is a kernelfs virtual filesystem that provides
kernel info to user space.

It needs a format that provides information as an organized
directory hierarchy, which each directory and filename
identifying the nature of the provided information. Files should
provide information in ASCII with one value per file (to avoid
all the tedious parsing), but also provides along with that bit
of information type and or/length information.

In some cases I guess we may also need more complex classes on
information. (lists of key-value stuff for example).

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

2002-01-04 12:33:14

by Eric S. Raymond

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Erik Andersen <[email protected]>:
> I once wrote up /dev/ps and /dev/mounts drivers to eliminate proc
> for embedded systems (pointer available if you care). It was not
> warmly received, but I did form some opinions in the process.

Sure, I'd like to see this work.

> The main things to think about are
> 1) machine readability
> Generally speaking the kernel gods have decided that
> ASCII is good, binary structures and such are bad (think
> endiannes, nfs exports, and similar oddness).

I agree with this decision. Binary structures would be false economy,
trading away readability and flexibility for a marginal speed gain.

> 2) typing
> Right now, if some /proc file prints a number, user space
> has to go digging about in the kernel sources to find
> what type that thing is -- int, uint, long, long long, etc.
> Cant tell without digging in the source. And what if
> someone then changes the type next week -- userspace
> then overflows.

I'm not very worried about this. On modern machines int == long
and the only case that's a potential headache is long long. If
longer than int-size data is labeled, we'll be OK.

> 3) field length
> When coping a string from /proc (say /proc/mounts),
> userspace has to go digging in the kernel source to
> find the field length. So if I copy things into a
> static buffer, I may be fine.

I think the right answer to this is usually "don't use a language that
has static buffers". :-)

> So what is needed is a kernelfs virtual filesystem that provides
> kernel info to user space.

I don't care what it's called. I've seen `sys', 'system', and 'archfs'
thrown around.

> It needs a format that provides information as an organized
> directory hierarchy, which each directory and filename
> identifying the nature of the provided information. Files should
> provide information in ASCII with one value per file (to avoid
> all the tedious parsing), but also provides along with that bit
> of information type and or/length information.
>
> In some cases I guess we may also need more complex classes on
> information. (lists of key-value stuff for example).

One value per *file*? That seems excessively fine-grained. Sometimes
you want multiple values per file because the information is a functional
unit for reporting to humans.
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Americans have the will to resist because you have weapons.
If you don't have a gun, freedom of speech has no power.
-- Yoshimi Ishikawa, Japanese author, in the LA Times 15 Oct 1992

2002-01-04 13:11:39

by Andreas Schwab

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

"Eric S. Raymond" <[email protected]> writes:

|> I'm not very worried about this. On modern machines int == long

You mean alpha, ia64, ppc64, s390x, x68-64 are not modern machines?

Andreas.

--
Andreas Schwab "And now for something
[email protected] completely different."
SuSE Labs, SuSE GmbH, Schanz?ckerstr. 10, D-90443 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5

2002-01-04 13:17:41

by Eric S. Raymond

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Andreas Schwab <[email protected]>:
> |> I'm not very worried about this. On modern machines int == long
>
> You mean alpha, ia64, ppc64, s390x, x68-64 are not modern machines?

Well, S390 certainly isn't! :-)

If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
compiler ports on those machines used the native register size same as
everybody else.
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

All forms of government are pernicious, including good government.
-- Edward Abbey

2002-01-04 13:25:31

by Andreas Schwab

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

"Eric S. Raymond" <[email protected]> writes:

|> Andreas Schwab <[email protected]>:
|> > |> I'm not very worried about this. On modern machines int == long
|> >
|> > You mean alpha, ia64, ppc64, s390x, x68-64 are not modern machines?
|>
|> Well, S390 certainly isn't! :-)

s390x is the new 64 bit architecture.

|> If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
|> compiler ports on those machines used the native register size same as
|> everybody else.

On all those architectures the ABI used on Linux has int == 32 bits and
long == 64 bits. LP64 is more usefull in most cases than ILP64.

Andreas.

--
Andreas Schwab "And now for something
[email protected] completely different."
SuSE Labs, SuSE GmbH, Schanz?ckerstr. 10, D-90443 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5

2002-01-04 13:27:31

by Andreas Jaeger

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

"Eric S. Raymond" <[email protected]> writes:

> Andreas Schwab <[email protected]>:
>> |> I'm not very worried about this. On modern machines int == long
>>
>> You mean alpha, ia64, ppc64, s390x, x68-64 are not modern machines?
>
> Well, S390 certainly isn't! :-)

s390x - the zSeries - is newer ;-)

> If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
> compiler ports on those machines used the native register size same as
> everybody else.

All the ports Andreas mentioned use 32-bit int and 64-bit longs.

Andreas
--
Andreas Jaeger
SuSE Labs [email protected]
private [email protected]
http://www.suse.de/~aj

2002-01-04 13:36:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

In article <[email protected]> you wrote:
> If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
> compiler ports on those machines used the native register size same as
> everybody else.

ANY Linux for to a 64bit machines use the LP64 programming model which
means that long != int.

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2002-01-04 15:35:47

by Luigi Genoni

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo



On Fri, 4 Jan 2002, Eric S. Raymond wrote:

> Andreas Schwab <[email protected]>:
> > |> I'm not very worried about this. On modern machines int == long
> >
> > You mean alpha, ia64, ppc64, s390x, x68-64 are not modern machines?
>
> Well, S390 certainly isn't! :-)
>
> If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
> compiler ports on those machines used the native register size same as
> everybody else.
No, and the last troubles I had with reiserFS on sparc64 were exaclty
because of this.
in 2.4.17 s_properties is declared as unsigned int, while it should be
an unsigned long. On x86 that is not aproblem at all, on all 64 bits CPUs
reiserFS is unusable if you do not make a little patch.


2002-01-04 15:46:37

by Jeff Garzik

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

"Eric S. Raymond" wrote:
> I'm not very worried about this. On modern machines int == long

I have been attempting to hammer this incorrect assumption out of
people's brains for years, and have submitted many patches to Linus [1]
over time, removing such crud from the kernel.

Such an assumption is blatantly non-portable, rendering your code
fragile.

Jeff, longtime alpha owner



[1] and other userland maintainers

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2002-01-04 16:52:03

by Alan

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

> If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
> compiler ports on those machines used the native register size same as
> everybody else.

Nobody I am aware of uses 64bit int default types on a 64bit platform. Its
a waste of memory, bus bandwidth and instruction bandwidth. In almost
all cases a 32bit int is quite adequate and since size_t can be 64bit when
int is 32bit life works out nicely.

Alan

2002-01-04 18:45:16

by Eric S. Raymond

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Alan Cox <[email protected]>:
> Nobody I am aware of uses 64bit int default types on a 64bit platform. Its
> a waste of memory, bus bandwidth and instruction bandwidth. In almost
> all cases a 32bit int is quite adequate and since size_t can be 64bit when
> int is 32bit life works out nicely.

Thanks for the education.
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A wise and frugal government, which shall restrain men from injuring
one another, which shall leave them otherwise free to regulate their
own pursuits of industry and improvement, and shall not take from the
mouth of labor the bread it has earned. This is the sum of good
government, and all that is necessary to close the circle of our
felicities.
-- Thomas Jefferson, in his 1801 inaugural address

2002-01-04 19:35:31

by Erik Andersen

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

On Fri Jan 04, 2002 at 07:19:40AM -0500, Eric S. Raymond wrote:
> Erik Andersen <[email protected]>:
> > I once wrote up /dev/ps and /dev/mounts drivers to eliminate proc
> > for embedded systems (pointer available if you care). It was not
> > warmly received, but I did form some opinions in the process.
>
> Sure, I'd like to see this work.

http://busybox.net/cgi-bin/cvsweb/busybox/examples/kernel-patches/

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

2002-01-04 21:45:12

by Ville Herva

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

On Fri, Jan 04, 2002 at 05:02:34PM +0000, you [Alan Cox] claimed:
> > If the PPC etc. have 32-bit ints then I stand corrected, but I thought the
> > compiler ports on those machines used the native register size same as
> > everybody else.
>
> Nobody I am aware of uses 64bit int default types on a 64bit platform. Its
> a waste of memory, bus bandwidth and instruction bandwidth. In almost
> all cases a 32bit int is quite adequate and since size_t can be 64bit when
> int is 32bit life works out nicely.

I *think* long is 32 bit on Windows XP 64bit, though. I imagine they went
with this hack to ensure backward compability or something. Can't tell for
sure since the IA64 box lying around hasn't got a bootable Windows on it
yet, just linux :).

http://msdn.microsoft.com/library/en-us/win64/64bitwin_4d0z.asp?frame=true


-- v --

[email protected]

2002-01-04 22:20:15

by H. Peter Anvin

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

Followup to: <[email protected]>
By author: Ville Herva <[email protected]>
In newsgroup: linux.dev.kernel
> >
> > Nobody I am aware of uses 64bit int default types on a 64bit platform. Its
> > a waste of memory, bus bandwidth and instruction bandwidth. In almost
> > all cases a 32bit int is quite adequate and since size_t can be 64bit when
> > int is 32bit life works out nicely.
>
> I *think* long is 32 bit on Windows XP 64bit, though. I imagine they went
> with this hack to ensure backward compability or something. Can't tell for
> sure since the IA64 box lying around hasn't got a bootable Windows on it
> yet, just linux :).
>
> http://msdn.microsoft.com/library/en-us/win64/64bitwin_4d0z.asp?frame=true
>

Yes, 'doze uses int == long == 32 bits, long long == void * == 64
bits. This is because the 'doze API has a bunch of really bogus
assumptions hard-coded in it, back from the days when "portable" in
the M$ world meant "don't use int; use `short' for 16 bits and `long'
for 32 bits."

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-01-07 03:08:01

by Rusty Russell

[permalink] [raw]
Subject: Re: LSB1.1: /proc/cpuinfo

On Thu, 3 Jan 2002 19:56:51 -0500 (EST)
Alexander Viro <[email protected]> wrote:

> It's more than just a name.
> a) granularity. Current "all or nothing" policy in procfs has
> a lot of obvious problems.
> b) tree layout policy (lack thereof, to be precise).
> c) horribly bad layout of many, many files. Any file exported by

As usual, Al has hit the highpoints (five lines vs. >> 1000 msgs of proc
flamewars over time). At risk of boring regular readers, I shall expand:

There is /proc, and /proc/sys. /proc is a pain to use in the kernel (seq_*
made this better recently, but far from perfect), but is flexible.
/proc/sys (aka sysctl) is easier to use, but a PITA for dynamic entries.

The "manual formatting" nature of /proc entries has lead to (c) mentioned by
Al. This can be alleviated by making the simplest method of exporting data
the correct one (ie. more like /proc/sys).

The tree layout issues are more complicated. In particular, the following
namespaces should be equivalent:
Boot command line: 3c509.debug=1
Module parameter: insmod 3c509 debug=1
proc entry: echo 1 > .../3c509/debug

Finally, I consider the granularity issue a red-herring: if it's in the
kernel, it should be in a logical location.

Now, I have a sample patch for a simple "/proc/sys" replacement which follows
the "one value per file" (similar to the current proc/sys) and
"dynamic is easy" principle (required for widespread use). I also have
module loader rewrite and boot param unification patches.

http://www.kernel.org/pub/linux/kernel/people/rusty

Important to realize that we will be stuck with the current interfaces for
another stable kernel version (backwards compatibility is a WIP).

Once Linus accepts general patches again I shall start pushing things to
him.

Hope that helps,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.