2001-04-24 01:08:26

by Tim Jansen

[permalink] [raw]
Subject: Device Registry (DevReg) Patch 0.2.0

The Linux Device Registry (devreg) is a kernel patch that adds a device
database in XML format to the /proc filesystem. It collects all information
about the system's physical devices, creates persistent device ids and
provides them in the file /proc/devreg.

Devreg has three purposes:
- collect all configuration data from drivers so the user can browse his
hardware configuration.
-allow an application to display all devices that provide a certain interface
(for example all mice) so the user can chose one.
-allow an application to find the device that the user has selected after a
reboot or a hotplug action: the device files in /dev do not offer stable
names, they depend on the order in that the devices have been plugged in or
powered on.

Changes since last release (0.1.1):
- converted file format to XML
- bus-specific information from pci and usb added
- fixed locking

The patch (for 2.4.3) can be found at
http://www.tjansen.de/devreg/devreg-2.4.3-0.2.0.diff.gz
To test it, apply the patch, select CONFIG_DEVFS_FS and CONFIG_DEVREG and
compile. Note that the patch will break binary drivers.

Supported hardware in version 0.2.0: PCI subsystem, USB subsystem, most PCI
sound cards, USB HID devices, USB hubs, USB printers

Other information and a user-space library can be found at
http://www.tjansen.de/devreg


2001-04-24 09:53:24

by Martin Dalecki

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

Tim Jansen wrote:
>
> The Linux Device Registry (devreg) is a kernel patch that adds a device
> database in XML format to the /proc filesystem. It collects all information
OH SHIT!! ^^^

<IRONY>
Why don't you just add postscript output to /proc?
</IRONY>

> about the system's physical devices, creates persistent device ids and
> provides them in the file /proc/devreg.

2001-04-24 11:44:40

by Tim Jansen

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

On Tuesday 24 April 2001 11:40, Martin Dalecki wrote:
> Tim Jansen wrote:
> > The Linux Device Registry (devreg) is a kernel patch that adds a device
> > database in XML format to the /proc filesystem. It collects all
> OH SHIT!! ^^^
> Why don't you just add postscript output to /proc?

XML wasn't my first choice. The 0.1.x versions used simple name/value pairs,
I gave this up after trying to fit the complex USB
configuration/interface/endpoint data into name/value pairs. Thinking about
text file formats that allow me to display hierarchical information, XML was
the obvious choice for me. Are there alternatives to get complex and
extendable information out to user space? (see
http://www.tjansen.de/devreg/devreg.output.txt for a example /proc/devreg
output)
My other ideas were:
- using a simple binary format, just dump structs. This would break all
applications every time somebody changes the format, and this should happen
very often because of the nature of the format
- using a complicated, extendable binary format, for example chunk-based like
(a|r)iff file formats. This would add more code in the kernel than XML
output, is difficult to understand and requires more work in user space
(because XML parsers are already available)
- making up a new text-based format with properties similar to XML because I
knew that many people dont like the idea of XML output in the kernel.. I
really thought about it, but it does not make much sense.

The actual code overhead of XML output compared to a format like
/proc/bus/usb/devices is almost zero, XML is only a little bit more verbose.
I agree that XML is not perfect for this kind of data, but it is simple to
generate, well known and I dont see a better alternative.

bye..

2001-04-24 16:50:39

by mirabilos

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

> > > The Linux Device Registry (devreg) is a kernel patch that adds a
device
> > > database in XML format to the /proc filesystem. It collects all
> > OH SHIT!! ^^^
> > Why don't you just add postscript output to /proc?
>
> XML wasn't my first choice. The 0.1.x versions used simple name/value
pairs,
> I gave this up after trying to fit the complex USB
> configuration/interface/endpoint data into name/value pairs. Thinking
about
> text file formats that allow me to display hierarchical information,
XML was
> the obvious choice for me. Are there alternatives to get complex and
> extendable information out to user space? (see
> http://www.tjansen.de/devreg/devreg.output.txt for a example
/proc/devreg
> output)
> My other ideas were:
> - using a simple binary format, just dump structs. This would break
all
> applications every time somebody changes the format, and this should
happen
> very often because of the nature of the format
> - using a complicated, extendable binary format, for example
chunk-based like
> (a|r)iff file formats. This would add more code in the kernel than XML
> output, is difficult to understand and requires more work in user
space
> (because XML parsers are already available)
> - making up a new text-based format with properties similar to XML
because I
> knew that many people dont like the idea of XML output in the kernel..
I
> really thought about it, but it does not make much sense.

What about indenting? I think of 0 spaces before the device name,
1 space before properties which belong to the device. (Is one level
enough? I'm currently offline so didn't check the sample)
Structure per entry:
[Space] Name colon property
It also could be an equality sign, but then we could use no
indention at all and [] for the sections, which leaves us
at .INI format, which after all still is lotta more readable after
cat than XML.

> The actual code overhead of XML output compared to a format like
> /proc/bus/usb/devices is almost zero, XML is only a little bit more
verbose.
> I agree that XML is not perfect for this kind of data, but it is
simple to
> generate, well known and I dont see a better alternative.
>
> bye..

-mirabilos


2001-04-24 16:51:49

by Martin Dalecki

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

Tim Jansen wrote:
>
> On Tuesday 24 April 2001 11:40, Martin Dalecki wrote:
> > Tim Jansen wrote:
> > > The Linux Device Registry (devreg) is a kernel patch that adds a device
> > > database in XML format to the /proc filesystem. It collects all
> > OH SHIT!! ^^^
> > Why don't you just add postscript output to /proc?
>
> XML wasn't my first choice. The 0.1.x versions used simple name/value pairs,
> I gave this up after trying to fit the complex USB
> configuration/interface/endpoint data into name/value pairs. Thinking about
> text file formats that allow me to display hierarchical information, XML was
> the obvious choice for me. Are there alternatives to get complex and
> extendable information out to user space? (see
> http://www.tjansen.de/devreg/devreg.output.txt for a example /proc/devreg
> output)

Yes filesystem structures. Or just simple parsing in the user space
plain binary
data.

> My other ideas were:
> - using a simple binary format, just dump structs. This would break all
> applications every time somebody changes the format, and this should happen
> very often because of the nature of the format
> - using a complicated, extendable binary format, for example chunk-based like
> (a|r)iff file formats. This would add more code in the kernel than XML
> output, is difficult to understand and requires more work in user space
> (because XML parsers are already available)
> - making up a new text-based format with properties similar to XML because I
> knew that many people dont like the idea of XML output in the kernel.. I
> really thought about it, but it does not make much sense.
>
> The actual code overhead of XML output compared to a format like
> /proc/bus/usb/devices is almost zero, XML is only a little bit more verbose.
> I agree that XML is not perfect for this kind of data, but it is simple to
> generate, well known and I dont see a better alternative.
>
> bye..
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
- phone: +49 214 8656 283
- job: eVision-Ventures AG, LEV .de (MY OPINIONS ARE MY OWN!)
- langs: de_DE.ISO8859-1, en_US, pl_PL.ISO8859-2, last ressort:
ru_RU.KOI8-R

2001-04-24 18:27:09

by Tim Jansen

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

On Tuesday 24 April 2001 18:39, Martin Dalecki wrote:
> Are there alternatives to get complex and extendable information out to
> user space?
> Yes filesystem structures.


How exactly can this work? A single value per file is not very helpful if you
have a thousand values. You could cluster them (for example one level in the
XML hierarchy == one file), but this will soon get very complicated. Its much
more work to implement in the kernel, its painful in user-space and you cant
just use a text editor to look at it (because you always have to look at 10
files per device).
IMHO only a single XML file per physical device is an option, but I do not
know how to name the files...


> Or just simple parsing in the user space plain binary data.

This would be a compatibility nightmare and hard to maintain. Once you
decided for a binary format you cannot change or extend it without breaking
user-space apps. This may save a few lines code, but not many. All you need
to add a line to XML output is a sprintf and a call to devreg_write_line().

One of the ideas of devreg is that while it has a common format for generic
information, like the name and topology of physical devices, every driver can
add additional data (this is why XML namespaces are used). Currently only the
USB and PCI subsystems add data to devreg, but in future versions the device
driver itself or other subsystems should do this, too.

bye...

2001-04-24 18:57:21

by Tim Jansen

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

On Tuesday 24 April 2001 18:43, mirabilos wrote:
> What about indenting? I think of 0 spaces before the device name,
> 1 space before properties which belong to the device.
> Structure per entry:
> [Space] Name colon property

But what is the advantage? Its not less work in the kernel, and in user-space
you need to write a parser for this. You would have made a new format for
hierarchical data that no one else uses only to avoid using XML in the
kernel.


> Is one level enough? I'm currently offline so didn't check the sample

No, for example for USB you have the levels
devices/configurations/interfaces/endpoints.

bye...

2001-04-25 17:09:46

by Dan Kegel

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

Tim Jansen wrote:
> On Tuesday 24 April 2001 18:39, Martin Dalecki wrote:
> >> Are there alternatives to get complex and extendable information out to
> >> user space?
> > Yes filesystem structures.
>
> How exactly can this work? A single value per file is not very helpful if you
> have a thousand values. You could cluster them (for example one level in the
> XML hierarchy == one file), but this will soon get very complicated. Its much
> more work to implement in the kernel, its painful in user-space and you cant
> just use a text editor to look at it (because you always have to look at 10
> files per device).

The command
more foo/* foo/*/*
will display the values in the foo subtree nicely, I think.

Think of the /proc tree as the XML parse tree already exploded for you.

The only problem with /proc as it stands is that there is no formal
syntax for its entries. Some of them are hard to parse.

Before we add a new /proc entry that generates XML which summarizes
the rest of /proc, it might make sense to standardize /proc entries
and write a regression test to verify they are formatted correctly.
It would then be trivial to write a /proc to XML converter which
ran solely in userspace.

See
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0101.0/0506.html
and
http://marc.theaimsgroup.com/?l=linux-kernel&s=%2Fproc+xml

for prior discussion on the matter.

I don't want to dismiss the reasons you want to use XML for this,
but tread carefully, lest you duplicate lots of code and introduce
cruft. Better to factor the XML part out to a userspace library...

- Dan

2001-04-25 18:09:45

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Device Registry (DevReg) Patch 0.2.0

Followup to: <[email protected]>
By author: Dan Kegel <[email protected]>
In newsgroup: linux.dev.kernel
>
> The only problem with /proc as it stands is that there is no formal
> syntax for its entries. Some of them are hard to parse.
>

/proc/sys is probably the method to follow. Every item is a datum of
a simple datatype.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt

2001-04-25 18:56:10

by Tim Jansen

[permalink] [raw]
Subject: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Wednesday 25 April 2001 19:10, you wrote:
> The command
> more foo/* foo/*/*
> will display the values in the foo subtree nicely, I think.

Unfortunately it displays only the values. Dumping numbers and strings
without knowing their meaning (and probably not even the order) is not very
useful.

> Better to factor the XML part out to a userspace library...

But the one-value per file approach is MORE work. It would be less work to
create XML and factor out the directory structure in user-space :)
Devreg collects its data from the drivers, each driver should contribute the
information that it can provide about the device.
Printing a few values in XML format using the functions from xmlprocfs is as
easy as writing
proc_printf(fragment, "<usb:topology port=\"%d\" portnum=\"%d\"/>\n",
get_portnum(usbdev), usbdev->maxchild);

Extending the devreg output with driver-specific data means registering a
callback function that prints the driver's data. The driver should use its
own XML namespace, so whatever the driver adds will not break any
(well-written) user-space applications. The data is created on-demand, so the
values can be dynamic and do not waste any space when devreg is not used.

The code is easy to read and not larger than a solution that creates static
/proc entries, and holding the data completely static would take much more
memory. And it takes less code than a solution that would create the values
in /proc dynamically because this would mean one callback per file or a
complicated way to specify several values with a single callback.

bye...


2001-04-25 19:18:39

by Dan Kegel

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

Tim Jansen wrote:
>
> On Wednesday 25 April 2001 19:10, you wrote:
> > The command
> > more foo/* foo/*/*
> > will display the values in the foo subtree nicely, I think.
>
> Unfortunately it displays only the values. Dumping numbers and strings
> without knowing their meaning (and probably not even the order) is not very
> useful.

The meanings should be implied by the filenames, which are displayed (try it).
The order is alphabetical by filename.

> But the one-value per file approach is MORE work. It would be less work to
> create XML and factor out the directory structure in user-space :)
> Devreg collects its data from the drivers, each driver should contribute the
> information that it can provide about the device.
> Printing a few values in XML format using the functions from xmlprocfs is as
> easy as writing
> proc_printf(fragment, "<usb:topology port=\"%d\" portnum=\"%d\"/>\n",
> get_portnum(usbdev), usbdev->maxchild);

The corresponding one-value-per-file approach can probably be made to
be a single call per value. IMHO that's more useful; it means that
(once we agree on definitions) programs don't need to parse XML to
access this data; they can go straight to the node in the document object
model tree ( = /proc ). Think of /proc as a preparsed XML tree
that hasn't been standardized yet.

> The code is easy to read and not larger than a solution that creates static
> /proc entries, and holding the data completely static would take much more
> memory. And it takes less code than a solution that would create the values
> in /proc dynamically because this would mean one callback per file or a
> complicated way to specify several values with a single callback.

... but XML parsing is something we don't want to force on people
when we can provide the same data in a pre-parsed, much easier to access
form, IMHO.

Have you bothered to go back and read the old discussions on this topic?

> The driver should use its
> own XML namespace, so whatever the driver adds will not break any
> (well-written) user-space applications.

Are you trying to avoid writing a DTD? IMHO it would be better to
have a single DTD for the entire tree, rather than a separate
anything-goes namespace for each driver. Yes, this is more work,
but all the Linux drivers are tightly integrated into the kernel
source tree, we may as well have a tightly-integrated DTD documenting
what each block, serial, synch, etc. driver must provide.

I think we both agree that there needs to be an easy, standardized way
to access this data. IMHO there's a lot of standardizing that needs
to happen before you can start writing code -- otherwise your new code
won't help, and we'll be in the same mess we're in now.

The DTD can apply to both the existing /proc form and any proposed XML form
of config info exported by the kernel; there should be an easy transformation
between them. And it has to come first!

- Dan

2001-04-25 19:37:49

by Jesse Pollard

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

--------- Received message begins Here ---------

>
> On Wednesday 25 April 2001 19:10, you wrote:
> > The command
> > more foo/* foo/*/*
> > will display the values in the foo subtree nicely, I think.
>
> Unfortunately it displays only the values. Dumping numbers and strings
> without knowing their meaning (and probably not even the order) is not very
> useful.
>
> > Better to factor the XML part out to a userspace library...
>
> But the one-value per file approach is MORE work. It would be less work to
> create XML and factor out the directory structure in user-space :)
> Devreg collects its data from the drivers, each driver should contribute the
> information that it can provide about the device.
> Printing a few values in XML format using the functions from xmlprocfs is as
> easy as writing
> proc_printf(fragment, "<usb:topology port=\"%d\" portnum=\"%d\"/>\n",
> get_portnum(usbdev), usbdev->maxchild);
>
> Extending the devreg output with driver-specific data means registering a
> callback function that prints the driver's data. The driver should use its
> own XML namespace, so whatever the driver adds will not break any
> (well-written) user-space applications. The data is created on-demand, so the
> values can be dynamic and do not waste any space when devreg is not used.
>
> The code is easy to read and not larger than a solution that creates static
> /proc entries, and holding the data completely static would take much more
> memory. And it takes less code than a solution that would create the values
> in /proc dynamically because this would mean one callback per file or a
> complicated way to specify several values with a single callback.

Personally, I think

proc_printf(fragment, "%d %d",get_portnum(usbdev), usbdev->maxchild);

(or the string "dddd ddd" with d representing a digit)

is shorter (and faster) to parse with

fscanf(input,"%d %d",&usbdev,&maxchild);

Than it would be to try parsing

<usb:topology port="ddddd" portnum="dddd">

with an XML parser.

Sorry - XML is good for some things. It is not designed to be a
interface language between a kernel and user space.

I am NOT in favor of "one file per value", but structured data needs
to be written in a reasonable, concise manner. XML is intended for
communication between disparate systems in an exreemly precise manner
to allow some self documentation to be included when the communication
fails.

Even Lisp S expressions are easier :-)

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]

Any opinions expressed are solely my own.

2001-04-25 20:07:27

by Dan Kegel

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

Jesse Pollard wrote:
> Personally, I think
> proc_printf(fragment, "%d %d",get_portnum(usbdev), usbdev->maxchild);
> (or the string "dddd ddd" with d representing a digit)
>
> is shorter (and faster) to parse with
> fscanf(input,"%d %d",&usbdev,&maxchild);
>
> Than it would be to try parsing
> <usb:topology port="ddddd" portnum="dddd">
> with an XML parser.
>
> Sorry - XML is good for some things. It is not designed to be a
> interface language between a kernel and user space.
>
> I am NOT in favor of "one file per value", but structured data needs
> to be written in a reasonable, concise manner. XML is intended for
> communication between disparate systems in an exreemly precise manner
> to allow some self documentation to be included when the communication
> fails.

Agreed.

But one thing XML provides (potentially) is a DTD that defines meanings and formats.
IMHO the kernel needs something like this for /proc (though not in DTD format!).

Has anyone ever tried to write a formal syntax for all the entries
in /proc? We have bits and pieces of /proc documentation in
/usr/src/linux/Documentation, but nothing you could feed directly
into a parser generator. It'd be neat to have a good definition for /proc
in the LSB, and have an LSB conformance test that could look in
/proc and say "Yup, all the entries there conform to the spec and can
be parsed properly."

(http://www.pathname.com/fhs/2.2-beta/fhs-2.2-beta.txt mentions /proc,
but doesn't standardize any of it, except to suggest that /etc/mtab
can be a symbolic link to /proc/mounts.)
- Dan

2001-04-25 20:40:49

by Tim Jansen

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Wednesday 25 April 2001 21:37, you wrote:
> Personally, I think
>> proc_printf(fragment, "%d %d",get_portnum(usbdev), usbdev->maxchild);
> is shorter (and faster) to parse with
> fscanf(input,"%d %d",&usbdev,&maxchild);

Right, but what happens if you need to extend the format? For example
somebody adds support for USB 2.0 to the kernel and you need to some new
values. Then you would have the choice between changing the format and
breaking applications or keeping the format and dont provide the additional
information.
With XML (or single-value-per-file) it is easy to tell application to ignore
unknown tags (or files). When you just list values you will be damned sooner
or later, unless you make up additional rules that say how apps should handle
these cases. And then your approach is no longer simple, but possibly even
more complicated

bye...

2001-04-25 21:09:28

by Dan Kegel

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

Jesse Pollard wrote:
> > But one thing XML provides (potentially) is a DTD that defines meanings and formats.
> > IMHO the kernel needs something like this for /proc (though not in DTD format!).
> >
> > Has anyone ever tried to write a formal syntax for all the entries
> > in /proc? We have bits and pieces of /proc documentation in
> > /usr/src/linux/Documentation, but nothing you could feed directly
> > into a parser generator. It'd be neat to have a good definition for /proc
> > in the LSB, and have an LSB conformance test that could look in
> > /proc and say "Yup, all the entries there conform to the spec and can
> > be parsed properly."...
>
> From one point of view (that of the /proc entries...) each file
> is by definition in the proper format. That format is specified
> (in the /proc interface to the driver). Using "proc_printf" is a
> specification for the output.

When two different distributions ship different forks of
the kernel source, which differ in the arguments passed to proc_printf,
which one is right?
There's no way to tell. That's why saying "the source is the spec" doesn't cut it.

Also, the source is not a specification a parser generator can use.

A formal spec for /proc entries maintained by e.g. the LSB is needed;
it has to be separate from the source code (to avoid forking problems),
and it should be machine-readable (so we can build parsers from it).

> That DOES NOT mean that no improvements are possible. If the formats
> used by the various modules/drivers has some variation in format from
> access to acess, then the determination of that format must also be
> included. From what I've seen (via "cat /proc/....") the files all
> have a fixed format. Sometimes the number of entries varies, but then
> the count should ALSO be included in the file (in a known place of
> course). The multi-entry files I've looked at (/proc/net) reach the
> EOF to end the list. This is not unreasonable.

Yeah, there's a general style that seems to work; it just needs to be
formalized.

> I'm not sure of the usefullness of the title lines that are printed. If
> looked at in raw form, yes the titles are nice. But the utilities
> that are aimed at examining the values should not have to discard them, nor
> should the drivers have to generate them.

I think they're good; they're a little bit like the XML tags you're proposing.

> I can live with them anyway, since they are already there.....
>
> The biggest problem I know of is being able to retrieve structure
> in an atomic manner. Not easy (in any system, not just Linux).

Something SNMP doesn't deal well with, either. People seem to cope,
though.

- Dan

2001-04-25 21:16:33

by Jesse Pollard

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

Tim Jansen <[email protected]>:
> On Wednesday 25 April 2001 21:37, you wrote:
> > Personally, I think
> >> proc_printf(fragment, "%d %d",get_portnum(usbdev), usbdev->maxchild);
> > is shorter (and faster) to parse with
> > fscanf(input,"%d %d",&usbdev,&maxchild);
>
> Right, but what happens if you need to extend the format? For example
> somebody adds support for USB 2.0 to the kernel and you need to some new
> values. Then you would have the choice between changing the format and
> breaking applications or keeping the format and dont provide the additional
> information.
> With XML (or single-value-per-file) it is easy to tell application to ignore
> unknown tags (or files). When you just list values you will be damned sooner
> or later, unless you make up additional rules that say how apps should handle
> these cases. And then your approach is no longer simple, but possibly even
> more complicated

Not necessarily. If the "extended data" is put following the current data
(since the data is currently record oriented) just making the output
format longer will not/should not casue problems in reading the data.
Just look at FORTRAN for an example of a extensible input :-) More data
on the record will/should just be ignored. The only coding change might
be to use a fgets to read a record, followed by a sscanf to get the known
values.

Alternatively, you can always put one value per record:
tag:value
tag2:value2...

This is still simpler than XML to read, and to generate.

The problem with this and XML is the same - If the tag is no longer relevent
(or changes its name), then the output must either continue to include it, or
break applications that depend on that tag.

In all cases, atomic extraction of the structured data will be problematical
since there may be buffering issues in output. XML is very verbose, and the
tagged format better; but a series of values goes even farther...

Try them out - Just go through the /proc/net formats and stick in the
XML... Just don't count on the regular utilities to decode them. It would
give some actual results to compair with the current structure.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]

Any opinions expressed are solely my own.

2001-04-25 21:50:35

by J.A. Magallon

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)


On 04.25 Jesse Pollard wrote:
>
> Alternatively, you can always put one value per record:
> tag:value
> tag2:value2...
>
> This is still simpler than XML to read, and to generate.
>

Just my two cents.

It looks clear that /proc is for programs, not for humans. So the best format
for proc is just binary values. So programs can read it quickly, even in
a chunk if they know the format. But sometimes it is usefull to do a cat on
a /proc entry.

Question: it is possible to redirect the same fs call (say read) to different
implementations, based on the open mode of the file descriptor ? So, if
you open the entry in binary, you just get the number chunk, if you open
it in ascii you get a pretty printed version, or a format description like
Bus: %d
Device: %h
..
to 'vprintf' the values.

--
J.A. Magallon # Let the source
mailto:[email protected] # be with you, Luke...

Linux werewolf 2.4.3-ac14 #1 SMP Wed Apr 25 02:07:45 CEST 2001 i686

2001-04-25 21:58:56

by Doug McNaught

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

"J . A . Magallon" <[email protected]> writes:

> Question: it is possible to redirect the same fs call (say read) to different
> implementations, based on the open mode of the file descriptor ? So, if
> you open the entry in binary, you just get the number chunk, if you open
> it in ascii you get a pretty printed version, or a format description like

There is no distinction between "text" and "binary" modes on a file
descriptor. The distinction exists in the C stdio layer, but is a
no-op on Unix systems.

-Doug
--
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time... --Dylan

2001-04-25 22:03:57

by J.A. Magallon

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)


On 04.25 Doug McNaught wrote:
> "J . A . Magallon" <[email protected]> writes:
>
> > Question: it is possible to redirect the same fs call (say read) to
> different
> > implementations, based on the open mode of the file descriptor ? So, if
> > you open the entry in binary, you just get the number chunk, if you open
> > it in ascii you get a pretty printed version, or a format description like
>
> There is no distinction between "text" and "binary" modes on a file
> descriptor. The distinction exists in the C stdio layer, but is a
> no-op on Unix systems.
>

Yep, realized after the post, fopen() is a wrapper for open(). The idea
is to (someway) set the proc entry in verbose vs fast-binary mode for
reads. Perhaps an ioctl() or an fcntl() or something similar.
So the verbose mode gives the field names, and the binary mode just
gives the numbers. Applications that know what are reading can just
read binary data, and fast.

--
J.A. Magallon # Let the source
mailto:[email protected] # be with you, Luke...

Linux werewolf 2.4.3-ac14 #1 SMP Wed Apr 25 02:07:45 CEST 2001 i686

2001-04-25 22:09:19

by Marko Kreen

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Thu, Apr 26, 2001 at 12:03:25AM +0200, J . A . Magallon wrote:
>
> On 04.25 Doug McNaught wrote:
> > "J . A . Magallon" <[email protected]> writes:
> >
> > > Question: it is possible to redirect the same fs call (say read) to
> > different
> > > implementations, based on the open mode of the file descriptor ? So, if
> > > you open the entry in binary, you just get the number chunk, if you open
> > > it in ascii you get a pretty printed version, or a format description like
> >
> > There is no distinction between "text" and "binary" modes on a file
> > descriptor. The distinction exists in the C stdio layer, but is a
> > no-op on Unix systems.
> >
>
> Yep, realized after the post, fopen() is a wrapper for open(). The idea
> is to (someway) set the proc entry in verbose vs fast-binary mode for
> reads. Perhaps an ioctl() or an fcntl() or something similar.
> So the verbose mode gives the field names, and the binary mode just
> gives the numbers. Applications that know what are reading can just
> read binary data, and fast.

Eh. Search in archives for "ascii is tough"...

--
marko

2001-04-25 22:25:21

by Mark Hahn

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

> > Question: it is possible to redirect the same fs call (say read) to different
> > implementations, based on the open mode of the file descriptor ? So, if
> > you open the entry in binary, you just get the number chunk, if you open
> > it in ascii you get a pretty printed version, or a format description like
>
> There is no distinction between "text" and "binary" modes on a file
> descriptor. The distinction exists in the C stdio layer, but is a
> no-op on Unix systems.

of course. but we could trivially define O_PROC_BINARY,
or an ioctl/fcntl, or even do something fancy like use lseek().

pardon my stream of consciousness here, but:

I think it's well-established that proc exists for humans,
and that there's no real sympathy for the eternal whines of
how terribly hard it is to parse. it's NOT hard to parse,
but would be more trivial if it were more consistent.

the main goal at this point is to make kernel proc-related
code more efficient, easy-to-use, etc. a purely secondary goal
is to make user-space tools more robust, efficient, and simpler.

there are three things that need to be communicated through the proc
interface, for each chunk of data: its type, it's name and its value.
it's critical that data be tagged in some way, since that's the only
way to permit back-compatibility. that is, a tool looking for a particular
tag will naturally ignore new data with other tags.

/proc/sys is an attempt to provide tagged data; it works well, is
easy to comprehend, but requires an open for each datum, and provides
no hints about type.

/proc/cpuinfo is another attempt: "tag : data", with no attempt to
provide types. the tags have also mutated somewhat over time.

/proc/partitions is an example of a record-oriented file:
one line per record, and tags for the record members at the top.
still no typing information.

I have a sense that all of these could be collapsed into a single
api where kernel systems would register hierarchies of tuples of
<type,tag,callback>, where callback would be passed the tag,
and proc code would take care of "rendering" the data into
human readable text (default), binary, or even xml. the latter
would require some signalling mechanism like O_PROC_XML or the like.
further, programs could perform a meta-query, where they ask for
the types and tags of a datum (or hierarchy), so that on subsequent
queries, they'd now how to handle binary data.

if only one piece of code handled the rendering of /proc stuff,
it could do more, without burdoning all the disparate /proc producers.

regards, mark hahn.


2001-04-25 22:42:52

by Alexander Viro

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)



On Thu, 26 Apr 2001, J . A . Magallon wrote:

>
> On 04.25 Doug McNaught wrote:
> > "J . A . Magallon" <[email protected]> writes:
> >
> > > Question: it is possible to redirect the same fs call (say read) to
> > different
> > > implementations, based on the open mode of the file descriptor ? So, if
> > > you open the entry in binary, you just get the number chunk, if you open
> > > it in ascii you get a pretty printed version, or a format description like
> >
> > There is no distinction between "text" and "binary" modes on a file
> > descriptor. The distinction exists in the C stdio layer, but is a
> > no-op on Unix systems.
> >
>
> Yep, realized after the post, fopen() is a wrapper for open(). The idea
> is to (someway) set the proc entry in verbose vs fast-binary mode for
> reads. Perhaps an ioctl() or an fcntl() or something similar.
> So the verbose mode gives the field names, and the binary mode just
> gives the numbers. Applications that know what are reading can just
> read binary data, and fast.

OK, _what_ applications spend a considerable time (and considerable
percentage of the total execution time) parsing stuff in /proc?
ps(1)? top(1)? Fine. They touch how many files outside of /proc/<pid>/* ?
Exactly.

_Please_, drop this idiotic "parsing ASCII is slow" strawman. Or show some
valid examples.

2001-04-25 22:46:32

by Tim Jansen

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Wednesday 25 April 2001 23:16, you wrote:
> Not necessarily. If the "extended data" is put following the current data
> (since the data is currently record oriented) just making the output
> format longer will not/should not casue problems in reading the data.
> Alternatively, you can always put one value per record:
> tag:value
> tag2:value2...

Both solutions only work for simple data, they dont help for more complex
things like adding a variable-sized list of structures. Actually the first
devreg version used something like your second proposal and I gave it up
because it wasnt flexible enough to add USB configuration data.

bye...

2001-04-25 23:08:58

by Tim Jansen

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Wednesday 25 April 2001 21:19, you wrote:
> The corresponding one-value-per-file approach can probably be made to
> be a single call per value.

Yes, the real problem is writing a callback-based filesystem (unless you want
to hold everything in memory). After thinking about it for the last two hours
I already find the one-value-per-file approach not as hard to do as I did
before, but it's still a lot of work.


> Have you bothered to go back and read the old discussions on this topic?

Yes. But in my case is different than, for example, the files in /proc/sys:
- the file names in /proc/sys are static. For devreg the filenames must be
made dynamically (similar to the /proc process directories or usbdevfs)
- in /proc/sys there is just one piece for code responsible for every file or
directory and no cooperation between different parts. If devreg creates, for
example, a directory for a USB mouse it must be prepared to share this
directory with the USB subsystem, the input subsystem and the USB hid driver.
All four modules are responsible for their own files.
- files and their content should be created on demand, so there must be some
callback to tell the USB subsystem something like "the user just opened the
directory of device X, please tell me which directories or files you want to
add".

It is certainly possible to convert devreg to the one-value-per-file approach
and if this is all that it takes to get into some future (2.5) kernel I will
do it. I just doubt that this is the easiest way to implement the
functionality, because that's what I really want.


> Are you trying to avoid writing a DTD?

Yes, at least a have a complete DTD, because it would be a nightmare to
maintain it. Each time somebody adds a new capability to a driver the DTD
would have to be updated. And what about drivers that are not part of the
official kernel?
I thought about using a separate XML Schema definition for each namespace
though.

bye...

2001-04-26 01:08:54

by Dan Kegel

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

Mark Hahn wrote:
> the main goal at this point is to make kernel proc-related
> code more efficient, easy-to-use, etc. a purely secondary goal
> is to make user-space tools more robust, efficient, and simpler.
>
> there are three things that need to be communicated through the proc
> interface, for each chunk of data: its type, it's name and its value.
> it's critical that data be tagged in some way, since that's the only
> way to permit back-compatibility. that is, a tool looking for a particular
> tag will naturally ignore new data with other tags.

Agreed.

> [three example schemes in use in /proc today]
> I have a sense that all of these could be collapsed into a single
> api where kernel systems would register hierarchies of tuples of
> <type,tag,callback>, where callback would be passed the tag,
> and proc code would take care of "rendering" the data into
> human readable text (default), binary, or even xml.

Sounds reasonable to me. Relieve the modules of having to
format their /proc entries by defining standard code that does
it. And as an extra bonus, if tuples registration was table-driven,
the tables would define a grammar that could be fed to a parser
generator.

(It sounds a little bit like the snmpd code I'm working on,
actually. How eerie.)

(It also sounds a little like (gasp) the windows registry,
but hey, that's ok.)

- Dan

2001-04-26 14:09:07

by Tim Jansen

[permalink] [raw]
Subject: Re: /proc format (was Device Registry (DevReg) Patch 0.2.0)

On Thursday 26 April 2001 00:24, Mark Hahn wrote:
> I have a sense that all of these could be collapsed into a single
> api where kernel systems would register hierarchies of tuples of
> <type,tag,callback>, where callback would be passed the tag,

You also need to know the parent of the tuple to build a hierarchy. And it
should be possible to create lists.

The callback prototypes for values could look like this:

int proc_value_cb_string(char *buf, void *context); // writes string to buf,
// returns len of string or negative value for error
int proc_value_cb_int(int *value, void *context);


For parent/directory tuples you would provide two additional callbacks that
set the context for their children and maybe take care of other things like
locking (so they dont need to be done in every single value callback):

void *proc_value_cb_level_enter(void *old_context); // returns new context
void proc_value_cb_level_leave(void *old_context, void *new_context);


For tuples with a list there would be two callbacks to get the list elements:

int proc_value_cb_list_num(void *context); // returns number of elements
void *proc_value_cb_list_context(int index, void *context); // returns context
// of the element at the given index or NULL


To register such a tuple you would have the following functions:
void proc_value_register_string(parent_handle_t parent,
const char *name,
proc_value_cb_string cb);
void proc_value_register_int(parent_handle_t parent,
const char *name,
proc_value_cb_int cb);
parent_handle_t proc_value_register_parent(parent_handle_t parent,
const char *name,
proc_value_cb_level_enter cb1,
proc_value_cb_level_leave cb2);
parent_handle_t proc_value_register_list(parent_handle_t h,
proc_value_cb_list_num cbnum,
proc_value_cb_list_context cbcon);

This is the simplest API that I can imagine for this. The only problem is
that you need to write a callback for each value (file). Just printing XML
still looks easier to me...


> and proc code would take care of "rendering" the data into
> human readable text (default), binary, or even xml. the latter
> would require some signalling mechanism like O_PROC_XML or the like.

Then you can argue that once you have a single format implemented in the
kernel you can convert it to whatever you like in user-space. And it seems
like the decision for "one-value-per-file" in /proc has already been made
(please correct me if not and we start all over again), so I will try to make
a generic API like the one above for it.


> further, programs could perform a meta-query, where they ask for
> the types and tags of a datum (or hierarchy), so that on subsequent
> queries, they'd now how to handle binary data.

That would undermine the only advantage of binary data: it's easy (and
fast) to dump or read a C struct. Not that I would really care for binary
data...

bye...