2005-03-19 18:00:22

by Jan Engelhardt

[permalink] [raw]
Subject: Relayfs question

Hello,


according to the relayfs description on opersys.com,

|As the Linux kernel matures, there is an ever increasing number of facilities
|and tools that need to relay large amounts of data from kernel space to user
|space. Up to this point, each of these has had its own mechanism for relaying
|data. To supersede the individual mechanisms, we introduce the "high-speed
|data relay filesystem" (relayfs). As such, things like LTT, printk, EVLog,
|etc.

This sounds to me like it would obsolete most character-based devices, e.g.
random and urandom.

What do the relayfs developers say to this?


Jan Engelhardt
--


2005-03-19 19:10:17

by Baruch Even

[permalink] [raw]
Subject: Re: Relayfs question

Jan Engelhardt wrote:
> according to the relayfs description on opersys.com,
>
> |As the Linux kernel matures, there is an ever increasing number of facilities
> |and tools that need to relay large amounts of data from kernel space to user
> |space. Up to this point, each of these has had its own mechanism for relaying
> |data. To supersede the individual mechanisms, we introduce the "high-speed
> |data relay filesystem" (relayfs). As such, things like LTT, printk, EVLog,
> |etc.
>
> This sounds to me like it would obsolete most character-based devices, e.g.
> random and urandom.
>
> What do the relayfs developers say to this?

I'm not a relayfs developer, just a happy user...

The latest relayfs versions are slimmed down of the original and are
unlikely to be useful as a character-based device, but are much better
as a data-transport facility.

There is no longer any interface for character based reading so it can't
be used for the device replace purposes.

The current method is to just manage buffers and enable applications to
mmap the buffers to read them with some signalling on when a buffer is
to be read and when the kernel can overwrite it.

A character device is unlikely to need such interface since you do want
16 bytes of random data and not several pages of mapped random numbers.
If you really need a lot of random numbers you need something in
user-space anyway since you'll deplete the kernel entropy pool pretty
fast anyway.

If you have a device that needs to transfer lots of data doesn't mind it
being batched and doesn't really need the character device interface
then relayfs could be useful.

Baruch

2005-03-19 19:17:13

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Relayfs question

Hi,

>[...]
> The current method is to just manage buffers and enable applications to mmap
> the buffers to read them with some signalling on when a buffer is to be read
> and when the kernel can overwrite it.
>
> A character device is unlikely to need such interface since you do want 16
> bytes of random data and not several pages of mapped random numbers. If you
> really need a lot of random numbers you need something in user-space anyway
> since you'll deplete the kernel entropy pool pretty fast anyway.
>
> If you have a device that needs to transfer lots of data doesn't mind it being
> batched and doesn't really need the character device interface then relayfs
> could be useful.

Ok, urandom was a bad example. I have my tty logger (ttyrpld.sf.net) which
moves a lot of data (depends) to userspace. It uses a ring buffer of "fixed"
size (set at module load time). Apart from that relayfs could use a dynamic
sized ring buffer, I would not see any need to move it to relayfs, would you?



Jan Engelhardt
--

2005-03-19 20:42:17

by Karim Yaghmour

[permalink] [raw]
Subject: Re: Relayfs question


Jan Engelhardt wrote:
> Ok, urandom was a bad example. I have my tty logger (ttyrpld.sf.net) which
> moves a lot of data (depends) to userspace. It uses a ring buffer of "fixed"
> size (set at module load time). Apart from that relayfs could use a dynamic
> sized ring buffer, I would not see any need to move it to relayfs, would you?

First, please note that the info on Opersys' site is out-of-date. While
it was relevant while we were still maintaining relayfs separately, it
has somewhat lost its relevance since we started posting the most up-to-
date code directly to LKML. For one thing, the dynamic resizing was
dropped very early in relayfs' inclusion review.

What relayfs does, and does very well, is move very large amounts of
data out of the kernel and make them available to user-space with very
little overhead. In the actual case of your tty logger, I've browsed
through the code briefly, and I think that with relayfs you should be
able to:
- Get rid of half the code:
- No need to manage your own user/kernel-buffer boundary (Most of the
code in uio_*()).
- No need to do any buffer management at all.
- Get better performance out of your logging functions.
- Get per-cpu buffers for free.

Basically, all the transport code you are doing in the kernel side of
your logger would be taken care of by relayfs. And given that there are
a lot of people doing similar ad-hoc buffering code, it just makes
sense to have one well-tested yet generic mechanism. Have a look at
Documentation/filesystems/relayfs.txt for the API details.

On a separate yet related topic:
Looking closer at rpldev.c, I believe that you'll be able to get rid of
it entirely (or very close to) once I actually get the time to refactor
the tracing code in LTT to make it generic. What I intend to do is to
obsolete the need for functions like your kio_*, and make it all
automatically generated at build time (you'll still to add the
instrumentation, but won't need to hand-code the callbacks). This is
still on the top of my to-do list and I should be able to get to this
shortly.

Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546

2005-03-19 20:51:23

by Karim Yaghmour

[permalink] [raw]
Subject: Re: Relayfs question


Karim Yaghmour wrote:
> What relayfs does, and does very well, is move very large amounts of
> data out of the kernel and make them available to user-space with very
> little overhead. In the actual case of your tty logger, I've browsed
> through the code briefly, and I think that with relayfs you should be
> able to:

Just to avoid any confusion, note that I'm referring mainly to rpldev.c,
which is the kernel-side driver for the logger, I haven't looked at any
of the user tools.

Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546

2005-03-19 21:08:20

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Relayfs question


>> Ok, urandom was a bad example. I have my tty logger (ttyrpld.sf.net) which
>> moves a lot of data (depends) to userspace. It uses a ring buffer [...]
>[...]
>Basically, all the transport code you are doing in the kernel side of
>your logger would be taken care of by relayfs. And given that there are
>a lot of people doing similar ad-hoc buffering code, it just makes
>sense to have one well-tested yet generic mechanism. Have a look at
>Documentation/filesystems/relayfs.txt for the API details.

Well, what about things like urandom? It also moves "a lot" of data and does
nothing else.

>[...]
>Just to avoid any confusion, note that I'm referring mainly to rpldev.c,
>which is the kernel-side driver for the logger, I haven't looked at any
>of the user tools.

The userspace daemon just read()s the device and analyzes it. Nothing to
optimize there, with respect to relayfs, I think.



Jan Engelhardt
--

2005-03-20 03:47:15

by Karim Yaghmour

[permalink] [raw]
Subject: Re: Relayfs question


Jan Engelhardt wrote:
> Well, what about things like urandom? It also moves "a lot" of data and does
> nothing else.

Forgive my slowness today, but I don't get the angle here:
- Relayfs is not a replacement for char devices, we've never claimed it
to be.
- Urandom generates a lot of data, and uses copy_to_user() to get it to
user-space, but it isn't a generalized buffering mechanism for
transfering large amounts of data to user-space.

If what you're inquiring about is a comparison between relayfs'
mechanisms and the underlying mechanisms that urandom is using, then
I don't think there can be a comparison: the goals are different.

For example, urandom relies on a global spin lock and uses copy_to_user()
for its transfers. This is just fine for this type of application. If
you wanted to transfer a huge amount of data from the kernel to user-
space (the kind of data generated by tracing facilities, for example),
however, these mechanisms would be simply inadequate. If we're generating
the amount of data LTT can gather, for example, (say 2MB/s as was
described in the earlier thread regarding relayfs), then you need per-cpu
buffering and you need to not write anything back to user-space, but
dump it to disk ASAP, etc. This is where relayfs comes in handy.

On the other hand, using relayfs to replace what urandom currently uses
is just the wrong thing to do. If nothing else, /dev/urandom would
behave entirely differently (API, dynamics, etc.). There would also be
no clear added benefit for using relayfs.

What character drivers do (mainly copy_to_user()) and what relayfs is
used for are entirely different. To use a slightly exagerated example
to illustrate the difference: replacing the standard mechanisms drivers
use to transfer data to user-space with relayfs would be like renting
a supersonic jet to get your package to a foreign country instead of
just using Fedex. It works ... but it's clearly the wrong approach.

Please read relayfs.txt.

Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546

2005-03-21 07:48:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Relayfs question

On Sat, Mar 19, 2005 at 10:08:13PM +0100, Jan Engelhardt wrote:
>
> Well, what about things like urandom? It also moves "a lot" of data and does
> nothing else.
>

If you're using urandom to move "a lot" of data, you're using it
wrong. That's not what it is supposed to be for; I can't think of a
valid use of /dev/urandom that would use more than, say, 256 bytes
(2048 bits), and most sanely written users of /dev/urandom only need
16 bytes (i.e., 128 bits). Anything more than that, and you should be
using a userpsace PRNG or CRNG, and **not** /dev/urandom.

- Ted