The following 5 patches implement relayfs, adding a dynamic channel
resizing capability to the previously posted version.
relayfs is a filesystem designed to provide an efficient mechanism for
tools and facilities to relay large amounts of data from kernel space
to user space. Full details can be found in Documentation/filesystems/
relayfs.txt. The current version can always be found at
http://www.opersys.com/relayfs.
I'll be posting shortly a version of printk that replaces the static
printk buffer with a dynamically resizing relayfs channel.
Relayfs is also used as the buffering mechanism for recent development
versions of the Linux Trace Toolkit (LTT).
These patches are for 2.6.0-test1.
Known problems - using read() to read from a resized channel may
result in garbage. This doesn't affect any current clients though
i.e. the dynamic printk I'll be posting or the Linux Trace Toolkit.
Thanks to Rusty Russell for picking some nits in the previous version.
Comments welcome.
--
Regards,
Tom Zanussi <[email protected]>
IBM Linux Technology Center/RAS
On Tue, 2003-07-15 at 16:15, Tom Zanussi wrote:
> The following 5 patches implement relayfs, adding a dynamic channel
> resizing capability to the previously posted version.
>
> relayfs is a filesystem designed to provide an efficient mechanism for
> tools and facilities to relay large amounts of data from kernel space
> to user space. Full details can be found in Documentation/filesystems/
> relayfs.txt. The current version can always be found at
> http://www.opersys.com/relayfs.
Could this be used to replace mmap() packet socket, how does it compare?
--
// Gianni Tedesco (gianni at scaramanga dot co dot uk)
lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D
Gianni Tedesco writes:
> On Tue, 2003-07-15 at 16:15, Tom Zanussi wrote:
> > The following 5 patches implement relayfs, adding a dynamic channel
> > resizing capability to the previously posted version.
> >
> > relayfs is a filesystem designed to provide an efficient mechanism for
> > tools and facilities to relay large amounts of data from kernel space
> > to user space. Full details can be found in Documentation/filesystems/
> > relayfs.txt. The current version can always be found at
> > http://www.opersys.com/relayfs.
>
> Could this be used to replace mmap() packet socket, how does it compare?
I think so - you could send high volumes of packet traffic to a bulk
relayfs channel and read it from the mmap'ed relayfs file in user
space. The Linux Trace Toolkit does the same thing with large volumes
of trace data - you could look at that code as an example
(http://www.opersys.com/relayfs/ltt-on-relayfs.html).
Tom
>
> --
> // Gianni Tedesco (gianni at scaramanga dot co dot uk)
> lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
> 8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D
>
--
Regards,
Tom Zanussi <[email protected]>
IBM Linux Technology Center/RAS
On Tue, 2003-07-15 at 17:01, Tom Zanussi wrote:
> Gianni Tedesco writes:
> > On Tue, 2003-07-15 at 16:15, Tom Zanussi wrote:
> > > The following 5 patches implement relayfs, adding a dynamic channel
> > > resizing capability to the previously posted version.
> > >
> > > relayfs is a filesystem designed to provide an efficient mechanism for
> > > tools and facilities to relay large amounts of data from kernel space
> > > to user space. Full details can be found in Documentation/filesystems/
> > > relayfs.txt. The current version can always be found at
> > > http://www.opersys.com/relayfs.
> >
> > Could this be used to replace mmap() packet socket, how does it compare?
>
> I think so - you could send high volumes of packet traffic to a bulk
> relayfs channel and read it from the mmap'ed relayfs file in user
> space. The Linux Trace Toolkit does the same thing with large volumes
> of trace data - you could look at that code as an example
> (http://www.opersys.com/relayfs/ltt-on-relayfs.html).
What are the semantics of the mmap'ing the buffer? With mmaped packet
socket the userspace (read-side) requires no sys-calls apart from when
the buffer is empty, it then uses poll(2) to sleep until something new
is put in the buffer. Can relayfs do a similar thing? poll is not
mentioned in the docs...
Thanks.
--
// Gianni Tedesco (gianni at scaramanga dot co dot uk)
lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D
Gianni Tedesco writes:
> On Tue, 2003-07-15 at 17:01, Tom Zanussi wrote:
> > Gianni Tedesco writes:
> > >
> > > Could this be used to replace mmap() packet socket, how does it compare?
> >
> > I think so - you could send high volumes of packet traffic to a bulk
> > relayfs channel and read it from the mmap'ed relayfs file in user
> > space. The Linux Trace Toolkit does the same thing with large volumes
> > of trace data - you could look at that code as an example
> > (http://www.opersys.com/relayfs/ltt-on-relayfs.html).
>
> What are the semantics of the mmap'ing the buffer? With mmaped packet
> socket the userspace (read-side) requires no sys-calls apart from when
> the buffer is empty, it then uses poll(2) to sleep until something new
> is put in the buffer. Can relayfs do a similar thing? poll is not
> mentioned in the docs...
You're right - I haven't implemented poll() in the relayfs VFS code
yet. I plan on doing that next, but won't have much time for the next
couple of weeks. Currently, you'd have to do something like LTT does,
which is have the kernel side signal the read-side when data is ready.
Tom
>
> Thanks.
>
> --
> // Gianni Tedesco (gianni at scaramanga dot co dot uk)
> lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
> 8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D
>
--
Regards,
Tom Zanussi <[email protected]>
IBM Linux Technology Center/RAS
Gianni Tedesco writes:
> On Tue, 2003-07-15 at 17:01, Tom Zanussi wrote:
> > >
> > > Could this be used to replace mmap() packet socket, how does it compare?
> >
> > I think so - you could send high volumes of packet traffic to a bulk
> > relayfs channel and read it from the mmap'ed relayfs file in user
> > space. The Linux Trace Toolkit does the same thing with large volumes
> > of trace data - you could look at that code as an example
> > (http://www.opersys.com/relayfs/ltt-on-relayfs.html).
>
> What are the semantics of the mmap'ing the buffer? With mmaped packet
> socket the userspace (read-side) requires no sys-calls apart from when
> the buffer is empty, it then uses poll(2) to sleep until something new
> is put in the buffer. Can relayfs do a similar thing? poll is not
> mentioned in the docs...
Just thinking a bit more about implementing poll() - that part should
be pretty simple, but how does the read-side know how much to read,
unless it's reading fixed-size blocks? To do that without using a
system call or IOCTL or sysfs file, you could reserve a part of the
channel or a part of each sub-buffer (relayfs channels are subdivided
into a number of sub-buffers) for info the read-side would need in
order to read what's ready. The rchan_start_reserve, start_reserve,
and end_reserve parameters to relay_open() are used for this purpose.
The one complication here is that if an event won't fit into the
current sub-buffer, the remainder will be filled with filler, and the
event will be put into the following sub-buffer. The amount of filler
is available from within the buffer_end() callback, and this value
should also be part of the data the read-side would need in order to
process the event. The size lost to filler shouldn't be an issue if
you use a large enough sub-buffer size and should be pretty easy to
account for in your user-side application.
LTT makes use of all of the above, but as a 'bulk' client, meaning it
processes complete sub-buffers, doesn't need to be super-efficient on
sub-buffer boundaries, and can get away with using a signal/syscall
protocol. dynamic printk, another relayfs client, is a 'packet'
client, meaning its readers are notified after each completed write,
but its userspace app (klogd) basically loops on read(2). What I
think you want is to be a 'packet' client, and to implement your
protocol using space reserved within the mmapped buffer. I think
relayfs has all the elements you'd need to do this pretty easily, with
the exception of the poll(2) support, which is at the top of my todo
list.
Hope this helps,
Tom
>
> Thanks.
>
> --
> // Gianni Tedesco (gianni at scaramanga dot co dot uk)
> lynx --source http://www.scaramanga.co.uk/gianni-at-ecsc.asc | gpg --import
> 8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D
>
--
Regards,
Tom Zanussi <[email protected]>
IBM Linux Technology Center/RAS
Shouldn't relayfs be in the "Pseudo filesystems" part of Kconfig.
Also don't need the .ko suffix.
diff -Nru a/fs/Kconfig b/fs/Kconfig
--- a/fs/Kconfig Wed Jul 16 14:49:58 2003
+++ b/fs/Kconfig Wed Jul 16 14:49:58 2003
@@ -881,6 +881,26 @@
say M here and read <file:Documentation/modules.txt>. The module
will be called ramfs.
+config RELAYFS_FS
+ tristate "Relayfs file system support"
+ ---help---
+ Relayfs is a high-speed data relay filesystem designed to provide
+ an efficient mechanism for tools and facilities to relay large
+ amounts of data from kernel space to user space. It's not useful
+ on its own, and should only be enabled if other facilities that
+ need it are enabled, such as for example dynamic printk or the
+ Linux Trace Toolkit.
+
+ See <file:Documentation/filesystems/relayfs.txt> for further
+ information.
+
+ This file system is also available as a module ( = code which can be
+ inserted in and removed from the running kernel whenever you want).
+ The module is called relayfs. If you want to compile it as a
+ module, say M here and read <file:Documentation/modules.txt>.
+
+ If unsure, say N.
+
endmenu
menu "Miscellaneous filesystems"
@@ -1220,26 +1240,6 @@
will be called sysv.
If you haven't heard about all of this before, it's safe to say N.
-
-config RELAYFS_FS
- tristate "Relayfs file system support"
- ---help---
- Relayfs is a high-speed data relay filesystem designed to provide
- an efficient mechanism for tools and facilities to relay large
- amounts of data from kernel space to user space. It's not useful
- on its own, and should only be enabled if other facilities that
- need it are enabled, such as for example dynamic printk or the
- Linux Trace Toolkit.
-
- See <file:Documentation/filesystems/relayfs.txt> for further
- information.
-
- This file system is also available as a module ( = code which can be
- inserted in and removed from the running kernel whenever you want).
- The module is called relayfs.ko. If you want to compile it as a
- module, say M here and read <file:Documentation/modules.txt>.
-
- If unsure, say N.
config UFS_FS
tristate "UFS file system support (read only)"
Since relay_open takes a pathname, it should be "const char *".
diff -Nru a/fs/relayfs/relay.c b/fs/relayfs/relay.c
--- a/fs/relayfs/relay.c Wed Jul 16 15:23:36 2003
+++ b/fs/relayfs/relay.c Wed Jul 16 15:23:36 2003
@@ -660,7 +660,7 @@
* locking scheme can use buffers of any size, but is hardcoded at 2.
*/
static struct rchan *
-rchan_create(char *chanpath,
+rchan_create(const char *chanpath,
int bufsize_lockless,
int nbufs_lockless,
int bufsize_locking,
@@ -829,11 +829,11 @@
* to create the file.
*/
static int
-rchan_create_dir(char * chanpath,
- char **residual,
+rchan_create_dir(const char * chanpath,
+ const char **residual,
struct dentry **topdir)
{
- char *cp = chanpath, *next;
+ const char *cp = chanpath, *next;
struct dentry *parent = NULL;
int len, err = 0;
@@ -867,12 +867,12 @@
* Returns 0 if successful, negative otherwise.
*/
static int
-rchan_create_file(char * chanpath,
+rchan_create_file(const char * chanpath,
struct dentry **dentry,
struct rchan * data)
{
int err;
- char * fname;
+ const char * fname;
struct dentry *topdir;
err = rchan_create_dir(chanpath, &fname, &topdir);
@@ -1239,7 +1239,7 @@
* cause the channel to wrap around continuously.
*/
int
-relay_open(char *chanpath,
+relay_open(const char *chanpath,
int bufsize_lockless,
int nbufs_lockless,
int bufsize_locking,
@@ -1556,7 +1556,7 @@
int err = 0;
int try_bufcount, cur_bufno = 0, include_nbufs = 1;
u32 cur_idx, buf_size;
- size_t avail_count, avail_in_buf;
+ size_t avail_count = 0, avail_in_buf;
int unused_bytes = 0;
if (rchan->bufs_produced < rchan->n_bufs)
diff -Nru a/include/linux/relayfs_fs.h b/include/linux/relayfs_fs.h
--- a/include/linux/relayfs_fs.h Wed Jul 16 15:23:36 2003
+++ b/include/linux/relayfs_fs.h Wed Jul 16 15:23:36 2003
@@ -531,7 +531,7 @@
* High-level relayfs kernel API, fs/relayfs/relay.c
*/
extern int
-relay_open(char *chanpath,
+relay_open(const char *chanpath,
int bufsize_lockless,
int nbufs_lockless,
int bufsize_locking,
Stephen Hemminger wrote:
> Shouldn't relayfs be in the "Pseudo filesystems" part of Kconfig.
> Also don't need the .ko suffix.
>
Yeah, I think that makes more sense. And thanks for your const
filename patch too.
Tom