LinuxLists.cc - read() on relayfs channel returns premature 0

2005-03-23 09:12:17

Subject: read() on relayfs channel returns premature 0

Hi

I'm using relayfs to relay data from a kernel module to user space on
a SuSE 2.6.5 kernel. I'm not absolutely sure what version of relayfs
has been back ported to it.

While reading data from the channel I've been seeing read() return 0
prematurely. However, the 0 does not signify that the file is being
closed for there is still data available afterwards.

I've noticed that zeros occur when roughly one page of data has been
read. I suspect that they occur whenever there is a read across the
relayfs sub-buffers.

Now I understand that this is not the latest release of relayfs (there
are the redux patches, which I have yet to try). Nonetheless I'd like
to know whether this behaviour is deliberate. Is it?

Thanks,
--
Kingsley

P.S. I've been able to get around this by deliberately modifying
do_read() with the attached patch. I'm not absolutely sure its
correct but it seems to work.

Attachments:

(No filename) (909.00 B)
relayfs-premature-zero.patch (1.31 kB)
Download all attachments

2005-03-23 15:31:21

by Tom Zanussi

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

[email protected] writes:
>
> Now I understand that this is not the latest release of relayfs (there
> are the redux patches, which I have yet to try). Nonetheless I'd like
> to know whether this behaviour is deliberate. Is it?

Nope, looks like you've found a bug - thanks for the patch. Any
chance you can send me your module so I can easily reproduce the
problem and test the fix?

Thanks,

Tom

2005-03-24 01:39:53

by Kingsley Cheung

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

On Wed, Mar 23, 2005 at 09:29:12AM -0600, Tom Zanussi wrote:
> [email protected] writes:
> >
> > Now I understand that this is not the latest release of relayfs (there
> > are the redux patches, which I have yet to try). Nonetheless I'd like
> > to know whether this behaviour is deliberate. Is it?
>
> Nope, looks like you've found a bug - thanks for the patch. Any
> chance you can send me your module so I can easily reproduce the
> problem and test the fix?
>
> Thanks,
>
> Tom
>

Tom,

Yes, well a cut down version of the kernel module anyway. Its in the
attached tar ball. The kernel module requires pagg as well as relayfs
to work. Basically the module tracks kernel events - in this case
process execs.

I've tested it on 2.6.10 with the pagg and relayfs patches from

http://www.opersys.com/ftp/pub/relayfs/patch-relayfs-2.6.10-050113

and

ftp://oss.sgi.com/projects/pagg/download/linux-2.6.10-pagg.patch-4

read() gives me a zero still once about a page of data has been read.

Many thanks,
--
Kingsley

Attachments:

(No filename) (1.01 kB)
RelayfsModule.tar.bz2 (2.57 kB)
Download all attachments

2005-03-24 06:57:05

by Jan Engelhardt

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

1>http://www.opersys.com/ftp/pub/relayfs/patch-relayfs-2.6.10-050113
2>ftp://oss.sgi.com/projects/pagg/download/linux-2.6.10-pagg.patch-4

I guess (2) is already in 2.6.11, because I do not get any compile and load
errors (good so!), is it?

Jan Engelhardt
--

2005-03-24 19:12:45

by Tom Zanussi

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

Kingsley Cheung writes:
> On Wed, Mar 23, 2005 at 09:29:12AM -0600, Tom Zanussi wrote:
> > [email protected] writes:
> > >
> > > Now I understand that this is not the latest release of relayfs (there
> > > are the redux patches, which I have yet to try). Nonetheless I'd like

[...]

>
> I've tested it on 2.6.10 with the pagg and relayfs patches from
>
> http://www.opersys.com/ftp/pub/relayfs/patch-relayfs-2.6.10-050113
>
> and
>
> ftp://oss.sgi.com/projects/pagg/download/linux-2.6.10-pagg.patch-4
>
> read() gives me a zero still once about a page of data has been read.
>

Thanks - I've tried it out and haven't immediately been able to
reproduce the problem yet - I'll do some more pounding on it though
when I get the chance.

BTW, I just want to point out that there aren't any problems with
read() in the version of relayfs included in the -mm tree (i.e. the
'redux' version), since of course it doesn't support read().

I went ahead and did a quick port of your stuff to the new version of
relayfs, which you can find here:

http://prdownloads.sourceforge.net/dprobes/RelayfsModule-new.tar.bz2?download

There's a README in the tarball with some notes on building and
running it. It includes both the kernel and user-side code, which makes
use of the relay-apps code and documentation found here (I've included
the necessary files in the RelayfsModule-new tarball so you don't have
to actually get this):

http://prdownloads.sourceforge.net/dprobes/relay-apps-0.2.tar.gz?download

Hopefully the new version will still be useful for what you're trying
to do, but it does differ in a couple important ways from the old
version, and points up the fact that the new relayfs really is now
much more specialized for high-volume applications -

- your old version used 'packet' mode with read(). The new relayfs
only supports 'bulk' mode with mmap(), which means it's not really
useful for notification of single events. I'm not sure how important
that is for your application - if you're mainly interested in
collecting data, you can certainly use it even for low-volume
applications, but the events will be sent to user space in 'blocks'.
You can modify the user space app to process blocks of events as they
come in, and play around with buffer sizes to get them more often, but
it's not quite the same thing.

- your old version used a single buffer, while the new relayfs only
supports per-cpu buffers, which means you'd have to sort out the
events in user space and impose some ordering using timestamps for
instance if your data doesn't already have a natural ordering (BTW,
the new relayfs doesn't have an option any longer to do automatic
timestamping either).

I'll continue maintaining the old relayfs for existing users (so
thanks for the patch and the test code) so if the new version doesn't
fit your needs, you'll still have the old version to fall back on.

Thanks,

Tom

2005-03-24 19:49:44

by Jan Engelhardt

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

>BTW, I just want to point out that there aren't any problems with
>read() in the version of relayfs included in the -mm tree (i.e. the
>'redux' version), since of course it doesn't support read().

Hm? Relayfs does not support a `cat /dev/relay/AChannelName` anymore?

>I'll continue maintaining the old relayfs for existing users (so
>thanks for the patch and the test code) so if the new version doesn't
>fit your needs, you'll still have the old version to fall back on.

Do you have the "new relayfs" as a "normal" file (outside any revision
control system), e.g. a diff patch?

>Hopefully the new version will still be useful for what you're trying to do
>- your old version used 'packet' mode with read(). The new relayfs
>only supports 'bulk' mode with mmap()
>- your old version used a single buffer, while the new relayfs only
>supports per-cpu buffers
>(BTW, the new relayfs doesn't have an option any longer to do automatic
>timestamping either).

I wanted to port the kernel side of my keylogger over to relayfs, but given
that API and functionality have now changed I am somewhat reluctant to do it.

Jan Engelhardt
--

2005-03-25 12:16:54

by Karim Yaghmour

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

Jan Engelhardt wrote:
> Hm? Relayfs does not support a `cat /dev/relay/AChannelName` anymore?

This was a requirement for it to be included.

Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546

2005-03-28 23:55:56

by Kingsley Cheung

[permalink] [raw]

Subject: Re: read() on relayfs channel returns premature 0

On Thu, Mar 24, 2005 at 01:11:39PM -0600, Tom Zanussi wrote:
> Kingsley Cheung writes:
> > On Wed, Mar 23, 2005 at 09:29:12AM -0600, Tom Zanussi wrote:
> > > [email protected] writes:
> > > >
> > > > Now I understand that this is not the latest release of relayfs (there
> > > > are the redux patches, which I have yet to try). Nonetheless I'd like
>
> [...]
>
> >
> > I've tested it on 2.6.10 with the pagg and relayfs patches from
> >
> > http://www.opersys.com/ftp/pub/relayfs/patch-relayfs-2.6.10-050113
> >
> > and
> >
> > ftp://oss.sgi.com/projects/pagg/download/linux-2.6.10-pagg.patch-4
> >
> > read() gives me a zero still once about a page of data has been read.
> >
>
> Thanks - I've tried it out and haven't immediately been able to
> reproduce the problem yet - I'll do some more pounding on it though
> when I get the chance.

Ok. It should be fairly easy to reproduce. On a dual 400MHz PII I've
been able to get the zero by running one reader of the channel in
parallel with a simple shell script like "while :; do ps -ef >
/dev/null; done".

>
> BTW, I just want to point out that there aren't any problems with
> read() in the version of relayfs included in the -mm tree (i.e. the
> 'redux' version), since of course it doesn't support read().

Ok.

> I went ahead and did a quick port of your stuff to the new version of
> relayfs, which you can find here:
>
> http://prdownloads.sourceforge.net/dprobes/RelayfsModule-new.tar.bz2?download
>
> There's a README in the tarball with some notes on building and
> running it. It includes both the kernel and user-side code, which makes
> use of the relay-apps code and documentation found here (I've included
> the necessary files in the RelayfsModule-new tarball so you don't have
> to actually get this):
>
> http://prdownloads.sourceforge.net/dprobes/relay-apps-0.2.tar.gz?download
>
> Hopefully the new version will still be useful for what you're trying
> to do, but it does differ in a couple important ways from the old
> version, and points up the fact that the new relayfs really is now
> much more specialized for high-volume applications -
>
> - your old version used 'packet' mode with read(). The new relayfs
> only supports 'bulk' mode with mmap(), which means it's not really
> useful for notification of single events. I'm not sure how important
> that is for your application - if you're mainly interested in
> collecting data, you can certainly use it even for low-volume
> applications, but the events will be sent to user space in 'blocks'.
> You can modify the user space app to process blocks of events as they
> come in, and play around with buffer sizes to get them more often, but
> it's not quite the same thing.
>
> - your old version used a single buffer, while the new relayfs only
> supports per-cpu buffers, which means you'd have to sort out the
> events in user space and impose some ordering using timestamps for
> instance if your data doesn't already have a natural ordering (BTW,
> the new relayfs doesn't have an option any longer to do automatic
> timestamping either).

Right. Thanks for all your work on it Tom . I really appreciate it.
I have been considering a move over to the redux version recently.
Your port will make it easier for me to test both versions out. I'll
keep your pointers in mind.

> I'll continue maintaining the old relayfs for existing users (so
> thanks for the patch and the test code) so if the new version doesn't
> fit your needs, you'll still have the old version to fall back on.
>
> Thanks,
>
> Tom

Ok. Thanks again!
--
Kingsley

2005-04-18 01:46:16

by Kingsley Cheung

[permalink] [raw]

Subject: Relayfs Question: Use of relay_reset(). Potential race?

On Wed, Mar 23, 2005 at 08:02:54PM +1100, [email protected] wrote:
> Hi
>
> I'm using relayfs to relay data from a kernel module to user space on
> a SuSE 2.6.5 kernel. I'm not absolutely sure what version of relayfs
> has been back ported to it.

Hi Tom,

Could you please have a look at the following use of relay_reset() in
a kernel module as follows (compiled against pre-redux relayfs):

static int
exec_fileop_notify(int rchan_id, struct file *filp, enum relay_fileop op)
{
if (unlikely(rchan_id != exec_cid)) {
printk(KERN_ERR "%s - bad file number\n", __FUNCTION__);
return -EBADF;
}

switch (op) {
case RELAY_FILE_OPEN:
atomic_inc(&exec_client_cnt);
break;
case RELAY_FILE_CLOSE:
if (atomic_dec_and_test(&exec_client_cnt) == 0)
relay_reset(exec_cid); <---
break;
default:
/* do nothing */
break;
}

return 0;
}

Is that legitimate? The reason I ask is because I've been seeing
garbled oopses with keventd and I've narrowed it to two things:

1) Inadequate locking on my part in the kernel module, which I have
addressed separately.

2) A race with relay_reset() and keventd, which is probably of
interest to you if you're still maintaining the pre-redux patches.

The race is due to the use of INIT_WORK in _reset_relay():

INIT_WORK(&rchan->wake_readers, NULL, NULL);
INIT_WORK(&rchan->wake_writers, NULL, NULL);

However, at the time relay_reset() is called, it is possible that
these work structures are still being used by keventd when under heavy
loads. The workaround I've used to fix this is to call
flush_scheduled_work() before calling reset_relay() in the kernel
module. Perhaps that needs to be called in relay_reset() or
_relay_reset()?

As well I'm not sure about the uses of INIT_WORK in
_relay_realloc_buffer() and relay_release() - perhaps they need
attention too. I understand, however, that flush_schedule_work()
blocks and thus it probably shouldn't be used in certain areas of the
relayfs code.

My thanks,
--
Kingsley

2005-04-18 15:57:19

by Tom Zanussi

[permalink] [raw]

Subject: Re: Relayfs Question: Use of relay_reset(). Potential race?

Kingsley Cheung writes:
> On Wed, Mar 23, 2005 at 08:02:54PM +1100, [email protected] wrote:
> > Hi
> >
> > I'm using relayfs to relay data from a kernel module to user space on
> > a SuSE 2.6.5 kernel. I'm not absolutely sure what version of relayfs
> > has been back ported to it.
>
> Hi Tom,
>
> Could you please have a look at the following use of relay_reset() in
> a kernel module as follows (compiled against pre-redux relayfs):
>
> static int
> exec_fileop_notify(int rchan_id, struct file *filp, enum relay_fileop op)
> {
> if (unlikely(rchan_id != exec_cid)) {
> printk(KERN_ERR "%s - bad file number\n", __FUNCTION__);
> return -EBADF;
> }
>
> switch (op) {
> case RELAY_FILE_OPEN:
> atomic_inc(&exec_client_cnt);
> break;
> case RELAY_FILE_CLOSE:
> if (atomic_dec_and_test(&exec_client_cnt) == 0)
> relay_reset(exec_cid); <---
> break;
> default:
> /* do nothing */
> break;
> }
>
> return 0;
> }
>
> Is that legitimate? The reason I ask is because I've been seeing

Yes, you should be able to reset the channel here, since at that point
it's been closed.

> garbled oopses with keventd and I've narrowed it to two things:
>
> 1) Inadequate locking on my part in the kernel module, which I have
> addressed separately.
>
> 2) A race with relay_reset() and keventd, which is probably of
> interest to you if you're still maintaining the pre-redux patches.
>
> The race is due to the use of INIT_WORK in _reset_relay():
>
> INIT_WORK(&rchan->wake_readers, NULL, NULL);
> INIT_WORK(&rchan->wake_writers, NULL, NULL);
>
> However, at the time relay_reset() is called, it is possible that
> these work structures are still being used by keventd when under heavy
> loads. The workaround I've used to fix this is to call
> flush_scheduled_work() before calling reset_relay() in the kernel
> module. Perhaps that needs to be called in relay_reset() or
> _relay_reset()?

Yes, flush_scheduled_work() should probably be called from
__relay_reset() - thanks for catching this and suggesting the fix.
BTW, I've updated the old relayfs patch with your previous fixes and
ported it to 2.6.11 - it hasn't appeared yet on the opersys website,
so let me know if you want it and I'll send it, or I can just send you
a new version after I make these changes...

>
> As well I'm not sure about the uses of INIT_WORK in
> _relay_realloc_buffer() and relay_release() - perhaps they need
> attention too. I understand, however, that flush_schedule_work()

I'll take a look at this too - looks like there might be a potential
problem if a channel was closed while a resize was in progress. I
don't think anyone's ever actually used resizing, but it should be
fixed nonetheless.

Thanks,

Tom

2005-04-19 01:16:15

by Kingsley Cheung

[permalink] [raw]

Subject: Re: Relayfs Question: Use of relay_reset(). Potential race?

On Mon, Apr 18, 2005 at 10:56:57AM -0500, Tom Zanussi wrote:
> Kingsley Cheung writes:
> > On Wed, Mar 23, 2005 at 08:02:54PM +1100, [email protected] wrote:
> > > Hi
> > >
> > > I'm using relayfs to relay data from a kernel module to user space on
> > > a SuSE 2.6.5 kernel. I'm not absolutely sure what version of relayfs
> > > has been back ported to it.
> >
> > Hi Tom,
> >
> > Could you please have a look at the following use of relay_reset() in
> > a kernel module as follows (compiled against pre-redux relayfs):
(snip)
> > Is that legitimate? The reason I ask is because I've been seeing
>
> Yes, you should be able to reset the channel here, since at that point
> it's been closed.
>
> > garbled oopses with keventd and I've narrowed it to two things:
> >
> > 1) Inadequate locking on my part in the kernel module, which I have
> > addressed separately.
> >
> > 2) A race with relay_reset() and keventd, which is probably of
> > interest to you if you're still maintaining the pre-redux patches.
> >
> > The race is due to the use of INIT_WORK in _reset_relay():
> >
> > INIT_WORK(&rchan->wake_readers, NULL, NULL);
> > INIT_WORK(&rchan->wake_writers, NULL, NULL);
> >
> > However, at the time relay_reset() is called, it is possible that
> > these work structures are still being used by keventd when under heavy
> > loads. The workaround I've used to fix this is to call
> > flush_scheduled_work() before calling reset_relay() in the kernel
> > module. Perhaps that needs to be called in relay_reset() or
> > _relay_reset()?

Tom,

Thanks for the prompt response.

> Yes, flush_scheduled_work() should probably be called from
> __relay_reset() - thanks for catching this and suggesting the fix.

My pleasure.

> BTW, I've updated the old relayfs patch with your previous fixes and
> ported it to 2.6.11 - it hasn't appeared yet on the opersys website,
> so let me know if you want it and I'll send it, or I can just send you
> a new version after I make these changes...

Ok, good to hear. I've been using the old patch against an earlier
version of the kernel, so I've no rush for a patch against 2.6.11 -
you can take your time with these new changes :)

> > As well I'm not sure about the uses of INIT_WORK in
> > _relay_realloc_buffer() and relay_release() - perhaps they need
> > attention too. I understand, however, that flush_schedule_work()
>
> I'll take a look at this too - looks like there might be a potential
> problem if a channel was closed while a resize was in progress. I
> don't think anyone's ever actually used resizing, but it should be
> fixed nonetheless.
>
> Thanks,
>
> Tom
>

Right. Thanks again for looking at it.

--
Kingsley

2005-05-04 09:05:19

by Kingsley Cheung

[permalink] [raw]

Subject: Re: Relayfs Question: Use of relay_reset(). Potential race?

On Tue, Apr 19, 2005 at 10:40:23AM +1000, Kingsley Cheung wrote:
> On Mon, Apr 18, 2005 at 10:56:57AM -0500, Tom Zanussi wrote:
> > Kingsley Cheung writes:
> > > On Wed, Mar 23, 2005 at 08:02:54PM +1100, [email protected] wrote:
> > > > Hi
> > > >
> > > > I'm using relayfs to relay data from a kernel module to user space on
> > > > a SuSE 2.6.5 kernel. I'm not absolutely sure what version of relayfs
> > > > has been back ported to it.
> > >
> > > Hi Tom,
> > >
> > > Could you please have a look at the following use of relay_reset() in
> > > a kernel module as follows (compiled against pre-redux relayfs):
> (snip)
> > > Is that legitimate? The reason I ask is because I've been seeing
> >
> > Yes, you should be able to reset the channel here, since at that point
> > it's been closed.
> >
> > > garbled oopses with keventd and I've narrowed it to two things:
> > >
> > > 1) Inadequate locking on my part in the kernel module, which I have
> > > addressed separately.
> > >
> > > 2) A race with relay_reset() and keventd, which is probably of
> > > interest to you if you're still maintaining the pre-redux patches.
> > >
> > > The race is due to the use of INIT_WORK in _reset_relay():
> > >
> > > INIT_WORK(&rchan->wake_readers, NULL, NULL);
> > > INIT_WORK(&rchan->wake_writers, NULL, NULL);
> > >
> > > However, at the time relay_reset() is called, it is possible that
> > > these work structures are still being used by keventd when under heavy
> > > loads. The workaround I've used to fix this is to call
> > > flush_scheduled_work() before calling reset_relay() in the kernel
> > > module. Perhaps that needs to be called in relay_reset() or
> > > _relay_reset()?
>
> Tom,
>
> Thanks for the prompt response.
>
> > Yes, flush_scheduled_work() should probably be called from
> > __relay_reset() - thanks for catching this and suggesting the fix.

Tom,

Sorry about this, but the fix only works if schedule_work() is used
instead of schedule_delayed_work() for the pre-redux relayfs patch. I
only realised this recently since I have been using the "flush" fix on
a pre-redux relayfs port to 2.4 which doesn't doesn't have an
equivalent of schedule_delayed_work() and have only tried it on 2.6
today.

Using flush_scheduled_work() with schedule_delayed_work() doesn't work
since schedule_delay_work() uses a timer to queue work at the
appropriate time. Although all uses of schedule_delay_work() in
relayfs adds work to the queue a tick later, this time delay is enough
for flush_schedule_work() to flushes what is presently on the queue
before the timer actually expires. eventd would then attempt to use a
work_struct that has then been initialised by relay_reset().

With schedule_work() is instead of schedule_delayed_work() I haven't
seen the race happen.

Thanks,
--
Kingsley