LinuxLists.cc - Problems implementing poll call

2002-10-17 11:06:49

Subject: Problems implementing poll call

--
----------------------------------------
Constantine Gavrilov
Linux Leader
Optibase Ltd
7 Shenkar St, Herzliya 46120, Israel
Phone: (972-9)-970-9140
Fax: (972-9)-958-6099
----------------------------------------

Attachments:

let.txt (2.47 kB)

2002-10-20 16:15:02

by Constantine Gavrilov

[permalink] [raw]

Subject: Re: Problems implementing poll call

Thanks a lot. I get it now.

As far as races are concerned, it should be OK. This is because my
condition uses testbit() and interrupt handler uses setbit(). These
operations are atomic even on SMP configurations. Right?

What do you think is of lower latency -- a poll() hook or an ioctl() hook?

One thing I could have with ioctl implementation was that I knew in
kernel space that a timeout has occurred. With poll(), I do not know how
to catch an error, because the poll function is called many times and I
do not know which return is last. But it is not critical for me.

Dan Maas wrote:

>Hi Constantine, this is in reponse to your post on the Linux kernel
>list of 17 October...
>
>I think you should take a look at the poll() implementation of other
>drivers. You should call poll_wait() before testing the bit flag,
>regardless of whether you actually need to block or not.
>
>Note that poll_wait() does not block; it merely adds your device's
>wait queue to a list of items the user is waiting on. The actual
>blocking occurs in other kernel code after your poll() method
>returns... select() and poll() can make several passes over the file
>descriptors; wait_table may NULL if the kernel knows for sure it is
>not going to block and is just checking the status of your device.
>
>I think you also need to be a little more careful about race
>conditions in your sleeping code. If you write this:
>
>if(need_to_sleep) {
> interruptible_sleep_on(...);
>}
>
>You create a potential race condition where need_to_sleep is changed
>by an interrupt between the if() check and the sleep_on(). To
>eliminate these races you need to set the task state to
>TASK_INTERRUPTIBLE, then check the 'need_to_sleep' condition, then
>schedule() to actually go to sleep. This way if the interrupt handler
>changes need_to_sleep after your if() statement, you will wake up
>immediately instead of staying asleep forever. For an example of this
>technique see the comments around line 1680 in my driver
>drivers/ieee1394/dv1394.c
>
>Regards,
>Dan
>

--
----------------------------------------
Constantine Gavrilov
Linux Leader
Optibase Ltd
7 Shenkar St, Herzliya 46120, Israel
Phone: (972-9)-970-9140
Fax: (972-9)-958-6099
----------------------------------------

2002-10-20 20:26:03

by Dan Maas

[permalink] [raw]

Subject: Re: Problems implementing poll call

* Constantine Gavrilov ([email protected]) wrote:
> As far as races are concerned, it should be OK. This is because my
> condition uses testbit() and interrupt handler uses setbit(). These
> operations are atomic even on SMP configurations. Right?

Yes, test/setbit() are atomic, but that's not the point - the "do I
need to sleep?" check must be atomic with respect to
sleep_on(). Consider the following case:

if(testbit(...)) {
// bit is TRUE now

// now the interrupt handler runs:
// setbit(FALSE)
// wake_up_interruptible(...)

interruptible_sleep_on(...);
}

Since wake_up() is called before sleep_on(), it will not wake you
up. If this case occurs, you will remain asleep forever.

You will probably find a few places in the kernel that suffer this
condition; the race window is very small so it's not likely to happen
in practice. But if I were developing a driver I'd definitely pay
attention to it...

A good description of the problem and solution is on page 287 of
Rubini's "Linux Device Drivers", 2nd ed. (I highly recommend this book
btw).

I just learned that there are macros in sched.h that encapsulate the
race-free solution (wait_event() and wait_event_interuptible()):
instead of the if() statement, you can write wait_event(..., !testbit());

> What do you think is of lower latency -- a poll() hook or an ioctl()
> hook?

Probably ioctl(). But ioctl() has the disadvantage that you can only
wait on one file descriptor. It's easy to provide both...

> With poll(), I do not know how to catch an error, because the poll
> function is called many times and I do not know which return is
> last. But it is not critical for me.

poll() should not return an error unless there is something wrong with
poll() itself. poll() merely tells the user that "some activity has
occurred on this file descriptor" - where "some activity" may include
an error. Device-specific errors should be returned when the user
subsequently calls read() write() or ioctl() on your device.

Regards,
Dan